Preface |
|
xv | |
Acknowledgements |
|
xvii | |
List of Contributors |
|
xix | |
List of Figures |
|
xxi | |
List of Tables |
|
xxix | |
List of Abbreviations |
|
xxxi | |
1 Introduction |
|
1 | (20) |
|
1.1 State of the Art in Engineering Data-Intensive Systems |
|
|
2 | (3) |
|
|
4 | (1) |
|
1.2 State of the Art in Semantics-Driven Software Engineering |
|
|
5 | (3) |
|
|
8 | (1) |
|
1.3 State of the Art in Data Quality Engineering |
|
|
8 | (4) |
|
|
11 | (1) |
|
|
12 | (3) |
|
|
15 | (2) |
|
1.5.1 Trinity College Dublin |
|
|
15 | (1) |
|
1.5.2 Oxford University - Department of Computer Science |
|
|
15 | (1) |
|
1.5.3 Oxford University - School of Anthropology and Museum Ethnography |
|
|
15 | (1) |
|
1.5.4 University of Leipzig - Agile Knowledge Engineering and Semantic Web (AKSW) |
|
|
15 | (1) |
|
1.5.5 Semantic Web Company |
|
|
16 | (1) |
|
1.5.6 Wolters Kluwer Germany |
|
|
16 | (1) |
|
1.5.7 Adam Mickiewicz University in Poznan |
|
|
16 | (1) |
|
1.5.8 Wolters Kluwer Poland |
|
|
17 | (1) |
|
|
17 | (4) |
2 ALIGNED Use Cases - Data and Software Engineering Challenges |
|
21 | (20) |
|
|
|
|
21 | (3) |
|
2.2 The ALIGNED Use Cases |
|
|
24 | (9) |
|
2.2.1 Seshat: Global History Databank |
|
|
24 | (2) |
|
2.2.2 PoolParty Enterprise Application Demonstrator System |
|
|
26 | (1) |
|
|
27 | (2) |
|
2.2.4 Jurion and Jurion IPG |
|
|
29 | (2) |
|
2.2.5 Health Data Management |
|
|
31 | (2) |
|
2.3 The ALIGNED Use Cases and Data Life Cycle. Major Challenges and Offered Solutions |
|
|
33 | (3) |
|
2.4 The ALIGNED Use Cases and Software Life Cycle. Major Challenges and Offered Solutions |
|
|
36 | (3) |
|
|
39 | (2) |
3 Methodology |
|
41 | (38) |
|
|
|
|
|
|
|
|
41 | (2) |
|
3.2 Software and Data Engineering Life Cycles |
|
|
43 | (6) |
|
3.2.1 Software Engineering Life Cycle |
|
|
43 | (4) |
|
3.2.2 Data Engineering Life Cycle |
|
|
47 | (2) |
|
3.3 Software Development Processes |
|
|
49 | (4) |
|
3.3.1 Model-Driven Approaches |
|
|
49 | (2) |
|
|
51 | (1) |
|
3.3.3 Test-Driven Development |
|
|
52 | (1) |
|
3.4 Integration Points and Harmonisation |
|
|
53 | (7) |
|
|
54 | (1) |
|
3.4.2 Barriers to Harmonisation |
|
|
55 | (3) |
|
3.4.3 Methodology Requirements |
|
|
58 | (2) |
|
3.5 An ALIGNED Methodology |
|
|
60 | (5) |
|
3.5.1 A General Framework for Process Management |
|
|
60 | (3) |
|
3.5.2 An Iterative Methodology and Illustration |
|
|
63 | (2) |
|
|
65 | (4) |
|
|
66 | (3) |
|
3.7 Sample Synchronisation Point Activities |
|
|
69 | (5) |
|
3.7.1 Model Catalogue: Analysis and Search/Browse/Explore |
|
|
70 | (1) |
|
3.7.2 Model Catalogue: Design and Classify/Enrich |
|
|
71 | (1) |
|
3.7.3 Semantic Booster: Implementation and Store/Query |
|
|
72 | (1) |
|
3.7.4 Semantic Booster: Maintenance and Search/Browse/Explore |
|
|
72 | (2) |
|
|
74 | (2) |
|
|
74 | (2) |
|
|
76 | (3) |
4 ALIGNED MetaModel Overview |
|
79 | (46) |
|
|
|
|
|
|
80 | (3) |
|
|
80 | (1) |
|
4.1.2 Namespaces and URIs |
|
|
81 | (1) |
|
4.1.3 Expressivity of Vocabularies |
|
|
82 | (1) |
|
4.1.4 Reference Style for External Terms |
|
|
82 | (1) |
|
4.1.5 Links with W3C PROV |
|
|
82 | (1) |
|
4.2 ALIGNED Generic Metamodel |
|
|
83 | (1) |
|
4.2.1 Design Intent Ontology (DIO) |
|
|
83 | (1) |
|
|
83 | (3) |
|
4.3.1 Software Life Cycle Ontology |
|
|
83 | (2) |
|
4.3.2 Software Implementation Process Ontology (SIP) |
|
|
85 | (1) |
|
|
86 | (1) |
|
4.4.1 Data Life Cycle Ontology |
|
|
86 | (1) |
|
4.5 DBpedia DataID (DataID) |
|
|
87 | (2) |
|
4.6 Unified Quality Reports |
|
|
89 | (36) |
|
4.6.1 Reasoning Violation Ontology (RVO) Overview |
|
|
89 | (2) |
|
4.6.2 W3C SHACL Reporting Vocabulary |
|
|
91 | (2) |
|
4.6.3 Data Quality Vocabulary |
|
|
93 | (3) |
|
4.6.4 Test-Driven RDF Validation Ontology (RUT) |
|
|
96 | (13) |
|
4.6.5 Enterprise Software Development (DIOPP) |
|
|
109 | (2) |
|
4.6.6 Unified Governance Domain Ontologies |
|
|
111 | (1) |
|
4.6.7 Semantic Booster and Model Catalogue Domain Ontology |
|
|
112 | (1) |
|
|
112 | (1) |
|
|
113 | (1) |
|
|
113 | (2) |
|
|
115 | (2) |
|
|
117 | (2) |
|
|
119 | (2) |
|
|
121 | (4) |
5 Tools |
|
125 | (76) |
|
|
|
|
|
|
|
|
|
|
125 | (30) |
|
|
125 | (2) |
|
|
127 | (11) |
|
|
127 | (3) |
|
5.1.2.2 Searching and browsing the catalogue |
|
|
130 | (1) |
|
5.1.2.3 Editing the catalogue contents |
|
|
131 | (3) |
|
|
134 | (1) |
|
5.1.2.5 Eclipse integration and model-driven development |
|
|
134 | (2) |
|
5.1.2.6 Semantic reasoning |
|
|
136 | (1) |
|
5.1.2.7 Automation and search |
|
|
137 | (1) |
|
|
138 | (17) |
|
|
138 | (1) |
|
|
139 | (16) |
|
|
155 | (9) |
|
5.2.1 RDFUnit Integration |
|
|
157 | (7) |
|
5.2.1.1 JUnit XML report-based integration |
|
|
158 | (1) |
|
5.2.1.2 Custom apache maven-based integration |
|
|
158 | (2) |
|
5.2.1.3 The shapes constraint language (SHACL) |
|
|
160 | (1) |
|
5.2.1.4 Comparison of SHACL to schema definition using RDFUnit test patterns |
|
|
161 | (1) |
|
5.2.1.5 Comparison of SHACL to auto-generated RDFUnit tests from RDFS/OWL axioms |
|
|
162 | (1) |
|
5.2.1.6 Progress on the SHACL specification and standardisation process |
|
|
163 | (1) |
|
5.2.1.7 SHACL support in RDFUnit |
|
|
163 | (1) |
|
5.3 Expert Curation Tools and Workflows |
|
|
164 | (8) |
|
|
165 | (2) |
|
5.3.1.1 Graduated application of semantics |
|
|
165 | (1) |
|
5.3.1.2 Graph - object mapping |
|
|
165 | (1) |
|
5.3.1.3 Object/document level state management and versioning |
|
|
166 | (1) |
|
5.3.1.4 Object-based workflow interfaces |
|
|
166 | (1) |
|
5.3.1.5 Integrated, automated, constraint validation |
|
|
166 | (1) |
|
5.3.1.6 Result interpretation |
|
|
167 | (1) |
|
|
167 | (1) |
|
5.3.2 Workflow/Process Models |
|
|
167 | (5) |
|
5.3.2.1 Process model 1 linked data object creation |
|
|
167 | (1) |
|
5.3.2.2 Process model 2 object - linked data object updates |
|
|
168 | (1) |
|
5.3.2.3 Process model 3 updates to deferred updates |
|
|
168 | (1) |
|
5.3.2.4 Process model 4 schema updates |
|
|
169 | (1) |
|
5.3.2.5 Process model 5 validating schema updates |
|
|
170 | (1) |
|
5.3.2.6 Process model 6 named graph creation |
|
|
170 | (1) |
|
5.3.2.7 Process model 7 instance data updates and named graphs |
|
|
171 | (1) |
|
5.4 Dacura Approval Queue Manager |
|
|
172 | (1) |
|
5.5 Dacura Linked Data Object Viewer |
|
|
172 | (4) |
|
5.5.1 CSP Design of Seshat Workflow Use Case |
|
|
173 | (1) |
|
|
174 | (2) |
|
5.6 Dacura Quality Service |
|
|
176 | (8) |
|
5.6.1 Technical Overview of Dacura Quality Service |
|
|
177 | (1) |
|
5.6.2 Dacura Quality Service API |
|
|
178 | (6) |
|
5.6.2.1 Resource and interchange format |
|
|
178 | (1) |
|
|
178 | (1) |
|
|
178 | (1) |
|
|
178 | (1) |
|
|
179 | (1) |
|
|
180 | (1) |
|
|
180 | (1) |
|
5.6.2.8 Required schema tests |
|
|
180 | (1) |
|
|
181 | (1) |
|
|
182 | (1) |
|
|
182 | (2) |
|
5.7 Linked Data Model Mapping |
|
|
184 | (11) |
|
5.7.1 Interlink Validation Tool |
|
|
184 | (6) |
|
5.7.1.1 Interlink validation |
|
|
185 | (2) |
|
5.7.1.2 Technical overview |
|
|
187 | (1) |
|
5.7.1.3 Configuration via iv_config.txt |
|
|
188 | (1) |
|
5.7.1.4 Configuration via external_datasets.txt |
|
|
189 | (1) |
|
5.7.1.5 Execute the interlink validator tool |
|
|
190 | (1) |
|
5.7.2 Dacura Linked Model Mapper |
|
|
190 | (3) |
|
5.7.3 Model Mapper Service |
|
|
193 | (2) |
|
5.7.3.1 Modelling tool - creating mappings |
|
|
193 | (1) |
|
5.7.3.2 Importing semi-structured data with data harvesting tool |
|
|
193 | (2) |
|
5.8 Model-Driven Data Curation |
|
|
195 | (6) |
|
5.8.1 Dacura Quality Service Frame Generation |
|
|
196 | (1) |
|
5.8.2 Frames for Userinterface Design |
|
|
197 | (1) |
|
5.8.3 SemiFormal Frame Specification |
|
|
197 | (2) |
|
5.8.4 Frame API Endpoints |
|
|
199 | (2) |
6 Use Cases |
|
201 | (104) |
|
|
|
|
|
|
|
|
|
6.1 Wolters Kluwer - Re-Engineering a Complex Relational Database Application |
|
|
201 | (34) |
|
|
201 | (1) |
|
|
202 | (2) |
|
|
204 | (2) |
|
|
206 | (9) |
|
6.1.4.1 PoolParty notification extension |
|
|
206 | (1) |
|
6.1.4.2 rsine notification extension |
|
|
206 | (1) |
|
|
206 | (1) |
|
6.1.4.3 RDFUnit for data transformation |
|
|
207 | (4) |
|
6.1.4.4 PoolParty external link validity |
|
|
211 | (3) |
|
6.1.4.5 Statistical overview |
|
|
214 | (1) |
|
|
215 | (4) |
|
|
217 | (1) |
|
|
217 | (1) |
|
|
217 | (1) |
|
6.1.5.4 Measuring overall value |
|
|
218 | (1) |
|
6.1.5.5 Data quality dimensions and thresholds |
|
|
218 | (1) |
|
|
219 | (1) |
|
|
219 | (1) |
|
|
219 | (16) |
|
|
219 | (6) |
|
|
225 | (2) |
|
6.1.6.3 Tools and features |
|
|
227 | (1) |
|
|
228 | (4) |
|
|
232 | (2) |
|
6.1.6.6 Experimental evaluation |
|
|
234 | (1) |
|
6.2 Seshat - Collecting and Curating High-Value Datasets with the Dacura Platform |
|
|
235 | (24) |
|
|
237 | (1) |
|
6.2.1.1 Problem statement |
|
|
237 | (1) |
|
|
238 | (2) |
|
6.2.2.1 Tools and features |
|
|
240 | (1) |
|
|
240 | (6) |
|
6.2.3.1 Dacura data curation platform |
|
|
240 | (1) |
|
6.2.3.2 General description |
|
|
240 | (1) |
|
|
241 | (5) |
|
6.2.4 Overview of the Model Catalogue |
|
|
246 | (7) |
|
6.2.4.1 Model catalogue in the demonstrator system |
|
|
250 | (3) |
|
6.2.5 Seshat Trial Platform Evaluation |
|
|
253 | (6) |
|
6.2.5.1 Measuring overall value |
|
|
253 | (1) |
|
6.2.5.2 Data quality dimensions and thresholds |
|
|
253 | (6) |
|
6.3 Managing Data for the NHS |
|
|
259 | (13) |
|
|
259 | (1) |
|
|
260 | (1) |
|
|
260 | (1) |
|
|
260 | (1) |
|
|
261 | (2) |
|
|
263 | (5) |
|
|
263 | (1) |
|
6.3.4.2 NIHR health informatics collaborative |
|
|
263 | (5) |
|
|
268 | (4) |
|
|
269 | (2) |
|
|
271 | (1) |
|
|
272 | (1) |
|
6.4 Integrating Semantic Datasets into Enterprise Information Systems with Poolparty |
|
|
272 | (30) |
|
|
272 | (2) |
|
|
274 | (1) |
|
|
274 | (1) |
|
|
274 | (2) |
|
|
276 | (8) |
|
6.4.4.1 Consistency violation detector |
|
|
276 | (1) |
|
6.4.4.2 RDFUnit test generator |
|
|
277 | (1) |
|
6.4.4.3 PoolParty integration |
|
|
277 | (1) |
|
6.4.4.4 Notification adaptations |
|
|
277 | (1) |
|
|
278 | (1) |
|
6.4.4.6 Validation on import |
|
|
278 | (6) |
|
|
284 | (11) |
|
6.4.5.1 RDF constraints check |
|
|
285 | (1) |
|
|
286 | (3) |
|
6.4.5.3 Improved notifications |
|
|
289 | (4) |
|
6.4.5.4 Unified governance |
|
|
293 | (2) |
|
|
295 | (7) |
|
6.4.6.1 Measuring overall value |
|
|
295 | (4) |
|
6.4.6.2 Data quality dimensions and thresholds |
|
|
299 | (1) |
|
|
300 | (2) |
|
6.5 Data Validation at DBpedia |
|
|
302 | (11) |
|
|
302 | (1) |
|
|
302 | (1) |
|
|
303 | (1) |
|
|
303 | (1) |
|
|
304 | (1) |
|
|
305 | (4) |
|
|
309 | (4) |
|
|
309 | (1) |
|
|
310 | (2) |
|
|
312 | |
7 Evaluation |
|
305 | (20) |
|
|
|
|
|
|
|
|
|
|
7.1 Key Metrics for Evaluation |
|
|
313 | (5) |
|
|
315 | (1) |
|
|
316 | (1) |
|
|
316 | (1) |
|
|
317 | (1) |
|
7.2 ALIGNED Ethics Processes |
|
|
318 | (2) |
|
7.3 Common Evaluation Framework |
|
|
320 | (3) |
|
|
320 | (1) |
|
|
320 | (1) |
|
|
321 | (2) |
|
7.4 ALIGNED Evaluation Ontology |
|
|
323 | (2) |
Appendix A Requirements |
|
325 | (70) |
Index |
|
395 | (4) |
About the Editors |
|
399 | |