(continue from part I) / Library and Information Science, 43.1 (2017): 7-46. / [[中文]]
# An old record is not a data but now defined as a new semantic dataset. i.e. its triples, graphs, links, file formats ... i.e. its revised, vocabulary encoded versions ... ex. data:d2148340 a dcat:dataset. #files:json-ld, ttl, XML # A new method to curate, publish & visualize LOD graphs via CKAN portal. i.e. two models for one dataset published in two views. ex. data:d2148340 a dcat:dataset. # Dublin Core @schema1 ex. data:d2148340 a data:Refined. # more semantics@schema2 # Validation & Reproducibility: Provenance and Contexts are in details.
|Example: data:d2148340 (click to enlarge)|
We then make use of structured records (XML files) from a digital archive catalogue, and convert the records into semantically rich and interlinked resources on the Web. This is realized as a unified Linked Data catalogue to several digital archive collections. Our work results in a LOD catalogue (data.odw.tw) available to the public at the website
. The following five parts are involved in realizing this website.
A catalogue record, about a species of Pleione Formosana (data:d2148340), is used throughout in the paper as an example to demonstrate the way we model, convert, and represent the semantics of a structured record.
|R4R Ontology (click to enlarge)|
Part 1: Exploring data reuse relations in a shared context -- We review our previous research about the Relation for Reuse Ontology (R4R). In particular, we provide mechanisms for reusing article, data, and code with some flexibility of encoding provenance and license information.
Part 2: Comparing two different data conversion approaches to providing LOD for an archive catalogue -- We show two different scenarios: (1) The LOD catalogue is converted directly from a relational database, and (2) the LOD catalogue is generated from a series of format conversions --- from XML to CSV, and then to RDF.
|KB links Example (click to enlarge)|
Part 4: Using CKAN as a Linked Data platform -- We briefly introduce CKAN, an open source web-based data portal software package for curating and publishing datasets. CKAN provides data preview, search, and discovery, especially with regard to geospatial datasets. We built several extensions to CKAN in order to deposit, publish, browse, and search Linked Data. Various Linked Data representations of a catalogue record --- Turtle, RDF/XML, and JSON-LD --- can all be downloaded and reused.
Part 5: Designing an ontology for data representation and reuse -- We design an ontology voc4odw which includes the following 3 modules:
(1) The Core Model. It is comprise of a data model and a conceptual model.
The data model represents key data structure and relation. It is a framework to illustrate data source,derivation, and provenance.
|The voc4odw Data Model (click)|
The conceptual model incorporates Simple Knowledge Organization System (SKOS); it also connects to key event concepts. The conceptual model allows for data contextualization using common and domain knowledge vocabularies.
(2) The Curation Model. It is responsible for disclosing the identification, classification, and publication of structured records at a curation platform, such as the classification of themes, the assignment of data identifiers, and the publication of datasets.
(3) A vocabulary voaf:Vocabulary. It is defined as "A vocabulary used in the Linked Data cloud", from the Vocabulary of a Friend
. This module is to relate the Core Model to external common vocabularies. Some hierarchy relations between different external vocabularies can be traced with this vocabulary.
11. GeoNames Entity
20. Wikidata Entity
4. The Encyclopaedia of Life (EOL)
http://data.odw.tw/r1/ (r2, r3…)