D3.7 Technical report on information extraction from heterogeneous data using TDM

This deliverable summarises the activities within Task 3.3. that were led by CLARIN ERIC with the main partners CLARIN/Athena, CLARIN/CUNI, and DARIAH/UGOE, and additional collaboration with SciencesPo. Over the course of M1-M38 the partners developed three demonstration scenarios that highlight the value of NLP (Natural Language Processing) technologies for the SSH field and investigated which aspects of the outcomes of T3.3. and in which form can be shared via SSH Open Marketplace.

D3.3 Populating EQB, meeting DDI standards

The European Question Bank (EQB) aims to provide a central search facility across all the Consortium of European Social Science Data Archives’ (CESSDA) survey holdings. It uses a question-level metadata schema based on the DDI-Lifecycle standard (DDI Alliance 2020). The ambition is for users to be able to find survey questions, their translations to applicable languages, answer categories, pre- and postquestion texts, and the study title.

D3.10 A 20th century version of the occupation multilingual ontology

Task 3.2 Selected SSH Ontologies and Vocabularies in SSHOC Work package 3 Lifting Technologies and Services into the SSH Cloud aimed to foster the use of selected global ontologies in the social sciences and humanities, regarding occupational titles, educational categories, sectors of industry, geographical regions, food items, and religions. These ontologies service the usage of vocabularies for classifying text corpora and predefined response categories to facilitate self-identification in survey questions.

Many different domains lack multilingual terminological resources. Making data and services accessible and usable in SSH is very much a matter of providing terminology across languages and multilingual vocabularies. Shortage of multilingual terminologies and vocabularies represents an obstacle to the access and reuse of information. Using the appropriate vocabularies can greatly improve both discovery and classification. Consequently, for SSHOC, it is important to address this issue with respect to the SSH domain.

Type: Dataset
Property: Datasets

D3.9 Report on Ontology and Vocabulary Collection and Publication

This deliverable pertains to SSHOC Task 3.1 which was responsible for investigating and providing resources and tools to support the multilingual aspects of the future pan-EU SSH infrastructure.

Making data and services accessible and usable in SSH is very much also a matter of providing relevant translations, translation of metadata concepts, multilingual vocabularies, terminology extraction across languages, multilingual databases.

D3.8 Implementation report and available SSHOC Switchboard and VCR services

 SSHOC Task 3.6 (Making Data Re-usable and Actionable) focuses on two services originally from the CLARIN infrastructure domain, which in discussion and collaboration with the SSHOC communities are generalised and adapted to serve the broader Humanities and Social Sciences. Specifically with the purpose of facilitating better sharing and re-use of data and services in innovative ways.

D3.6 Report on SSHOC format interoperability solution services, including new software

The SSHOC project task 3.5 has worked with metadata and data format interoperability issues and built an interoperability hub consisting of a portal (called Conversion Hub) and selected metadata conversion solutions that the task team is currently developing. The work is based on earlier work of task 3.5 that delivered Deliverable 3.1 Report on SSHOC (meta)data interoperability problems, which provided an inventory of metadata and data formats used by the SSHOC communities, and recommendations for metadata standards and data file formats.

D3.5 Report on citation enabled SSH catalogues and SSH citation exploitation

Citation is a pillar for the construction of knowledge. By creating proper citations in a standardized way researchers can constitute a mesh of linked information for various purposes (from credit to reuse). This becomes increasingly important as the SSHOC Task 3.4 team confronts the realities of Social Sciences and Humanities Research in a digital age, when machine actionability takes on a renewed and vital importance.

MS8 Choice of Vocabulary Publication platform for SSHOC

It is important to understand the general framework of use for a vocabulary publication platform for the SSHOC project. In SSHOC, it is crucial to support better discovery of SSH research data in order to ensure better access and reusability. This will be made possible through support for multilinguality. In infrastructures, metadata aggregation platforms are provided that map metadata to a shared common ontology usually in English.