News

EDDI 2022 Conference Report

The aim of the European DDI (EDDI) conference is to be a place where the social science data management community can meet, exchange ideas, report progress and build DDI capabilities and capacity across Europe. Moving around different cities in Europe has been an integral part of that process. Since the beginning, the tutorials, long breaks between sessions, and meetups in the evening have all been part of encouraging a friendly and conducive atmosphere. This was very much challenged by COVID and moving everything online. SciencesPo, who had great plans for an in-person conference in 2020 graciously offered to host it online, stuck with us for 2021 and ably carried off the hybrid event in 2022 in Paris.

Up until 2020, there had been over 500 people from 170 organizations that had come to an EDDI conference. Going online does seem to have stimulated more interest, of the three recent conferences hosted in Paris there were 274 new participants and 122 new organizations, whilst still retaining the participation of 40 percent of those who had previously attended an in-person conference. Paris 2022 had 120 participants, 50 new to EDDI and 19 of the 40 presentations were from first time attendees.

Whatever the advantages of online, the pleasure of actually meeting in person, and having unprompted discussions can’t easily be reproduced, especially for those new to EDDI.

The conference was preceded by a CODATA / DDI Alliance Introduction to DDI, online event, with presentations on DDI-Codebook, DDI-Lifecycle, DDI-CDI and demonstrations of various tools that support DDI. The presentations are available online at https://codata.org/initiatives/data-skills/ddi-training-webinars/europea....

The conference was opened by the new CESSDA Director, Bonnie Wolff-Boenisch, with a keynote entitled “The European Research Area - So Far and Yet So Close”, which drew on the development of a European vision for cooperation and her experiences at both Science Europe and the European Research Infrastructure (ESFRI) to illustrate the way in which the incorporation of infrastructure and collaboration is a critical part of delivering the European Research Area’s policy objectives. She concluded by saying that “without metadata experts and data infrastructures, there will not be a data highway for research and innovation”.

NESSTAR has been an important part of the DDI Codebook landscape since its launch in 2000, but with the decision to stop support in 2015, many organizations have been looking at the alternatives available. This was the subject of two sessions and a number of related presentations, that showcased a range of different solutions, including using Dataverse as the backend repository and using the add-on developed at CDSP in 2018, to access variable level information (https://www.odesi.ca), migrating to NADA (INED, Tulsa), using MTNA Rich Data Services (Statistics Canada) and Colectica (Sikt), through to development of a completely new repository at Progedo and SSJDA. Other presentations from CDSP looked at what workflows might be needed to support these more heterogeneous environments, and providing a DDI-Codebook feed from Dataverse. The World Bank also presented the latest upgrade to their NADA software.

DDI-Lifecycle has steadily increased its visibility at EDDI. A number of different groups are providing harmonizable / concorded data to the research community and the presentations from CDSP, NACDA and FSD showed a range of different approaches to creating that content both within a study, but increasingly across studies using Colectica. Adoption of DDI-Lifecycle at CESSDA was the spur for a number of presentations on how the many archives are managing to provide content for the CESSDA Data Catalogue, and how CESSDA interacts with other European infrastructures, including ECRIN and EOSC. There were a number of presentations from INSEE, including a keynote from Franck Cotton on how over a 10-year period they have developed a system that can specify, field and manage the data from their business surveys, natively in DDI-Lifecycle. New software supporting extraction of DDI-Lifecycle from datasets by Colectica, GESIS and CLOSER were also showcased. Presentations from CLOSER and INSEE explored the challenges of creating questionnaires in DDI-Lifecycle, which was also the subject of a one-day workshop preceding the conference which proposed the establishment of a DDI Alliance questionnaire working group.

SciencesPo were able to offer a diversity scholarship, and this led to two very interesting talks from researchers on their perspective on DDI.

There were a number of presentations focusing specifically on interoperability, with other standards, however, in recent years many presentations have been related to DDI being used with/or alongside other standards. The DDI Alliance Technical Committee, along with the DDI-CDI Working Group held a post-conference workshop on implementation languages for DDI, looking at ways in which DDI can be represented in other ways than just XML as has been the case thus far.

The Program Committee would also like to place on record our thanks to CDSP at SciencesPo for hosting what was a memorable in-person conference.

EDDI 2023 will be hosted by the Slovenian Social Science Data Archives at the University of Ljubljana.

Presentations from the Conference are available from https://zenodo.org/communities/eddi2022

Jon Johnson & Mari Kleemola
EDDI co-chairs

DDI Alliance Secretariat Offices Closed Dec. 24-Jan. 2

The DDI Alliance Secretariat offices will be closed from December 24th through January 2nd.  Our online resources remain open 24/7 at https://ddialliance.org/.  Happy Holidays!

Registration Open: DDI-CDI: Optimising Your Data Description for Integration and Reuse, Workshop 24 March 2023

Time: Friday 24 March 2023, 13:00-16:30 UTC

Location: online and Lindholmen Conference Centre 5 Lindholmspiren, 417 56 Lindholmen, Sweden (colocated side event to the RDA Plenary)

Register here: https://www.eventbrite.com/e/ddi-cdi-optimising-your-data-description-for-integration-and-reuse-tickets-486696691907

Event Goal and Structure:

The goal of this workshop is to explain the mechanism employed by DDI-CDI and how it can most easily be leveraged to enhance the reusability of research data. DDI-CDI is a model-based, platform- and technology-independent specification designed to supplement the metadata holdings of data disseminators, archives, and producers. By allowing for an expression of structural metadata, with references to external controlled vocabularies and ontologies, and by connecting metadata records intended for discovery, provenance, and process description, it can act as a connector format which is independent of domain standards. Typically, it can be produced in a programmatic fashion from existing metadata records held in more domain-specific models, although it can also be used as a stand-alone specification. It supports granular, machine-actionable description of a wide variety of data, from traditional wide data files to event/streaming data to key-value (“big”) data and multidimensional cubes.

This workshop will present an overview followed by a series of worked examples, with an exploration of different types of implementations and features of the standard in each. The intent is to give more than an overview, to help participants understand not only what DDI-CDI is intended to do, but also how it works to complement other popular metadata models and standards. Different syntax representations of the standards will be discussed.

DDI has long published metadata standards for the social, economic and behavioural sciences, which are widely used among data producers and archives, including those in the CESSDA network, such as the Swedish National Data Service, the UK Data Archive, Sciences Po, Gesis, Sikt – the Norwegian Agency for Shared Services in Education and Research and many more. DDI-CDI represents an evolution reflecting the growing importance of cross-disciplinary research and the requirement for data services to describe new types of data coming from other domains. The result is a specification which can describe any data in a domain agnostic fashion and is useful within domains for which other DDI specifications are not relevant. Because of this domain independent feature, it has become central to the WorldFAIR project work on the Cross-Domain Interoperability Framework.

Each part comprises three topics, which will each be structured around a presentation and discussion.

Target Audience:

This workshop is intended to be useful to both technical and operational staff working in organisations which produce, archive, integrate, and disseminate quantitative research data, regardless of domain orientation. It is intended to address questions about what the practical implementation of systems supporting the FAIR principles will look like, and will appeal to infrastructure players who are concerned with broadening and deepening the reusability of their data holding through enhanced data and provenance description.

Part One: FAIR Functional Drivers and Requirements

13:00-13:30: The Variable Cascade: concepts, measures and observations.

13:30-14:00: Data Structures: the roles of concepts and variables.

14:00-14:30: Provenance: connecting data through process.

14:30-15:00 UTC: Break

Part Two: System Functions and Supporting Standards

15:00-15:30: Data integration across domains and structures.

15:30-16:00: Process description and alignment with PROV.

16:00-16:30: DDI-CDI as the connection point for a set of related specifications (CDIF example).

Organisational Note:

The workshop will be recorded and the recordings will be made available via CODATA Vimeo. If you plan to attend the event virtually, kindly note the Data Statement for CODATA Zoom at: https://drive.google.com/file/d/1QdZMRNs9h3Md4ArIiJepR15f3MLOYros/view?usp=sharing

All attendees, onsite and online are expected to comply with the CODATA Code of Conduct: https://codata.org/about-codata/codata-policies-and-guidelines/code-of-c...

EDDI Tutorial: 'Introduction to DDI', Paris, 28 Nov - recording and slides now available!

CODATA and the DDI Alliance collaborated to present free online training at the European DDI conference 2022. The training event introduced DDI and described the major specifications, DDI Codebook and DDI Lifecycle, as well as the upcoming DDI Cross-Domain Integration. In addition, DDI tool demonstrations showed how DDI can be used in practice.
 
You can catch up or revisit the workshop via the recording and slide decks now available at:
 
A wide range of speakers shared their expertise, as follows:
 
Introduction: Elizabeth Bishop (GESIS) & Wolfgang Zenk-Möltgen (GESIS);
DDI Codebook: Katja Moilanen (Finnish Social Science Data Archive);
DDI Lifecycle: Hayley Mills (CLOSER);
DDI Cross-Domain Integration: Arofan Gregory (CODATA);
 
plus DDI Tools demonstrations:
 
World Bank tools: Olivier Dupriez (World Bank);
Codebook Statconverter: Adrian Dușa (RODA);
Archivist: Becky Oldroyd (CLOSER);
DDI-CDI Tools implementation: Deirdre M. Lungley (UKDA);
Rich Data Services: Andrew DeCarlo (Metadata technology).
 
Check out their presentations, now available at:
 
Find out more about the CODATA / DDI Alliance training webinar series at: https://codata.org/initiatives/data-skills/ddi-training-webinars/
 
Visit the DDI Alliance: https://ddialliance.org/
Visit CODATA: https://codata.org/

Save the Date: DDI Developers Hackathon in Gothenburg, March 24 & 25, 2023

Dear all

As some of you might have heard we are planning to revive the DDI Developers Group which has been dormant since 2014. To initiate this revival we plan to host a two days DDI Developers Hackathon from Friday 24th of March until Saturday 25th of March, 2023 at the Swedish National Data Service (SND) in Gothenburg. This event is directly following the Research Data Alliance (RDA) plenary in the same week. Therefore we believe some participants could already be at the location saving travel costs.

Therefore, if you are a developer, software engineer or programmer using or implementing tools around the DDI suite of metadata standards this event might be the chance to exchange ideas with similar people plus during the two days we would like to create some prototypical software implementations of current pain points or needed features for DDI tools.

The event is sponsored by the DDI Alliance, the Swedish National Data Service (SND) and the University of Applied Sciences of the Grisons which will provide catering and the location for the whole event. For a limited number of people also sponsorships for travel costs can be provided if the member organizations cannot sponsor it.

Please treat this mail as a Save-the-Date for this event and forward it to interested technical personnel in your organizations. We will soon follow up with another mail after EDDI containing a link for registration as well as an online document where proposals for topics can be entered.

Thanks in advance.

Best Regards

The DDI Developers Hackathon Organizing Team

Ingo Barkow

Johan Fihn Marberg

Olof Olsson

DDI Scientific Board Newsetter, Fall 2022

Ingo Barkow and Hilde Orten, Chair and Vice Chair of the DDI Scientific Board, distributed a newsletter about Scientific Board activities during the fall.  See:

https://ddi-alliance.atlassian.net/wiki/spaces/DDI4/pages/2884993035/News+from+the+Scientific+Board

Free, Virtual DDI Metadata Training: 28 November (13:00-15:30 CET)

The DDI Alliance and CODATA are hosting a free, virtual DDI metadata training workshop 13:00-15:30 CET on 28 November 2022. Anyone interested in metadata is welcome to attend!

Part of the European DDI User Conference, the training workshop will show how DDI metadata can describe single and longitudinal data collections, as well as preview the upcoming DDI-Cross Domain Integration (DDI-CDI) version that helps share and reuse data across domain boundaries and within and between research infrastructures.  Finally, DDI tools developers will demo tools and services for implementing DDI metadata.

To attend, please register by 22 November 2022 via the EDDI conference web site: https://eddi22.sciencesconf.org/resource/page/id/5

DDI Promotional Materials Available

Looking for DDI promotional materials?  Visit https://ddialliance.org/about/promotion to find digital and physical items, including logos, handouts, brochures, and presentation templates.

Need physical materials -- like brochures, buttons, and stickers -- shipped to you?  Just let us know!

Call for Contributions: Implementation Languages Across the DDI Suite of Products

The DDI Technical Committee is asking for ideas and thoughts on the identification and use of implementation languages in the DDI suite of products.

The purpose of the this is to:

  • Identify priority implementation languages for DDI products (e.g. RDF, JSON, UML, XML, etc.)
  • Identify style options for implementation languages
  • Mappings to produce syntax representations
    • Moving from conceptual models to serialization
  • What aspects of implementation should be consistent
    • Document options, decisions, and reasoning
  • Provide guidance for variation from the agreed model
    • based on applied use of product
    • what needs to be noted and how (e.g. a consistent expression of exceptions and reasons)

There will be several ways to get involved, including:

  • Email ideas, proposals and thoughts to the DDI Technical Committee Chair, Wendy Thomas, at wlt@umn.edu
  • Attend the consultation and requirements gathering meeting on 2 December in Paris (after the European DDI User Conference)

If you wish to attend the meeting on 2 December, registration is not mandatory, but it would be helpful to know, for numbers.

A workshop will be held on 5-6 December 2022 in Paris, France to produce a workplan and recommendations arising from the consultation.

Panel Session at DCMI Conference: The Cross-Domain Interoperability Framework: Coordinating Standards for Scalable, Practical FAIR Sharing

Oct 4, 2022 1:15 PM Eastern Daylight Time

To register: https://www.dublincore.org/conferences/2022/sessions/panel-cross-domain-interoperability-framework/

We are now witnessing the emergence of FAIR data-sharing mechanisms in many areas, with the focus having shifted from the "what" to the "how" in many organizations. In many domains, there are a number of common standards – some which can apply equally across domains, and some specific to the data, processes, and practices within that domain. The challenge of FAIR data sharing – ubiquitous, automated reuse of data and metadata – is particularly acute across domain and infrastructure boundaries, demanding a change in how data are described.

To meet this challenge, it is important to first understand how the different standards and models used to describe data can be employed, so that they speak not only to traditional users, but also to users coming from other domains. One major development in this area is the idea of a FAIR Digital Object Framework (FDOF), where information - both data and metadata - of interest for the discovery and reuse of data can be identified and obtained. The FDOF represents an initial step, but does not address many of the practical issues of interoperability. We must look at the intersection of standards of different types and how they fit into this picture: the idea that every FAIR resource is implemented according to an entirely new set of technical standards is not realistic. The FDOF serves as an agreed way to obtain needed FAIR resources and to learn enough about them to understand some related resources (e.g., metadata schemas) at the level of a protocol. It is not sufficient on its own to produce interoperability, which will require an ability to actually understand the metadata schemas being used. When it comes to standards, some parts of FAIR are better supported than others.

Discovery of FAIR resources increasingly relies on standards and approaches which are widely adopted, and often much the same across domains and institutional boundaries. DCAT, Schema.org, and Dublin-Core-based cataloguing metadata is commonly found in many areas. For other aspects of FAIR however, this degree of domain-agnostic standardization does not exist. Semantics and vocabularies are often deeply domain-dependent, and other important types of metadata needed for effective reuse - structural metadata, provenance, etc. - are also seen in many different forms, reflecting domain practice. Within any given domain, the standards requiring support may be well-understood, and limited in number. The same cannot typically be said when data from other domains is the target of reuse. If we are to make use of the FDOF as intended, we need to have a second tier of domain-agnostic standards which makes this profusion of models, schemas, etc. tractable. Such a second tier should be developed as a mechanism for domain-specific standards to be more easily exchanged and transformed. Technical standards such as RDF, JSON, XML (etc.) may provide a useful foundation, but they are not themselves sufficient.

The standard vocabularies and models which are understandable across domains provide an additional needed layer of interoperability. One good example of this is SKOS: many domains use concept systems of different types. If they are described in SKOS, they can at least be exchanged and processed in a coherent way across domain boundaries, even if the specifics of the concepts themselves need further attention. The EOSC Interoperability Framework introduced this idea of a leveled hierarchy of standards, and it is a useful way to understand what a practical approach to interoperability looks like as we progress from the universal toward the domain- and community specific. This session presents the requirements which lead us to a middle tier of domain-agnostic standards in support of the FDOF, and proposes some candidates for consideration based on implementations and explorations to date. Some examples of such standards are provided, showing how they can work together to provide the complete information set needed to reuse data in a FAIR data-sharing scenario across domain and institutional boundaries.

The focus of the session is on the "interoperability" and "reuse" elements of FAIR, but the session will touch on all aspects of FAIR data sharing, and how it might practically be realized. In particular, we aim to present these ideas to the DCMI community, to get feedback and to understand how this approach may intersect with current activities and thinking in the DCMI community and with related initiatives.

Speakers: Arofan Gregory (DDI Alliance and CODATA), Flavio Rizzolo (Statistics Canada), Franck Cotton (INSEE), Simon Hodson (CODATA).