What is a controlled vocabulary?
The DDI Controlled Vocabularies Group (CVG) has created a set of controlled vocabularies that can be used with DDI as well as for other purposes and applications. Many of the DDI Alliance vocabularies are already in use worldwide -- across Europe, by countries belonging to the Consortium of European Social Science Data Archives (CESSDA), in the United States by the Inter-university Consortium for Political and Social Research (ICPSR), and Mathematica Policy Research, in Canada, etc. DDI Controlled Vocabularies are also incorporated in editing and publishing tools that work with the DDI specification, like Colectica and Nesstar Publisher.
A paper on "Controlled Vocabularies for DDI 3: Enhancing Machine-Actionability" provides additional background on this effort.
Available Formats
On the CESSDA Vocabulary Service website, the controlled vocabularies may be downloaded in SKOS, PDF, and HTML formats.
Usage
Usage information for each controlled vocabulary is available in the vocabulary documentation. The published DDI-CVs may be used with relevant classes from both DDI-Codebook and DDI-Lifecycle.
Translations
The languages in which translations are available are listed in the HTML presentation of the individual CVs, as shown on the CESSDA Vocabulary Service website. There users can select the language(s) and format(s) they wish to download.
Publication, Maintenance, and Management
Versioning Policy
The DDI versioning policy as described below has been approved by the DDI Alliance in November 2022 and is published and implemented starting January 2023. This new protocol supersedes the previous policy which was based on a two-digit version numbering system. Users who have referenced these vocabularies prior to February 1, 2023 will need to retroactively change any version reference from V. x.x (two digits -- e.g., V. 1.0) to V. x.x.x (three digits -- e.g., V. 1.0.0). From that point on, new versions can be used and referenced normally.
Versioning is done at the level of each published controlled vocabulary (CV), and not at the code/concept level.
A code/concept in a vocabulary consists of the following parts:
Code value | The specific content that can be entered into the DDI specification as an identifier of the code across languages. In hierarchical lists, all of the levels are always mentioned in each code value, and are separated by a period (for example, AutomatedDataExtraction.ApiQuery). |
Descriptive term | The display label associated with the code. This may be available in multiple languages. |
Definition | The definition of the code. This may be available in multiple languages. |
Both the vocabularies and the individual concepts/codes have a persistent identifier (PID), that is, a Linked Data URI available in the SKOS/RDF export. Even though the URIs include a CV version number that may change over time, the vocabulary and the codes will remain findable based on the CV short name and the 7-character alphanumeric ID in the URI.
- The vocabulary URI pattern is: http://rdf-vocabulary.ddialliance.org/cv/<CV_SHORT_NAME>/<VERSION_NUMBER/
e.g.http://rdf-vocabulary.ddialliance.org/cv/AnalysisUnit/2.1.0/ -
The code/concept URI pattern is:
http://rdf-vocabulary.ddialliance.org/cv/<CV_SHORT_NAME>/<VERSION_NUMBER/<7-character-alphanumeric_id>
e.g.http://trdf-vocabulary.ddialliance.org/cv/AnalysisUnit/2.1.0/d56e194
(Concept ID d56e194 with label “OrganizationOrInstitution”).
‘English’ refers to the source language of DDI vocabularies, that is, American English.
A change in the first digit of the version number will indicate a major change in the controlled vocabulary. Major changes are any substantive amendments in the content or meaning of a vocabulary (concept scheme) or code (concept), as specified below. Changes in the second digit of the version number will indicate a minor change. Minor changes are changes in wording, spelling in English, etc. (i.e., "form") that do not involve changes in intellectual content or meaning. Major and minor changes affect the source language (English) only.
A change in the third digit will indicate a sub-minor change. Sub-minor changes are changes in content of language variants at vocabulary or code level, for instance, title, definition or descriptive term amendments in any other languages than English. Sub-minor changes include the addition of a new language.
Users who have referenced these vocabularies prior to February 1, 2023 will need to retroactively change any version reference to a three-digit version by checking the value of the new third digit number on the DDI Alliance controlled vocabularies website and updating the version reference accordingly. For example, they should change ModeOfCollection 3.0 to ModeOfCollection 3.0.0. After February 1, 2023, new versions can be used and referenced normally. The previous URIs with the old two-digit version number will still find the published vocabularies with the new three-digit version number.
Major, minor and sub-minor changes are specified below. Note that these changes do not include changes to the short name of a vocabulary. Any change to the short name of a vocabulary results in the deprecation of that vocabulary. CV notes, usage information and version history can be edited without impact on versioning.
Major changes that may break the backward compatibility - X.0.0 first digit changes
- Vocabulary definition is amended with meaning change in English
- Code/concept is added
- Code/concept is deprecated
- Code is replaced by another code
- Code value of a concept is changed
- Code/concept definition change with meaning change in English
- Descriptive term change with meaning change in English
Minor change - 1.X.0 second digit changes
- Vocabulary long name (title) is rephrased in English with no meaning change, for example, due to a typo
- Vocabulary definition is rephrased with no meaning change in English
- Code/concept definition is added in English
- Code/concept definition is rephrased without meaning change in English
- Descriptive term is rephrased without meaning change in English
Sub-minor change - 1.0.X third digit changes
- Changes in any other languages than English that affect versioning, including:
- Vocabulary long name rephrased with no meaning change
- Vocabulary definition amended with meaning change
- Vocabulary definition rephrased with no meaning change
- Vocabulary definition added
- Code/concept definition is added
- Code/concept definition is amended with meaning change
- Code/concept definition rephrased with no meaning change
- Descriptive term is amended with meaning change
- Descriptive term rephrased with no meaning change
Deprecation of a vocabulary (concept scheme):
- The CV short name is an identifier that is the same across languages. It is not translated.
- If the short name of a vocabulary is changed, the whole vocabulary, including all its language variants, is deprecated, and a new vocabulary is published with the new name. Versioning of the new vocabulary starts from scratch. Changing the short name signifies that it is a different vocabulary used for a different element in the DDI standard and therefore DDI Alliance considers it to be a new vocabulary.
- Another reason it is necessary to deprecate a vocabulary if its short name changes is because a change to the short name entails a change to all the other machine-actionable vocabulary identifiers, including the PID and URI.
- If the vocabulary's long name (title) changes with a change in meaning, then the short name also changes, thus causing the vocabulary to be deprecated. Typo corrections and other small changes in the long name do not cause the short name to change.
Vocabulary-level changes in English
DOCUMENTATION CHANGE EXPRESSED | LOGICAL EXPRESSION | EXPLANATION | CHANGE TYPE |
---|---|---|---|
CV DEFINITION AMENDED WITH MEANING CHANGE | X | The CV definition is amended to reflect a change in meaning for the CV in English. | Major |
CV LONG NAME REPHRASED | X | The CV title is amended without a change to the meaning (short name does not change). | Minor |
CV DEFINITION REPHRASED | X | The definition for the CV is rephrased for clarity, edited for accuracy without a change in meaning in English. | Minor |
Vocabulary-level changes in language variants
DOCUMENTATION CHANGE EXPRESSED | LOGICAL EXPRESSION | EXPLANATION | CHANGE TYPE |
---|---|---|---|
CV LONG NAME REPHRASED | X | The CV title is amended without a meaning change in other language than English. | Sub-minor |
CV DEFINITION REPHRASED | X | CV definition is rephrased without a meaning change in other language than English. | Sub-minor |
CV DEFINITION AMENDED WITH MEANING CHANGE | X | CV definition is amended with meaning change in other language than English. | Sub-minor |
CV DEFINITION ADDED | X | CV definition is added in other language than English. | Sub-minor |
Code/concept-level changes in English
DOCUMENTATION CHANGE EXPRESSED | LOGICAL EXPRESSION | EXPLANATION | CHANGE TYPE |
---|---|---|---|
CODE ADDED | --> Z | A new code Z is added to the CV. | Major |
CODE DEPRECATED | X --> | Code X is deprecated from the CV. | Major |
CODE IS REPLACED BY | X, Y (n) --> Z | One or more codes (X, n) are deprecated, and their meaning is taken over by a new Z. | Major |
CODE VALUE CHANGED | X --> Z | The value of code X is changed to Z but its definition remains the same. | Major |
CODE DEFINITION AMENDED WITH MEANING CHANGE | X | The definition for code X is amended to reflect a change in meaning for code X. | Major |
CODE DESCRIPTIVE TERM AMENDED WITH MEANING CHANGE | X | The descriptive term for code X is amended with a change in meaning. |
Major |
CODE DEFINITION ADDED | X | Definition is added for the code X. | Minor |
CODE DEFINITION REPHRASED | X | The definition for code X is rephrased for clarity, edited for accuracy, or an example is added or deleted without a change in meaning. | Minor |
DESCRIPTIVE TERM REPHRASED | X | The term describing code X is rephrased for clarity or edited for accuracy, without a change in meaning. | Minor |
Code/concept-level changes in language variants
DOCUMENTATION CHANGE EXPRESSED | LOGICAL EXPRESSION | EXPLANATION | CHANGE TYPE |
---|---|---|---|
CODE DEFINITION ADDED | X | A definition for code X is added. | Sub-minor |
CODE DEFINITION AMENDED WITH MEANING CHANGE | X | The definition for code X is amended to reflect a change in meaning for code X. | Sub-minor |
CODE DEFINITION REPHRASED | X | The definition for code X is rephrased for clarity, edited for accuracy, or an example is added or deleted without a change in meaning. | Sub-minor |
CODE DESCRIPTIVE TERM AMENDED WITH MEANING CHANGE | X | The descriptive term for code X is amended with a change in meaning. | Sub-minor |
CODE DESCRIPTIVE TERM REPHRASED | X | The term describing code X is rephrased for clarity or edited for accuracy, without a change in meaning. | Sub-minor |
In addition to a change in the version number, each new version of a CV will contain documentation in the version history about how the new CV compares with the previous version, filtered by language. The changes will be documented using the following structure:
- CV LONG NAME REPHRASED:
- Reponse Unit changed to Response Unit
- CV DEFINITION AMENDED WITH MEANING CHANGE:
- AnalysisUnit
- CV DEFINITION REPHRASED:
- ModeOfCollection
- CODE ADDED:
- AutomatedDataExtraction
- CODE DEPRECATED:
- SelfAdministeredQuestionnaire.FixedForm, is replaced by SelfAdministeredQuestionnaire
- CODE VALUE CHANGED:
- Interview.FaceToFace.CAPICAMI changed to Interview.FaceToFace.CAPIorCAMI
- CODE DESCRIPTIVE TERM AMENDED WITH MEANING CHANGE:
- From Geographical to Geospatial
- CODE DEFINITION ADDED:
- Longitudinal.Panel
- CODE DEFINITION AMENDED WITH MEANING CHANGE:
- Longitudinal.Panel
- CODE DEFINITION REPHRASED:
- SelfAdministeredQuestionnaire.CAWI
- DESCRIPTIVE TERM REPHRASED:
- Interview: Face-to-face: CAOI rephrased into Interview: Face-to-face: CAPI
- DESCRIPTIVE TERM REPHRASED:
- Haastattelu: Kasvokkainen haastattelu: CAOI rephrased into Haastattelu: Kasvokkainen haastattelu: CAPI (an example in the language variant Finnish)
Note: DDI-CVG has also produced a set of guidelines to support controlled vocabularies users in retrofitting their collections following the publication of new CV versions. Please note that these are only intended as recommendations, and are not being enforced as part of the versioning policy.
For an up-to-date list and download links of the latest versions of the CVs currently available, please see the CESSDA Vocabulary Service.
Name | Title | Description | Link | |
---|---|---|---|---|
AggregationMethod | Aggregation Method |
Identifies the type of aggregation used to combine related categories, usually within a common branch of a hierarchy, to provide information at a broader level than the level at which detailed observations are taken. (From: The OECD Glossary of Statistical Terms) |
CESSDA | |
AnalysisUnit | Analysis Unit |
Describes the entity being analyzed in the study or in the variable. |
CESSDA | |
CharacterSet | Character Set |
Standard set of characters upon which many character encodings are based (Wikipedia). |
CESSDA | |
CommonalityType | Commonality Type |
Describes the degree of similarity between two items or schemes (collections of items). |
CESSDA | |
ContributorRole | Contributor Role |
A classification of contributor roles. |
CESSDA | |
DataSourceType | Data Source Type |
Includes a typology of data sources. |
CESSDA | |
DataType | Data Type |
Identifies the type of data, which has a bearing on the acceptable data values, the operations that can be performed with the data, and the ways in which the data are stored. The present list is based on the W3C data types, and includes the terms relevant for documenting research data. |
CESSDA | |
DateType | Date Type |
Specifies the type of date. The present list is based on ISO 8601 usage. |
CESSDA | |
GeneralDataFormat (formerly KindOfDataFormat) |
General Data Format | Describes the physical format(s) of the data documented in the logical product(s) of a study unit. | CESSDA | |
LanguageProficiency | Language Proficiency |
Describes the level of proficiency of an individual in a natural language. |
CESSDA | |
LifecycleEventType | Lifecycle Event Type |
Specifies the event happening over the data life cycle that is considered significant enough to document. |
CESSDA | |
ModeOfCollection | Mode of Collection |
The procedure, technique, or mode of inquiry used to attain the data. |
CESSDA | |
NumericType | Numeric Type |
Specifies the type of numeric data. |
CESSDA | |
ResponseUnit | Response Unit |
Indicates the entity that provided the information carried by the variable. |
CESSDA | |
SamplingProcedure | Sampling Procedure |
Includes a typology of sampling methods. |
CESSDA | |
SoftwarePackage | Software Package |
Indicates the statistical software package used in the production/processing/dissemination of the data. Data collection software is not covered in this list. |
CESSDA | |
SummaryStatisticType | Summary Statistic Type |
Specifies the type of summary statistic. Summary statistics are a single number representation of the characteristics of a set of values. |
CESSDA | |
TimeMethod | Time Method |
Describes the time dimension of the data collection. |
CESSDA | |
TimeZone | Time Zone |
Time zone specification as an offset from UTC (Coordinated Universal Time) in terms of hours and minutes. |
CESSDA | |
TypeOfAddress | Type of Address |
Identifies the type of address entered as contact information for an individual or an organization. |
CESSDA | |
TypeOfConceptGroup | Type of Concept Group |
Specifies the rationale for creating a concept group. |
CESSDA | |
TypeOfFrequency | Type of Frequency |
Indicates the frequency of data collection events. |
CESSDA | |
TypeOfInstrument | Type of Instrument |
Includes a typology of data collection instruments. |
CESSDA | |
TypeOfNote | Type of Note |
Includes a typology of notes. |
CESSDA | |
TypeOfTelephone | Type of Telephone |
Identifies the type of telephone entered as contact information for an individual or an organization. |
CESSDA | |
TypeOfTranslationMethod | Type of Translation Method |
A typology of methods used to translate data collection instruments, including questionnaires, individual questions, measurements, data capture flows, etc. |
CESSDA |
See archived versions.