Data Dictionary Table
The Data Dictionary Table (right side of Create a dataset) provides an interactive way to filter your variables & meta-data of interest. The table allows you to build on the filtering provided by the ontology tree does through the filter & search functions. Additionally the data dictionary table has flexible display options, including controlling the data dictionary meta-data that is displayed and adjusting the tables appearance (rows displayed, column order, width, etc).
The data dictionary table is also where you will create datasets and modify existing datasets
Table display
By default, the data dictionary table displays the following columns:
Variable name
Variable label
Variable type
Data type
Level of measurement
Instructions
Unit
For more information on ABCD’s data dictionary visit their documentation website.
Variable name
Variable label
Variable type
Data type
Level of measurement
Unit
For more information on HBCD’s data dictionary visit their documentation website.
Column display (): To change which columns are displayed, select located at the top right of the table. In the widget you can toggle columns on or off individually or show () or hide () all columns at once using the buttons at the top. You can reset to default settings by clicking the reset button ().
You can make further changes to the appearance of the data dictionary table by clicking the next to each column header. In the column menu you can:
- Sort the column (ascending / descending )
- Pin () the column to the left or right to stabilize their view as you scroll horizontally. Pinned columns will appear in the order selected; the column pinned first will appear the furthest out in the selected direction.
- Filter () the column (learn more about filtering below).
- Hide column will hide the column (it will also unselect them in ).
- Manage columns is another way to view the menu for selecting/deselecting the displayed columns.
Other ways to modify the table display:
- Click the arrows ( / ) next to each column header to sort ascending or descending. This can also be done in . To sort by more than one column, click on the first column, then hold your system’s modifier key (cmd on a Mac; ctrl on Windows/Linux) and click on the second (and potentially further columns).
Ordering the Variable name column in alphanumeric fashion may not best represent the logical progression of variables. Use of the Sort order
column is often more useful for seeing the order variables are displayed in.
The data dictionary table is sorted by Table name
and Sort order
column by default and Sort order
best represent the variables in the order they were displayed to participants or a more logical progression (such as grouping administrative variables together).
In some tables, Sort order
is equivalent to the variable’s alphanumeric order, but in most tables the Sort order
has been set manually for a more intuitive display.
- Use the sliders () to the right of each column to change the column’s width.
- At the bottom of the table, you can adjust the number of variables that are displayed on one page (10, 25, 50 or 100 rows; default: 25). Use the button with the page numbers or the arrows to navigate to different table pages.
Data warnings
Data warnings have special indicators in the data dictionary table. These warnings exist as columns in the data dictionary and are viewable in the data dictionary table & the variable details.
There are two kinds of warnings in the ABCD & HBCD datasets; responsible data use & data quality. Warnings may be applied to individual variables or whole tables. Learn more about these warnings in the study documentation:
The presence of a warning is displayed next to the Table name
and/or Variable name
as an icon.
-: Responsible data use warning
-: Data quality warning
Hover over the alert icon and there will be a link on the word details that will take you to the variable details tab; from there you can navigate to the warnings information on each studies respective websites.
Table contents
This section will cover how to effect what is included in the data dictionary table that is displayed. As a reminder the ontology tree can be used to adjust the variables included in the data dictionary.
The selection in the ontology tree are applied before the search function or filtering. Filtering also takes precedence over the search function. That is, the search function will only search among the variables that are displayed in the table given the sections in the ontology tree or filters.
Order of priority: Ontology Tree > Filter > Search
To learn more about the data dictionary meta-data and to better understand the full scope of information on each variable please visit each studies documentation:
Filter
Variables names are structured to provide you with information about the domain, source, measure, and question order. Understanding the variable name structure for the study can aid in sorting.
Available filters
- contains
- equals
- starts with
- ends with
- is empty
- is not empty
- is any of
The ontology tree acts as the primary filter of the data dictionary table but you can also filter the table itself. To specify column-specific filters, select () at the top-right of the table or in the column settings (). Selecting to filter from a variable column will default the filter to be based on that column. You can only filter on columns that are actively selected for display. Filters can be modified and/or removed within the same pop-up ().
The filter function will only work on the variables that are currently listed in the table, i.e., after the selections in the ontology tree have been applied. Searching for [Table name
] [starts with] [nt
] will not bring up variables from the Novel Technology domain if you have already used the ontology tree to filter the data dictionary table to only include variables from Physical Health.
All the categories of the ontology tree are also columns in the data dictionary so you can also filter on those columns instead of using the ontology tree. Some examples of these types of filters are:
- [
Domain
] [contains] [Health
] - All variables in the ‘Physical Health’ or ‘Mental Health’ domain - [
Source
] [equals] [Parent
] - Only parent variables - [
Atlas
] [is not empty] - Only variables with a defined ‘Atlas’, which by default will limit to only imaging variables
Example from ABCD dataset
Table names are structured to tell you the domain, source, and measure. If you know the abbreviation for these you can filter by any of those criteria using the ‘Table name’ or ‘Variable name’ column.
- [
Table name
] [contains] [le_l_
] - Limit the search to variables where ‘Linked External Data’ is the source and the data belongs to the main ABCD study. - [
Table name
] [starts with] [nt
] - Display variables in the ‘Novel Technology’ domain - [
Table name
] [equals] [nc_y_lmt
] - Variables related to the Little Man neurocognitive task
Example from ABCD dataset
Searching by keywords in the Domain
, Subdomain
, or Variable label
columns can be a good way to explore the variables if you don’t know exactly what you are looking for.
For example, searching for [Variable label
][contains] [sleep
] may give you a better idea of where to start looking for sleep-related variables or may tell you about measures you did not know about that also cover sleep.
To learn more about effectively using keywords read about the study glossaries:
- ABCD: https://docs.abcdstudy.org/latest/
- HBCD: https://docs.hbcdstudy.org/latest/
Search
The search function works across all columns in the data dictionary table, even if they are not actively displayed in the data dictionary table. For example, even if the Name (REDCap)
column is not displayed, the search function will still search for matching terms in that column. Similarly variables (rows) not on the currently viewable tab of the data dictionary table will still be searchable
Search terms do not need to contain the full term. For example, you can search bisbas__bas_001
or mh_y_bisbas__bas__dr_001
.
The search function is not case sensitive but, when searching by variable name, it will be sensitive to spelling and the number of underscores ( _ ) between terms.
Multiple search terms can be used by separating terms with a space; “harm mh_y
” will search for columns that contain harm
AND mh_y
.