OpenRefine Instructions for Merging New Reconciled Data with Existing Reconciliations in Data Dictionaries
- Load enriched data CSV into OpenRefine, rename
VALUE-add
whereVALUE
is the type of metadata (genres, languages, names, subjects, etc.) - On the
structured_value
column, facet by selectingFacet -> Customized facets -> Facet by blank
- Select
TRUE
for applied facet tostructured_value
column - On the
structured_value_add
column, facet by selectingFacet -> Customized facets -> Facet by blank
- Select
FALSE
for applied facet tostructured_value_add
column Export -> Comma-separated value
file from OpenRefine to local machine
- Copy
VALUE.csv
fromds-data/terms/reconciled
repository to local machine for the type of metadata to be merged - Load
VALUE.csv
andVALUE-add.csv
simultaneously into OpenRefine - Name file
VALUE-new
- On
VALUE_as_recorded
column, applySort...
with the following parameters: text, case sensitive, a-z - From row view selector, find
Sort
and chooseReorder rows permanently
- On
VALUE_as_recorded
column,Edit column -> Join columns...
- Within the join columns menu, choose all columns that apply which may render the row unique, select
Write result in new column named
and type into the text boxduplicate_check
- On the newly created
duplicate_check
column, facet by selectingFacet -> Customized facets -> Duplicates facet
- Select
TRUE
for applied facet onduplicate_check
column to find any duplicate rows - Star one of each duplicate row to be removed, and on the
All
column, facet by selectingFacet -> Facet by star
- Select
TRUE
for applied facet on starred rows - On
All
column, selectEdit rows -> Remove matching rows
- Apply JSON recipe to remove unnecessary columns, merge new data values into existing data, remove whitespace: JSON
Export -> Comma-separated value
file from OpenRefine to local machine
- Add exported
VALUE-new.csv
tods-data/terms/reconciled
repository from local machine - Deprecate/archive old version of
VALUE.csv
by moving tods-data/terms/batch
and under the appropriately dated directory - Rename
VALUE-new.csv
toVALUE.csv
json/all/010-merge-clean-recon-data.json