Skip to main content

Posts

Showing posts from December, 2016

Inserting a New Element into an XML File Using Oxygen

ISSUE: We want to be able to batch update all values in our ISO 19139 XML metadata files that reside in the GeoNetwork editing application.  We have a python script (CSW-Update) that can import a spreadsheet of values to make batch changes to the metadata files. CHALLENGE:  Our CSW-Update script cannot (yet) create certain elements that are not nested or contain special attributes. SOLUTION: Use XML Refactoring in Oxygen to create blank elements.  Re-upload the metadata files to GeoNetwork, and then the CSW-Update script works. Steps: 1. Download the metadata files from GeoNetwork and unzip them. 2. Open one file in Oxygen and go to Tools-XML Refactoring 3. Select Insert Element 4. The fields are not well documented in the Oxygen application, however, so below the image, I have indicated what they mean. Local Name:   This is the text that will be inside the tag.  You cannot specify a prefix here. Namespace:  You have to type in the full URI of

Deleting duplicate keywords in OpenRefine

We are planning to perform some keyword remediation on the Big Ten Academic Alliance Geoportal records starting in 2017.  This process includes normalizing values by spelling, capitalization, and pluralization. ISSUE: Duplicate keywords in metadata SOLUTION: Use a GREL expression in OpenRefine One of the challenges our project has encountered is duplicated keywords.  Thanks to the code provided on the  Free Your Metadata site , fixing this is an easy process with OpenRefine.  Here are the steps: 1. Create a Project in OpenRefine from a csv file 2. The keywords should be combined in one cell per row.  Take note of the separator character (usually a comma, but our csv files often use triple hash marks- ###) 3. Click the dropdown arrow next to the column name and select Edit Cells-Transform 4. Enter this Grel expression:  value.split(", ").uniques().join(", "). Note: The character between the quotes in the expression needs to match the separato

A method for adding technical metadata to existing records

ISSUE: When aggregating records for inclusion in a geoportal, we frequently harvest just the metadata records. Unfortunately, these sometimes are missing needed technical information. Important technical metadata elements that might be missing include: bounding box coordinates coordinate system geometry type distribution format file size When we harvest large collections of standalone metadata files, it might not be apparent that certain values are missing until we have already worked on the descriptive metadata values in GeoNetwork. CHALLENGE: This type of metadata is difficult or impossible to obtain without downloading the dataset and using ArcCatalog or GDAL to extract the values. If we already have a metadata file associated with the resource, what is the best way to insert the technical metadata values? SOLUTION:  Download the dataset and use ArcCatalog to automatically generate the technical metadata. Next, use the Import Metadata tool to merge the e