Skip to main content

Posts

Showing posts from 2016

Inserting a New Element into an XML File Using Oxygen

ISSUE: We want to be able to batch update all values in our ISO 19139 XML metadata files that reside in the GeoNetwork editing application.  We have a python script (CSW-Update) that can import a spreadsheet of values to make batch changes to the metadata files. CHALLENGE:  Our CSW-Update script cannot (yet) create certain elements that are not nested or contain special attributes. SOLUTION: Use XML Refactoring in Oxygen to create blank elements.  Re-upload the metadata files to GeoNetwork, and then the CSW-Update script works. Steps: 1. Download the metadata files from GeoNetwork and unzip them. 2. Open one file in Oxygen and go to Tools-XML Refactoring 3. Select Insert Element 4. The fields are not well documented in the Oxygen application, however, so below the image, I have indicated what they mean. Local Name:   This is the text that will be inside the tag.  You cannot specify a prefix here. Namespace:  Y...

Deleting duplicate keywords in OpenRefine

We are planning to perform some keyword remediation on the Big Ten Academic Alliance Geoportal records starting in 2017.  This process includes normalizing values by spelling, capitalization, and pluralization. ISSUE: Duplicate keywords in metadata SOLUTION: Use a GREL expression in OpenRefine One of the challenges our project has encountered is duplicated keywords.  Thanks to the code provided on the  Free Your Metadata site , fixing this is an easy process with OpenRefine.  Here are the steps: 1. Create a Project in OpenRefine from a csv file 2. The keywords should be combined in one cell per row.  Take note of the separator character (usually a comma, but our csv files often use triple hash marks- ###) 3. Click the dropdown arrow next to the column name and select Edit Cells-Transform 4. Enter this Grel expression:  value.split(", ").uniques().join(", "). Note: The character between the quotes in the expression needs to match the...

A method for adding technical metadata to existing records

ISSUE: When aggregating records for inclusion in a geoportal, we frequently harvest just the metadata records. Unfortunately, these sometimes are missing needed technical information. Important technical metadata elements that might be missing include: bounding box coordinates coordinate system geometry type distribution format file size When we harvest large collections of standalone metadata files, it might not be apparent that certain values are missing until we have already worked on the descriptive metadata values in GeoNetwork. CHALLENGE: This type of metadata is difficult or impossible to obtain without downloading the dataset and using ArcCatalog or GDAL to extract the values. If we already have a metadata file associated with the resource, what is the best way to insert the technical metadata values? SOLUTION:  Download the dataset and use ArcCatalog to automatically generate the technical metadata. Next, use the Import Metadata tool to merge t...

GeoBlacklight Plugin for Omeka: Finding the correct URIs for U.S. states from the GeoNames API

Technology: The GeoBlacklight plugin for Omeka includes a very handy feature in the Spatial Coverage field. A user can type in a place term, which will query GeoNames, and produce a dropdown list of options.  The user selects a value from the list, and this will pull in the GeoNames URI as well as the bounding box coordinates. Issue: The dropdown list will only display the first 12 items returned for any given term.  Occasionally, this list will not include the desired value.  Searching for several different U.S. states will trigger this problem.  The screenshots below show a search for Florida. Typing in “Florida” will not return the GeoNames URI for the state in the selectable list. Typing in “State of Florida” does return the desired value at the top of the list. A Discovery: All of the states in the U.S. have two possible formatting options that may (or may not) show up in the dropdown list in Omeka. On...

Batch Transform MARC to Dublin Core

Overview: This post lists a set of steps that can transform MARC metadata into the GeoBlacklight Metadata schema.  The process involves changing a .mrc file into a .csv file.  It also shows how to access the Math Tool in MarcEdit that will transform coordinates in the 034 field from degrees-minutes-seconds into decimal degrees. Tools Required: MarcEdit , Spreadsheet editor 1. Obtain the MARC metadata as a single file in the .mrc format.  (MARCXML works too.) 2. Open MarcEdit and select the icon called MarcEditor. 3. Go to File-Open: select the .mrc file. 4. If you need to convert the coordinates in the 034 field from degrees-minutes-seconds, go to Edit-Edit Shortcuts-Math Functions-Convert To Decimal Degrees. (Otherwise, skip to step 5) If your coordinates have been entered using the standard d,e,f,g part of the 034 field, accept the defaults. Review the results: Before: After: Go to File-Save As to save your work as...

Technology List for Project

This post is a quick rundown of the main software applications used for the Big Ten Academic Alliance Geospatial Data Project. Hosted Applications GeoBlacklight Purpose:   This application is the public geoportal. Authors: GeoBlacklight project Code Base:   Open source application built on Blacklight and Ruby on Rails Project GitHub repository fork: https://github.umn.edu/Libraries/geoblacklight/ Associated Scripts Csw-to-geoblacklight - python and XSLT used for publishing, updating, deleting records Comments: This is the core output of the project. GeoBlacklight is actively developed, and new functionality has been added in version 1.x that we hope to migrate to soon. GeoNetwork Purpose:   This online application is for collaboratively editing XML files in the ISO 19139 format. Authors: Open Source Geospatial Foundation Code Base:   Open source application built on Java with Apache Tomcat Project GitHub repository: http...