Skip to main content

Posts

Generate Technical Metadata for GIS Data

Technical Metadata for GIS Data Technical metadata can be programmatically extracted from the dataset. Step 1: Generate or Update the Technical Metadata How to generate an ArcGIS 1.0 XML file for a dataset one at a time. Open ArcCatalog and use the Catalog Tree to navigate to the location of the dataset. Click on the dataset in the catalog tree. Open the Description tab in the main window.  This action will update or create a new metadata file in the ArcGIS Metadata 1.0 Format.   How to generate an ArcGIS 1.0 XML file for a batch of files. Download the datasets and unzip all of the files into a flat folder.   With 7-Zip, use the option without the “*/”, which will keep the files in separate folders Folders can be flattened with the following command line script (Mac tested):   find [DIRECTORY] -mindepth 2 -type f -exec mv -i '{}' [DIRECTORY] ';' Download a zip of the project’s Custom ArcGIS Toolbox of Batch Metadata Scripts. Unzip t...

Batch Transform FGDC to ISO 19139

There are two options for how to do this. Option 1:  Obtain standalone XML files and use the FGDC to ArcGIS Translator tool Option 2:  Download the dataset and use the ArcGIS Update tool Before you begin: Download a zip of the project’s Custom ArcGIS Toolbox of Batch Metadata Scripts. Unzip the folder into a location that you can access from within ArcCatalog. Option 1: Standalone XML files Use the Esri Translator with FGDC2Esri_ISO selected on standalone XML files. All metadata except the reference system information will be retained. 1. Place the XML files in their own folder.  Make sure that no other types of files are in the folder. 2. Open ArcCatalog and use the Catalog Tree to navigate to the downloaded folder of custom scripts. 3. Expand the toolbox and select the tool called “Translate FGDC to ArcGIS for Standalone XML file” 4. The input is only the folder containing the XML files.  Details about the tool: The input is a fol...

Harvest datasets in a batch with WGET or a Browser Plugin

WGET This command line program can download batches of datasets or just XML files.   Some recipes to try: Download all zipped files from an FTP site: wget -r --no-parent -A.zip name of site Download only the XML files from an online folder: wget -r -l 1 -np -A '*.xml' name of site To download all ZIP files from a DCAT JSON: Create a CSV from the JSON file (see above section called Harvesting Metadata from ArcGIS Open Data or Socrata portals) Save a copy with only the download links (ZIP files) Use this spreadsheet to download the datasets: wget -i filename.csv Troubleshooting Note: ArcGIS portals create the downloadable shapefiles on the fly- this means that they may time out or cause errors when trying to download batches or even singly. Browser Plug-ins A browser plugin is another way of downloading datasets.  This option may work if WGET fails. As of this writing, a useful one for Chrome is Tab Save , where the user can just p...

Geo4LibCamp 2017

This post was co-written with Mara Blake. Three members from the Big Ten Academic Alliance Geospatial Data Project attended Geo4LibCamp 2017 . The organizers describe this annual event as “a hands-on meeting to bring together those building repository services for geospatial data. The main focus is to share best-practices, solve common problems, and address technical issues with integrating geospatial data into a repository and associated services.” The Setting Geo4LibCamp is hosted by Stanford University in Palo Alto, California. Stanford is a fitting location for this event, because it has been a leader in the development of geospatial repositories, or georepositories , for libraries, notably with their work with the University of Santa Barbara to develop the National Geospatial Digital Archive , as well as their more recent contributions to discovery platforms for archived GIS data, particularly GeoBlacklight . It is also home to the interactive Rumsey Map Center , which i...

Using sentence-case for keywords in OpenRefine

Issue Capitalization and pluralization of ingested keywords vary.  Our keyword list for in GeoBlacklight is somewhat messy and contains near duplicates. Challenge Our instance of Solr for GeoBlacklight indexes Dog, dog, dogs as separate keywords. Solution Use OpenRefine to normalize keywords before importing to Solr. Description As we aggregate metadata records from multiple sources, we found that the keywords need attention. The GIS records have keyword groups that may or may not come from a thesaurus, but frequently are coming from the TAGS field in ArcGIS Open Data Portals.  As a result, the keywords are frequently just regional acronyms or abbreviations and often have many spelling variants. We also anticipate combining our metadata records with those made at other institutions outside of the Big Ten Academic Alliance Geospatial Data Project.  After reviewing records from other universities and consulting the RDA rules on capitalization , we decided to...

Inserting a New Element into an XML File Using Oxygen

ISSUE: We want to be able to batch update all values in our ISO 19139 XML metadata files that reside in the GeoNetwork editing application.  We have a python script (CSW-Update) that can import a spreadsheet of values to make batch changes to the metadata files. CHALLENGE:  Our CSW-Update script cannot (yet) create certain elements that are not nested or contain special attributes. SOLUTION: Use XML Refactoring in Oxygen to create blank elements.  Re-upload the metadata files to GeoNetwork, and then the CSW-Update script works. Steps: 1. Download the metadata files from GeoNetwork and unzip them. 2. Open one file in Oxygen and go to Tools-XML Refactoring 3. Select Insert Element 4. The fields are not well documented in the Oxygen application, however, so below the image, I have indicated what they mean. Local Name:   This is the text that will be inside the tag.  You cannot specify a prefix here. Namespace:  Y...