Skip to main content

Posts

Showing posts from 2017

Harvesting from CKAN and Sorting Adjacent Key Value Columns

Many of the open data portals are built on the open source application CKAN.  Metadata can be harvested from these portals using the CKAN API.  Many CKAN items include numerous resource URLs, including download links of varying formats, landing page links, web services, and applications. Sorting through myriad of links can be challenging. This post describes how to: Harvest metadata with the ckan-exporter script Use OpenRefine to sort the resource URLs. Part 1: Harvest metadata with the ckan-exporter script The CKAN API has a number of calls that will return information such as a list of items, tags, or organizations.  It will also return the item's metadata, also described as a package in the API calls.  The ckan-exporter script allows the user to define a list of desired metadata elements that can be harvested via the command line. The readme file includes documentation for how to set up the harvest files and examples are included. The BTAA GDP fork o...

Exporting Metadata from Omeka to CSV

February 2018 Update: There is now a plugin for Omeka that allows users to export to CSV via the interface. The Omeka digital collections platform features many easy to use plugins to facilitate editing and sharing metadata.  Oddly, there isn't a built in option or even a plugin available yet that allows a user to export the metadata directly to a spreadsheet. This post lists a step by step method to do this using a PHP script written by an Omeka developer.  Note: This script will export all items that have been marked Public. It will also export all of the elements, even if they are empty. 1. Clone or download  OmekaApiToCsv .  This is a version of the original script, but has the addition of a pipe delimiter for multivalued elements. 2. Upload and extract this same set of scripts to the same web server as the Omeka installation.  Folder of the OmekaApiToCsv scripts within the rest of the Omeka files 3. Open the php file within the OmekaApi...

Geospatial Metadata Contact Types and Roles

Geospatial metadata standards include multiple elements for contacts - persons or organizations that play some kind of role in the creation or maintenance of the resource. Although many metadata profiles require listing multiple contact types/roles, they are often the same entity. This post lists the main contact types in use, and their locations in the ISO 19139 and FGDC CSDGM standards. Point of Contact This is the person/organization that the user should contact for questions.  This contact type should include an address, email, and phone number.  Their role in the creation or maintenance of the resource is not specified here- it is just an all-purpose contact for the resource. ISO ⇨ MD_DataIdentification.gmd:pointOfContact FGDC ⇨ 1.9-10 Metadata Point of Contact This is the person/organization that is responsible for the metadata.  They are not necessarily part of the publisher or distributor organization.  They are the perso...

ArcGIS Metadata Toolbox Guide

These charts list the most useful metadata tools/models, when to use them, and problems they may cause. For more information, see the ArcGIS online documentation. Name Synchronize Type Tool When to Use When there is no metadata file yet or the dataset has changed. Description Uses the dataset to create or update technical metadata. This script will automatically run when you open the dataset in ArcCatalog in the Description tab (this is set in the ArcCatalog Options - Metadata tab. It will create or update many fields, including extent, coordinate system, geometry, format, size, and attribute field names. Since this script can run automatically when opening the item in ArcCatalog, it is not often necessary to call the script manually. However, if you have the option to run automatically turned off in the options, it can be called to update the item. It helpful to use for batch technical metadata creation or updating with an ArcPy script that iterates ...

Adding Place keywords from GeoNames to map records

The GeoBlacklight plugin for Omeka includes a custom feature in the Spatial Coverage field. A user can type in a place term, which will query GeoNames, and produce a dropdown list of options.  The user selects a value from the list, and this will pull in the GeoNames URI. The user can select multiple place names using multiple inputs.   A second feature of the plugin is that it can pull the bounding box coordinates from the GeoNames official record.  This is triggered during export to the GeoBlacklight schema JSON file if the Bounding Box element for the item is empty.  Although users can add multiple place names, only the first input will be queried for this function. When you begin typing in this input, it will query the GeoNames API and pull suggestions for defined place names.    NOTE:  This feature is constrained by the fact that our site is hosted on an https site, but the GeoNames API query is on http.  This means that your brows...

Generate Technical Metadata for GIS Data

Technical Metadata for GIS Data Technical metadata can be programmatically extracted from the dataset. Step 1: Generate or Update the Technical Metadata How to generate an ArcGIS 1.0 XML file for a dataset one at a time. Open ArcCatalog and use the Catalog Tree to navigate to the location of the dataset. Click on the dataset in the catalog tree. Open the Description tab in the main window.  This action will update or create a new metadata file in the ArcGIS Metadata 1.0 Format.   How to generate an ArcGIS 1.0 XML file for a batch of files. Download the datasets and unzip all of the files into a flat folder.   With 7-Zip, use the option without the “*/”, which will keep the files in separate folders Folders can be flattened with the following command line script (Mac tested):   find [DIRECTORY] -mindepth 2 -type f -exec mv -i '{}' [DIRECTORY] ';' Download a zip of the project’s Custom ArcGIS Toolbox of Batch Metadata Scripts. Unzip t...