Skip to main content

Posts

Showing posts from March, 2018

Scraping Portal Discovery Metadata and Merging it with Standards Documentation Metadata

Summary  This post describes a technique for scraping Portal Discovery Metadata from a custom site and merging it with Standards Documentation Metadata in accompanying XML files. The example portal used is PASDA , but this could be modified for other repositories. Background The BTAA GDP aggregates metadata to provide a catalog of geospatial resources from public data providers.  There are generally two types of sources for the metadata: Portal Discovery Metadata : This is found within the data provider's portal application and may include minimal elements, such as title, date, description, and links.  Several structured data portal applications, such as ArcGIS Hub and Socrata provide this through their API as DCAT.  Other portals, such as CKAN, have APIs that expose the Portal Discovery Metadata in a custom schema.   Standards Documentation Metadata : This is a file that accompanies the dataset and includes much more detail, such as spatial reference systems and