Skip to main content

Batch transformations: scripts & techniques

Summary:  This post shares several scripts and techniques for working with multiple files at a time.

Intro
The metadata workflow process developed for our project was created for collaborative metadata creation, transformation, and editing.  In the workflow’s current form, the basic steps are:
     1. Submit records (individuals from various institutions)
     2. Metadata transition (performed by myself, the Metadata Coordinator)
     3. Edit records (online editor accessible by all)

This post is about the second step, the metadata transition.  The goal of this step is to transform all of the metadata into ISO and add template information.  This is all about working with the files in batches.  Any individual edits, even if done with spreadsheets in groups, is part of the third step.  

Here are some of the scripts I have copied or written for this work, as well as a few useful tools. I will add to this post from time to time as I develop more techniques.

Flattening a folder

If you have downloaded and unzipped many shapefiles, they will be in their own folders.  If you want to run scripts on them, one easy way is to flatten the folder structure- take all of the files out of their individual folders. A quick way to do this is to use this script in the command line:

find [DIRECTORY] -mindepth 2 -type f -exec mv -i '{}' [DIRECTORY] ';'

Adobe Bridge Batch Renaming

There are frequent times where we wish to rename the metadata files.  This is especially useful when using a batch script to merge metadata files based upon matching names.  Regex can be used to do more complicated renames.  To remove all characters after an underscore, use the following syntax:

\_\w*

ArcGIS Metadata Conversion Tools

If you read the ArcGIS help pages, they will suggest running ArcGIS tools in batch mode.  However, this usually involves so much clicking and file browsing that it doesn’t save any time.  It is much better to open the Python window and run scripts from there.  

All of the ArcGIS scripts should start with this code:

import arcpy
from arcpy import env
env.workspace = "the path to your folder of files"

Scripts for converting metadata with the resource present:

Upgrade:
fcList = arcpy.ListFeatureClasses()
For fc in fcList:
arcpy.UpgradeMetadata_conversion (fc, "FGDC_TO_ARCGIS")

Synchronize:
fcList = arcpy.ListFeatureClasses()
for fc in fcList:
arcpy.SynchronizeMetadata_conversion(fc,"NOT_CREATED")

Scripts for converting metadata without the resource (a group of XML files only):

Translator:
dir = arcpy.GetInstallInfo("desktop")["InstallDir"]
translator = dir + "Metadata/Translator/FGDC2ESRI_ISO.xml"
for file in arcpy.ListFiles():
   outfile = file.replace('xml','_arc.xml')

arcpy.ESRITranslator_conversion(file,translator,outfile)


Update April, 2016: The Translator script above will sometimes cause ArcCatalog to crash.  One cause of this is having any files that end with shp.xml.  Even if these files are in FGDC format, the batch script will crash.  These files can still be translated one by one or if you rename them.