This post explores in greater detail how the ArcGIS Upgrade metadata process works. The tool can run on either a standalone metadata file or a dataset in conjunction with a metadata file. Although the tool will work in both cases, the results will be different. I specifically wanted to examine how it processes a standalone FGDC XML file to identify possible data loss and formatting issues. My investigations showed that the tool was designed for the second case (dataset + metadata) and has flaws when used for a standalone file.
Esri definition: The Upgrade Metadata tool copies information in existing FGDC or ESRI-ISO metadata elements that are not included in the ArcGIS metadata format to the equivalent ArcGIS metadata elements. Upgrading doesn't alter the item's ArcGIS-internal content: the geoprocessing history, thumbnail, enclosures, and so on. Upgrading doesn't remove any existing FGDC- and ESRI-ISO-format elements. Properties of the item that were recorded in its metadata by ArcGIS Desktop 9.3.1 aren't upgraded. The current version of ArcGIS automatically updates the item's metadata to include its current properties at the end of the upgrade process.
Upgrading a standalone metadata file from FGDC to the ArcGIS 1.0 metadata format takes about 25 seconds per file. In contrast, other XSLT tools can do this in just a few seconds. The reason for the lengthy time is that the upgrade tool is actually a model that performs a series of scripts and XSLT transformations. To see or edit the model for yourself in ArcGIS, find the Upgrade tool in the Arc Conversion-Metadata Toolboxes. Right-Click on the tool name and select edit. This will take you to the following model view.
So what exactly is going on in each of these steps? I performed the upgrade process on a standalone FGDC XML file and analyzed the log. I then ran each step of the log separately and compared the before and after files. Here are descriptions of each of the steps based upon my observations:
The comments in the XSLT file include the following description: Processes ArcGIS metadata to remove all elements containing information added automatically by ArcGIS. If you are doing this process with a standalone metadata file, this is not necessarily desirable. The full model appears to have been designed to remove certain technical pieces of metadata at the beginning and then generate them again at the end. However, if the dataset is not present, the software has no way of extracting information.
In the file for this example, the geographic extents disappeared after this step. On a second try, I used a text editor to find and erase all instances of Sync="TRUE" before running the upgrade script. That worked and the extents were retained.
Prior to this step: <EX_TemporalExtent><extent><gml:TimePeriod gml:id="d1e411">
Esri definition: The Upgrade Metadata tool copies information in existing FGDC or ESRI-ISO metadata elements that are not included in the ArcGIS metadata format to the equivalent ArcGIS metadata elements. Upgrading doesn't alter the item's ArcGIS-internal content: the geoprocessing history, thumbnail, enclosures, and so on. Upgrading doesn't remove any existing FGDC- and ESRI-ISO-format elements. Properties of the item that were recorded in its metadata by ArcGIS Desktop 9.3.1 aren't upgraded. The current version of ArcGIS automatically updates the item's metadata to include its current properties at the end of the upgrade process.
Upgrading a standalone metadata file from FGDC to the ArcGIS 1.0 metadata format takes about 25 seconds per file. In contrast, other XSLT tools can do this in just a few seconds. The reason for the lengthy time is that the upgrade tool is actually a model that performs a series of scripts and XSLT transformations. To see or edit the model for yourself in ArcGIS, find the Upgrade tool in the Arc Conversion-Metadata Toolboxes. Right-Click on the tool name and select edit. This will take you to the following model view.
So what exactly is going on in each of these steps? I performed the upgrade process on a standalone FGDC XML file and analyzed the log. I then ran each step of the log separately and compared the before and after files. Here are descriptions of each of the steps based upon my observations:
1. Remove Synchronized Elements
This XSLT will remove the tag attributes Sync="TRUE." these were generated in older versions of ArcGIS. This XSLT on its own can be helpful, because this attribute can cause validation errors if the file is used outside of the ArcGIS environment. For most of the elements, it only removes the phrase Sync="TRUE." However, for some elements, it also removes the value. For a standalone file, this XSLT can cause data loss.The comments in the XSLT file include the following description: Processes ArcGIS metadata to remove all elements containing information added automatically by ArcGIS. If you are doing this process with a standalone metadata file, this is not necessarily desirable. The full model appears to have been designed to remove certain technical pieces of metadata at the beginning and then generate them again at the end. However, if the dataset is not present, the software has no way of extracting information.
In the file for this example, the geographic extents disappeared after this step. On a second try, I used a text editor to find and erase all instances of Sync="TRUE" before running the upgrade script. That worked and the extents were retained.
2. Remove FGDC Required Hints
This XSLT appears to remove the word "Required" from an FGDC XML created in earlier versions of ArcGIS. In the test I ran, my file did not have this text, and this XSLT did not make any changes to the file.3. Remove Empty Elements
This cleaned up the file and removed tag pairs without values. Several of these tag pairs had values in the original file that got stripped out in the Remove Synchronized Elements XSLT. Some of the tags that got removed for the test file were: <serinfo/>, <spdom><bounding/><lboundng/></spdom>, <qvertpa/>.4. Transform from FGDC to ESRI_ISO
This is the most essential step, because it changes the format from FGDC to ArcGIS. The title of the step is something of a misnomer in that it does not change the file into the older ESRI_ISO format- rather it turns it into the new ArcGIS 1.0 format. This process takes the FGDC Read-Only metadata and transfers as much of it as possible to the editable elements in the ArcGIS metadata format.
This step is actually more complex and difficult to reproduce outside of ArcGIS. It requires the Translation tool, which calls upon an XSL and well as a series of Rules files in txt and xml format..
Note: The file lost the metadata element of transfer file size during this step. This is puzzling, because the field was included in the XSL document used for the translation.
This step is actually more complex and difficult to reproduce outside of ArcGIS. It requires the Translation tool, which calls upon an XSL and well as a series of Rules files in txt and xml format..
Note: The file lost the metadata element of transfer file size during this step. This is puzzling, because the field was included in the XSL document used for the translation.
For a standalone XML metadata file, the transformed file is now complete. These steps took 14 seconds to run. The following steps still run in the upgrade script, but will not actually affect the file. However, they added an additional 10 seconds. For a large number of files, this will needlessly increase the processing time.
5. Merge new FGDC with existing
This tool doesn't appear to merge anything if using a standalone file- it just generates a new ArcGIS format file with creation date, time, and ArcGIS Format fields.6. Run the Import metadata tool
It appears that this imports the ArcGIS metadata that is being stored in the upgraded FGDC file into the new ArcGIS XML file from the last step.7. Run the Synchronize metadata tool
This synchronizes the dataset. (See this post.) However, if there is no dataset to synchronize, nothing will be inserted. Unfortunately, the first step might have erased this metadata, and now it can't be replaced.8. Upgrade ESRI_ISO to ArcGIS94
This final step is to address various details that might be out of date or need new references. However, I am unclear about the details, since it did not affect the test metadata file except for one value. I suspect that it might be useful especially with older metadata files.Prior to this step: <EX_TemporalExtent><extent><gml:TimePeriod gml:id="d1e411">
was changed to: <EX_TemporalExtent><extent><gml:TimePeriod gml:id="d1e405">