How Well do you Know Your Data? Converting an Archive of Proprietary Markup Schemes to JATS: A Case Study

Cite

Related Material

River Valley TV

Faye Krawitz, Jennifer McAndrews, Richard O'Keeffe,

Formal Metadata

Title

How Well do you Know Your Data? Converting an Archive of Proprietary Markup Schemes to JATS: A Case Study

Title of Series

JATS-Con 2012

Part Number

Number of Parts

Author

Faye Krawitz,

Jennifer McAndrews,

Richard O'Keeffe,

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/30581 (DOI)

Publisher

River Valley TV

Release Date

2016

Language

English

Production Year

2012

Production Place

Washington, D.C.

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

The presentation will describe the challenges, benefits, and opportunities resulting from converting an archival collection of approximately 750,000 files to JATS. The goal was to migrate the American Institute of Physics (AIP) and member society archival collection from multiple generations of proprietary markup to an industry standard to create a true archive, all managed within a new, more controlled content management system. Integral to the process was the adoption and application of the XML technologies XSLT, XPath, and Schematron to transform and check the content. Sounds straightforward doesn't it? Perform a thorough document analysis, map out the transformation rules, convert the data. But is it? Have you accounted for all historical variations, generated text, metadata, nomenclature variations on XML file assets? Beside your core, don't forget about reuse for other products, edge cases, online presentation, distribution channels and staff training!