Related technophoo articles: WorldCat API; Join the Movement
Standards developed by librarians and archivists allow modern researchers internet access to information about sources. Those same standards might seem a natural fit with the development of new genealogy technology standards. Think of the possibilities--if genealogists describe sources using the same standards by which librarians describe sources, how could this not improve our ability to share information.
There are skeptics. About a month ago I gave a few minute overview of "metadata standards" to a small group. Some genealogists in attendance passed chat comments--"I don't buy it" and "neither do I." I'm blogging today about standardized metadata as powerful collaboration tools--information about sources developed by those who know sources well.
Metadata and standards
Wikipedia defines metadata as "data about data," saying, "a good example ... is the cataloging system found in libraries, which records ... author, title, subject ..."
Genealogical software users work to identify the author, title, date, etc., for each source. We type and type and type sets of catalog-like data into our database. Using yet other techniques, the software may manipulate the user data to create reference notes and bibliographies.
Librarians and archivists around the world routinely create catalogs about their holdings--working with the same data and often the same sources with which genealogists work. The librarians and archivists have been doing this for a long time, so that standards have been developed.
Relevant metadata standards are the product of librarians and archivists working across disciplines, together with professionals from the more extended family of information science and technology. MARC and DACS are two examples of metadata standards (there are others).
- MARC is the acronym for MAchine-Readable Cataloging, with a history dating back to the 1960s--before some genealogists were born. From the Library of Congress (US) MARC Standards website. "The MARC formats are standards for the representation and communication of bibliographic and related information in machine-readable form."
- DACS is the acronym for Describing Archives: A Content Standard. It was adopted by the Society of American Archivists in 2004. Wikipedia describes DACS as "a set of rules for describing archives, personal papers and manuscript collections." (See also EAD, Encoded Archival Description, "an XML standard for encoding archival finding aids.")
Standards pave the way for other supporting technologies. Today, more library and archive catalogs are available online. AND ... increasing amounts of online catalog data can be extracted by users with the click of a mouse.
Powerful, complex and user friendly, too
To a family historian like me, the actual standardized metadata and supporting technologies are complex, but shhhh.... the results wind up in the most user-friendly places. You're probably already working with standardized metadata!
- Do you use WorldCat.org to search, find or extract information about a source? Various libraries and archives--the OCLC network--supply WorldCat with descriptive cataloging information about their individual holdings using international metadata standards. The result of this collaboration is a virtual library catalog of more than 239 million bibliographic records--1 billion plus individual items ... 470 languages and 112 countries. Users can search the catalog by author name, title, etc. and learn more detailed information about a source.
- Visit the Library of Congress (US) website and review a title in the Online Catalog--you'll find associated “MARC TAGS” displayed in a tab. Metadata.
- Perhaps you've tried Zotero or Endnote? Maybe EasyBib (...there's an app for that)? Yup. Yup. Yup--all implementations of metadata.
Despite so many user-friendly implementations of standardized metadata, I still find myself adding sources to my genealogical database by filling out forms. I type and type and type the name of every author, editor, and the titles of each and every website, book, article, journal and newspaper ... With a single mouse click, I'm able to extract metadata to Zotero or Endnote, but I can't port that same information to the "add new source" form in my genealogical software. Nor can my genealogical software access that information directly from the archive. So I type and type and type.
But is the "metadata" useful?
In the case of TNA's library catalog API, the metadata communicates how specific material can be retrieved from the archive--what is the Item, Piece, Sub-sub series, Sub series, Series, Division and/or Department. As well, the catalog provides "summary descriptions" of the record references; a glossary supports the catalog. TNA is inviting users to access the metadata directly via their API.
Without the API/metadata, we could each research various documents in the archive to develop a summary and learn the hierarchal organization--most of us would probably make a good record--but why not begin with imported data the library actually uses for identification? Should the next record we seek come from an archive elsewhere, why not again begin with the identification principles used by that archive?
Does it deliver a citation?
Not directly, but neither does the data entered now to genealogical software about author, title, date, etc. Citations consider context, and form preferences vary from user to user. Most supported citation styles change frequently, calling for separately supporting technologies and tools.
Products like WorldCat, EndNote and Zotero process the metadata separately to form citations. See the graphic below for the WorldCat options.
You might say standardized metadata is not so much about the comma as it is the substance between the commas.
Share and Share Alike
We are fortunate to live in an age where librarians and archivists come prepared to share data with us in modern formats they have worked to document and standardize. In many respects, they've carried the heavy water.
I'm surely among those interested in working with their standards as we advance genealogical technology standards.
Mark Tucker, for introducing me to the MARC standard.
Geir Thorud, for directing me to higher-level thinking about metadata.
Adrian Bruce, because he finds it interesting.
Tamura Jones, for discovering the link to "Discovery System RESTful API."
Nancy, Godfrey Memorial Library (Connecticut), interview of 19 Aug 2011, about the cataloging process.
Cynthia Whitacre, quality control and customer service, OCLC (Ohio), interview of 19 Aug 2011.
Rebecca Baker Hill, Head Librarian, Rutherford B. Hayes Presidential Center (Fremont, Ohio), interview of 19 Aug 2011.
Valerie A. Metzler, archivist and historian, about archival materials and DACS, interview of 22 Aug 2011.
Stephen Slovasky, Bibliographic Servicing, Connecticut State Library, interview about DACS, cataloging and OCLC, 23 Aug 2011.
Revised. TY to Heather and Geir for input.
Revision 2. TY to Geir for input.