Reference Backtalk: Digitizing Diesel (and Other Great Minds)

By the end of 2013 the Springer Book Archives (SBA) will contain around 100,000 ebooks, dating back to the 1840s—nearly the entire Springer book catalog. Identifying, locating, sourcing, digitizing, and securing rights and assigning metadata for all of this content will all have been finished in under three years, giving new meaning to the term “time management.” The story of the project, perhaps as much as the resulting product, is compelling and relevant to the larger scholarly community given that there is no precedent for this type and extent of effort.

The case for preserving the SBA at such great expense, with an untested product, no less, was not a foregone conclusion. Why do something like this? Will customers buy it? Is there enough interest in older titles? How long is the long tail? Librarians and researchers still faced with difficulty in gaining access to original publications know the inherent value of reading a seminal work, as opposed to reading an analysis of it. But most have made do with only limited access so far, so what makes something like the SBA different?

The initial scoping of the undertaking was deceptively simple: how many, and what books comprise Springer’s catalog? Gathering 170 years of publishing history, though, which began long before standardized identifiers like ISBNs, and which was spread across multiple countries, languages, and disciplines, was daunting. Combined with acquisitions, mergers, and sales along the way, the investigation continued to grow in complexity. Library catalogs helped enormously with this phase of a still-ongoing process.

One of the early, fascinating pieces of the project was the identification of key titles and authors in our collection. With so much content online today it is difficult to comprehend that the original writings of some key researchers—Rudolf Diesel, for example—are still only in print. His eponymous diesel engine remains a mainstay of modern transportation, making anytime, anywhere availability of his research invaluable.

Once the books were identified, we had to determine whether most of the material could be found. Fortunately our warehouses had a majority of the titles, often unused and in excellent condition. Employees contributed as well; anyone familiar with libraries and publishing knows that there are staff whose offices closely resemble storage facilities. These repositories were a huge help, as were the libraries, antiquarian booksellers, and others who proved to be goldmines of material.

The investment required for all of this was immense. We physically collected, inventoried, packaged, and shipped these books to a centralized digitization vendor, essentially creating a master print collection from myriad locations around the globe. In addition to meeting the physical needs of a digitization process of this magnitude, the Springer catalog also revealed both the extent of the author outreach that would be necessary and the metadata that would be needed.

With respect to metadata, some books had ISBNs, but they did not always conform to today’s 13-digit standard. All needed to be assigned DOIs, at the book and chapter levels, which now number in the millions. The extent would only become obvious once we grasped the full breadth of the catalog. Like any major undertaking, the SBA required flexibility and patience.

Other matters, like rights, were yet more complicated. At the most basic level, just understanding and conforming to copyright laws in different countries requires a lot of time, attention, and expertise. Springer launched a massive effort to identify, find, and contact authors and rights holders (in multiple languages) to educate them about the project, royalties, and rights, a successful and ongoing initiative.

We can confidently report that this huge project was well worth our time, sweat, and, yes, sometimes frustration; in December 2012 the first 37,000 English-language titles of the SBA were loaded onto the new SpringerLink, which launched just two months prior, requiring the coordination of two major projects for testing and readiness—not a task for the faint of heart! Years of hard work, collaboration, and creative problem solving culminated in the launch of the SBA at the ALA 2013 Midwinter Meeting in Seattle. At 605 feet above the Emerald City, in the Space Needle, customers and Springer personnel toasted the beginning of the final phase of the project. The remainder of the titles will be online by the end of 2013, marking the end of the project. But the beginning of a new future for these important scientific works is just getting started. For more information on the project, visit the project at springer.com/bookarchives.

Jennifer Kemp is eProduct Manager, eBooks, Springer