26 posts categorized "Digital Services"

February 06, 2012

Digitization Dispatch: Call for Entries!

Hello again from the ground floor at the National Museum of Natural History! I thought I'd take this opportunity to refresh the collective memory about the digital History, Art, and Culture collection we have been working to build. 

Ideally, we'd have a digital copy of every book on every shelf in the Libraries, always available to researchers anywhere with a wifi signal. We'd have digital copies that mirror physical collections in both conventional and unconventional ways--from any number of simultaneous collections and uses. But until then, here we are, still in the early stages of (more) complete digital access. So, we have to make some decisions about selections that are informed by criteria particular to the requirements of the ways we do digitization. 

First, what makes a book a good scanning candidate, or how do we select?

  • It's in our collection. We need to have access to the book in order to scan it.
  • It was published before 1923. That means the publication date needs to be 1922 or prior. More on copyright can be found here

And that's it for first level clearance. Anything we have that was published prior to 1923 has potential. Second level clearance factors are a little more subjective and have to do with the book's physicality. Logistically, we are talking about creating image files of pages with a digital camera. Our scanning partner, the Internet Archive, literally takes a picture of every page. So, that means wide margins help. A high contrast of the legible letters on the page helps. Binding that easily opens fairly wide helps. Pages that are sturdy enough to handle being turned at a fairly rapid pace helps. There's some wiggle room with each of these criteria, so if you have a title in mind that's no longer in copyright, speak up! We'll have a look to determine scannibilty and ultimate access from the internet. And that's about it. That's the gist of the criteria for selecting titles for digitization. Please feel free to suggest titles in the comments! 

2078386948_0e7407530b_z
                                                                                                       Photo courtesy of Flickr user matt707

Thanks! 

 

February 02, 2012

Smithsonian Tropical Research Institute and Smithsonian Research Online

During the week of January 16-19th, I visited the Smithsonian Tropical Research Institute (STRI) to discuss several matters relating to the Smithsonian Research Online (SRO) program and to offer technical support and training to STRI library staff. I was accompanied from Washington by Digital Services Head, Martin Kalfatovic, who was to attend a three-day Encyclopedia-of-Life meeting at Barro-Colorado Island during the same week.

Together we met with Oris Sanjur (STRI Associate Director for Science Administration), Vielka Chang-Yau (STRI head librarian), Angel Aguirre (librarian), Klaus Winter (STRI scientist) and Eldredge Bermingham (STRI Director). Everyone was in agreement that STRI-authored publication data ought to be collected in one place and that the SIL is doing a good job of coordinating this program across all Institution units. The Director and Associate Director will discuss the specific needs of their unit and report back to SIL, who will propose a workflow to accomplish this.

Meanwhile, I held a brief introduction to the bibliographic tools, EndNote and Zotero for STRI library staff and volunteers. While we had a training room available to us, unfortunately there was not a copy of these programs available to all participants. But they were still able to see the possibilities of using these tools in day-to-day library services.

2012.01.16-IMG_0155Alvin and Vielka review the SRO website and list of Smithsonian-authored publications using the newly-installed LCD screen in the STRI library. Photo courtesy of martin_kalfatovic via Flickr.

Finally, I met with Fernando Bouché (Head, Office of Information Technology) and STRI programmer, Carlos Caballero, to discuss the management of publication data, its re-use on the STRI web page and inclusion in the SI Collections search system (EDAN).

STRI scientists publish over 300 scholarly papers every year. Approximately 70% of them are captured automatically by the SRO via websites and associated tools. This circumvents the need for manual data entry. The inclusion of the complete corpus of work being done there is an essential part of representing the research being conducted at the Institution and the cooperation between the SI Libraries and STRI will bring the project to fruition.

 

 

 

December 26, 2011

Announcing TL-2 Online

The Smithsonian Libraries is pleased to announce that the online version of Taxonomic Literature II, or TL-2, is now online on the Libraries' website. We are calling this TL-2 Online.

What is TL-2?

TL-2 is an essential tool for Botany research that includes botanists and their publications from 1753 to the present. Comprising fifteen volumes, seven original and eight supplemental, Tl-2 is organized alphabetically by author and includes some biographical information about each author.  The main content for the author entries is the publications that he or she has written. TL-2 was constructed such that each author is assigned a unique abbreviation and each publication a unique number. There are nearly 10,000 authors and over 37,000 publications in TL-2 and the entire set of data is cross-referenced in the two indexes in each of the fifteen volumes.

To put it simply, TL-2 is a database published in the form of a book. Now that the Libraries, with generous permission from the publisher, has digitized and placed the content online the door has been opened to utilize the data from TL-2 in new ways, some of which we haven't even imagined yet. 

What can I do with TL-2?

Currently the website allows you to search TL-2 Online either via a simple keyword search, or a more advanced search on several fields including logical AND and OR operators. Additionally, all volumes of TL-2 may be read online using a simple page-turning application. Finally, in addition to the scan of the page, all pages that contain searchable data are presented with the corrected OCR text that was created during the digitization process. 

Our goal was to construct TL-2 Online using modern web development techniques to minimize page refreshes in order to offer a better experience for readers. The result is that viewing search results and reading the volumes online is very, very fast. 

The data used to create TL-2 Online is also available for download and use by other people and organizations. The download file contains the full corrected text as well as the XML version of the parsed data. Due to the fact that we continue to work on the data and plan to do additional parsing, this data is subject to change and a version number and last modified date are provided for reference.

What else do you have planned?

We're very glad you asked that! As part of the Libraries' website redesign, TL-2 Online will be one of the first components of the new Digital Library that to be presented entirely as Linked Open Data (LOD). Overall, LOD will be integral to the entire Digital Library, but TL-2 will be the first data set that we make available in that manner. The Smithsonian Libraries will aims to become the permanent home for TL-2 and the authority for TL-2 Linked Open Data identifiers. Although LOD is not directly visible to visitors to our site, making it available allows other computers and software to more easily reuse and query the data without extensive programming.

We also plan to continue parsing the data inside TL-2 in order to provide new avenues for using and analyzing the data. Expect the TL-2 Online website to expand to include new downloadable data and new features when the time comes. For example, may botanists contributed specimens to herbaria (libraries of plant specimens) around the world. We would like to present that data in a searchable fashion on the site when the data is ready.

Lastly, a note: Since TL-2 Online was digitized from a printed work, there are bound to be errors in the OCR and places where the parsing was not quite accurate. Although we have minimized many these, there may still be some that exist. Please be patient and feel free to contact us if you'd like to bring anything to our attention.

We hope that botanists around the world continue to use TL-2 and that they find our new online offering even easier to use than the printed work. 

November 03, 2011

Digitization Dispatch: Patents!

Over 150 volumes of US government patent records have recently been digitized and are currently accessible from Internet Archive here. Originally from the main library at NMAH and on their way to SILRA, these volumes represent patents from Agriculture, Arts and Manufacture, and Mechanics during the 1800's and well into the 20th century. In general, the patent materials have indexes organized annually by topic alphabetically with corresponding patent numbers. Amid the chronological organization, evidence of Thomas Edison's impact is highlighted through unique binder's choices. For example, a volume bound entirely of patents belonging to Thomas A Edison is available here. While he accrued thousands of patents in his lifetime, among his most cited is the "Incandescing Electric Lamp", pictured. (Actually, he has many patents for improvements of the light bulb, but this one is the cutest, in this blogger's humble opinion)

Screen shot 2011-11-01 at 12.46.39 AM
The other notiable exception to the chronological organization includes dozens of discreetly bound "graphophone" patents, aka, the phonograph. Interestingly, the earliest record players were a result of the work Edison did on two other inventions--the telegraph and the telephone. 
The Library of Congress:

 In 1877, Edison was working on a machine that would transcribe telegraphic messages through indentations on paper tape, which could later be sent over the telegraph repeatedly. This development led Edison to speculate that a telephone message could also be recorded in a similar fashion. He experimented with a diaphragm which had an embossing point and was held against rapidly-moving paraffin paper. The speaking vibrations made indentations in the paper. Edison later changed the paper to a metal cylinder with tin foil wrapped around it. The machine had two diaphragm-and-needle units, one for recording, and one for playback. When one would speak into a mouthpiece, the sound vibrations would be indented onto the cylinder by the recording needle in a vertical (or hill and dale) groove pattern. Edison gave a sketch of the machine to his mechanic, John Kruesi, to build, which Kruesi supposedly did within 30 hours. Edison immediately tested the machine by speaking the nursery rhyme into the mouthpiece, "Mary had a little lamb." To his amazement, the machine played his words back to him.

 

 

My Other Accounts

Flickr FriendFeed Twitter
RSS Feed
Blog powered by TypePad
Member since 12/2007