Five of the world's largest libraries have joined Google to digitize millions of books and make every sentence searchable. Nothing in today's announcement mentions genealogy books but with millions of out-of-print books being digitized, one has to believe that at least a handful of them will be genealogies or local histories.
The project involves libraries at Harvard and Stanford Universities, the University of Michigan at Ann Arbor, and the University of Oxford, as well as the New York Public Library. It could soon turn Google into the single largest holder of digitized published material. In effect, it will become the world's largest digital library and one of the world's largest libraries of any kind. It will also provide researchers and students with an unprecedented tool for finding information.
The company will begin by scanning works that are in the public domain, and the full contents of those books will be accessible online through the popular Google search engine. But the company also plans to scan copyrighted books in some of the libraries. The search engine will not return the full texts of those volumes, but will instead provide up to three short excerpts, each consisting of only a few lines of text in which a search term appears.
Google officials and librarians hope the excerpts will be sufficient to let researchers determine whether they want to check out or purchase the book. Google will include links to online booksellers and local library catalogues along with search results.
The number of volumes that could be scanned is interesting to contemplate:
Harvard University: 15 million volumes
New York Public Library; 20 million
Stanford: more than 7.6 million
University of Michigan: 7.8 million
Oxford: more than 6.5 million books
Harvard, Stanford, and the New York Public Library have agreed only to pilot projects with the company. Harvard University, for example, has agreed to let Google scan only 40,000 books during the pilot phase of the project. The books will be selected randomly from the five million volumes in the Harvard Depository, an off-site storage facility for seldom-requested books.
During the pilot phase of the project, the New York Public Library has agreed to let Google scan more than 10,000 but less than 100,000 public domain books. Oxford will allow Google to scan only books published before 1900 while officials at the University of Michigan have agreed to allow all of their books to be scanned. All of the projects are expected to take years to complete.
Susan Wojcicki, director of product management for Google, said that the Google Print project would lead to an increase in book sales because it would show readers what the volumes contain. "For publishers, we believe that this will be beneficial," she said.