IBM to lead creation of global film database

More than 100 years of motion pictures, TV broadcasts and other images, now scattered in museums and collections around the globe, have never been catalogued in one massive worldwide database.

That will change next year when three U.S. universities and the U.S. Library of Congress begin work on an online catalogue of the world’s movie and broadcasting treasures for researchers, historians, educators and the public. The database will initially include information on the images, such as when they were made, who created them and where they are kept, but some of the material will be available for viewing online.

In a recent announcement, IBM Corp. was named the lead hardware vendor for the database project, which will use the company’s line of eServer pSeries servers, which use IBM’s Power processors.

Jim DeRoest, assistant director of computing and communications at the University of Washington in Seattle, which is helping develop the database, said the project has long been a goal of researchers and is coming together now with help from a US$900,000 grant from the National Science Foundation.

Until now, the only catalogues of films and broadcast images have covered individual private collections or museums, he said, which has hindered knowledge about what remains from the early days of the industry. “There are some large (collections), but there hasn’t been this cross-genre type of catalogue,” DeRoest said.

Also participating in the project are Rutgers University Libraries in New Jersey and the Georgia Institute of Technology Media Center. The National Science Foundation grant was commissioned by the Association of Moving Image Archivists in Hollywood through a grant from the National Film Preservation Board of the Library of Congress.

The database will run on SuSE Linux Enterprise Server 8 on the IBM hardware, along with a collection of open-source software used to keep costs down, DeRoest said. The Power processor servers were chosen, he said, because each of the participating universities has had good experiences with them. “All of us were fairly satisfied with the scalability,” he said.

One problem for Linux on Intel-based hardware, DeRoest said, has arisen when vendors made hardware changes and Linux didn’t include the proper device drivers. Using the pSeries servers should solve that problem through a “consistency of hardware,” he said.

Barbara Humphrys, who works in the Library of Congress motion picture, broadcast and recorded sound division and was a member of an early Association of Moving Image Archivists subcommittee for the project, said the database will solve many problems for historians and researchers. “We’re kind of starting at the beginning,” she said. “You’d be surprised where some things are held.”

Once the database is built, links directly to the content can be added so that images and movies can be viewed, Humphrys said. And users who find the images they’re seeking will be able to contact the collection owner to try to obtain viewing rights or more information.

In addition to motion pictures, TV broadcasts and other images, the database will feature archives from the Smithsonian Museum, including video from the Hubble Space Telescope and other notable or historical images.

The Library of Congress will be the host Web site for the catalogue next year when it debuts the Moving Images Collection after it is created. An early version of the Web site is already online.

The Moving Images Collection databases and Web portal will be run on two IBM eServer p630 and two IBM eServer p610 servers under SuSE Linux and IBM directory server.

The University of Washington and Rutgers University are designing and developing the directory and catalogue databases of digital images, and the Georgia Institute of Technology is developing the Web portal for the project.