UCSC computer scientists develop solutions for long-term storage of digital data

UCSC computer scientists develop solutions for long-term storage of digital data

By Tim Stephens (831) 459-2495; stephens@ucsc.edu

 

Although the digital age is well under way, one crucial detail remains to be worked out--how to store vast amounts of digital information in a way that allows future generations to recover it.

The team that developed Pergamum includes graduate students Kevin Greenan and Mark Storer and associate professor of computer science Ethan Miller.

 

"The problem is how to build a large-scale data storage system to last 50 to 100 years," said Ethan Miller, associate professor of computer science in the Baskin School of Engineering at the University of California, Santa Cruz.

 

Tape libraries are widely used for data storage, but digital tape has many shortcomings as an archival medium. Miller's group has come up with a new approach, called Pergamum, which uses hard disk drives to provide energy-efficient, cost-effective storage. The declining cost of hard drives has made them more competitive with tape, and they offer numerous advantages for searching and retrieving data. "It's like the difference between a VCR and TiVo," Miller said.

 

Pergamum, named after the ancient Greek library that made the transition from fragile papyrus to more durable parchment, is a distributed network of intelligent, disk-based storage devices. The team that developed it includes UCSC graduate students Mark Storer and Kevin Greenan, along with researcher Kaladhar Voruganti of NetApp (formerly Network Appliance), a company that focuses on storage and data management solutions.

 

Archival storage is a big issue for businesses, partly due to legal requirements for the preservation of financial and business records, and also because data mining strategies can turn stored data into a valuable resource. Long-term storage is also a growing issue for individuals who are filling their personal computers with digital photos, movies, and documents.

 

"There is a risk that an entire generation's cultural history could be lost if people aren't able to retrieve that data," Storer said. "Everyone is switching to digital cameras, but we've never demonstrated that digital data can be reliably preserved for a long time."

 

The researchers designed the system to provide reliable, energy-efficient data storage using off-the-shelf components. It also has the ability to evolve over time as storage technologies change. "You want to avoid 'forklift upgrades,' where you have to get rid of the old system and transfer all your data to a whole new system," Miller said.

 

According to Storer, businesses are beginning to recognize that archival storage is very different from simply backing up their data. "A backup is a safety net--you hope you won't need it. Archival data you do want to use--it's a valuable resource and you want to be able to mine it for information," he said.

 

Tapes work well for backups, in which data are written once, rarely read, and not kept indefinitely. But archival data should be easy to read, query, browse, and search, and tape has inherent weaknesses in these areas. Existing disk-based systems offer excellent performance, but rely on power-hungry central controllers.

 

Pergamum is one of several related projects being developed by researchers in the Storage Systems Research Center (SSRC) at UCSC's Baskin School of Engineering.

 

Read the full news release at http://www.ucsc.edu/news_events/text.asp?pid=2130

Last Updated: May 2, 2008 - 10:38am