Library of Congress has archive of tweets

Jan. 7, 2013 at 6:33 PM
share with facebook
share with twitter

WASHINGTON, Jan. 7 (UPI) -- The U.S. Library of Congress says it has compiled more than 170 billion tweets, which will be available to researchers and other interested parties.

The library and Twitter entered into an agreement in 2010 giving the library access to all public tweets since Twitter's founding in 2006.

"Twitter is a new kind of collection for the Library of Congress but an important one to its mission," Gayle Osterberg, the library's director of communications, wrote in a blog post. "As society turns to social media as a primary method of communication and creative expression, social media is supplementing, and in some cases supplanting, letters, journals, serial publications and other sources routinely collected by research libraries."

The library archived all of the tweets it currently possesses in digital form and is now working on plans to make them available for researchers, CNN reported Monday.

The library gathers roughly 500 million tweets per day, but making the archive publicly available is proving difficult, library officials said.

"It is clear that technology to allow for scholarship access to large data sets is lagging behind technology for creating and distributing such data," library executives wrote last week in a government white paper discussing the effort. "Even the private sector has not yet implemented cost-effective commercial solutions because of the complexity and resource requirements of such a task."

Since 2000, the library has also been archiving pages from websites dealing with government information and activity, creating a database of more than 300 terabytes in size.

Related UPI Stories
Trending Stories