Interview: The second Gutenberg - I

By SAM VAKNIN, UPI Senior Business Correspondent  |  May 27, 2002 at 3:36 AM
share with facebook
share with twitter

SKOPJE, Macedonia, May 27 (UPI) -- The champion of what he says is a dwindling public domain, Michael S. Hart is the founder of a now legendary project that publishes texts on the Internet for which the copyright has expired.

The ethos of the early Internet owes a lot to Hart, who as the father of e-publishing and e-books, created the first online library, Project Gutenberg.

A professor of electronic text at Benedictine University in Illinois, Hart launched a mass movement of volunteers who scan, proofread and upload dozens of new e-texts every week, all in the most common, most easily accessible format.

"Michael Hart is a visionary who was quite ahead of his time. In fact, it may still be several years before his dream of universally-available literature comes true," says Glenn Sanders, director of

"Nevertheless, Michael's efforts have inspired thousands of people around the world who now share his vision."

Hart is a former visiting scientist at Carnegie Mellon University where he was a fellow of the Internet archive in 2000. He founded Project Gutenberg in 1971 when $100 million of computer time was made available to him by the University of Illinois operators of a Xerox mainframe. Currently he is the project's executive coordinator.

He pioneered not only the dissemination of electronic texts -- but also some of the working models that underpinned the Internet until the crash two years ago.

Project Gutenberg is, by now, an integral part of the myth and history of our networked world. Most texts online are in the public domain. But a few are copyrighted -- with permission to store the work granted by authors and publishers or other copyright holders.

As copyrights expire, thousands of works are added monthly to the public domain, where they can be freely replicated and distributed. Most of these books are out of print and saved only by the project from obscurity and ultimate oblivion.

The recurrent extension of copyright terms by Congress hampers this work by restricting the growth of the public domain or even by removing texts from it. It benefits, he says, very few copyright holders at the expense of universal access to literature and knowledge.

Hart mourns the rapidly dwindling public domain.

"In the U.S.A., no copyrights will expire from now to 2019! It is even much worse in many other countries, where they actually removed 20 years from the public domain," Hart said. "Books that had been legal to publish all of a sudden were not. Friends told me that in Italy, for example, all the great Italian operas that had entered the public domain are no longer there."

Same goes for the United Kingdom. Germany increased its copyright term to more than 70 years back in the 1960s. It is a domino effect. Australia is the only country I know of that has officially stated they will not extend the copyright term by 20 years to more than 70."

Such vocations as Hart's carry a heavy price tag in recurrent frustration and cumulative exhaustion. Hart may be tired, but he does not sound bitter. He is still a fount of brilliant ideas, thought-provoking insights, exuberant optimism, and titillating predictions.

Three decades of constant battle ended in partial victory -- but Hart is as energetic as ever, straining at the next, seemingly implausible target.

"A million books to a billion people in all corners of the globe," Hart said.

Inevitably, he sometimes feels cornered. "They" figure in many of his statements -- the cynical and avaricious establishment that will sacrifice anything to secure the diminishing returns of a few more copies sold. In the project's life time, the period of copyright has been extended from an average of 30 years to 95 years.

Moreover, no notice of renewal is required to enjoy the copyright extensions.

This protectionism, Hart believes, hinders the spread of literacy, deprives the masses of much needed knowledge, discriminates against the poor, and, ultimately, undermines democracy.

Q: "Project Gutenberg" is a self-conscious name. In which ways is the project comparable to Gutenberg's revolution?

A: When I chose the name, the major factor in mind was that publishing e-books would change the map of literacy and education as much as did the Gutenberg press, which reduced the price of books to 1/400th of their previous price tag. From the equivalent of the cost of an average family farm, books became so inexpensive that you could see a wagonload of them in the weekend marketplace in small villages at prices that even these people could afford.

Another way our project compares to Gutenberg's revolution is that copyright laws were created to stop both.

When we only had a dozen e-books online, the price of putting one on a computer was about 1/400th the price of a paperback. But obviously with 100 gigabyte drives coming down to $100, the price of putting e-books on computers has fallen so low as to be literally "too cheap to meter." Those who like to meter everything on the cash scale are incredibly upset about Project Gutenberg.

Project Gutenberg is the first example of a paradigm shift from "limited distribution" to "unlimited distribution", now touted as "the information age". However, you should be aware that this is the fourth such information age. Each such phase has been stifled by making it illegal to use new technologies to copy texts. In 1710, the Statute of Anne copyright made it illegal for any but members of the ancient Stationers' Guild to use a Gutenberg press. Then, in 1909, the United States doubled the term of all copyrights to eliminate "reprint houses" who were using the new steam and electric powered presses to compete with the old-boy publishing network.

The third information age came in 1976 when the United States increased the copyright term to 75 years and eliminated the requirement to file copyright renewals, to stifle changes brought on by Xerox machines. In 1998, the United States extended the copyright term yet again, to 95 years, to eliminate publication via the Internet.

Q: The concept of e-texts or e-books back in 1971 was novel. What made you think of this particular use for the $100 million in spare computer time you were given by the University of Illinois?

A: What allowed me to think of this particular use for computers so long before anyone else did is the same thing that allows every other inventor to create their inventions: being at the right place, at the right time, with the right background. As Lermontov said in The Red Shoes: "Not even the greatest magician in the world can pull a rabbit out of a hat if there isn't already a rabbit in it."

I owe this background to my parents, and to my brother. I grew up in a house full of books and electronics, so the idea of combining the two was obviously not as great a leap as it would have been for someone else. I repaired my Dad's hi-fi the first time when I was in the second grade, and was also the kid who adjusted everyone's TV and antennas when they were so new everyone was scared of them.

I have always had a knack for electronics, and built and rebuilt radios and other electronics all my life, even though I never read an electronics book or manuals. It was just natural.

Let me tell you a story about how the project started: I happened to stop at our local IGA grocery store on the way. We were just coming up on the American Bicentennial and they put faux parchment historical documents in with the groceries. So, as I fumbled through my backpack for something to eat, I found the Declaration of Independence and had a "light-bulb moment."

I thought for a while to see if I could figure out anything I could do with the computer that would be more important than typing in the Declaration of Independence, something that would still be there 100 years later, but couldn't come up with anything, and so Project Gutenberg was born.

You have to remember that the Internet had just gone transcontinental and this was one of the very first computers on it. Somehow I had envisioned the Net in my mind very much as it would become 30 years later.

I envisioned sending the Declaration of Independence to everyone on the Net, all 100 of them, which would have crashed the whole thing. Luckily Fred Ranck stopped me, and we just posted a notice in what would later become comp.gen. I think about six out of the 100 users at the time downloaded it.

Q: Between 1971 and 1993 you produced 100 e-texts. And then, in less than nine years, an additional few thousand. What happened?

A: People rarely understand the power of doubling something every so often. In 1991, we were doing one e-book per month. This was totally revolutionary at the time. People kept predicting that we couldn't continue, but we were planning on doubling production every year, which we did for most years. We are now adding 200 e-texts a month.

Q. Can you give us some current download statistics?

A. As for stats, this is pretty much impossible since we don't directly control any but one or two of what I presume are hundreds of sites around the world that have our files up for download. What I can tell you is that the one site we have the most control of gives away over a million e-books per month.

Q. The Internet is often castigated as an English-language, affluent people's toy. Project Gutenburg includes predominantly English language, Western world, texts. Do you intend to make it more multi-cultural and multilingual?

A. I encourage all languages as hard as I possibly can. So far we have English, Latin, French, Italian, German, Spanish, Chinese, Japanese, Swedish, Danish, Welsh, Portuguese, Old Dutch, Bulgarian, Dutch/Flemish, Greek, Hebrew. We have texts in Old French, Polish, Russian, Romanian, and Farsi in progress. I wonder if we should count mathematics as a language? I was surprised at how many people were interested when we first uploaded Pi to a million places.

Q. Why are stand-alone images (e.g., films, photographs) and sound excluded or rare?

A. We have tried some, but haven't received much feedback. Still, we will continue to experiment with all formats. Also, these files are total hogs for drives and bandwidth. Our short movie of the lunar landing is twice as big as Shakespeare and the Bible combined in uncompressed format. It's only a couple minutes long, and low-resolution. Think how big a whole movie would be, even not at high resolution. It would take up a couple CD-ROMs.

Q. Project Gutenburg now makes files available as DOC/RTF and HTML -- as well as plain vanilla ASCII. Yet, plain-text delivery seemed to have been a basic tenet of the project. What made you change your mind?

A. We're willing to post in all kinds of file formats, but the only format everyone can read is plain vanilla ASCII, so we always try to include that. Project Gutenberg has been available on CDs for years.

Q. The failure of the advertising-sponsored revenue model forces Internet-based content generators and aggregators to charge for their wares. Will Project Gutenberg continue to be free -- and, if so, how will it finance itself? Example: who is paying for the hosting and bandwidth now?

A. It's all volunteer. And the number of sites continues to grow, and to reach more and more regions around the world for easier local access. Actually, all the hosting, bandwidth, etc. are voluntary, too. However, we desperately need donations to do copyright research, cataloging, to hire librarians and library and information science professors, to support the Project Gutenberg spin-offs in other languages and countries, not to mention mundane things such as phone and utility bills, computers, drives, backups, etc. We need volunteers equally desperately. Volunteering is perhaps the only way for one person to work for a week or a month on a book and get it to a hundred million people.

Q. The reaction to e-books fluctuates wildly between euphoria and gloom.

A. This is only the commercial point of view. They want to take it over or sink it to the bottom. There are no other commercial perspectives. Between 1500-1550, thanks to the Gutenberg press, more books were printed than in all of history previous to Gutenberg. I have hopes like that for e-books.

Q. Some say that e-books are doomed, having miserably failed to capture the public's imagination and devotion. Others predict a future of ubiquitous, ATM-printed, e-books, replete with olfactory, tactile, audio, and 3-D effects. What is your scenario?

A. The main trouble with these predictions is not only that they are made solely with the commercial aspects in mind, but that they are made by an assortment of people from pre-e-book generations, who have no idea that you could use the same gizmo to play MP3s as to read or listen to e-books. The younger generations have no doubt about e-books. It's only the dinosaurs that have no idea what's going on. We are still getting e-mail stating that not one person is ever going to read books from computers!

Who will be the more well-read -- those who can carry at most a dozen books with them, or those who have a PDA in their pocket with a hundred or more e-books in it? Who will look up more quotations in context? Who will use the dictionary more often? Who will look up geographical information more often?

These are all things I do with my little antique PDA and the new ones are already a dozen times more powerful.

Part 2 of this interview will run Tuesday.

Comments to:

Trending Stories