Wednesday, September 22, 2010

Tech Tip: DocumentCloud for Librarians

Documents - and the information they contain - are the lifeblood of news organizations. We read them, write about them, discuss them. And, every so often, we do the right thing and let our readers and listeners do the same. DocumentCloud is a project that could change the way that newsrooms deal with primary source materials such as government reports or court decisions. And news researchers should take full advantage of it.

First, a disclosure: DocumentCloud is run by a group that includes my current boss, Aron Pilhofer, and when I'm in New York I sit near the site's developers. So I've got a bias. But I think that when you look at what other journalists have been able to do with it, and consider the internal newsroom uses as well, you'll agree that it's a valuable tool.

One of DocumentCloud's great strengths is freeing information from file formats like the PDF, allowing it to be read and searched as you would almost any Web page. In this way, documents become a seamless part of the story, not a distracting trip away from it. The Memphis Commercial-Appeal recently used DocumentCloud to help present its project on civil rights era photographer Ernest Withers, who was also an FBI informant.

DocumentCloud made it easy not just to view the FBI reports, but also simple for reporters to draw readers' attention to the important bits via its annotation feature (it's the part in yellow here). That static PDF can now be a more interactive document, more Webby, if you will. The Arizona Republic used it to help explain SB1070 and the impact of a federal judge's ruling on it, inviting two attorneys to add their expertise.

These are two examples of public projects, but DocumentCloud can also be used to store documents that newsrooms might not want to share externally (yet). It's a great way to maintain a set of files that anyone from the newsroom can access and annotate, making it a good candidate for long-term project work. And when you're reading to show that work to the world, you can make any or all of the files public.

So how does it work? If you have, say, a PDF, you can upload it to DocumentCloud's servers, where it will be scanned and have the text extracted (electronic PDFs yield a better result, but DocumentCloud tries its best for images using Optical Character Recognition software). Then you can annotate the finished document. For more details, check out the FAQ.

DocumentCloud is free to use, so there's not much stopping you from giving it a try. Contact the folks there to get an invite; they love working with news organizations.

1 Comments:

At 2:28 PM, Blogger Angie said...

We have a DocumentCloud account. I love how simple and easy to use the interface is.

--Angie Holan
PolitiFact

 

Post a Comment

<< Home