Wednesday, September 22, 2010

Tech Tip: DocumentCloud for Librarians

Documents - and the information they contain - are the lifeblood of news organizations. We read them, write about them, discuss them. And, every so often, we do the right thing and let our readers and listeners do the same. DocumentCloud is a project that could change the way that newsrooms deal with primary source materials such as government reports or court decisions. And news researchers should take full advantage of it.

First, a disclosure: DocumentCloud is run by a group that includes my current boss, Aron Pilhofer, and when I'm in New York I sit near the site's developers. So I've got a bias. But I think that when you look at what other journalists have been able to do with it, and consider the internal newsroom uses as well, you'll agree that it's a valuable tool.

One of DocumentCloud's great strengths is freeing information from file formats like the PDF, allowing it to be read and searched as you would almost any Web page. In this way, documents become a seamless part of the story, not a distracting trip away from it. The Memphis Commercial-Appeal recently used DocumentCloud to help present its project on civil rights era photographer Ernest Withers, who was also an FBI informant.

DocumentCloud made it easy not just to view the FBI reports, but also simple for reporters to draw readers' attention to the important bits via its annotation feature (it's the part in yellow here). That static PDF can now be a more interactive document, more Webby, if you will. The Arizona Republic used it to help explain SB1070 and the impact of a federal judge's ruling on it, inviting two attorneys to add their expertise.

These are two examples of public projects, but DocumentCloud can also be used to store documents that newsrooms might not want to share externally (yet). It's a great way to maintain a set of files that anyone from the newsroom can access and annotate, making it a good candidate for long-term project work. And when you're reading to show that work to the world, you can make any or all of the files public.

So how does it work? If you have, say, a PDF, you can upload it to DocumentCloud's servers, where it will be scanned and have the text extracted (electronic PDFs yield a better result, but DocumentCloud tries its best for images using Optical Character Recognition software). Then you can annotate the finished document. For more details, check out the FAQ.

DocumentCloud is free to use, so there's not much stopping you from giving it a try. Contact the folks there to get an invite; they love working with news organizations.

Thursday, September 02, 2010

Notes from the Chair

I can hardly believe summer is coming to an end and that the SLA annual conference was over two months ago.

In addition to the conference, my summer was consumed with a huge project at work: the simultaneous migration of our text and photo archive to a new system. I was however able to get in a 10-day vacation back home in Wisconsin, where I celebrated the “big birthday” I mentioned in my initial chair column, with family and friends.

In a past chair column and messages posted to NewsLib, I mentioned that the News Division was facing some serious challenges to its future, namely a marked decrease in our membership.

At the division’s annual board meeting in New Orleans, I introduced the idea of exploring an alignment with IRE (Investigative Reporters and Editors). I had a conversation with IRE’s executive director, Mark Horvit, prior to the conference, and he was very excited about the possibility of working with our membership in either a formal or informal partnership. An enthusiastic conversation with one of our paper’s projects reporters, a long-standing member of IRE, made me even more excited about the idea.

Like many of you, I hold membership in both SLA and IRE and find both to be an equally important part of my professional development goals. One of the things about IRE that I found particularly appealing is the many workshops held across the country throughout the year and the organization’s increased emphasis on Web-based training, both attractive ideas for those of us who rarely get the opportunity to attend the SLA annual conference or are able to take advantage of training opportunities.

I appreciate the attendees at our business meeting letting me present my idea and sharing their thoughts. Concerns were voiced about an arrangement with IRE, namely the feeling that the organization would fail to meet the needs of some of our division members who work as full-time text or photo archivists or who work outside news organizations. I appreciate those concerns, but I also think some type of collaboration with IRE would be beneficial for us and provide growth opportunities.

A suggestion was made to survey our membership about their chosen professional memberships and about the IRE idea. I intended to create an online survey for that purpose after the conference, but as I’m sure you can appreciate, things got crazy at work with vacations and the migration project and my best intentions went awry. Look for the survey soon, I promise.

* * * *

In the meantime, our Chair-Elect-Elect and SLA 2011 conference planner Eli Edwards has been hard at work planning division programming for next year. Unfortunately, I will not be able to attend the conference, which leads me to a request. Since I won’t be in Philadelphia, I would deeply appreciate a volunteer to take over the role of planning the News Division’s Awards Reception. If you’re interested, please send me a message at adisch[at] Thanks.

--Amy Disch