|
Join us in the Poynter On Media
News Update.
| Chat Replay: How Can Journalists and Programmers Collaborate More Effectively? |
| It's oversimplified to call it a right-brain, left-brain difference, but it's clear that while programmers and journalists need each other, they don't always find it easy to work together. Differences in project needs and personal styles can add to the disconnect.
Below, you can replay a chat we held about the practical ways to help journalists and programmers collaborate. Here are the folks we talked to:

07/13/2010
|
| From Open Mics to Buzz Brokers, 'Content Farms' are Not all Created Equal |
| They are called a variety of euphemisms, from "content mills" or "content farms" to "content creation houses" and the Fifth Estate, but make no mistake: sites that specialize in the production and distribution of user-generated content are influencing the news industry and journalism.
The evergreen content produced by Demand Media, Helium.com, and Associated Content finds its way from these platforms to a variety of media partners, including newspapers, magazines and online news providers seeking to add local or evergreen content to their sites.
These partnerships generate low-cost content for publications and revenue for the content provider. And for some writers, these opportunities provide them with credibility and a small amount of regular income.
In a recent webinar hosted by Poynter's News University, Mitch Gelman, Vice President of Special Projects at Examiner.com -- a relatively recent addition to the stable of content creation houses -- discussed the differences between these sites.
Gelman introduced three basic models that describe the writers drawn to the sites and the content that they provide. - Open Mic sites have their roots in the "Speaker's Corner." People drive the content production on these sites. Both Associated Content and Helium.com have Open Mic components to their content production models. Demand Media may be adding this to their offerings in the near future as well.
- Buzz Brokers analyze search trends and put out calls for stories. Associated Content has incorporated this model, and this is the primary content model at Demand Media.
- Pro-Am sites reach out to people in neighborhoods who can contribute. Its roots are in the stringer model of local newspapers, and these models seek to develop their contributors' skills. Helium and Examiner.com make use of aspects of this model.
The three pieces in this series touch on elements of each model. The partnership between Demand Media and USA Today for travel content reflects some buzz brokering. The purchase of Associated Content by Yahoo! may demonstrate the value of a local open mic. And Helium's strategy of credentialing writers underlines a distinct pro-am influence.
Key to how each of these develops will be the evolution of their contributor communities. Their respective page views make it clear there is a huge demand for the content produced. And if contributor participation and collaboration continue, maybe these upstart disruptors will begin to replicate a virtual newsroom experience while expanding their business models.

07/09/2010
|
| Helium Hopes Credentialing Sets it Apart from other Social Content Producers |
| For those who are concerned about the future of news, the notion that a "content mill" could produce quality journalism seems to be anathema. But Mark Ranalli, CEO of Helium.com, has been working towards building the kind of online community that could do that. In a recent conversation with Ranalli, he explained that since its launch in 2006, Helium has been growing as both a content platform and community in many different ways. One of the significant changes is Helium's Credentialed Professional Program. As more professionals have come to Helium, some via its partnership with the Society of Professional Journalists, Helium needed a system that brought their offline credentials into the online community.
For example, a journalist or SPJ member can apply to Helium's credentialing board with all the necessary information, and the board will check those credentials. If the writer is credentialed as a journalist, then he will receive the appropriate site badge, and a four-star ranking. A paramedic who is writing on health issues might apply to be credentialed as a medical professional. Credentialing and badges let others know that the writers are people who have substantial experience in particular fields and that their work can be trusted. Helium has also assembled a credentialed Editorial Team. Potential editors must apply for a position, and pass what Ranalli describes as a very stringent test of their editorial skills before being considered for the team. "I know that the people on are editorial team are top-notch," Ranalli said, "because even I can't pass our editor's test." Since the implementation of credentialing and the introduction of the Editorial Team, Ranalli noted that more magazine and online publishers are turning to Helium's content rather than to freelancers. The pay, however, is lower than what freelancers may once have made. Ranalli sees a downward trend for wages: "People might not get paid the same amounts as in the past, but they will be paid and published." Credentialing may also be important as Helium considers doing investigative journalism. In December of 2009, Helium News was introduced to encourage more news-style reporting, as well as collaboration between contributors. Ranalli believes these changes begin creating an online newsroom experience, where seasoned reporters mingle and exchange ideas with new writers. This "virtual newsroom" community does not fast track publishing on Helium. All articles, whether or not they come from a credentialed writer, are submitted first to a blind peer-review process. This process has always been part of the Helium model of editorial oversight, partly because it can lessen the likelihood stories will be approved based on a writer's popularity. Ranalli also consider the blind review process an important way to bring forth new voices that might otherwise never be heard and have the potential to make a strong contribution to journalism. Helium, over the years, has created partnerships with prestigious organizations in order to raise the profile of it writers. A partnership with the National Press Club has opened the doors of that 100-year-old organization to Helium contributors who have earned a five-star ranking. And an ongoing relationship with The Pulitzer Center for Crisis Reporting brings the Global Issues/Citizen Voices essay contest on under-reported topics to the Helium community. The current and ninth contest is focused on global maternal health.
This partnership and others have lead to the Citizen Journalism Awards, which cover a broad range of topics and are sponsored by organizations as diverse as The Sunshine Foundation, the Knight Center for International Media and ITVS (for the 1H2O Project), and PETA. Helium hopes these partnerships and its editorial processes elevate it above other content-creation companies. At least some believe it has. The Massachusetts firm was recently named one of the "Hottest Boston Companies."

07/09/2010
|
| How Associated Content Helps Yahoo Go Local |
| Since 2004, Associated Content -- "The People's Media Company" -- has grown a stable of over 380,000 loyal content producers who have contributed over two million pieces of text, audio, video, and photographic content to its distribution platform. In mid-May, it was announced that Associated Content had been sold to Yahoo! for a little more than $100 million, and has plans to shut down the Associated Content website when the sale is complete in the third quarter of this year.
How will Associated Content continue to court the loyalty of its contributors while the sale and shutdown are pending? And what -- beyond the obvious advantages of loyal contributors and a huge cache of money-earning, evergreen content -- does Yahoo get? I posed these questions to Patrick Keane, CEO of Associated Content. But first, here's how it works now. Associated Content's writers create self-selected and assignment-based content. Most of what is produced is evergreen content, but there are also personal essays, product reviews, and the like. While some content is paid at scale or "upfront," Keane explained that various types of content are often valued individually, according to the form (text, video, etc.) and potential earnings.
Since monetization happens over the lifetime of an article, and articles are considered annuities for both Associated Content and the producers, potential earnings are determined by a number of factors, including Web search results and Ad Sense metrics. Content contributors have a number of options to help them distribute and promote their work across social networking sites and blogs. Contributors can also rely on the site's search engine optimization and how-to guides for creating headlines and leads with search-friendly keywords. The combination of self-promotion and search engine optimization helps producers maximize what is available to them beyond the upfront payment system. Keane said that "no immediate changes" would be made to the payment process. He elaborated: "We remain committed to the people that produce content. The acquisition by Yahoo! brings a great deal of opportunity for them and this will increase our contributor base. Contributors will continue to create and upload content onto Associated Content's platform. They will now be supported with a much larger distribution -- 600 million unique monthly visitors. "There may be tweaks and changes to the process of content creation in the future," he continued, "but both Yahoo! and Associated Content are committed to maintaining the standards our contributors are used to in order to produce the most useful, original content by the people, for the people. Yahoo! plans to leverage content from our contributors across its leading media properties including Yahoo! News, Yahoo! Sports and Yahoo! Finance." Contributors will also have the opportunity to produce for Shine, Yahoo! Movies, OMG, and most of the Yahoo! network. Associated Content currently partners with media organizations, including Thomson Reuters, Cox Newspapers, CNN. Keane said the sale is viewed positively by them. "Yahoo! partners and collaborates with publishers and they view the acquisition of Associated Content as an opportunity to extend those partnerships," he told me. "We envision that this agreement will open new opportunities to partner with other companies that share the same mission of producing high-quality original content at scale. No specific changes have been made at this point." There may be a battle brewing over who will produce fresh, news-style content, though. Even though its focus until now has been on the production of evergreen content, with less than 10 percent considered "news," there are a number of seasoned journalists who contribute news-style content to several of AC's verticals, including Sports and Society.
Prior to the sale, I had asked Keane about the potential for Associated Content to create local news. "Using the virtual assignment desk, we can activate any audience in any ZIP code," he responded. "So, then, we could potentially have someone follow the story of a plane crash. We can activate people in any community to create news stories if we'd like to do that. But that's not our focus." Yahoo!, however, will now be able to throw its hat in the ring, alongside others such as AOL, in the fight to produce content for local portals. When asked after the sale whether Associated Content might begin to produce more locally-focused content, Keane responded: "The local section on Associated Content's site already has a library full of locally-focused content across several topics. Yahoo! intends to leverage the Associated Content platform to generate content across their properties, including local content. This deal will help Yahoo! provide useful local content, as Associated Content has the unique ability to tap 380,000 'man-on-the-street' contributors who are experts in their locale and can produce high-quality content in real time from any DMA."

07/09/2010
|
| Why USA Today Partnered with Demand Media |
| As more news organizations begin to consider integrating user-generated content into their daily offerings, several traditional news publishers (Hearst) have started using various forms of user-generated content from content production sites like Helium.com and Associated Content. Demand Media is the newest and perhaps most closely watched of the content production sites.
Concern over Demand comes not just from its 2008 merger with blog syndicator and aggregation software developer Pluck, but also due to its proprietary algorithm that is said to help content producers generate keyword-rich content that increases reach into the first pages of Google and other search results.
In the deal between Demand Media and USA Today, Demand provides 4,000-plus keyword-rich "Travel Tips" articles and other types of content that will be cached in USA Today's Travel Section. Demand Media will also provide keyword-rich advertising to accompany the content. While the article content will be free to USA Today, the revenue generated from the ads will be split between the news organization and Demand.
Recently, I had the opportunity to correspond with Victoria Borton, General Manager of the Travel section at USA Today, on the decision to partner with Demand Media and the benefits.
Tish Grier: Were other content outlets considered before Demand Media was chosen?
Victoria Borton: USA TODAY had a long-standing relationship with Pluck through our integration of their social media tools on USATODAY.com, enabling community as part of our Network Journalism launch in 2007. Pluck introduced us to Demand Studios about extending our relationship around their search optimized content model, we agreed travel was a category where creating a co-branded section using their approach made sense.
... We've had a positive relationship with Pluck since the launch of Network Journalism on USATODAY.com in 2007.
How important was Demand's ad production and placement plan to the deal?
Borton: Demand Media's system to create content based on search trends and the corresponding advertising model provided a strong business case for entering into this relationship.
Why was the Travel section chosen over other USAT sections that feature evergreen content? Is there an expectation that Demand's content will help the USAT Travel section become a "destination site" known for its travel info?
Borton: Travel is an area where consumers are always looking for functional, actionable tips and information around a wide variety of topics. It's an ideal area to offer travel tips. USA TODAY Travel is already a popular destination site for original, trusted travel information, and the addition of Travel Tips is one way to broaden our overall content offering.
What might be expected earnings from the travel section now that this deal is in place?
Borton: We are most excited about the demand-driven, search friendliness of this content, and its ability to bring new users to our site. As traffic increases over time, advertising revenues will follow those traffic increases.
Are there any plans to extend Demand content to other sections in USA Today, or to use them for any news or investigative reporting?
Borton: We will watch the performance of the section over time and make further decisions on whether to extend to other areas if it makes sense for both our audience and our business. While Demand Media's co-branded content expands our overall offering to our audience, there has been no thought that it would replace our existing content coverage, news and investigative reporting in any way.
Much has been made of the possibility of Demand's content not meeting with prevailing journalistic standards. Could you comment on Demand's standards and how those standards relate to the journalistic standards maintained by USA Today?
Borton: We worked with Demand Media to share our overall editorial guidelines, and they selected their top writers with existing travel experience for our project. USA TODAY reserves the right to remove content we don't feel is up to our standards. For this type of consumer service content, we are happy with the quality to date.
How might you describe the relationship between "content" and "journalism"?
Borton: Journalism is core to the USA TODAY brand -- it's our unique investigation and reporting around timely events and items of interest. Content can be anything consumed by a user: data, information, listings, photos, videos, maps and so on.
Note: On June 14, USA Today announced a partnership with location-based social network Gowalla. I asked Borton if any of the Demand Media content would be served on the three Gowalla applications. She responded: "All three of the USA TODAY content features appearing on the Gowalla application are written by USA TODAY staffers and freelance columnists."

07/09/2010
|
| How to be a social climber on the digital ladder |
| There are a variety of ways to participate in or experience news via social media. Twitter, Facebook, LinkedIn, FriendFeed, Yelp, Foursquare, Gowalla... the list goes on. But in what ways should a journalist utilize social technology? A few years ago, Forrester researchers Charlene Li (a Poynter National Advisory Board member) and Josh Bernoff created the Social Technographics Ladder. This graphic (below) defines the behaviors and interactions associated with social media by placing users into overlapping categories. Each rung on the ladder represents a specific set of behaviors, and people can move up and down these rungs. (The most recent addition to the ladder is the "Conversationalists" category.) How many of these rungs should today's journalist climb? I say every rung above "Inactive." Why? Because while there may be a learning curve for using specific tools, these categories describe behaviors that defined journalism before social media became the "it girl." Here's how each rung relates to journalism, from the top of the ladder to the bottom. -
Creators author a story. -
Conversationalists talk to people about stories, find sources, break news. -
Critics review, offer opinion pieces. -
Collectors research, create contacts and read publications on a regular basis. -
Joiners are part of a community, professional or personal group. -
Spectators keep up with competitors and other publications. I am a Creator, Conversationalist, Critic, Collector, Joiner and Spectator. But, I'm not all of these things on every social network. I focus on the networks that I see being used heavily in my Lawrence, Kansas community: Twitter, Facebook, Foursquare and Gowalla. LinkedIn, MySpace and FriendFeed are not used as often by our audience at the Lawrence Journal-World, so I'm more of a Joiner/Spectator when it comes to those. But our websites have an active presence on them all. Being an active part of these networks keeps us in touch with a tech-savvy, information-hungry portion of our audience. They're willing to participate in and share our content on a daily basis. On Twitter alone, if a handful of people retweet a link, it could reach hundreds of thousands of users new to LJWorld.com. Where are you on this ladder of social interaction today? Have you been a social climber over the last few years? Hint: If you're reading this, you're at least a Spectator. If you have an account on Facebook, you're a Joiner. If you leave me a comment, you're a Critic.

07/08/2010
|
| How the Semantic Web Can Connect News and Make Stories More Accessible |
| Tom Tague isn't content to let an article just be an article. "How do I take a chunk of text," he asked, "and turn it into a chunk of data?"
He was speaking Thursday night at a panel discussion hosted by Hacks/Hackers, a San Francisco-based group that bridges the worlds of journalism and engineering. Coinciding with the 2010 Semantic Technology Conference, Thursday's presentation dealt with the Web's evolution from a tangle of text to a database capable of understanding its own content.
Tague, vice president for platform strategy with Thompson Reuters, was joined by New York Times Semantic Technologist Evan Sandhaus, allVoices CEO Amra Tareen, and Read It Later creator Nate Weiner. The semantic Web is already here, they explained; and it's getting smarter.
Make news worth more
Simply put, the semantic Web is a strategy for enabling communication between independent databases on the Web.
For example, Sandhaus said, there's a wealth of priceless data in databases at Amazon, the Environmental Protection Agency, the Census Bureau, Twitter and Wikipedia. "But they don't know anything about one another," he said, so there's no way to answer questions like, "What is the impact of pollution on population?" or "What do people tweet about on smoggy days?" (Sandhaus said he did not do his presentation as a representative of the Times.)
This is a particular problem for news publishers, said Tague. Publishers need to monetize content, engage with users and launch new products; since news articles lie in a "sweet spot" between fleeting tweets and durable scientific journals, they have the most potential to grab and retain readers.
In other words, it's possible for publishers to improve the value and shelf life of news. All that's required is rich metadata.
Metadata, Tague said, improves reader engagement by linking together related media. For readers, that means more context on each story and a more personalized experience. And for advertisers, it means better demographic data than ever before.
But there's a problem: Currently, the economics of online news doesn't support the manual creation of metadata.
Let algorithms curate
Tague's solution to the Internet's overwhelming volume of news is OpenCalais, a Thomson Reuters tool that can examine any news article, understand what it's about, and connect it to related media.
This is more than a simple keyword search. OpenCalais extracts "named entities," analyzing sentence structure to determine the topic of the article. It is able to understand facts and events. For example, when fed a short article about a hurricane forming near Mexico, an OpenCalais demo tool recognized locations like Acapulco, facilities like The National Hurricane Center and an even occupations like "hurricane specialist." It also understood facts, synthesizing a subject-verb-object phrase to express that a hurricane center had predicted a hurricane.
OpenCalais has already been put to work at a wide range of news organizations, including The Nation, The New Republic, Slate, and Aljazeera. Each site's implementation is unique; for example, DailyMe uses semantic data to monitor each user's reading habits, presenting the user with personalized reading suggestions.
Both The Nation and The New Republic saw immediate benefits to the use of OpenCalais, Tague said; the tool coincided with significant gains in time-on-site, and it automatically generates pages dedicated to a single topic, which had been a labor-intensive process for editors.
Overcome overwhelming content
As OpenCalais frees editors from the minutiae of searching for complementary stories, Nate Weiner's software facilitates the gathering of reading material. Read It Later integrates with browsers and RSS readers; when users see something that they want to read later, they simply flag the page and the application gathers it for later consumption.
Unfortunately, users can sometimes wind up with an overwhelming, disorganized collection of articles. So Weiner decided to teach the application how to group similar items, making them easier to skim and select.
Initial experiments with manual tagging didn't work out, since users weren't interested in taking the time to add tags to every article they collected. So Weiner turned to semantic applications that could automatically analyze each article and organize related topics. His tool of choice: OpenCalais, which turned Read It Later's "Digest" view from an unwieldy list into a magazine-like layout.
Organize the organizing
Sandhaus described the alchemy of the semantic Web as "graphs of triples," which drew furrowed brows from his audience. But it turned out not to be as complicated as it sounds; the "triples" are just simple subject-verb-object sentences, chained together. For example, if a tool detects "Barack Obama" in an article, it will scan nearby words to create a relationship like "Barack Obama is the President." Then it can build on its knowledge of "the President" to branch further out: "The President lives in the White House," "The White House was burned in 1814," and so on.
These relationships are derived from massive databases that grow larger and larger by the day. For example, DBpedia has turned Wikipedia into a database of 2.6 million entities; Freebase is a database of databases with 11 million topics; GeoNames tracks 8 million place names, and MusicBrainz can recognize 9 million songs.
But the real magic happens when the databases come together, such as when the BBC wanted to create a comprehensive resource for information about bands. By merging its own information with entries from Wikipedia and MusicBrainz, the BBC created a website that seems to know everything about music.
Trust algorithms, but trust humans more
As smart as the semantic Web can be, it's still not as smart as a human editor. "Our algorithms can never be perfect," said allVoices CEO Amra Tareen. Her company provides citizen journalists with their own news platform, incentivizing high-quality reporting with payments based on page views.
Since its launch in 2008, allVoices has scanned articles to generate what Tareen called a "bag of words" that connects each story to complementary reporting. Depending on a reporter's algorithmically calculated reputation and users' engagement with the story, the story can work its way up from a local section to national or even global focus on the site.
Tareen estimates that the curating of news on the site is about 20 percent human and 80 percent algorithmic.
Expect to see more semantic Web tools
Expect to see more semantic Web technology -- lots more, and soon. "There's growing momentum in this space," said Sandhaus, gesturing to a slide showing exponential growth of connected databases. "The more that you put yourself out there and people point back to you, the easier you are to find."
Fortunately for journalists, the semantic Web will work for humans, not the other way around. "We don't want to get in the way of the journalistic process," said OpenCalais' Tague. That's welcome news to any reporter who has been frustrated by a clunky content management system, a labyrinthine tagging and categorization system or manual photo management.
Semantic Web developers' goal, Tague said, is to free journalists to report, rather than sentencing them to generate endless metadata for the sake of SEO. "I hate the idea of journalists writing for searchability," he said. "That's a problem we should solve on the tech side."
Weiner of Read It Later agreed. Speaking on behalf of developers, he advised journalists, "Keep doing what you're doing. We'll try to adapt."

06/25/2010
|
| Miami Herald Marks Anniversary of Mariel Boatlift with Database of Passengers, Vessels |
| A Miami Herald database has publicized in-depth information on one of the most important events of Cuban emigration. A reporter, data analyst and Web developer worked for months to digitize and organize little-known data about the 1980 Mariel boatlift, published in late May to commemorate the 30th anniversary of the vessels' arrivals in the United States.
The data sets are more than mere numbers and names; every record hints at the story of someone beginning a new chapter of his or her life. It's a powerful example that demonstrates that data-driven projects can be much more than stark, emotionless series of numbers.
The project tracks more than 125,000 passengers of the 1980 Mariel boatlift from Cuba to Florida, which was one of three post-Castro exoduses. The idea behind the database was to create a master list of people who arrived during the boatlift, culled from data obtained from an unknown government source of raw, unstandardized logs. The Herald planned to encourage people who were part of the boatlift to help create a comprehensive list of vessels that made the trip and match people to vessels.
For the reporter who compiled the data, this was more than a special assignment; it was an opportunity to bring in-depth coverage to an experience relevant to her own life.
Staff writer Luisa Yanez came to the U.S. on the Freedom Flights, another exodus from Cuba to Florida. "Today, there is no master list, no Ellis Island-type record to mark the arrival of Cubans in Miami," Yanez wrote in an e-mail. "The goal of the Mariel Database is to fill that hole for one of our best-known exoduses by creating a passenger list for each vessel."
Cleaning the list of refugee names, which mostly meant double-checking every record for accuracy and removing obvious errors, took Yanez about five months. She said she was freed from her daily deadlines to work with the data.
As part of her research, Yanez said she had hoped to find more complete information about who was on which boat. About four months into the project, she requested records related to the Mariel boatlift from a U.S. Coast Guard historian. He mentioned a document called the Marine Safety Log, a list of boat manifests.
At the time, it was only available in handwritten form, although it was scheduled to be digitized. To expedite the process, Yanez hired a researcher in Washington, D.C., to copy and send the data to her.
After ensuring the information was relevant, Yanez and a group of transcribers hired for the project digitized the boat names. The process took about two weeks.
While not comprehensive, the Marine Safety Log provided more information than Yanez, Database Editor Rob Barry and Web Developer Stephanie Rosenblatt originally expected to be able to provide.
In its final form, the Herald's list aggregates, and makes searchable, two data sets. "One is a list of more than 130,000 names of Cubans who arrived in Key West via Cuba's Mariel Harbor between late April and late September 1980," Yanez wrote. "The other is a list of the names of more than 1,600 boats used during that very boatlift."
The design of the site, which Yanez said transforms the data into a community project, encourages readers to contribute missing records and assign or remove anyone from a boat list. People can also share their anecdotes and memories.
Yanez said public reaction both online and in person has been strong and emotional, which reinforces the idea that historical databases are more than numbers.
"We had people burst into tears at the simple sight of their name on our database," said Yanez. "I like to call this 'the power of the list.' There is something tremendously moving about experiencing a traumatic event in your life -- war, migration, persecution -- then seeing your name among all the other survivors or veterans. It's affirmation that I was there, that I counted, that I mattered."

06/11/2010
|
| 'All Facebook' Blogger Explains Reporting Process, Decision to Unpublish Erroneous Post |
| Last week I saw a refreshingly honest post on a site called All Facebook, which provides reporting and analysis of, you guessed it, all things Facebook. The post that attracted my attention was a follow-up to something that had been posted on the site the day before. Nick O'Neill, blogger and founder of the site, wrote:
"Yesterday I posted an article on here which suggested Facebook or Google had accidentally 'leaked' user emails, through Facebook's opt-out system. The logic we used at the time to deduce this was completely off. ...
"Since the article was so off base, we decided to pull it all together. While the logic we used was a round-about logic, we weren't the only ones confused. However rather than updating a post which has practically become useless, we've pulled it all together."
Talk about transparency being the new objectivity. O'Neill's approach struck me as more forthright than many bloggers, who use the term "update" when they really mean "correction." And it was more up-front than news organizations that are willing to correct a minor factual error but won't acknowledge if the premise of a story is "completely off," to use O'Neill's words.
The 'unpublishing' question
While some areas of online publishing have developed rapidly -- how user comments are handled, for example -- the "unpublishing" dilemma is as perplexing as ever. A decision to unpublish a story is wrapped up in how a site handles corrections, just as newspaper and TV retractions are reserved for extreme cases in which a correction isn't enough. And, as I learned from my conversation with O'Neill, it has a lot to do with your standards for publishing in the first place.
When my colleague Bill Mitchell wrote about unpublishing a couple of years ago, he concluded that pulling a story should be the last resort, not the first. (Many, if not most, requests we get for unpublishing -- or hear about -- come from people who say they've been harmed by some piece of ever-Googleable news, not an error by the news organization.)
I bet most online publishers, whether All Facebook or the Chicago Tribune, would agree with Mitchell. In fact, O'Neill told me that he rarely removes content from his site. (We're learning more at Poynter about how others handle this, and as we do, we want to help you learn more.)
All Facebook is an example of what we've taken to calling the Fifth Estate at Poynter -- an enterprise that creates journalism, but without the trappings or conventions of a traditional news outlet. More and more these niche sites are tracking the ripples of news caused by companies like Facebook. When the ripples become a big wave, their work quickly moves from the Fifth Estate to the more established Fourth Estate.
All of this makes it important to understand how such sites gather, publish and in some cases, unpublish. This is how O'Neill described his process and values.
He's comfortable with reporting the truth as it develops
Last Thursday, O'Neill found a link on a site that aggregates news of interest to programmers. Someone had blogged that his e-mail address was exposed by a Facebook page that was apparently visible to the public and had been indexed by Google.
It appeared that someone had uncovered yet another Facebook privacy lapse, and O'Neill tried to figure out what was going on.
O'Neill said he contacted Facebook's communications department, which usually responds quickly to his inquiries. When he didn't hear back after 30 minutes or so, he posted the item. He temporarily pulled the item when the company asked for more time to respond, and he posted it again with a comment from the company.
Already, this looks very different from the traditional newsgathering process.
"Writing on a blog," O'Neill said, "I'm pretty flexible with the truth evolving. Because that happens with a news story sometimes, that we get a half-piece of the story and you wonder, do you post the article or not?"
That was not the end of the process. He and someone with Facebook's communications department started debating the accuracy of his post. He thought the company's disagreement amounted to a technical argument that didn't invalidate his post.
But after discussing the issue with a Facebook developer who also voiced his disagreement, O'Neill updated the post about 10 times, striking through sentences and adding new information. One of the last updates said, "This post is effectively destroyed," and O'Neill decided to just take down the whole thing.
O'Neill felt he could effectively argue the post was accurately updated, but beneath the strikethroughs and other changes "it still showed all this false information."
He believes there's merit in publishing information even if it turns out to be incorrect
Despite the multiple changes and corrections and eventual removal of the post, O'Neill stands by the process. "Eventually the truth comes out in one form or another," he said.
"I honestly think that publishing information reveals the truth quicker, as it creates an opportunity for others to come forward with more information," he told me in a follow-up e-mail after our conversation. "In this instance it was Google and Facebook that needed to provide information; however that's not always the case. Sometimes it's a tipster or someone else that can help create the truth."
He noted that he was right about one important element: If someone publicized the link to this particular Facebook page (for instance, by linking to it on a blog), Google would index that page and reveal the e-mail addresses on that page. Facebook and Google quickly worked to fix that.
You can't really un-ring the bell online
Some bloggers have said that readers understand the information they read is often in the process of being reported. O'Neill believes this. He also believes it's the readers' job to make sure they're properly informed, but he didn't make it easy for readers to get the correct information in this case. The first people to see the post, he acknowledged, saw incorrect information. If they checked back, the link to that post returned a 404 or "page not found" error. He didn't explain why he removed the post until the next day.
Likewise, he couldn't pull the story from his site's RSS feed, so those readers didn't know the story had been unpublished. Again, clicking on the links wouldn't have told them anything.
And one commenter pointed out that unpublishing didn't help to inform readers of the hundreds of blogs that had cited and linked to his post.
O'Neill said he would have liked to redirect to the post in which he explained his retraction, but he doesn't have an easy way of doing that and didn't have the time. "I justified it out of the perspective that it's ultimately the user's job to become educated."
He knows that how he deals with these issues affects his credibility, reputation and survival
O'Neill started his site three years ago, sold it to the company that owns Mediabistro.com, and acknowledges "large ambitions" for the site (as well as related ones covering social media). He attempts to balance his desire for more traffic with his desire for credibility -- among his readers and by the people who work at Facebook.
"At the end of the day, the more content I have, the more traffic I get. So I need to publish as much information as possible about Facebook," he said.
However, he said he's conscious about maintaining his relationship with Facebook, which he described as being cooperative over the years, even back when he was starting out and his work wasn't as accurate. (Facebook had 15 million users when he first started blogging about it. Now it has more than 400 million.)
"If you're going to post something damaging, then you better have the facts right," he said. "Because the last thing I want to be doing is not only damaging their brand, but damaging their brand without any sort of legitimate backing."
So he won't bury the lead if he learns of a security glitch, he said, but he won't sensationalize it, either. "That will drive traffic, but it's not good traffic, and it's going to hurt my reputation."
That brings us to a value that he shares with journalists old and new: He wants to be seen as an unbiased source of news about the social network. "If you want to maintain a trustworthy reputation with your readers, you need to publish the truth eventually, right?" he asked. "As close as you can come to truth."

06/11/2010
|
| How to Deal with Web Browser 'Fingerprints' |
| A few years ago, The New York Times exposed how "anonymous" search data isn't anonymous by using saved AOL search terms to track down an elderly widow in Georgia. Now, the Electronic Frontier Foundation has revealed that Web browsers leave information on websites you visit, which could be used to track your digital movements.
Volunteers for an EFF experiment visited panopticlick.eff.org. The website logged data that are automatically collected when you visit most sites: configuration and version of a user's operating system, browser and plug-ins.
That information was compared with a database of configurations from other visitors.
EFF found that 84 percent of the configuration combinations ended up identifying unique browsers -- essentially acting as fingerprints. Browsers installed with Adobe Flash or Java plug-ins were unique and trackable 94 percent of the time.
The privacy concerns are obvious. Do you want others to find out that you visit NSFW sites like Hawtness? Advertising networks could (and some do) use this information to secretly monitor you across websites and build a profile of your behavior and interests.
Implications for journalists As journalists, the problem is compounded. A government agency or corporation could track your research and maybe even sources through your browser. If you cover the Pentagon, for example, would you want your fingerprints on the databases and public records that you review on defense.gov? What if you clicked on the e-mail link for a top-level executive at a major corporation?
Stephen Doig, Knight Chair in Journalism at the Walter Cronkite School of Journalism at Arizona State University, has spoken at IRE and NICAR conferences about "spycraft" -- how to keep sources safe from the government or corporations. He discusses ways to keep Internet searches and e-mail private, make untraceable phone calls, use encryption programs and deal with keyloggers. (If you are an IRE member, you can download tipsheets from one of his talks.)
Self-defense The Electronic Frontier Foundation found that browsers that block JavaScript blend in because their configurations look more like other browsers. You may be able to find browser plug-ins that reduce how much information is shared with sites. But there doesn't seem to be much else you can do.
The Panopticlick site offers a few tips that could help keep you anonymous online:
- Use a standard browser. EFF says the most common browser is the latest release of Firefox on a Windows computer. But then you have to consider all the plug-ins you use, which makes using a "standard" version harder than you'd expect. Oddly enough, your best bet is to use a smart phone browser. They offer fewer configuration options and are harder to trace.
- Disable JavaScript. This is easy, but it makes a lot of websites unusable. An alternative is to use Firefox plug-ins like NoScript or AdBlock Plus.
- TorButton is a plug-in that sends incorrect browser configuration information data to websites, covering your tracks.
- "Private browsing" is now available on several modern browsers. This prevents your computer from storing cookies, browsing history, images and other data from websites that you visit. It doesn't affect what information a website collects about your browser, but it does clear the evidence of your activity from your own computer.
Seem paranoid? Maybe, but if it's important that you not to leave fingerprints when you're online, better safe than sorry.

06/03/2010
|
|