In the crucial early hours after the Las Vegas mass shooting, it happened again: Hoaxes, completely unverified rumors, failed witch hunts, and blatant falsehoods spread across the internet.
But they did not do so by themselves: They used the infrastructure that Google and Facebook and YouTube have built to achieve wide distribution. These companies are the most powerful information gatekeepers that the world has ever known, and yet they refuse to take responsibility for their active role in damaging the quality of information reaching the public.
BuzzFeed’s Ryan Broderick found that Google’s “top stories” results surfaced 4chan forum posts about a man that right-wing amateur sleuths had incorrectly identified as the Las Vegas shooter.
4chan is a known source not just of racism, but hoaxes and deliberate misinformation. In any list a human might make of sites to exclude from being labeled as “news,” 4chan would be near the very top.
Yet, there Google was surfacing 4chan as people desperately searched for information about this wrongly accused man, adding fuel to the fire, amplifying the rumor. This is playing an active role in the spread of bad information, poisoning the news ecosystem.
The problem can be traced back to a change Google made in October 2014 to include non-journalistic sites in the “In the News” box instead of pulling from Google News.
But one might have imagined that not every forum site could be included. The idea that 4chan would be within the universe that Google might scrape is horrifying.
Worse, when I asked Google about this, and indicated why I thought it was a severe problem, they sent back boilerplate.
Unfortunately, early this morning we were briefly surfacing an inaccurate 4chan website in our Search results for a small number of queries. Within hours, the 4chan story was algorithmically replaced by relevant results. This should not have appeared for any queries, and we’ll continue to make algorithmic improvements to prevent this from happening in the future.
It’s no longer good enough to note that something was algorithmically surfaced and then replaced. It’s no longer good enough to shrug off (“briefly,” “for a small number of queries”) the problems in the system simply because it has computers in the decision loop.
After I followed up with Google, they sent a more detailed response, which I cannot directly quote, but can describe. It was primarily an attempt to minimize the mistake Google had made, while acknowledging that they had made a mistake.
4chan results, they said, had not shown up for general searches about Las Vegas, but only for the name of the misidentified shooter. The reason the 4chan forum post showed up was that it was “fresh” and there were relatively few searches for the falsely accused man. Basically, the algorithms controlling what to show didn’t have a lot to go on, and when something new popped up as searches for the name were ramping up, it was happy to slot it as the first result.
The note further explained that what shows up in “In the News” derives from the “authoritativeness” of a site as well as the “freshness” of the content on it. And Google acknowledged they’d made a mistake in this case.
The thing is: This is a predictable problem. In fact, there is already a similar example in the extant record. After the Boston bombings, we saw a very similar “misinformation disaster.”
Gabe Rivera, who runs a tech-news service called Techmeme that uses humans and algorithms to identify important stories, addressed the problem in a tweet. Google, he said, couldn’t be asked to hand-sift all content but “they do have the resources to moderate the head,” i.e., the most important searches.
The truth is that machines need many examples to learn from. That’s something we know from all the current artificial-intelligence research. They’re not good at “one-shot” learning. But humans are very good at dealing with new and unexpected situations. Why are there not more humans inside Google who are tasked with basic information-filtering tasks? How can this not be part of the system, given that we know the machines will struggle with rare, breaking-news situations?
Google is too important, and from what I’ve seen reporting on them for 10 years, the company does care about information quality. Even from a pure corporate-trust and brand perspective, wouldn’t it be worth it to have a large enough team to make sure they get these situations right across the globe?
Of course, it is not just Google.
On Facebook, a simple search for “Las Vegas” yields a Group called “Las Vegas Shooting /Massacre,” which sprung up after the shooting and already has more than 5,000 members.
The group is run by Jonathan Lee Riches, who gained notoriety by filing 3,000 frivolous lawsuits while he was in prison for 10 years for stealing money by impersonating people whose bank credentials had been phished. Now, he calls himself an “investigative journalist” with Infowars, though there is no indication he’s been published on the site, and given that he also lists himself as a former Male Underwear Model at Victoria’s Secret, a former Nuclear Scientist at Chernobyl, and a former bodyguard at Buckingham Palace, his work history may not be reliable.
The problems with surfacing this man’s group to Facebook users is obvious to literally any human. But to Facebook’s algorithms, it’s just a fast-growing group with an engaged community.
Most people who joined the group looking for information presumably don’t know that the founder is notorious for legal and informational hijinx.
Meanwhile, Kevin Roose of The New York Times, pointed out that Facebook’s Trending Stories page was surfacing stories about the shooting from Sputnik, a known source of Russian propaganda. Their statement was, like Google’s, designed to minimize what had happened.
“Our Global Security Operations Center spotted these posts this morning and we have removed them. However, their removal was delayed, allowing them to be screen captured and circulated online,” a spokesperson responded. “We are working to fix the issue that allowed this to happen in the first place and deeply regret the confusion this caused.”
All across the information landscape, looking for news about the shooting within the dominant platforms delivered horrifying results. “Managing breaking news is an extremely difficult problem but it’s incredible that asking the search box of *every major platform* returns raw toxic sewage,” wrote John Hermann, who covers the platforms for The New York Times.
For example, he noted that Google’s conglomerate mate at Alphabet, YouTube, was also surfacing absolutely wild things and no respected news organization.
managing breaking news is an extremely difficult problem but it’s incredible that asking the search box of *every major platform* returns raw toxic sewage
— John Herrman (@jwherrman) October 2, 2017
As news consumers, we can say this: It does not have to be like this. Imagine a newspaper posting unverified rumors about a shooter from a bunch of readers who had been known to perpetuate hoaxes. There would be hell to pay—and for good reason. The standards of journalism are a set of tools for helping to make sense of chaotic situations, in which bad and good information about an event coexist. These technology companies need to borrow our tools—and hire the people to execute on the principles—or stop saying that they care about the quality of information that they deliver to people.
There’s no hiding behind algorithms anymore. The problems cannot be minimized. The machines have shown they are not up to the task of dealing with rare, breaking news events, and it is unlikely that they will be in the near future. More humans must be added to the decisionmaking process, and the sooner the better.