Private tech firms have found a new market for their sophisticated software capable of analyzing vast segments of the Internet – local police departments looking for ways to pre-empt the next mass shooting or other headline-grabbing event.
Twitter, Facebook and other popular sites are 24-hour fire hoses of raw information that need an automated tool for deciding what’s important and what is not. So technology companies are pushing products at law enforcement conferences, in trade publications and through white papers that promise to help police filter the deluge for terrorists, traffickers, pedophiles and rioters.
In the process, privacy advocates and other critics fear these tools – once reserved for corporate branding – could ensnare Internet users who happen to be at the wrong cyberspace destination at the wrong time.
Some 400 million tweets now flow across the Web every day, Twitter CEO Dick Costolo said in June. Facebook today reportedly boasts more than 900 million users, each pumping out a ceaseless stream of family photos, relationship updates, political manifestos, impulsive reactions to celebrity news and even criminal confessions.
Help us do more.
It’s increasingly clear that random keyword searches for “burn,” “collapse,” “public health” and “cloud” – among dozens of terms the Department of Homeland Security considers worth monitoring – won’t produce actionable intelligence when hunted on crude and readily available tools like TweetDeck. “Cocaine” as a search term will net more tweets about Charlie Sheen than plans for the Sinaloa cartel’s next illicit shipment.
“Twitter’s, like, 90 percent noise – bots that are producing erroneous or extraneous tweets,” said Tim Gasper, product manager for Infochimps, which helps companies produce meaning from extremely large sets of data. “So you’d be scrolling through all of that just to see if anything caught your eye. Obviously, that’s not a very efficient use of people.”
Now that greater surveillance capabilities exist, some law enforcement agencies have become eager for the prestige of having their own intelligence arm with no clear target in mind, raising sticky questions about who or what they want to spy on en masse and why.
“I follow lots of people on Twitter that I don’t agree with at all,” said Ginger McCall, open government program director for the Electronic Privacy Information Center. “ … I follow a lot of accounts of people who are potentially breaking various U.S. laws. Does that association necessarily mean that I am?”
Many sites make their basic architecture open to all, which is why you can share Flickr photos on Facebook with a single click. It’s also why anyone can turn your public status updates or tweets into data points and combine them with billions of others to better understand consumer habits – or produce government intelligence.
One company, SAS Institute Inc. of North Carolina, teaches police that they can scrape and analyze massive volumes of data from the backsides of Facebook and Twitter – something not everyone even knows is possible.
Your data, now just a drop of ocean water, can be processed for keywords and geographic locations that reveal “patterns of interest” to police in real time. Relying on the SAS Institute’s Text Miner tool, police can single out both words and phrases and determine if a word is being used as a noun, verb or adjective. “Bomb,” for instance, can be all three.
“Unlike their commercial counterparts who monitor the Twitter stream for any mention of a product, law enforcement clients don’t necessarily know what they need to monitor on Twitter,” the institute wrote in a white paper [PDF] earlier this year titled, “Twitter and Facebook Analysis: It’s Not Just for Marketing Anymore.”
With just one suspect’s name, they can do more: Draw in his or her followers from Twitter or read Facebook wall posts and status updates of their “friends.” Using the company’s social network analysis tool, police can visualize the connections among these individuals and see what was said among them.
The result “speeds investigation and creates a story line that can be crucial for investigators who want to harness the information available in social media,” the company writes. “This information around conversations is rarely available anywhere else to investigators.”
The company would not make anyone available for an interview, but spokesman Trent Smith offered a brief email statement.
“If it’s not publicly available data, then law enforcement officers must adhere to usual due process,” Smith wrote. “Also, human investigators should analyze what the technology produces, with no actions taken without a person validating the results.”
SAS, which specializes in corporate intelligence and business analytics, was founded more than 30 years ago by statisticians at North Carolina State University.
Two years ago, SAS made its pursuit of law enforcement customers official by acquiring the British firm Memex, which converts disparate pieces of data like fingerprints and mug shots into intelligence by making it more easily available for sharing and analysis. Memex aggressively marketed itself to the dozens of intelligence “fusion centers” created after Sept. 11 that allow local, state and federal police to swap digital tips in a command center-like setting.
Other tech tools
Then there’s 3i-MIND, a Swiss company that last year prominently showcased Web surveillance products at a law enforcement conference in San Diego. There, it pitched OpenMIND, developed specifically for intelligence and law enforcement agencies, which “automatically finds suspicious patterns and behaviors” across the Internet. It digs not just within social media, but also through blogs, online forums and the “deep Web,” where many chat rooms exist.
“OpenMIND helps analysts to find insights they were not even looking for, about entities they had not previously queried,” boasts the company’s product literature [PDF]. “It also helps to pinpoint specific websites not regularly monitored that may be relevant to research being performed.”
The company claims it can analyze text “according to its semantic meaning” and show whether “C4” is referring to explosives or something else.
While 3i-MIND says it has more than 500 clients, it did not respond to calls and emails seeking to determine who those clients are and how many are taxpayer-funded. The SAS Institute also declined a request for names of its government customers.
Other examples of tech tools include TACTrend, which can monitor social media within a geographic area determined by the customer. West Virginia maker HMS Technologies Inc. says [PDF] it was developed by former law enforcement and special operations personnel and is being used by federal, state and local agencies. The company in March tipped North Carolina police to a tweet by a student threatening his teacher.
An investment arm of the CIA called In-Q-Tel raised eyebrows in 2009 when it pumped money into a social media monitoring firm called Visible Technologies based in Massachusetts. In-Q-Tel also has reportedly invested in a company called Attensity that offers text and semantic analysis. Elsewhere, the British company CrowdControlHQ offers its products to both police and business executives and created a Twitter feed dedicated to the intersection of public safety and social media.
Information distribution and other uses
Law enforcement officials have embraced social media for simpler purposes, like notifying the public of disaster procedures, distributing information about crime trends and arrests, and receiving feedback from taxpayers about new initiatives.
Police in Michigan reportedly used social media to nab a serial burglar and motorcyclist who boasted online about racing from the cops. A sheriff’s office in Louisiana posts suspect information on its own social media sites first before turning to TV stations and newspapers. The Colorado Office of Emergency Management used Twitter for updates on recent wildfires, and the Aurora Police Department tweeted critical updates following the theater shooting there in July.
Some argue that online surveillance shouldn’t matter if we freely make so many of our tweets and status updates available to all on the Internet. One site developed by a British teenager compiles examples of Internet users openly describing their own misconduct, confessing to being hungover at work and threatening their bosses.
However, public notification is different from intelligence gathering. Dallas-area Internet lawyer Benjamin Wright said courts are sending mixed signals when it comes to using social media as evidence. But, he said, there’s one concurring opinion worth noting from the Supreme Court’s landmark January privacy case known as United States v. Jones. Justice Sonia Sotomayor argued that it may be time to reconsider the assumption that any reasonable expectation of privacy is lost when we hand over personal information to third parties.
“This approach is ill suited to the digital age, in which people reveal a great deal of information about themselves to third parties in the course of carrying out mundane tasks,” she wrote [PDF].
More troubling, Wright said, is deceptively “friending” people online for surveillance purposes. A college campus police investigator in Boston recently told Security Management magazine that he created fake identities to watch people and used a profile image his targets might “consider attractive.” Interns at a New Jersey prosecutor’s office were instructed to monitor as many alleged gang members as possible. Tax collectors have been using the Web for years, Wright said, and they view it as no different than reading newspapers for tips.
“It’s one thing for the IRS or a police officer to read the newspaper,” Wright said. “But it’s a completely different ballgame when that very same officer now is using some automated tool to track me around the Internet and read thousands of blog posts and tweets and filter it all through some kind of artificial intelligence software. People just start getting creeped out.”