Persuading David Simon (Pinboard Blog)

[June 19 update: David Simon has been kind enough to respond at length here. He points out that I falsely stated that collecting call records requires a warrant; I have corrected that statement in the post below.]

I read with interest David Simon's recent blog post in which he responds to revelations that the NSA has been collecting the call records of all American mobile phone users.

David Simon, of course, created the Wire, a television series where institutions take on lives of their own and defy attempts by well-meaning people to reform them from within. So it came as a real shock to find Simon criticizing pundits who have objected to the extent of NSA surveillance, and accusing them of wilful ignorance about the nature of police work.

Mr. Simon pointed out that law enforcement agencies have been allowed to capture call records for decades, including in cases where the information harvested includes calls from people who are not under suspicion. In other words, there's nothing new going on to get worked up about.

Having labored as a police reporter in the days before the Patriot Act, I can assure all there has always been a stage before the wiretap, a preliminary process involving the capture, retention and analysis of raw data. It has been so for decades now in this country. The only thing new here, from a legal standpoint, is the scale on which the FBI and NSA are apparently attempting to cull anti-terrorism leads from that data. But the legal and moral principles? Same old stuff.

Seeing no difference in principle, only a difference in degree, in the NSA's surveillance program, Simon expresses annoyance with Americans who demand total protection from terrorism and then purport to be shocked when their government takes their requests seriously.

Mr. Simon cites the specific example of an investigation he covered as a police reporter in Baltimore in the 1980's. Criminals were using pay phones and pagers to evade detection, and tracking them down required indiscriminately recording numbers dialed from those pay phones, with the goal of sifting through the data later to find the pager numbers.

He argues that this kind of investigation, which targeted pay phones, was in some ways more invasive than the kind of tracking the NSA is accused of, since people expect to be anonymous when using a pay phone in a way that doesn't apply when they're calling from their own cell.

There is certainly a public expectation of privacy when you pick up a pay phone on the streets of Baltimore, is there not? And certainly, the detectives knew that many, many Baltimoreans were using those pay phones for legitimate telephonic communication. Yet, a city judge had no problem allowing them to place dialed-number recorders on as many pay phones as they felt the need to monitor, knowing that every single number dialed to or from those phones would be captured. So authorized, detectives gleaned the numbers of digital pagers and they began monitoring the incoming digitized numbers on those pagers — even though they had yet to learn to whom those pagers belonged. The judges were okay with that, too, and signed another order allowing the suspect pagers to be “cloned” by detectives, even though in some cases the suspect in possession of the pager was not yet positively identified.

I think Simon's fundamental argument, “same old stuff”, is mistaken in a number of important ways, and that some of this reflects our failure as technologists to communicate what modern surveillance can do.

First, there is the scope of the order. The Baltimore operation, and others like it, were limited to a specific criminal investigation. They were obtained ~~under a warrant~~ under a subpoena setting limits on what would be collected, and for how long.

The NSA program is universal and appears to be open-ended. Information is collected in aggregate. The program operates under the authority of secret court order, not a warrant. It is not clear whether the Administration even believes this type of surveillance requires a court order.

Second is the nature of the body carrying out the surveillance. In Simon's example, this was a municipal police force, overseen by a local court.

In our case, it's the NSA, a Federal agency whose job has traditionally been to collect foreign signals intelligence ①. The operation is overseen by a secret court system called FISC.

Third is the nature of the data being collected. When the Baltimore investigation took place, it collected a simple list of telephone numbers dialed from the monitored phones.

Modern call records contain much more data, reflecting the fact that almost all of us carry cell phones. A call record now includes unique device identifiers, routing information, cell tower IDs, and a wealth of additional information about the circumstances and location of the call. The location data is particularly powerful, turning mobile phones into de facto tracking devices whenever they are turned on.

Fourth is the question of oversight. The evidence used in the Balitmore case was collected by municipal police and presented (I'm assuming) in open court. Those against whom it was used had the chance to mount a defense, appeal the verdict to state and Federal courts, and enjoyed the presumption of innocence guaranteed to them by the Constitution.

The NSA call data is collected and used in secret. The agency is overseen as part of the very large national security establishment by a small, overworked group of legislators and senior government officials who have the requisite security clearance.

So I contend that the parallel Simon makes is false. The NSA is not a law enforcement organization, and intuitions from police reporting don't carry over.

But even if we grant the analogy, I think there's a more dangerous argument in Simon's essay, which is the contention that two programs that differ only in degree are necessarily "the same old thing". I believe this is not a safe assumption to make when talking about computers and their use in domestic surveillance.

In the portion of his essay that excited the most comment, Simon appears to express disbelief that the NSA can make broad use of the data it gathers:

When the government grabs every single fucking telephone call made from the United States over a period of months and years, it is not a prelude to monitoring anything in particular. Why not? Because that is tens of billions of phone calls and for the love of god, how many agents do you think the FBI has? How many computer-runs do you think the NSA can do — and then specifically analyze and assess each result?

Well, of course, the answer is "you would not believe how many 'computer-runs' the NSA can do". I believe this part of the essay especially caught tech people's attention, since it suggested that Simon might be naive about the capabilities of a modern datacenter. It's certainly the part Clay Shirky pounced on in his rebuttal.

But Simon is not a fogey who doesn't understand how powerful computers have become (though I feel that there are such people in positions of oversight in the House and Senate). I believe his error is in assuming that the analysis of these 'computer-runs' is any kind of bottleneck. There are powerful techniques for surfacing interesting features in any comprehensive list of interactions between human beings. I've written in the past about my distaste for the 'social graph' and the perverse worldview it imposes on our projects, but part of the appeal of that worldview is the real power of mathematics applied to exactly this kind of data. The analysis can be automated, and no good comes of it.

In a beautiful worked example, Kieran Healey has shown how a precocious British intelligence service could have identified Paul Revere as a person of particular interest based only on a set of membership lists of organizations he belonged to.

The point is, you don't need human investigators to find leads, you can have the algorithms do it. They will find people of interest, assemble the watch lists, and flag whomever you like for further tracking. And since the number of actual terrorists is very, very, very small, the output of these algorithms will consist overwhelmingly of false positives.

It's at this point that Simon's logic starts to work in the other direction. Given a long list of potential leads, investigators are going to focus on vetting the most likely, rather than taking any steps to clear false positives out of system. The penalty for missing a real terrorist is catastrophic, while the penalty for falsely accusing someone (when not only the accusation, but the very existence of the program, is secret) is nonexistent, even if the secret accusation ends up doing real harm. Limits on manpower won't constrain the investigation; they will only reduce its overall quality.

This isn't an abstract argument. We are all familiar with the tenebrous no fly list, a document that prevents several thousand people from traveling by air, and condemns thousands more to intrusive security measures each time they want to get on a plane. After 2001, this list rapidly expanded to thousands of names, with no avenues of appeal and no way to even check whether your name appeared in the document, to the point where the government finally had to improvise a 'redress' policy for travelers who found themselves living out a Kafka novel.

Characteristically, proposals for fixing the no-fly list and similar watch lists now call for collecting even more information, to help disambiguate people who share a name but not a date of birth with someone on the watch list. The basic problem—that lists of suspects are generated without accountability, without oversight, and with no incentive to avoid mistakes—persists.

There's also a more dangerous institutional problem to consider. When a system like this exists, it creates pressure for its own use. What is the point, after all, of having a very elaborate, extremely expensive ② database if you are only ever going to use it in exceptional cases? It is the nature of law enforcement to want to go after bad guys with all available tools. We saw a vivid demonstration of this in the years after the 2001 attacks, when the administration attempted to blur the lines between the 'War on Drugs' and the 'War on Terror', arguing that the proceeds from narcotics sales paid for terrorism.

Consider, too, a technique that has become standard in Federal investigations. It is a felony to make false statements to a Federal agent, and investigators routinely make use of this fact to gain leverage over a witness or suspect. People tend to be nervous when they talk to police, and unless they know better are liable to give inconsistent answers during questioning. Good interrogators can convert each of these inconsistencies into a felony count. Imagine how much more potent this tactic becomes when investigators can gain access to a database of your movements and contacts for the past decade.

The security state operates as a ratchet. Once you click in a new level of surveillance or intrusiveness, it becomes the new baseline. What was unthinkable yesterday becomes permissible in exceptional cases today, and routine tomorrow. The people who run the American security apparatus are in the overwhelming majority diligent people with a deep concern for civil liberties. But their job is to find creative ways to collect information. And they work within an institution that, because of its secrecy, is fundamentally inimical to democracy and to a free society.

I can't believe that David Simon, of all people, doesn't see the danger inherent in a permanent domestic surveillance program. I doubt that he would support a government initiative for all Americans to wear tracking devices in the name of fighting terrorism. Yet the NSA data collection program, whose output is functionally identical, seems not to trip the same alarm bells with him.

-:-

In public statements, the NSA director has defended domestic surveillance as a vital tool in preventing terrorism.

The term 'terrorism' is a magic word, unlocking government powers we normally associate with wartime. The current and previous Administration have, at various times, asserted the right of the government to conduct invasive and open-ended surveillance on people it suspects of terrorism, detain suspects in terrorism cases indefinitely without trial, 'render' them to countries for interrogation and torture, kill people it considers terrorists, including American citizens, with giant flying robots, or keep such people alive against their own will.

This is total power over human life. The authorities assure us that numerous checks exist to prevent abuses of this power, but of course the checks are also classified. The government is promising that the secret police won't put innocent people in the secret prisons because the secret courts would never allow it.

This system puts enormous pressure on a small group of fallible human beings. For the secrecy to work, the number of people in on the secret must be small ③. But this group is all part of the same hierarchy, subject to the same pressures, and unable to communicate its concerns outside the same closed circle.

Talk of secret prisons, indefinite detention, and force-feeding can sound tendentious (though it's all uncontested public record!). Americans have a deep faith in the rule of law and have not proven receptive to the argument that truly innocent people will find themselves placed in the "terrorist" category by accident.

There is a tendency among those who grew up under the rule of law to treat it like the Rock of Ages, an immovable substrate in which all the institutions of the state are forever anchored. And so even ordinarily skeptical people tend to assume that the government obeys its own laws when no one is looking. To an astonishing extent, and to the great credit of American civic life, this is actually true.

But I think a better metaphor for the rule of law is that it is the soil in which democratic institutions take root. Like the soil, it can be depleted. And once depleted, it is not easily replenished.

Secrecy erodes the rule of law because it makes democratic accountability impossible. Secrets can't be held too broadly, so secrecy concentrates responsibility and asks too much of human nature. That is why every intelligence agency, unless given rigorous outside oversight, commits terrible excesses.

I think Simon agrees about the perniciousness of this secrecy. In a later rebuttal he's called for a modern-day version of the Church Committee, a group of people from outside the security establishment with top-secret access and the power to compel testimony.

And I agree with Simon that the current state of affairs is the "inevitable consequence of legislation that we drafted and passed."

American politics since the Cold War has operated under the conceit that national security must transcend partisan differences. And so we have seen large bipartisan majorities voting for pre-emptive war and domestic surveillance even though both of those policies were highly controversial outside Congress.

This tradition has created a vast space beyond political accountability. When both political parties pursue a nearly identical policy, there are no electoral consequences when the policy proves disastrously wrong. Who do you vote against?

People have good intuitions about the danger of indiscriminate collection and retention of their data. They're not being hysterical. For the last decade, we've been concentrating on how to regulate the way this data gets used in the private sector. But now that the coercive power of the state has entered the picture, the stakes are much higher, and we have an opportunity to politicize the debate. David Simon tells us to resign ourselves to the consequences of technological change:

"The question is not should the resulting data exist. It does. And it forever will, to a greater and greater extent."

But I think that is wrong. Whether the data should exist, and for how long, is exactly the question. The answer is not a technological inevitability, but a political choice.

I believe a world in which everything is recorded and persists forever carries the seeds of something monstrous ④. It is in the nature of computer systems to remember things indefinitely, but there's nothing difficult about programming machines to forget. It just requires laws to do it. We can't treat it as a technical problem. And to get the laws passed, we need to politicize the issue.

Still, these barricades are going to seem awfully lonely if we can't even get David Simon up there with us. The man should be a natural ally, and the fact that he sounds so exasperated troubles me. The fact that he seems resigned to a future of total information retention troubles me. The fact that we are talking past each other troubles me most of all.

① Simon also mentions the FBI, but it's unclear to me that this agency has anything to do with the accusations of widespread call monitoring.

② The expensive part is keeping everything secret, and staffing it with people cleared for such access. The database itself is likely quite modest in size.

③ The Washington Post has estimated the number of people with Top Secret clearance at 854,000. The number of people with full knowledge of all secret programs is much smaller, as this information is carefully compartmentalized.

④ Except Pinboard archives. Those are great!

—maciej on June 15, 2013

Pinboard is a bookmarking site and personal archive with an emphasis on speed over socializing.

This is the Pinboard developer blog, where I announce features and share news.

How To Reach Help

Send bug reports to bugs@pinboard.in

Talk to me on Twitter

Post to the discussion group at pinboard-dev

Or find me on IRC: #pinboard at freenode.net