22 August 2013 by Published in: rants 6 comments

The EFF managed to get one of the FISC court rulings declassified:

NSA’s targeting and minimization procedures, as the government proposes to apply them to MCTs as to which the “active user” is not known to the a tasked selector, are inconsistent with the requirements of the Fourth Amendment.

However, there is one point of fact that, as best as I can tell, has escaped public analysis. I am not a real lawyer, I just play one on TV, so take my analysis with a grain of salt.

Now I preface this by saying that, I am really not a conspiracy theorist. As this story has developed, I have been, perhaps, more pro-NSA than many others in the hacker community would like, and I have intended to interpret facts, if perhaps not in the best possible light for the NSA, certainly not the worst.

However, I have tried very hard to come up with a reasonable explanation for this new set of facts, and so far the only reasonable explanation I can see… well… sounds like a conspiracy theory. Keep in mind that I have written this blog post very fast, and it’s possible I have made a mistake somewhere.

The declassified decision deals primarily with what they call NSA’s “upstream collection”, as opposed to its “non-upstream collection”. Unfortunately, the details of what these terms mean are largely classified. However, by looking carefully at the footnotes, a picture begins to emerge. Footnote 3 says that

The term “upstream collection” refers to NSA’s interception of Internet communications as they transit _______ , _______, rather than to acquisitions directly from Internet service providers such as ______, ____ ____ ____ ____________________

Over on p. 29, the decision notes that

NSA requires more than two hundred fifty million Internet communications each year pursuant to Section 702, but the vast majority of these communications are obtained from Internet service providers and are not at issue here. Indeed, NSA’s upstream collection constitutes only approximately 9% of the total Internet communications being acquired by NSA under Section 702.

A footnote then clarifies as follows:

In addition to its upstream collection, NSA acquires discrete Internet communications from Internet service providers such as _____ _____ ____ … NSA refers to this non-upstream collection as its “PRISM collection”.

Based on these facts, a picture begins to emerge.

  1. Approximately 9% of the NSA’s collections, or more than 22.5 million “Internet communications each year”, are intercepted “as they transit” some classified thing. I presume the classified thing is something like “undersea cable” or “Internet backbones”.
  2. The remaining 91% of the NSA’s collections, or more than 227 million “Internet communications each year”, are acquired as part of the PRISM program, “from Internet service providers”.

Now the particular Internet service providers that are listed are redacted from the ruling. And you may think, being a reasonably well-informed technical person, that “Internet service providers” refers perhaps to companies like AT&T, who have been known to forward their traffic to the NSA. But I think a careful reading of this decision precludes that possibility.

Footnote 24 says that

The Court understands that NSA does not acquire “Internet transactions” through its PRISM collection.

What is an “Internet transaction” you ask? Footnote 23 provides the answer:

The government describes an Internet “transaction” as “a complement of ‘packets’ traversing the Internet that together they may be understood by a device on the Internet and, where applicable, rendered in an intelligible form to the user of that device.”

In other words, the “PRISM collection” does not consist of TCP or UDP connections. And so if it does not consist of packets, it probably does not originate at the router level. In fact, those “router backdoor” programs are probably part of the 9%–that is, the “upstream collection”.

So what does the document mean by “Internet service providers” that comprise the PRISM program? Well, I think it must mean companies like Microsoft, Google, and Facebook et al, who “provide” “services” over the Internet. Even the NSA’s own slides call the members of the PRISM program “providers”:

NSA calls these "providers"

Well yes, we all knew that these were the PRISM companies, ever since the Guardian originally broke this story back in June. So what is the news here? The news is as follows: 91% of the NSA’s total data collection comes from these companies, or companies like them. And in absolute terms, more than 227 million “Internet communications each year” are sourced from these companies, or companies like them.

Now I have seen it speculated, by reasonable, smart people, that PRISM is merely an API that the NSA uses to access data pursuant to a lawful order, such as a court order. In fact, until about 60 minutes ago, this view was also mine. Claire Miller of the New York Times explains:

Each of the nine companies said it had no knowledge of a government program providing officials with access to its servers, and drew a bright line between giving the government wholesale access to its servers to collect user data and giving them specific data in response to individual court orders. Each said it did not provide the government with full, indiscriminate access to its servers.

The companies said they do, however, comply with individual court orders, including under FISA. The negotiations, and the technical systems for sharing data with the government, fit in that category because they involve access to data under individual FISA requests. And in some cases, the data is transmitted to the government electronically, using a company’s servers.

However, in light of new facts, the theory doesn’t hold up. The notion that there is any court in this country that has issued within three orders of magnitude of 225 million specific “individual requests” is ludicrous. For example, a Wired story reports that there were 1,856 FISA orders in 2012, and Facebook reports that they received “between 9,000 and 10,000” requests “from any and all government entities in the U.S.” which includes routine criminal requests over a six-month period. It is unclear what the FISC decision means by “Internet communications” (individual emails? individual chat messages? wall posts?) but it should be clear that volumes of 10,000 specific requests across all federal, state, and local law enforcement anywhere in the United States should not get you within a few lightyears of the NSA collecting 225 million anything.

So, we are faced with a rather significant question. Who is providing the NSA more than 225 million “Internet communications” per year under PRISM?

Take a look at Google’s statement for a minute:

Now, what does happen is that we get specific requests from the government for user data. We review each of those requests and push back when the request is overly broad or doesn’t follow the correct process. There is no free-for-all, no direct access, no indirect access, no back door, no drop box.

Now I’m certainly not accusing Google of lying. But clearly, if all the PRISM companies are only complying with “specific requests” that are not “overly broad” and are “each” “reviewed”, it is impossible to end up collecting 225 million anything.

Well, I have only had an hour or so to process this information. But, as a avowed non-conspiracy-theorist — all the explanations I can think of are — well — conspiracy theories.

I mean either:

  • The NSA’s collection of “more than two hundred fifty million Internet communications each year” is a misrepresentation to the court. While the idea that the NSA would misrepresent facts to the Court apparently does not surprise the Court (see Footnote 14), it would certainly surprise me if they are exaggerating the extent of their collection, rather than playing it down.
  • One or more of the PRISM companies has told a bald-faced lie about the scope of their involvement with the NSA. I have seen this idea floated, and I have until this time found the suggestion more than a little crazy that 9 publcly-traded companies would all conspire to go out and tell lies, no matter how ostensibly similar their statements seem.
  • There is some new company — that apparently didn’t rate a mention on any of the PRISM slides — that is actually responsible for the bulk of the PRISM data. Why leave them out? And what company that is large enough to have access to hundreds of millions of non-packet-level “Internet communications” has not yet denied involvement with PRISM?

I’m struggling — really struggling — to find an explanation for these facts that isn’t totally crazy. Anyone want to walk me back from this cliff?


Want me to build your app / consult for your company / speak at your event? Good news! I'm an iOS developer for hire.

Like this post? Contribute to the coffee fund so I can write more like it.

Comments

  1. ix
    Thu 22nd Aug 2013 at 5:09 am

    Isn’t this just xkeyscore, which would not be collection of data moving between ISPs (which is what I think it would say in that blacked out “transit …” footnote), but collection of data straight from user connections by a “black box” that sends data on-demand to agents. As I understand it, these boxes filter the traffic they get and do not capture the full stream but only the content of that stream (so they don’t get the full transaction) or maybe only metadata in some cases (URLs might be enough for HTTP for instance).

    Since it necessarily must filter broadly (because deep filters take too much processing time) it ends up catching a lot of data, which might add up to the 200 million something. Or if you count every time an agent looks something up in this system, suppose he wants everything from one IP for the last 3 days. This would be hundreds or thousands of http connections, tens of e-mails sent, etc. Might well add up easily to 200 million.

  2. xo
    Thu 22nd Aug 2013 at 9:05 am

    I’ve read that in some (many?) cases, there are people at the nine Internet companies (among others) with very high security clearances, such that even the CEOs and legal counsel do not know what they work on. It has not been clear if these people are employees of the firms, or employees of the government, but if the CEO has actual deniability (as opposed to plausible deniability, thank you Oliver North) then it would explain their “we know nothing” statements. This would seem like a preferred stance by all participants, despite unusual warping of concepts like “employment” and “democracy.”

  3. Thu 22nd Aug 2013 at 10:00 am

    There are plenty of ways the facts could be interpreted so that each party could claim to be giving the “least untruthful” statement.

    For example, some leaked docs have claimed that FISA requests are always specific, while others have cited FISA requests that are basically a rubber stamp for any future collection. Likewise, PRISM has sometimes been used to designate only voluntary provision via a participating ISP of a specific user’s metadata. But in other places, it’s much more general. To @ix’s point, we already know that xkeyscore sits atop much more than metadata, and there are presumably systems we still don’t know about. We see this all the time in large IT systems — people just use the term that they think will get across the basic point to their audience.

    To your point about conspiracy, it was a conspiracy from the start, by definition. The government needed cooperation and silence from many parties, and used many forms of persuasion to ensure silence. Of course, they claim it is all totally constitutional, but the point is that all of the parties continue conceal the truth while striving to provide the “least untruthful” answer possible within the constraints of concealing the truth.

  4. Hugh
    Thu 22nd Aug 2013 at 12:27 pm

    If I wanted information on particular people, and presuming I got or need a court order, I’d ensure that it required the companies that held that information (email/social/etc) to give me their archives of this person, and also to update me as new information became available (private posts/location tagged photos auto-uploaded/cell tower id/etc). It’s not too hard to imagine a few thousand such entries per person of interest, which could scale into the 225 million.

  5. Fri 06th Sep 2013 at 9:44 pm

    Yes ix is right, in that these companies consider a “specific” court order to still be specific even if it is broad as saying that they are looking for anyone who has sent an email whose contents included the term “Vlademir Putin”.

    Also, they used to call “conspiracy theorists” detectives. There’s this illogical notion that simply because some people jump to off-the-wall theories (Alien reptiods from Andromeda run the NSA!!!) that anyone positing a theory that two or more people are working in secret toward a common goal, is not sane.

    The fact is that conspiracy theories occur CONSTANTLY. Every time a guy cheats on his wife with her friend or sister, the two are conspiring toward a common goal. If I tell you this is going on, it’s simply irrational to dismiss it out of hand because of the fact that I have a theory that your husband is boning your sister.

    To dismiss such theories is to accept what you are told at face value and be gullible as though you were born yesterday.

    If a detective sees a red stain on Bob’s shirt, and the Bob says it’s ketchup, and he sees hair under Bob’s finger nails, and Bob says it’s from petting his dog, and then says his wife is lying on the ground because she fell on a knife and died, and the detective suspects foul play, we do not call him a conspiracy theorist.

    We say he’s seeking the truth. That is all it is, and such professions do not have a monopoly on seeking it. I encourage you to seek it out and question everything.

  6. Jeff Berkowitz
    Tue 10th Sep 2013 at 11:45 pm

    Drew, you very nicely stated a question that has been troubling me since the whole thing started. My hypothesis is that the answer to your question is something like CDNs. I don’t want to name names because I don’t want to get sued, but you should look into the history of some of the early entrants in the CDN business.

Comments are closed.

Powered by WordPress