I know what you downloaded from Freenet

The Freenet Project has been around since 2000. It was designed as a stealthy P2P network (some have called it a “darknet”) that distributes its content so broadly that it’s impossible to censor.

There are a number of security features in Freenet that other P2P networks lack. Because data that the network’s various nodes exchange is encrypted, it’s difficult, though not impossible, for an outside observer to know what’s being passed between two nodes. It is also nearly impossible to identify the author of a Freesite, or to identify the person responsible for inserting content into the network, unless they wish to be known. Most importantly, it’s nearly impossible for an outside attacker to determine whether a given node is requesting the data being sent to it, or is merely relaying it to another node.

These layers of obscurity and limited anonymity are what enable Freenet participants to exchange information freely. Content that is illegal, whether rightly or wrongly, flows freely through the network, cached on thousands of computers worldwide.

Who knows where that stuff came from? Each participant necessarily operates a Freenet node, which caches encrypted data that has either been requested by that node’s owner, or requested by other Freenet nodes. That is, one’s node will cache data that it is merely proxying for others. Caching enables the broad distribution of content that makes Freenet impossible to censor. It also introduces doubt about the origin of any data found in a given node’s cache. It helps to provide deniability.

Of course, anyone can find out what data is in their cache by decrypting it. If one applies the correct Content Hash Key (CHK), the data will be revealed. But because it’s encrypted, one can avoid knowing what’s in their cache simply by neglecting to run a list of CHKs against it – hence deniability in case a forensic examiner should locate illegal files in one’s Freenet cache. It is, or rather, ought to be, impossible to determine whether the owner of a particular machine requested the files in his cache, or if his node merely proxied and cached them for others.

Obviously, this works only so long as cached data that the node’s owner has requested, and cached data that his node has proxied, are indistinguishable. Unfortunately, The Register has discovered that this is not the case for large files.

Behavioral differences Let there be a file of, say, 700 MB – maybe a movie, maybe warez, and possibly illegal, that you wish to have. Your node will download portions of this “splitfile” from numerous other nodes, where they are distributed. To enable you to recover quickly from interruptions during the download, your node will cache all of the chunks it receives. Thus when you re-start the download after an interruption, you will download only those portions of the file that you haven’t already received. When the download is complete, the various chunks will be decrypted and assembled, and the file will be saved in your ~/freenet-downloads directory.

If you destroy the file but leave your cache intact, you can request it again, and the file will appear almost instantly. And there’s the problem.

Freenet distributes files in a way that tends to select for frequently-requested, or “popular” data. This is partly because the other nodes that one’s requests pass through will also cache parts of any files one requests.

We tested this, and found that a 50 MB file took six hours to download the first time we tried. After we eliminated the contents of our own local cache, we requested the file again, and it took only two hours and 20 minutes. Clearly, our “neighborhood” nodes had been caching a good deal of it while we downloaded it the first time. That behavior is by design, and it’s nothing to be concerned about. The difference in download times between files never downloaded before and ones cached nearby is not revealing, because anyone else nearby might have initiated the request.

However, it is quite easy to distinguish between a large file cached in nearby nodes and one cached locally. And that is a very big deal.

As we noted earlier, a large splitfile will be cached locally to enable quick recovery from download interruptions. The problem is, the entire file will be cached. This means that, when a file is downloaded once, so long as the local cache remains intact, it can be reconstructed wholly from the local cache in minutes, even when the computer is disconnected from the internet. And this holds even when the browser cache is eliminated as a factor.

We tested this by downloading the same 50 MB file and removing it from our ~/freenet-downloads directory, while leaving the local Freenet cache intact. On our second attempt, it “downloaded” in one minute, nine seconds.

We ran the test again after disconnecting our computer from the internet, with the Freenet application still running, and it “downloaded” in one minute, fifty seconds.

So, it took six hours initially; two hours, twenty minutes with neighboring nodes caching it thanks to our request; and less than two minutes with our local cache intact, even when disconnected from the net.

The difference in download time between a splitfile cached locally (seconds) and one cached nearby (hours) is so great that we can safely dismiss the possibility that any part of it is coming from nearby nodes, even under the best possible network conditions. It’s absolutely clear that the entire file is being rebuilt from the local cache. Forensically speaking, that information is golden.