How Governments Censor the Internet in Real Life: Analyzing Blocking Mechanisms
A group of researchers from India has conducted a thorough analysis of new censorship mechanisms, approaches, and technologies used by governments worldwide. They took their home country as an example and examined blocking tools used by internet providers in India to restrict access to forbidden information. Researchers also studied the effectiveness of these methods and the probability of bypassing them. Today we will review the main facts from this study.
In recent years, researchers around the world have conducted dozens of studies of internet censorship methods. The main issue with these studies is that most of them focus on countries believed not to be free societies, such as North Korea or Iran. These states implement hard blocks: North Korea doesn’t even allow its citizens to connect to the internet. Instead, they are locked in a large intranet model.
However, even democratic countries like India are building massive infrastructure for internet censorship. The methods used by these countries are much more flexible and therefore more interesting to study.
During the preparation of the study, researchers came up with a list of 1200 potentially blocked websites (PBWs). They gathered information from open sources like Citizen Lab or Herdict. Then they acquired internet connections via the nine most popular ISPs in India.
For determining whether there was censorship, the OONI tool was used.
OONI vs custom script for detecting blockades
The researchers had planned to use the OONI tool, but it turned out to be ineffective and gave lots of false positives. Manual verification of the results revealed a lot of inaccuracies.
The poor quality of work may have been due to outdated detection mechanisms used in OONI. For example, when detecting a DNS filtering, this tool compares the given host IP address returned by Google DNS (which is believed to always be uncensored) to the IP address assigned to a website by an ISP.
If these IP addresses do not match, OONI detects the blockade. However, there are multiple, legitimate situations when such addresses may not match, but not due to censorship reasons. For example, this may occur when a CDN network is used.
Thus, researchers had to develop their own censorship detection script by themselves. Below we have published a list of the popular blocking methods analyzed in their study, along with their effectiveness analysis.
Censorship mechanisms, and what exactly are middleboxes?
The analysis revealed that all blocking patterns have one thing in common: they are implemented using custom network elements. Researchers called them middleboxes. These intermediaries intercept a user's traffic, and analyze it. If the attempt to connect to a censored website is detected, middleboxes inject the traffic with specialized packets.
To detect the presence of middleboxes, researchers have developed a new method called Iterative Network Tracing (INT). It uses the principles of a traceroute tool. The main idea of this technique is to send web requests to censored websites with increased TTL values in IP headers.
DNS resolution is the primary step on the way to accessing any website. The URL entered by the user is resolved to an associated IP address. In the case of DNS blocking, censors always spring into action at this stage. The control as set by a censor returns the wrong IP address to the user. As a result, the target website does not open (DNS poisoning).
Another blocking method is to use DNS injections. In this case, the middlebox intercepts the DNS request and sends the response with the wrong IP address.
To detect DNS blocking applied by ISPs, the researchers used Tor circuits with exit nodes in uncensored countries. If the website opens via Tor and does not load via the ISP connection, there is a block.
After the detection of a website blocked by DNS, researchers determined the exact blocking method.
TCP/IP packet filtering
Blocking via packet header filtering is a popular form of internet censorship. There are lots of studies and experiments dedicated to detecting such blocks.
The problem here is that this method may be easily confused with legitimate system errors or failures. Unlike HTTP-based blocking, when TCP/IP filtering is used, the user does not get any notification from censors. The target website just does not load. It is tough to distinguish between censor-applied blockades and common network failures/errors.
Nevertheless, researchers have tried to solve this problem. To do this, they used a handshake procedure. Handshake packets were tunneled via Tor circuits with exit nodes in uncensored countries. If the website could be reached via Tor, the handshake process was repeated five times in a row with a delay of approximately two seconds. If all these attempts were unsuccessful, there was a high probability of deliberate filtration.
This type of blocking was not detected during the research.
Unlike TCP/IP filtering, HTTP-based blocks were detected for five out of nine ISPs. This method involves analyzing the content of HTTP packets. It is done with the help of middleboxes.
To detect the facts of HTTP filtering, researchers created Tor circuits with exit nodes in uncensored countries. Later, they compared the web content received after sending requests to websites via ISPs to those sent via Tor.
The main task here was to detect the moment when the blockade is applied. For example, for some ISPs, after sending an HTTP GET request, the response was HTTP 200 OK with a set TCP FIN with a blocking notification. This response forces the browser to abort the connection with the target website. However, after this, the packet from the site could arrive. This means it was hard to understand what triggered the block, the request sent by the user, or the website's response.
Researchers managed to understand this using straightforward manipulation. In a header of an HTTP GET request, they replaced the Host field with HOST. It was enough to make the blocked website open again. Here we have evidence of censors only monitoring requests sent by the user, not server replies.
The research has also shown that, often, an ISP could not apply a block by itself, passing this task to ISPs that manage neighboring network segments. During the experiment stage, several providers did not perform any blocks, but their users were still unable to connect to websites blocked in the country.