How to Catch When Proxies Lie: Network Proxy Service Location Verification Using An Active Geolocation Algorithm
People all over the world use commercial proxies to hide their location either to access blocked content or to alleviate privacy issues. However, there is a little to know about the actual physical location of servers that proxy providers use. This is a crucial factor which directly affects whether it is secure enough to use a particular proxy.
A group of US researchers from the University of Massachusetts, Carnegie Mellon University, and Stony Brook University published a paper examining the accuracy of claims made by seven popular proxy providers concerning the physical location of their servers. Here are the main takeaways of this research.
Proxy operators often do not share any information that might be used to verify the accuracy of their claims about the physical location of the servers used. IP-to-location databases usually confirm the data shared by providers. However, there is evidence of errors in such databases.
During their work, the researchers assessed the location of 2269 proxy servers owned by seven companies and located in 222 countries and territories. The analysis has demonstrated that at least one-third of all servers are not located in the countries they should be according to the claims of the proxy providers. Instead, these servers are located in countries with cheap and reliable hosting such as the Czech Republic, Germany, Netherlands, UK, or the US.
Server location analysis
Commercial proxy and VPN providers can manipulate IP-to-location databases, for example, by changing location codes in the names of the routers. As a result, they brag about a wide range of available locations where their servers are working linking to IP-to-location databases, while in reality their hardware is located in a short list of particular countries.
To verify the actual location of proxy servers, the researchers used active a geolocation algorithm which assessed the round-trip of a packet sent to the server and then to other known hosts on the internet.
Less than 10% of proxy servers will respond to a ping, and researchers could not install any tracking software to the server itself. They could only send packets via proxy and measure the round-trip time. The overall round-trip of a packet sent to a website via a proxy is the time needed to get to the proxy plus the time from the proxy to the destination.
Researchers have developed specialized software based on four active geolocation algorithms: CBG, Octant, Spotter, and hybrid Octant/Spotter. The code of the solution is available on GitHub.
Since they could not rely on IP-to-location databases, the researchers used a list of RIPE Atlas anchor hosts. This database is accessible online, regularly updated, and contains locations proven to be correct. Moreover, hosts from the list are continually pinging each other and update round-trip times in the public database.
The final solution is a web app which tries to establish the HTTPS connection over a TCP sending the request to an unsecured HTTP:80 port. This will fail if the server is not listening on a port 80. However, if the server is listening on this port, the browser will reply to the SYN-ACK with a TLS ClientHello packet which triggers a protocol error. As a result, the browser will report a failure after a second round-trip.
Using this method, the app can measure the time of one or both round-trips. There is also a CLI version of the tool.
No tested proxy providers share information on the specific location of their servers. They might name the city, but most often there is only information about the country. Moreover, even when the city is mentioned, this does not guarantee this information is correct. Researchers identified situations when the configuration files of the servers were using one city name (usa.new-york-city.cfg) but instructions in the file directed a connection to another city (chicago.vpn-provider.example).
Researchers were able to confirm the physical location of 989 of 2269 IP-addresses tested during the experiment. For 642 addresses they could not get reliable data, while 638 were confirmed to be not in the country the provider had claimed. More than 400 such fake addresses were not even located on the continent they should have been according to the provider's marketing materials.
Suspicious hosts were detected from every one of the seven tested providers. Researchers referred directly to these companies, but they declined to comment on the matter.