Using stand-off observation and measurement to understand aspects of the global internet
Date
2022
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of Delaware
Abstract
The Internet has become a central part of our daily lives. In the meantime, the Internet is a very complex system and it is challenging to understand the nature of the Internet ecosystem from different perspectives. To extend our knowledge of the Global Internet and better understand the nature of the Internet, we design unique active and passive measurements to study several crucial components of the Internet, including anycast in global routing, open proxy ecosystem, and transparent proxy systems. ☐ Anycast has been widely adopted by today's Internet services, including DNS, CDN, and DDoS protection. Prior research has focused on various aspects of anycast, either its usage in particular services such as DNS or characterizing its adoption by Internet-wide active probing methods. We first explore an alternative approach to characterize anycast based on previously collected global BGP routing information. Leveraging state-of-the-art active measurement results as near-ground-truth, our passive method without requiring any Internet-wide probes can achieve high accuracy in detecting anycast prefixes. While investigating the root causes of inaccuracy, we reveal that anycast routing has been entangled with the increased adoption of remote peering. The invisibility of remote peering from layer-3 breaks the assumption of the shortest AS paths on BGP and causes an unintended impact on anycast performance. ☐ Open proxies provide free relay services and are widely used to anonymously browse the Internet, avoid geographic restrictions, and circumvent censorship. To shed light on the ecosystem of open proxies and characterize the behaviors of open proxies, we conduct a large-scale, comprehensive study. We characterize open proxies based on active and passive measurements and examine their network and geographic distributions, performance, and deployment. In particular, to obtain a more in-depth and broader understanding of open proxies, we analyze two particular groups of open proxies---cloud-based proxies and long-term proxies. To process and analyze the enormous amount of responses, we design a lightweight method that classifies and labels the proxies based on DOM structure which defines the logical structure of Web documents. Furthermore, we parse the contents to extract information to identify the owners of proxies and track their activities for deploying malicious proxies. We reveal that some owners regularly change the proxy deployment to avoid being blocked and deploy more proxies to expand their malicious attacks. ☐ Transparent proxies are one type of web proxies that host between clients and servers. Transparent proxies intercept requests and responses between clients and web servers. In this work, we study an overlooked issue around web browsing, the hidden interception of the HTTP path by on-path devices, which is not yet thoroughly studied and well understood by previous works. We propose a novel method that utilizes designed requests to detect the interception to discover the hidden transparent proxies. We characterize various aspects of transparent proxies – geographically and AS level distribution, server hosting, software, and services. We investigate the vulnerabilities of transparent proxies and examine the impact on end-users.
Description
Keywords
Anycast, Computer networking, Cybersecurity, Measurement, Open proxy, Transparent proxy