What is the open web?
The internet that we use for everyday activities — like browsing, searching, reading the news, online shopping and social media — is known as the open web. Organizations also rely on it for their operations — from customer outreach and advertising to sophisticated commerce engines and live event streaming. This layer of the web is also commonly referred to as “open” or “clear” web. These terms help differentiate it from the deep web, which contains unindexed content (often hidden in databases and research papers and protected by paywalls) and from the dark web, which requires specialized software for access and is designed to safeguard its website owners’ anonymity. Unlike its deep and dark counterparts, the open web is truly open, with information conveniently indexed and available for common search engines to collect and present to users in response to their queries.
What should online investigators watch out for when researching on the open web?
Even when investigating on the open web, analysts need to take measures to protect their devices from accidental exposure to malware, and themselves from inadvertently exposing their identity and intent. Traditional browsers such as Chrome, Firefox or Safari are built to track users and obtain an array of information about their device, browsing activity and online behaviors. Search engines and websites may display differently based on your location, how much time you spend on a particular page, the browser and device you’re using, etc. These details are correlated your online behaviors across different sites to paint a more complete picture of your specific interests. Most of this information is monetized and resold; but for an online investigator, this type of tracking can present a problem. The same tracking mechanisms that enable personalized ads and simplify shopping experiences can be exploited by adversaries and investigative targets. All this data collected over time across sites can easily give away an investigator’s identity and intent.
What specifically is being tracked when investigating on the open web?
In addition to cookies, there are many other types of data that websites and devices track to help profile and identify you. Your digital fingerprint includes everything from which sites you click on (and which ones you skip) to the type of connection you use (IP address and provider), your hardware (device type, OS, video and audio cards), configurations (keyboard and language settings, time zones, etc.), installed software and plugins, and even seemingly random things like battery status. All of this information helps browsers track you across sessions. And while millions of web users around the world have similar devices and search for the same terms, traditional browsers are capable of fingerprinting users based on small differences and distinct combinations of settings and behaviors that make your online presence incredibly unique.
Do VPNs and private browsing make investigating on the open web safer?
It’s important to understand that by simply turning off the most commonly used cookies, connecting through a VPN or switching to Google’s Incognito or “private browsing” mode, investigators are not fully protecting their identities or ensuring anonymity.
A VPN can change or “spoof” the geographic location that your device appears to connect from, but it’s unable to alter any other elements of your online identity, leaving your device susceptible to identification. While it’s important for the analyst’s disguise their machine’s location, VPNs don’t address many other components of the “location narrative” such as language, time zone and keyboard settings, among others.
Private browsing blocks certain cookies to limit tracking while you search the web. However, search engines still track your activity in other ways. Even if cookies are blocked in the browser settings, super-cookies remain active, among other trackers. Below are some examples of information being tracked online, even while using private browsing:
- Canvas fingerprinting: Draws an image in the background of your machine to take a fingerprint of the rendering engine
- E-tags: Continues to track items such as what info you’ve already viewed/clicked on a page
- Battery status API: Can be used to continuously identify a mobile phone across multiple contexts
With all this unmanaged information still flowing from analysts' machines, VPN and private browsing mode fall short of truly concealing the investigator’s identity and intent.
How to minimize risk when investigating on the open web
Increasing the success rate of investigations relies on secure, anonymous access to credible information. Minimizing risk is key — and that requires a solution purpose-built to protect analysts, organizations and the integrity of data collected as evidence. A managed attribution service like Silo for Research conceals identities during online research, providing the anonymity and access investigators need. From financial fraud specialists to corporate security or trust and safety teams, to law enforcement, analysts can more safely, easily and efficiently conduct anonymous research on both dark and open web to maximize productivity and improve outcomes.
To learn more about investigating on the open web, see: