Searching with Google means sifting through lots of information. For people who make online research their profession, it is important to know how to get the best, most accurate, reliable data.
Google’s mission is to “organize the world's information and make it universally accessible and useful.” With this mission statement, the search engine strives to make it easy for people to find information on a variety of topics, from many sources, in any format, and present it in the most useful way. Some answers are easy to find — most people know how to type specific terms and phrases into the search bar. But research analysts and online investigators need to become familiar with more advanced tools and techniques to get higher fidelity outcomes and zero in on specific search results.
For people who make online research their profession, it is important to know how to get the best, most accurate data and how to do it safely, without jeopardizing their mission or alerting the person or organization they are investigating.
Google search engine is free to use, but, as Google’s former design ethicist famously stated, “if you're not paying for the product, then you are the product.” It's no secret that Google and its parent company, Alphabet, collect dozens of data points about its users, their interests, locations, language preferences, activities, patterns of life, etc. It helps the search engine to personalize results and present data that’s more relevant to its users. It also makes it harder to do anonymous research without being tracked online. For analysts, investigators and online researchers, Google is a treasure trove of information, but they need to tread carefully to protect their missions and identities.
Getting higher fidelity results with basic search techniques
To better refine search results, Google ranks webpages based on its proprietary algorithm that takes into account over 200 different factors and uses AI to process natural language in the same way that humans do. Some of the elements that determine how a page is ranked include quality and distinctiveness of content, frequency and location of keywords, the page’s overall appearance, loading speed, ease of navigation, and keywords organization.
Google also looks at how a website relates to other sites, such as how many pages it’s linked to and how many pages link back to it. Naturally, website owners will use all available means — legitimate and otherwise — to optimize their ranking by refreshing their sites, including visual elements, targeting specific keywords and content and maximizing their backlinks to appear more relevant.
Searching with Google means sifting through lots of information — and many marketers will stop at nothing to bring their page ranking higher, which can translate into sizable increases in revenue. For analysts, it helps to know a few tricks that can help narrow down search results and get the precise information they need faster.
- Putting a phrase inside quotes will return the exact result containing all the words in a phrase (e.g.: “cyber security”)
- Adding an “OR” between each keyword or phrase will help search for both relevant terms (e.g.: malware OR ransomware)
- To search for a specific website, add “site:” before entering the website or domain name (e.g.: site: authentic8.com or site: .edu”)
- To exclude words from a search, put a minus (-) or the word “NOT” in front of the term you want to omit (e.g.: “virus NOT COVID”)
All of these are relatively common search techniques, and analysts can find many helpful articles, including from Google itself, to learn what’s available to them.
Searching with Google dorking and other advanced tools
Beyond simple queries, analysts can use advanced operations to find information that’s not readily available to a casual searcher. Google dorking (a.k.a. “Google hacking”) is a technique for using specific terms to locate information that’s not meant to be seen but is not well guarded. Google dorks are often used by hackers to find usernames and passwords, steal financial information and uncover vulnerabilities within web servers. A whole repository — the Google Hacking Database — exists to compile and document specific dork queries used to find poorly configured servers that can expose sensitive information, reveal unsecured webcams, give unprotected access to internal corporate documents and directories, and more. New dorks are being invented and added every day; and are used by both webmasters to test and locate flaws within their pages, and by hackers to exploit these flaws.
Here are a few examples of Google dorks:
- Intitle will tell Google to show only those pages that have the desired search term in their HTML title
- Inurl searches for a specific term contained within the URL
- Define will return sites with definition for a specific search term
- Link will look within a site for a specific linked URL
- Filetype returns results for a specific file type (e.g.: filetype: PPT)
- Allinanchor shows pages whose anchor tags contain the search phrase
For research analysts, Google hacks can mean getting results faster and with better precision, but they need to exercise extreme caution, as many of these dorks are created for nefarious purposes. Just last year, a Swiss hacker named Till Kottmann was indicted for conspiracy, wire fraud and aggravated identity theft. Using Google dorking techniques, Kottman and their associates reportedly found “super admin” credentials to internet-connected cameras in several U.S. facilities, including prisons, hospitals, warehouses, and even a Tesla factory. A few years prior, Iranian hackers gained access to a dam near NYC by using Google dorks to gain access to government-owned computer systems.
Aside from dorking, researchers have at their disposal several other tools that can help refine their results. For example:
- Google Alerts: A tool from Google that’s designed to update users on new search results for their queries. When new results appear that match the searcher’s topic of interest, Google sends an email to the user (who controls the frequency of updates)
- Google Trends: A Google tool that shows what’s trending on any given date in any geographic location. Analysts can use Google Trends to stay updated on current events in a location they are researching, keep track of what’s happening in the news, get crime statistics, market numbers and more
- Reverse Image Search: Allows analysts to upload an image and locate similar images anywhere on the web. Matching images could be useful for identifying places, brands, buildings and more, and gives researchers information about where the image was taken and even find the date when the image was first published. It’s a useful tool for verifying the authenticity of viral images.
Successful online investigators typically combine these techniques to achieve their specific goals.
Research securely and anonymously
All the data that Google collects is great for research, but anyone who browses the web or uses a search engine is leaving behind a trail of crumbs that builds a profile on them too. Online investigators often work on sensitive projects, and revealing their identity and intent can compromise their mission, alert their research subjects or even invite retaliation from hackers. Using a VPN or a dedicated machine that’s not connected to the organization’s network helps reduce the amount of data that Google and other websites are able to collect, but these measures fall far short of guaranteeing online anonymity and protection from web-borne threats.
For online researchers, it’s important to use tools that will ensure that their browsing and search activity cannot be traced back to them or their organization. And if during the search, they accidentally click on a malicious link or download an infected file, that no malware reaches their endpoint and spreads to the network.
How Silo can help
Silo for Research is purpose-built for secure and anonymous online investigations. It provides users with a familiar web experience through an isolated, cloud-based browsing environment that air gaps researchers’ computers from the web, eliminating the threat of malware. Each session is launched as a one-time-use browser, ensuring cookies and supercookies don’t follow investigators — even between sessions. Silo for Research enables users to search the web anonymously through a global managed research network with non-attributed IP addresses and robust control of the digital fingerprint.
Learn more on our website.Anonymous research OSINT research