Advanced search operators to access the "deep web" through standard search engines.
Core Operators
site:
Restricts results to a specific domain or top-level domain.
Syntax: site:domain.com or site:.gov
site:gov "climate change" # Government sites about climate change
site:edu filetype:pdf # PDFs from educational institutions
site:pacer.gov "case number" # Federal court records
Strategic Use:
- Target official sources (
.gov,.mil,.edu) - Find country-specific information (
.uk,.ru,.cn) - Exclude commercial results
filetype:
Restricts results to specific file formats.
Syntax: filetype:extension
Supported formats: pdf, doc, docx, xls, xlsx, ppt, pptx, txt, csv, kml, kmz
filetype:pdf "budget report" 2024 # Budget PDFs from 2024
filetype:xls "confidential" # Excel files marked confidential
filetype:kml "military base" # Google Earth files of military bases
Why it matters:
- PDFs often contain scanned documents not indexed as text
- Spreadsheets reveal raw data
- Presentations show internal strategies
intitle:
Searches for keywords specifically in the page title.
Syntax: intitle:"keyword" or allintitle:keyword1 keyword2
intitle:"index of" "confidential" # Open directories with confidential files
intitle:"login" inurl:admin # Admin login pages
allintitle:budget 2024 draft # Pages with all three words in title
inurl:
Searches for keywords within the URL structure.
Syntax: inurl:keyword
inurl:admin login # Admin login pages
inurl:pdf site:gov # Government PDFs (URL contains "pdf")
AROUND(X)
Finds pages where two words appear within X words of each other.
Syntax: term1 AROUND(5) term2
Biden AROUND(5) "tax policy" # Biden mentioned near tax policy
"climate change" AROUND(10) funding # Climate change near funding discussions
Why it's powerful: Finds contextual relationships more precisely than simple keyword matching.
Strategic Combinations
The "Government Leak" Query
site:gov filetype:pdf "not for public release"
Finds government PDFs marked as restricted.
The "Corporate Intelligence" Query
site:* filetype:xls "confidential" -site:example.com
Finds Excel files marked confidential, excluding a specific domain.
The "Open Directory" Query
intitle:"index of" "parent directory" "budget"
Finds open web directories containing budget files.
Practical Workflows
Investigating a Company
site:sec.gov "Company Name" filetype:pdf"Company Name" AROUND(5) lawsuitsite:pacer.gov "Company Name""Company Name" filetype:xls "financial"
Researching a Person
"Full Name" -obituary -facebook"Full Name" site:linkedin.com"Full Name" filetype:pdf"Full Name" AROUND(5) "arrested" OR "charged"
Legal & Ethical Considerations
- Legal: Google Dorking accesses only publicly indexed information. No hacking is involved.
- Ethical: Just because something is indexed doesn't mean it was meant to be public. Use judgment.