Infoga: Email Harvesting & OSINT Enrichment for Pentesters
In OSINT for authorized engagements, discovering an organization’s email addresses is often the first step of a social engineering risk assessment or an external attack-surface mapping. Infoga is a small but focused Python tool that automates the search for emails tied to a domain and correlates them with public sources (Shodan, breach databases).
Legal & ethical scope: Collecting and processing email addresses falls under the GDPR (Regulation 2016/679) in the EU and equivalent privacy laws elsewhere. Use Infoga only in engagements with written authorization, against your own domains, or against lab placeholders (
example.com,test.local). Using harvested addresses for unsolicited phishing is a criminal offense.
What Infoga is
Infoga is an open-source Python script (available on GitHub) aimed at information gathering around email addresses. Its purpose is twofold:
- Email harvesting: Search for public emails of a domain via search engines (Google, Bing, Yahoo, Baidu, and more).
- Email enrichment: Check a specific address against sources like Shodan and breach databases.
It has a small footprint, is written in Python 2/3 (newer forks support Python 3 natively), and is useful when you want fast discovery without installing a heavy framework.
Where it fits in the methodology
Email harvesting belongs to the External Reconnaissance / OSINT phase and maps to MITRE ATT&CK T1589.002 — Gather Victim Identity Information: Email Addresses. It usually precedes:
- Phishing simulations (red-team awareness assessments)
- Credential exposure analysis (HIBP, leak databases)
- Email pattern enumeration (firstname.lastname@, initial+lastname@, and so on)
Installation
# Clone from GitHub (lab / authorized use only)
git clone https://github.com/GiJ03/Infoga.git
cd Infoga
# Install Python dependencies
pip install -r requirements.txt
# or for Python 3 forks:
pip3 install requests
# Test run
python infoga.py -h
On newer systems where Python 2 is unavailable, look for an actively maintained Python 3 fork of Infoga.
Core parameters
-d, --domain | Domain to search for emails (for example, example.com). |
-i, --info | Specific email address for enrichment. |
-s, --source | Search source (all, google, bing, yahoo, baidu, shodan, …). |
-b, --breach | Check against breach databases. |
-v, --verbose | Verbosity level (1, 2, 3 — level 3 prints the queries in detail). |
--report <file> | Save results to a file. |
Practical examples (lab placeholders)
1. Search emails for a domain
python infoga.py --domain example.com -v 3
Infoga queries 8 search engines and returns the publicly indexed email addresses related to the domain. With -v 3 you see exactly which query is sent to each source.
2. Targeted source
python infoga.py --domain example.com --source bing -v 2
When you want to reduce noise or avoid rate-limits from Google, target a specific engine.
3. Enrichment of a single address
python infoga.py --info [email protected] -v 3
For an already-known email, Infoga searches for additional context in public sources.
4. Breach check + report
python infoga.py --domain example.com --breach -v 3 --report results.txt
python infoga.py --info [email protected] --breach -v 3 --report results.txt
--breach checks whether the found emails appear in public breach datasets — valuable input for a credential-exposure risk assessment of the target organization on your own domain.
Complementary tools
Infoga is not used in isolation. In a serious OSINT engagement it is combined with:
- theHarvester: a more complete email + subdomain harvesting tool.
- Hunter.io / Snov.io: commercial APIs with higher-quality data (paid).
- EmailHarvester: alternative CLI across multiple sources.
- HaveIBeenPwned API: the canonical source for breach checks (paid API key required since 2019).
- Email permutator scripts: for patterns such as
{first}.{last}@example.com.
Common mistakes
- Collecting personal data out of scope: Email is personal data under GDPR. Always stay inside the engagement scope.
- Google rate-limits: Many queries in a short time → captcha. Use
--source bingor--source yahooto spread the load. - Outdated Python 2 environment: Modern Kali no longer ships Python 2. Use an active fork or
pyenv. - Blind trust in results: Cross-check emails — scraping does produce false positives.
- Phishing without authorization: Never. Findings → report → recommendations, not active attack.
Defensive / Blue team perspective
- Email obfuscation: Avoid publishing plain-text email addresses on public web pages. Use contact forms instead.
- DMARC/SPF/DKIM: Correctly configured email-authentication records make spoofing difficult.
- Breach monitoring: Subscribe to HIBP for organizational emails (official enterprise API).
- Security awareness training: Emails will leak sooner or later — users need to recognize phishing.
- MFA everywhere: When email addresses become credential-stuffing targets, MFA is the strongest defense.
Best practices
- Confirm scope and legal authorization before every run.
- Record timestamps and sources for the pentest report.
- Apply minimal data retention — delete the emails when the engagement closes.
- Cross-validate with at least two different tools (Infoga + theHarvester).
- Do not run extensive queries from a production network — prefer a VPN or proxy to avoid IP-based rate limits.
Summary
Infoga is a light, focused tool for email harvesting and enrichment during authorized OSINT engagements. Combined with theHarvester and HIBP, it gives a quick view of an organization’s exposure to email-based threats. Always with care toward GDPR and a clear engagement scope.
Next steps
- theHarvester — more complete email/subdomain harvesting
- All Information Gathering articles
- External references: MITRE ATT&CK T1589.002, HaveIBeenPwned.
For training in OSINT, social engineering risk assessment, and ethical hacking, explore the courses at Audax Cybersecurity Academy.

