Infoga: Email Harvesting & OSINT Enrichment for Pentesters

In OSINT for authorized engagements, discovering an organization’s email addresses is often the first step of a social engineering risk assessment or an external attack-surface mapping. Infoga is a small but focused Python tool that automates the search for emails tied to a domain and correlates them with public sources (Shodan, breach databases).

Legal & ethical scope: Collecting and processing email addresses falls under the GDPR (Regulation 2016/679) in the EU and equivalent privacy laws elsewhere. Use Infoga only in engagements with written authorization, against your own domains, or against lab placeholders (example.com, test.local). Using harvested addresses for unsolicited phishing is a criminal offense.

What Infoga is

Infoga is an open-source Python script (available on GitHub) aimed at information gathering around email addresses. Its purpose is twofold:

  • Email harvesting: Search for public emails of a domain via search engines (Google, Bing, Yahoo, Baidu, and more).
  • Email enrichment: Check a specific address against sources like Shodan and breach databases.

It has a small footprint, is written in Python 2/3 (newer forks support Python 3 natively), and is useful when you want fast discovery without installing a heavy framework.

Where it fits in the methodology

Email harvesting belongs to the External Reconnaissance / OSINT phase and maps to MITRE ATT&CK T1589.002 — Gather Victim Identity Information: Email Addresses. It usually precedes:

  • Phishing simulations (red-team awareness assessments)
  • Credential exposure analysis (HIBP, leak databases)
  • Email pattern enumeration (firstname.lastname@, initial+lastname@, and so on)

Installation

# Clone from GitHub (lab / authorized use only)
git clone https://github.com/GiJ03/Infoga.git
cd Infoga

# Install Python dependencies
pip install -r requirements.txt
# or for Python 3 forks:
pip3 install requests

# Test run
python infoga.py -h

On newer systems where Python 2 is unavailable, look for an actively maintained Python 3 fork of Infoga.

Core parameters

-d, --domainDomain to search for emails (for example, example.com).
-i, --infoSpecific email address for enrichment.
-s, --sourceSearch source (all, google, bing, yahoo, baidu, shodan, …).
-b, --breachCheck against breach databases.
-v, --verboseVerbosity level (1, 2, 3 — level 3 prints the queries in detail).
--report <file>Save results to a file.

Practical examples (lab placeholders)

1. Search emails for a domain

python infoga.py --domain example.com -v 3

Infoga queries 8 search engines and returns the publicly indexed email addresses related to the domain. With -v 3 you see exactly which query is sent to each source.

2. Targeted source

python infoga.py --domain example.com --source bing -v 2

When you want to reduce noise or avoid rate-limits from Google, target a specific engine.

3. Enrichment of a single address

python infoga.py --info [email protected] -v 3

For an already-known email, Infoga searches for additional context in public sources.

4. Breach check + report

python infoga.py --domain example.com --breach -v 3 --report results.txt
python infoga.py --info [email protected] --breach -v 3 --report results.txt

--breach checks whether the found emails appear in public breach datasets — valuable input for a credential-exposure risk assessment of the target organization on your own domain.

Complementary tools

Infoga is not used in isolation. In a serious OSINT engagement it is combined with:

  • theHarvester: a more complete email + subdomain harvesting tool.
  • Hunter.io / Snov.io: commercial APIs with higher-quality data (paid).
  • EmailHarvester: alternative CLI across multiple sources.
  • HaveIBeenPwned API: the canonical source for breach checks (paid API key required since 2019).
  • Email permutator scripts: for patterns such as {first}.{last}@example.com.

Common mistakes

  • Collecting personal data out of scope: Email is personal data under GDPR. Always stay inside the engagement scope.
  • Google rate-limits: Many queries in a short time → captcha. Use --source bing or --source yahoo to spread the load.
  • Outdated Python 2 environment: Modern Kali no longer ships Python 2. Use an active fork or pyenv.
  • Blind trust in results: Cross-check emails — scraping does produce false positives.
  • Phishing without authorization: Never. Findings → report → recommendations, not active attack.

Defensive / Blue team perspective

  • Email obfuscation: Avoid publishing plain-text email addresses on public web pages. Use contact forms instead.
  • DMARC/SPF/DKIM: Correctly configured email-authentication records make spoofing difficult.
  • Breach monitoring: Subscribe to HIBP for organizational emails (official enterprise API).
  • Security awareness training: Emails will leak sooner or later — users need to recognize phishing.
  • MFA everywhere: When email addresses become credential-stuffing targets, MFA is the strongest defense.

Best practices

  • Confirm scope and legal authorization before every run.
  • Record timestamps and sources for the pentest report.
  • Apply minimal data retention — delete the emails when the engagement closes.
  • Cross-validate with at least two different tools (Infoga + theHarvester).
  • Do not run extensive queries from a production network — prefer a VPN or proxy to avoid IP-based rate limits.

Summary

Infoga is a light, focused tool for email harvesting and enrichment during authorized OSINT engagements. Combined with theHarvester and HIBP, it gives a quick view of an organization’s exposure to email-based threats. Always with care toward GDPR and a clear engagement scope.

Next steps

For training in OSINT, social engineering risk assessment, and ethical hacking, explore the courses at Audax Cybersecurity Academy.

Reviews

0 %

User Score

0 ratings
Rate This