Automated IOC threat intelligence aggregator that collects, normalizes, deduplicates, and exports indicators of compromise from 29 open-source threat feeds into Sumo Logic SIEM-compatible STIX 2.1 CSV format for enterprise threat detection.
Aggregates from: AlienVault OTX, PhishTank, ThreatFox, Feodo, Spamhaus, URLhaus, MalwareBazaar, Pulsedive, Hybrid Analysis, CAPE Sandbox, OpenPhish, FireHOL, Botvrij, ThreatView, C2IntelFeeds, C2Tracker, DataPlane, AbuseSSL, IPsum, CINScore, PhishingArmy, VXVault, URLAbuse, TweetFeed, VirusShare, MISP CERT-FR, Blocklist.de, EmergingThreats, and more.
A threat intelligence analyst runs this aggregator weekly to pull fresh IOCs from 29 feeds and upload to Sumo Logic so L1 analysts get automatic alerts when those indicators appear in live logs. This closes the gap between threat feed discovery and SIEM detection, enabling rapid response to known malicious infrastructure.
| Feature | Details |
|---|---|
| Parallel Fetching | 29 threat intelligence sources with 10 concurrent threads for speed |
| Confidence Scoring | Per-source trust scores (range 65–95) based on feed reputation |
| IOC Deduplication | Automatic deduplication keeping highest-confidence threat type |
| 5-Week Rolling Window | Master CSV maintains rolling history; auto-archives oldest week |
| Timestamp Refresh | Re-seen IOCs get validity extended so they don't expire in Sumo Logic |
| STIX 2.1 Export | Sumo Logic-compatible 10-column CSV format, no header |
| Smart Splitting | Split output files at 9,999 rows per file for SIEM compatibility |
| Auto-Archiving | Week 6+ IOCs move to permanent Archive_IOC.csv |
| Cross-Platform | Runs on Windows (Task Scheduler) and Linux (Cron) |
| Multi-IOC Type | IP addresses, domains, URLs, MD5/SHA-1/SHA-256 hashes |
This aggregator enables detection across multiple attack phases:
| Threat Category | Sources | MITRE ID | Tactic | Use Case |
|---|---|---|---|---|
| C2 Infrastructure | Feodo, C2IntelFeeds, C2Tracker | T1071 | Command & Control | Block outbound comms to known C2 servers |
| Phishing URLs | PhishTank, OpenPhish, PhishingArmy | T1566.002 | Initial Access | Alert on phishing landing page visits |
| Malware Hashes | MalwareBazaar, Hybrid Analysis, CAPE | T1204.002 | Execution | Detect malware execution by file hash |
| Botnet IPs | Spamhaus, FireHOL, DataPlane | T1583.005 | Resource Development | Block traffic from botnet source IPs |
| Malware Domains | URLhaus, Botvrij, ThreatView | T1566.002 | Initial Access | Block DNS requests to malware domains |
| # | Source | IOC Types | Threat Category | Confidence | API Key? |
|---|---|---|---|---|---|
| 1 | Feodo | IP | C2 | 95 | No |
| 2 | PhishTank | URL / Domain | Phishing | 90 | No |
| 3 | C2IntelFeeds | IP | C2 | 90 | No |
| 4 | OpenPhish | URL / Domain | Phishing | 88 | No |
| 5 | MalwareBazaar | SHA-256 Hash | Malware | 88 | No |
| 6 | ThreatFox | IP / Domain / URL / Hash | Malware | 85 | No |
| 7 | URLhaus | URL / Domain | Malware | 85 | No |
| 8 | AbuseSSL | IP / Hash | Malware | 83 | No |
| 9 | MISP CERT-FR | Hash | Malware | 83 | No |
| 10 | Spamhaus | IP | Botnet | 80 | No |
| 11 | OTX (AlienVault) | IP / Domain / URL / Hash | Malware | 70 | Yes |
| 12 | Pulsedive | IP / Domain / URL | Malware | 65 | Yes |
| 13 | Hybrid Analysis | Hash | Malware | 82 | Yes |
| 14 | CAPE Sandbox | Hash | Malware | 80 | Yes |
| 15 | Botvrij | IP / Domain / URL / Hash | Malware | 70 | No |
+ 14 more sources including FireHOL, Blocklist.de, C2Tracker, DataPlane, ThreatView (5 feeds), Bazaar, IPsum, CINScore, PhishingArmy, VXVault, URLAbuse, TweetFeed, VirusShare, Botvrij Hashes
- Python: 3.10 or higher
- OS: Windows 10+ or Ubuntu 20.04+
- Internet: Required (fetches from 29 external sources)
- Disk: ~500 MB for master + archive + weekly CSVs
# 1. Clone the repository
git clone https://github.com/siva404e/IOC_AUTOMATION.git
cd IOC_AUTOMATION
# 2. Install Python dependencies
pip install -r requirements.txt
# 3. Create config file from template
cp config.ini.example config.ini
# 4. Add your API keys (optional but recommended)
# Edit config.ini and fill in:
# - otx_api_key (AlienVault OTX)
# - pulsedive_api_key (Pulsedive)
# - hybrid_analysis_api_key (Hybrid Analysis)
# - cape_api_token (CAPE Sandbox)
# 5. Run the aggregator
python final_ioc_weekly_split.py19:12:40 INFO Config loaded from : /IOC_Scripts/config.ini
19:12:40 INFO Output directory : /home/analyst/IOC_Output
19:12:40 INFO Starting IOC aggregator — 29 sources configured
19:12:47 INFO [+] ThreatFox fetched
19:12:50 INFO [+] PhishTank fetched (56294 rows)
19:12:52 WARNING [-] CAPE skipped — CAPE_API_TOKEN not set
19:13:28 INFO Total unique IOCs fetched: 438977
19:13:31 INFO Master updated — 343349 new | 95628 re-seen | 438977 total
19:13:34 INFO IP 39379 rows → 4 part file(s)
19:13:34 INFO Domain 273135 rows → 28 part file(s)
19:13:34 INFO URL 20912 rows → 3 part file(s)
19:13:34 INFO Hash 9923 rows → 1 part file(s)
IOC_AUTOMATION/
├── final_ioc_weekly_split.py ← Main aggregator script (1200+ lines)
├── config.ini ← Your API keys (git ignored)
├── config.ini.example ← Configuration template
├── requirements.txt ← Python dependencies
├── README.md ← This file
├── SOP_IOC_ThreatIntelligence_Aggregator.md ← Complete operational guide
├── .gitignore ← Prevents API key leaks
└── LICENSE ← MIT License
The aggregator produces Sumo Logic-compatible STIX 2.1 CSV files:
| File | Format | Updated | Purpose |
|---|---|---|---|
| Master_IOC.csv | CSV (8 cols) | Every run | Rolling 5-week IOC history with WeekTag |
| Archive_IOC.csv | CSV (8 cols) | Week 6+ | Permanent archive of evicted batches |
| IOC_Weekly_IP_PartN_.csv | CSV (10 cols) | Every run | IPv4 addresses (max 9,999 rows/file) |
| IOC_Weekly_Domain_PartN_.csv | CSV (10 cols) | Every run | Domain names (max 9,999 rows/file) |
| IOC_Weekly_URL_PartN_.csv | CSV (10 cols) | Every run | Malware/phishing URLs (max 9,999 rows/file) |
| IOC_Weekly_Hash_PartN_.csv | CSV (10 cols) | Every run | MD5/SHA-1/SHA-256 hashes (max 9,999 rows/file) |
id,indicator,type,source,validFrom,validUntil,confidence,threatType,actors,killChain
0001,192.0.2.1,ipv4-addr,Feodo,2026-03-18T13:00:00.000Z,2027-03-18T13:00:00.000Z,95,malicious-activity,,command-and-control
0002,evil.com,domain-name,PhishTank,2026-03-18T13:00:00.000Z,2027-03-18T13:00:00.000Z,90,malicious-activity,,delivery
0003,https://malware.xyz/pay.html,url,URLhaus,2026-03-18T13:00:00.000Z,2027-03-18T13:00:00.000Z,85,malicious-activity,,initial-access
Master file automatically maintains a rolling window:
| Week | Master Contains | Archive Action |
|---|---|---|
| 1 | W01 | — |
| 2 | W01–W02 | — |
| 3 | W01–W03 | — |
| 4 | W01–W04 | — |
| 5 | W01–W05 | — |
| 6 | W02–W06 | W01 → Archive |
| 7 | W03–W07 | W02 → Archive |
Each IOC gets a WeekTag (e.g., 2026-W11) so eviction is deterministic. Re-seen IOCs get timestamp refresh to prevent expiration in Sumo Logic.
Program: C:\Python310\python.exe
Arguments: C:\IOC_Scripts\final_ioc_weekly_split.py
Start in: C:\IOC_Scripts
Trigger: Weekly, Monday 07:00
crontab -e
# Add line:
0 7 * * 1 cd /path/to/IOC_Scripts && python3 final_ioc_weekly_split.py >> ~/IOC_Scripts/cron.log 2>&1- Run the aggregator → generates CSV files in
IOC_Output/ - Log in to Sumo Logic → Security → Threat Intelligence
- Click Add Source → Manual Upload → CSV
- Upload each
IOC_Weekly_<Type>_Part<N>.csvfile - Sumo Logic auto-detects indicators and creates detection rules
Verify: Row count in Sumo Logic matches your CSV file row count.
For detailed setup, troubleshooting, and operational procedures, see:
👉 SOP_IOC_ThreatIntelligence_Aggregator.md
Covers:
- System overview & architecture
- Pre-requisites & dependencies
- Initial setup procedure
- Running & scheduling (Windows + Linux)
- Monitoring & log interpretation
- 29 threat source details
- Master file rolling window
- Sumo Logic upload steps
- Troubleshooting guide
- Internet Dependent: Requires connectivity to all 29 source URLs
- Rate Limits: Free API tiers have rate limits; may skip sources if throttled
- Batch Only: Weekly batch aggregation, not real-time feed updates
- Manual Upload: Sumo Logic upload requires manual CSV import (can automate with API)
- API Keys: OTX, Pulsedive, Hybrid Analysis, CAPE require free registration for full functionality
- GitHub Actions – Scheduled weekly runs with auto-upload
- AbuseIPDB Integration – Add IP reputation scoring
- Sumo Logic API – Automated upload via API (no manual CSV import)
- Slack Alerts – Notify SOC on completion with summary stats
- Database Backend – PostgreSQL for historical queries
- Web Dashboard – Real-time feed status & IOC analytics
- Elasticsearch Export – Alternative to Sumo Logic
Solution: Copy config.ini.example to config.ini and place in same directory as script.
Solution: Run pip install -r requirements.txt
Solution: Check internet connection. Source failures are non-fatal; other sources continue.
Solution: Check internet connection and verify firewall allows HTTPS outbound.
For more troubleshooting, see SOP_IOC_ThreatIntelligence_Aggregator.md#11-troubleshooting
Total IOCs fetched: 438,977
- New IOCs: 343,349
- Re-seen IOCs: 95,628
- Unique IOCs: 438,977 (after dedup)
Breakdown by type:
- IP addresses: 39,379 (4 part files)
- Domains: 273,135 (28 part files)
- URLs: 20,912 (3 part files)
- Hashes: 9,923 (1 part file)
Confidence distribution:
- 95 (Critical): 12,450 (Feodo, C2Intel)
- 85-90 (High): 187,234 (PhishTank, ThreatFox, URLhaus)
- 70-82 (Medium): 239,293 (OTX, Hybrid Analysis, others)
Processing time: ~2-3 minutes
Archive size: ~2.5 GB (cumulative)
Found an issue? Want to add a source? Open an issue or pull request!
Areas for contribution:
- Additional threat feeds
- Performance optimizations
- Enhanced logging
- Integration examples (Splunk, ELK, etc.)
This project is licensed under the MIT License — see LICENSE for details.
Sivamuthu Selvadurai M
- GitHub: @siva404e
- Repository: IOC_AUTOMATION
- Threat Feeds: AlienVault OTX, abuse.ch, MalwareBazaar, Spamhaus, PhishTank, and 23+ open-source feeds
- Libraries: requests, pandas, beautifulsoup4, python-whois
- SIEM Integration: Sumo Logic STIX 2.1 format compliance
Last Updated: May 2026
Version: 1.0
Status: ✅ Production Ready