Self-hosted market intelligence pipeline -- collect, enrich, and surface trading signals from 10+ sources
park-intel is a self-hosted market intelligence pipeline. It collects articles from 10+ source types (RSS, Hacker News, Reddit, GitHub, and more), enriches them with keyword tagging and optional LLM-based relevance scoring, clusters related articles into narrative events, and serves everything through a REST API with a feed-first frontend.
Core sources work out of the box with zero API keys. Optional sources (Xueqiu, LLM tagging) activate when you add their credentials.
git clone https://github.com/zinan92/intel.git park-intel
cd park-intel
# Backend
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env # edit .env to add optional API keys
python main.py # starts on http://localhost:8001
# Frontend
cd frontend && npm install && npm run dev
# open http://localhost:5174The built-in scheduler starts collecting automatically. Check http://localhost:8001/health to see source status.
| Source | Key Required | Env Var | Notes |
|---|---|---|---|
| RSS Feeds (50+) | No | -- | Blogs, newsletters, tech and crypto media |
| Hacker News | No | -- | Algolia API, score >= 20 filter |
| No | -- | 13 subreddits via RSS | |
| GitHub Trending | No | -- | Keyword-filtered trending repos |
| Yahoo Finance | No | -- | Ticker news via yfinance |
| Google News | No | -- | Query-driven news aggregation |
| GitHub Releases | Optional | GITHUB_TOKEN |
Increases API rate limit |
| Xueqiu (Chinese market) | Yes | XUEQIU_COOKIE |
Chinese market KOL commentary |
| Social KOL | Optional | -- | Requires clawfeed CLI installed |
| LLM Tagging | Optional | ANTHROPIC_API_KEY |
AI relevance scoring + narrative tags |
Without ANTHROPIC_API_KEY, articles still collect and get keyword tags -- they just won't have LLM-based relevance scores or narrative tags.
Source Registry (DB)
|
v
Adapters --> Collectors (fetch + dedup) --> SQLite
|
Keyword Tagger (13 categories) |
Ticker Extractor ($NVDA, etc.) |
v
LLM Tagger (optional)
relevance_score 1-5
narrative_tags
|
v
Event Aggregator (48h window)
cross-source clustering
signal scoring
|
v
FastAPI REST API
/api/* + /api/ui/*
|
+-----------+-----------+-----------+
| | | |
React UI Quant Bridge User Health
Feed + price impact profiles Dashboard
Events from ext. topic /health
service weights
Data flow: Sources are registered in a database table (not config files). The scheduler runs one job per source type. Collectors fetch, deduplicate, and auto-tag articles on ingest. An optional LLM tagger scores relevance and generates narrative labels. The event aggregator clusters articles sharing the same narrative tag within 48-hour windows, computing a signal score (source count x avg relevance).
The /health endpoint shows per-source status including last collection time, article counts, error rates, and volume anomalies. When running the frontend, navigate to the health page to see a visual overview.
Optional: run park-intel as a persistent background service using launchd. The service auto-restarts on crash.
./scripts/install-service.sh # installs LaunchAgent and starts the service
./scripts/service-status.sh # check if the service is running
./scripts/uninstall-service.sh # stop and remove the serviceLogs go to the logs/ directory with automatic rotation.
| Endpoint | Description |
|---|---|
GET /api/health |
Per-source health status (registry-driven) |
GET /api/articles/latest |
Recent articles ?limit=20&source=rss&min_relevance=4 |
GET /api/articles/search |
Keyword search ?q=bitcoin&days=7 |
GET /api/articles/digest |
Articles grouped by source with top tags |
GET /api/articles/signals |
Topic heat + narrative momentum ?hours=24 |
GET /api/articles/sources |
Historical source statistics |
| Endpoint | Description |
|---|---|
GET /api/ui/feed |
Priority-scored feed ?user=myname&window=24h |
GET /api/ui/items/{id} |
Article detail with related items |
GET /api/ui/topics |
Topic list |
GET /api/ui/sources |
Active source list |
GET /api/ui/search |
Frontend search ?q=openai |
| Endpoint | Description |
|---|---|
GET /api/events/active |
Active events ranked by signal score |
GET /api/events/{id} |
Event detail with article timeline + price impacts |
GET /api/events/history |
Closed events archive ?tag=btc&days=30 |
| Endpoint | Description |
|---|---|
POST /api/users |
Create user profile |
GET /api/users/{username} |
Get user profile and topic weights |
PUT /api/users/{username}/weights |
Update topic weights (0.0-3.0 per topic) |
# Run tests
pytest tests/
# Run in development mode (auto-reload on file changes)
PARK_INTEL_DEV=1 python main.py
# Run collectors manually
python scripts/run_collectors.py # all sources
python scripts/run_collectors.py --source reddit # single source
# Run LLM tagger
python scripts/run_llm_tagger.py --limit 10 # score 10 unscored articles
python scripts/run_llm_tagger.py --backfill # backfill historical articles
# Backfill ticker extraction
python scripts/backfill_tickers.pymain.py # FastAPI entry point (port 8001)
config.py # Source seed data, collector config, env loading
scheduler.py # Registry-driven APScheduler
sources/ # Source registry, adapters, seeding
collectors/ # 10 source-type collectors (BaseCollector pattern)
events/ # Event aggregation (48h clustering, narratives)
tagging/ # Keyword tagger, LLM tagger, ticker extractor
users/ # User profiles and topic weights
bridge/ # Quant bridge (price impact from external service)
api/ # REST API routes
db/ # SQLAlchemy models, migrations, database init
frontend/ # React + TypeScript + Vite frontend
scripts/ # Management and utility scripts
tests/ # 290+ pytest tests
- Fork the repository
- Create a feature branch (
git checkout -b feature/my-feature) - Make your changes and add tests
- Run the test suite (
pytest tests/) - Commit and push (
git push origin feature/my-feature) - Open a Pull Request
MIT
