park-intel

Self-hosted market intelligence pipeline -- collect, enrich, and surface trading signals from 10+ sources

What It Does

park-intel is a self-hosted market intelligence pipeline. It collects articles from 10+ source types (RSS, Hacker News, Reddit, GitHub, and more), enriches them with keyword tagging and optional LLM-based relevance scoring, clusters related articles into narrative events, and serves everything through a REST API with a feed-first frontend.

Core sources work out of the box with zero API keys. Optional sources (Xueqiu, LLM tagging) activate when you add their credentials.

Quick Start

git clone https://github.com/zinan92/intel.git park-intel
cd park-intel

# Backend
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env          # edit .env to add optional API keys
python main.py                # starts on http://localhost:8001

# Frontend
cd frontend && npm install && npm run dev
# open http://localhost:5174

The built-in scheduler starts collecting automatically. Check http://localhost:8001/health to see source status.

Core vs Optional Sources

Source	Key Required	Env Var	Notes
RSS Feeds (50+)	No	--	Blogs, newsletters, tech and crypto media
Hacker News	No	--	Algolia API, score >= 20 filter
Reddit	No	--	13 subreddits via RSS
GitHub Trending	No	--	Keyword-filtered trending repos
Yahoo Finance	No	--	Ticker news via yfinance
Google News	No	--	Query-driven news aggregation
GitHub Releases	Optional	`GITHUB_TOKEN`	Increases API rate limit
Xueqiu (Chinese market)	Yes	`XUEQIU_COOKIE`	Chinese market KOL commentary
Social KOL	Optional	--	Requires `clawfeed` CLI installed
LLM Tagging	Optional	`ANTHROPIC_API_KEY`	AI relevance scoring + narrative tags

Without ANTHROPIC_API_KEY, articles still collect and get keyword tags -- they just won't have LLM-based relevance scores or narrative tags.

Architecture

Source Registry (DB)
       |
       v
   Adapters  -->  Collectors (fetch + dedup)  -->  SQLite
                                                      |
                        Keyword Tagger (13 categories) |
                        Ticker Extractor ($NVDA, etc.) |
                                                      v
                                              LLM Tagger (optional)
                                              relevance_score 1-5
                                              narrative_tags
                                                      |
                                                      v
                                          Event Aggregator (48h window)
                                          cross-source clustering
                                          signal scoring
                                                      |
                                                      v
                                              FastAPI REST API
                                             /api/* + /api/ui/*
                                                      |
                              +-----------+-----------+-----------+
                              |           |           |           |
                           React UI   Quant Bridge  User        Health
                           Feed +     price impact  profiles    Dashboard
                           Events     from ext.     topic       /health
                                      service       weights

Data flow: Sources are registered in a database table (not config files). The scheduler runs one job per source type. Collectors fetch, deduplicate, and auto-tag articles on ingest. An optional LLM tagger scores relevance and generates narrative labels. The event aggregator clusters articles sharing the same narrative tag within 48-hour windows, computing a signal score (source count x avg relevance).

Health Dashboard

The /health endpoint shows per-source status including last collection time, article counts, error rates, and volume anomalies. When running the frontend, navigate to the health page to see a visual overview.

Run as Background Service (macOS)

Optional: run park-intel as a persistent background service using launchd. The service auto-restarts on crash.

./scripts/install-service.sh    # installs LaunchAgent and starts the service
./scripts/service-status.sh     # check if the service is running
./scripts/uninstall-service.sh  # stop and remove the service

Logs go to the logs/ directory with automatic rotation.

API Endpoints

Core Data

Endpoint	Description
`GET /api/health`	Per-source health status (registry-driven)
`GET /api/articles/latest`	Recent articles `?limit=20&source=rss&min_relevance=4`
`GET /api/articles/search`	Keyword search `?q=bitcoin&days=7`
`GET /api/articles/digest`	Articles grouped by source with top tags
`GET /api/articles/signals`	Topic heat + narrative momentum `?hours=24`
`GET /api/articles/sources`	Historical source statistics

Frontend Read Model

Endpoint	Description
`GET /api/ui/feed`	Priority-scored feed `?user=myname&window=24h`
`GET /api/ui/items/{id}`	Article detail with related items
`GET /api/ui/topics`	Topic list
`GET /api/ui/sources`	Active source list
`GET /api/ui/search`	Frontend search `?q=openai`

Events

Endpoint	Description
`GET /api/events/active`	Active events ranked by signal score
`GET /api/events/{id}`	Event detail with article timeline + price impacts
`GET /api/events/history`	Closed events archive `?tag=btc&days=30`

Users

Endpoint	Description
`POST /api/users`	Create user profile
`GET /api/users/{username}`	Get user profile and topic weights
`PUT /api/users/{username}/weights`	Update topic weights (0.0-3.0 per topic)

Development

# Run tests
pytest tests/

# Run in development mode (auto-reload on file changes)
PARK_INTEL_DEV=1 python main.py

# Run collectors manually
python scripts/run_collectors.py                # all sources
python scripts/run_collectors.py --source reddit # single source

# Run LLM tagger
python scripts/run_llm_tagger.py --limit 10     # score 10 unscored articles
python scripts/run_llm_tagger.py --backfill     # backfill historical articles

# Backfill ticker extraction
python scripts/backfill_tickers.py

Project Structure

main.py                  # FastAPI entry point (port 8001)
config.py                # Source seed data, collector config, env loading
scheduler.py             # Registry-driven APScheduler
sources/                 # Source registry, adapters, seeding
collectors/              # 10 source-type collectors (BaseCollector pattern)
events/                  # Event aggregation (48h clustering, narratives)
tagging/                 # Keyword tagger, LLM tagger, ticker extractor
users/                   # User profiles and topic weights
bridge/                  # Quant bridge (price impact from external service)
api/                     # REST API routes
db/                      # SQLAlchemy models, migrations, database init
frontend/                # React + TypeScript + Vite frontend
scripts/                 # Management and utility scripts
tests/                   # 290+ pytest tests

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/my-feature)
Make your changes and add tests
Run the test suite (pytest tests/)
Commit and push (git push origin feature/my-feature)
Open a Pull Request

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

park-intel

What It Does

Quick Start

Core vs Optional Sources

Architecture

Health Dashboard

Run as Background Service (macOS)

API Endpoints

Core Data

Frontend Read Model

Events

Users

Development

Project Structure

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 141 Commits
.planning		.planning
api		api
bridge		bridge
briefs		briefs
collectors		collectors
db		db
docs/plans		docs/plans
events		events
frontend		frontend
logs		logs
plans		plans
scripts		scripts
sources		sources
tagging		tagging
tests		tests
users		users
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
collect-errors.log		collect-errors.log
com.park-intel.agent.plist		com.park-intel.agent.plist
config.py		config.py
main.py		main.py
park_intel.db		park_intel.db
requirements.txt		requirements.txt
scheduler.py		scheduler.py

Folders and files

Latest commit

History

Repository files navigation

park-intel

What It Does

Quick Start

Core vs Optional Sources

Architecture

Health Dashboard

Run as Background Service (macOS)

API Endpoints

Core Data

Frontend Read Model

Events

Users

Development

Project Structure

Contributing

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages