summaryrefslogtreecommitdiff
path: root/docs/superpowers
diff options
context:
space:
mode:
authorTheSiahxyz <164138827+TheSiahxyz@users.noreply.github.com>2026-04-02 13:31:40 +0900
committerTheSiahxyz <164138827+TheSiahxyz@users.noreply.github.com>2026-04-02 13:31:40 +0900
commitfea55160794bcd048496b2ee3e282180d8869e2a (patch)
treecef1625b30a894a9323782c63bbea4ffc2b757f7 /docs/superpowers
parente2344bc310817db763f697df3fb2a9be987ce6b0 (diff)
docs: add news-driven stock selector design spec
Design for upgrading MOC strategy from fixed symbols to dynamic, news-driven stock selection with sentiment analysis pipeline.
Diffstat (limited to 'docs/superpowers')
-rw-r--r--docs/superpowers/specs/2026-04-02-news-driven-stock-selector-design.md418
1 files changed, 418 insertions, 0 deletions
diff --git a/docs/superpowers/specs/2026-04-02-news-driven-stock-selector-design.md b/docs/superpowers/specs/2026-04-02-news-driven-stock-selector-design.md
new file mode 100644
index 0000000..d439154
--- /dev/null
+++ b/docs/superpowers/specs/2026-04-02-news-driven-stock-selector-design.md
@@ -0,0 +1,418 @@
+# News-Driven Stock Selector Design
+
+**Date:** 2026-04-02
+**Goal:** Upgrade the MOC (Market on Close) strategy from fixed symbol lists to dynamic, news-driven stock selection. The system collects news/sentiment data continuously, then selects 2-3 optimal stocks daily before market close.
+
+---
+
+## Architecture Overview
+
+```
+[Continuous Collection] [Pre-Close Decision]
+Finnhub News ─┐
+RSS Feeds ─┤
+SEC EDGAR ─┤
+Truth Social ─┼→ DB (news_items) → Sentiment Aggregator → symbol_scores
+Reddit ─┤ + Redis "news" (every 15 min) market_sentiment
+Fear & Greed ─┤
+FOMC/Fed ─┘
+
+ 15:00 ET ─→ Candidate Pool (sentiment top + LLM picks)
+ 15:15 ET ─→ Technical Filter (RSI, EMA, volume)
+ 15:30 ET ─→ LLM Final Selection (2-3 stocks) → Telegram
+ 15:50 ET ─→ MOC Buy Execution
+ 09:35 ET ─→ Next-day Sell (existing MOC logic)
+```
+
+## 1. News Collector Service
+
+New service: `services/news-collector/`
+
+### Structure
+
+```
+services/news-collector/
+├── Dockerfile
+├── pyproject.toml
+├── src/news_collector/
+│ ├── __init__.py
+│ ├── main.py # Scheduler: runs each collector on its interval
+│ ├── config.py
+│ └── collectors/
+│ ├── __init__.py
+│ ├── base.py # BaseCollector ABC
+│ ├── finnhub.py # Finnhub market news (free, 60 req/min)
+│ ├── rss.py # Yahoo Finance, Google News, MarketWatch RSS
+│ ├── sec_edgar.py # SEC EDGAR 8-K/10-Q filings
+│ ├── truth_social.py # Truth Social scraping (Trump posts)
+│ ├── reddit.py # Reddit (r/wallstreetbets, r/stocks)
+│ ├── fear_greed.py # CNN Fear & Greed Index scraping
+│ └── fed.py # FOMC statements, Fed announcements
+└── tests/
+```
+
+### BaseCollector Interface
+
+```python
+class BaseCollector(ABC):
+ name: str
+ poll_interval: int # seconds
+
+ @abstractmethod
+ async def collect(self) -> list[NewsItem]:
+ """Collect and return list of NewsItem."""
+
+ @abstractmethod
+ async def is_available(self) -> bool:
+ """Check if this source is accessible (API key present, endpoint reachable)."""
+```
+
+### Poll Intervals
+
+| Collector | Interval | Notes |
+|-----------|----------|-------|
+| Finnhub | 5 min | Free tier: 60 calls/min |
+| RSS (Yahoo/Google/MarketWatch) | 10 min | Headlines only |
+| SEC EDGAR | 30 min | Focus on 8-K filings |
+| Truth Social | 15 min | Scraping |
+| Reddit | 15 min | Hot posts from relevant subs |
+| Fear & Greed | 1 hour | Updates once daily but check periodically |
+| FOMC/Fed | 1 hour | Infrequent events |
+
+### Provider Abstraction (for paid upgrade path)
+
+```python
+# config.yaml
+collectors:
+ news:
+ provider: "finnhub" # swap to "benzinga" for paid
+ api_key: ${FINNHUB_API_KEY}
+ social:
+ provider: "reddit" # swap to "stocktwits_pro" etc.
+ policy:
+ provider: "truth_social" # swap to "twitter_api" etc.
+
+# Factory
+COLLECTOR_REGISTRY = {
+ "finnhub": FinnhubCollector,
+ "rss": RSSCollector,
+ "benzinga": BenzingaCollector, # added later
+}
+```
+
+## 2. Shared Models (additions to shared/)
+
+### NewsItem (shared/models.py)
+
+```python
+class NewsCategory(str, Enum):
+ POLICY = "policy"
+ EARNINGS = "earnings"
+ MACRO = "macro"
+ SOCIAL = "social"
+ FILING = "filing"
+ FED = "fed"
+
+class NewsItem(BaseModel):
+ id: str = Field(default_factory=lambda: str(uuid.uuid4()))
+ source: str # "finnhub", "rss", "sec_edgar", etc.
+ headline: str
+ summary: str | None = None
+ url: str | None = None
+ published_at: datetime
+ symbols: list[str] = [] # Related tickers (if identifiable)
+ sentiment: float # -1.0 to 1.0 (first-pass analysis at collection)
+ category: NewsCategory
+ raw_data: dict = {}
+ created_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
+```
+
+### SymbolScore (shared/sentiment_models.py — new file)
+
+```python
+class SymbolScore(BaseModel):
+ symbol: str
+ news_score: float # -1.0 to 1.0, weighted avg of news sentiment
+ news_count: int # Number of news items in last 24h
+ social_score: float # Reddit/social sentiment
+ policy_score: float # Policy-related impact
+ filing_score: float # SEC filing impact
+ composite: float # Weighted final score
+ updated_at: datetime
+
+class MarketSentiment(BaseModel):
+ fear_greed: int # 0-100
+ fear_greed_label: str # "Extreme Fear", "Fear", "Neutral", "Greed", "Extreme Greed"
+ vix: float | None = None
+ fed_stance: str # "hawkish", "neutral", "dovish"
+ market_regime: str # "risk_on", "neutral", "risk_off"
+ updated_at: datetime
+
+class SelectedStock(BaseModel):
+ symbol: str
+ side: OrderSide # BUY or SELL
+ conviction: float # 0.0 to 1.0
+ reason: str # Selection rationale
+ key_news: list[str] # Key news headlines
+
+class Candidate(BaseModel):
+ symbol: str
+ source: str # "sentiment" or "llm"
+ direction: OrderSide | None = None # Suggested direction (if known)
+ score: float # Relevance/priority score
+ reason: str # Why this candidate was selected
+```
+
+## 3. Sentiment Analysis Pipeline
+
+### Location
+
+Refactor existing `shared/src/shared/sentiment.py`.
+
+### Two-Stage Analysis
+
+**Stage 1: Per-news sentiment (at collection time)**
+- VADER (nltk.sentiment, free) for English headlines
+- Keyword rule engine for domain-specific terms (e.g., "tariff" → negative for importers, positive for domestic producers)
+- Score stored in `NewsItem.sentiment`
+
+**Stage 2: Per-symbol aggregation (every 15 minutes)**
+
+```
+composite = (
+ news_score * 0.3 +
+ social_score * 0.2 +
+ policy_score * 0.3 +
+ filing_score * 0.2
+) * freshness_decay
+```
+
+Freshness decay:
+- < 1 hour: 1.0
+- 1-6 hours: 0.7
+- 6-24 hours: 0.3
+- > 24 hours: excluded
+
+Policy score weighted high because US stock market is heavily influenced by policy events (tariffs, regulation, subsidies).
+
+### Market-Level Gating
+
+`MarketSentiment.market_regime` determination:
+- `risk_off`: Fear & Greed < 20 OR VIX > 30 → **block all trades**
+- `risk_on`: Fear & Greed > 60 AND VIX < 20
+- `neutral`: everything else
+
+This extends the existing `sentiment.py` `should_block()` logic.
+
+## 4. Stock Selector Engine
+
+### Location
+
+`services/strategy-engine/src/strategy_engine/stock_selector.py`
+
+### Three-Stage Selection Process
+
+**Stage 1: Candidate Pool (15:00 ET)**
+
+Two candidate sources, results merged (deduplicated):
+
+```python
+class CandidateSource(ABC):
+ @abstractmethod
+ async def get_candidates(self) -> list[Candidate]
+
+class SentimentCandidateSource(CandidateSource):
+ """Top N symbols by composite SymbolScore from DB."""
+
+class LLMCandidateSource(CandidateSource):
+ """Send today's top news summary to Claude, get related symbols + direction."""
+```
+
+- SentimentCandidateSource: top 20 by composite score
+- LLMCandidateSource: Claude analyzes today's major news and recommends affected symbols
+- Merged pool: typically 20-30 candidates
+
+**Stage 2: Technical Filter (15:15 ET)**
+
+Apply existing MOC screening criteria to candidates:
+- Fetch recent price data from Alpaca for all candidates
+- RSI 30-60
+- Price > 20-period EMA
+- Volume > average
+- Bullish candle pattern
+- Result: typically 5-10 survivors
+
+**Stage 3: LLM Final Selection (15:30 ET)**
+
+Send to Claude:
+- Filtered candidate list with technical indicators
+- Per-symbol sentiment scores and top news headlines
+- Market sentiment (Fear & Greed, VIX, Fed stance)
+- Prompt: "Select 2-3 stocks for MOC trading with rationale"
+
+Response parsed into `list[SelectedStock]`.
+
+### Integration with MOC Strategy
+
+Current: MOC strategy receives candles for fixed symbols and decides internally.
+
+New flow:
+1. `StockSelector` publishes `SelectedStock` list to Redis stream `selected_stocks` at 15:30 ET
+2. MOC strategy reads `selected_stocks` to get today's targets
+3. MOC still applies its own technical checks at 15:50-16:00 as a safety net
+4. If a selected stock fails the final technical check, it's skipped (no forced trades)
+
+## 5. Database Schema
+
+Four new tables via Alembic migration:
+
+```sql
+CREATE TABLE news_items (
+ id UUID PRIMARY KEY,
+ source VARCHAR(50) NOT NULL,
+ headline TEXT NOT NULL,
+ summary TEXT,
+ url TEXT,
+ published_at TIMESTAMPTZ NOT NULL,
+ symbols TEXT[],
+ sentiment FLOAT NOT NULL,
+ category VARCHAR(50) NOT NULL,
+ raw_data JSONB DEFAULT '{}',
+ created_at TIMESTAMPTZ DEFAULT NOW()
+);
+CREATE INDEX idx_news_items_published ON news_items(published_at);
+CREATE INDEX idx_news_items_symbols ON news_items USING GIN(symbols);
+
+CREATE TABLE symbol_scores (
+ id UUID PRIMARY KEY,
+ symbol VARCHAR(10) NOT NULL,
+ news_score FLOAT NOT NULL DEFAULT 0,
+ news_count INT NOT NULL DEFAULT 0,
+ social_score FLOAT NOT NULL DEFAULT 0,
+ policy_score FLOAT NOT NULL DEFAULT 0,
+ filing_score FLOAT NOT NULL DEFAULT 0,
+ composite FLOAT NOT NULL DEFAULT 0,
+ updated_at TIMESTAMPTZ NOT NULL
+);
+CREATE UNIQUE INDEX idx_symbol_scores_symbol ON symbol_scores(symbol);
+
+CREATE TABLE market_sentiment (
+ id UUID PRIMARY KEY,
+ fear_greed INT NOT NULL,
+ fear_greed_label VARCHAR(30) NOT NULL,
+ vix FLOAT,
+ fed_stance VARCHAR(20) NOT NULL DEFAULT 'neutral',
+ market_regime VARCHAR(20) NOT NULL DEFAULT 'neutral',
+ updated_at TIMESTAMPTZ NOT NULL
+);
+
+CREATE TABLE stock_selections (
+ id UUID PRIMARY KEY,
+ trade_date DATE NOT NULL,
+ symbol VARCHAR(10) NOT NULL,
+ side VARCHAR(4) NOT NULL,
+ conviction FLOAT NOT NULL,
+ reason TEXT NOT NULL,
+ key_news JSONB DEFAULT '[]',
+ sentiment_snapshot JSONB DEFAULT '{}',
+ created_at TIMESTAMPTZ DEFAULT NOW()
+);
+CREATE INDEX idx_stock_selections_date ON stock_selections(trade_date);
+```
+
+`stock_selections` stores an audit trail: why each stock was selected, enabling post-hoc analysis of selection quality.
+
+## 6. Redis Streams
+
+| Stream | Producer | Consumer | Payload |
+|--------|----------|----------|---------|
+| `news` | news-collector | strategy-engine (sentiment aggregator) | NewsItem |
+| `selected_stocks` | stock-selector | MOC strategy | SelectedStock |
+
+Existing streams (`candles`, `signals`, `orders`) unchanged.
+
+## 7. Docker Compose Addition
+
+```yaml
+news-collector:
+ build:
+ context: .
+ dockerfile: services/news-collector/Dockerfile
+ env_file: .env
+ ports:
+ - "8084:8084"
+ depends_on:
+ redis: { condition: service_healthy }
+ postgres: { condition: service_healthy }
+ healthcheck:
+ test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8084/health')"]
+ interval: 10s
+ timeout: 5s
+ retries: 3
+ restart: unless-stopped
+```
+
+## 8. Environment Variables
+
+```bash
+# News Collector
+FINNHUB_API_KEY= # Free key from finnhub.io
+NEWS_POLL_INTERVAL=300 # Default 5 min (overrides per-collector defaults)
+SENTIMENT_AGGREGATE_INTERVAL=900 # 15 min
+
+# Stock Selector
+SELECTOR_CANDIDATES_TIME=15:00 # ET, candidate pool generation
+SELECTOR_FILTER_TIME=15:15 # ET, technical filter
+SELECTOR_FINAL_TIME=15:30 # ET, LLM final pick
+SELECTOR_MAX_PICKS=3
+
+# LLM (for stock selector + screener)
+ANTHROPIC_API_KEY=
+ANTHROPIC_MODEL=claude-sonnet-4-20250514
+```
+
+## 9. Telegram Notifications
+
+Extend existing `shared/notifier.py` with:
+
+```python
+async def send_stock_selection(self, selections: list[SelectedStock], market: MarketSentiment):
+ """
+ 📊 오늘의 종목 선정 (2/3)
+
+ 1. NVDA 🟢 BUY (확신도: 0.85)
+ 근거: 트럼프 반도체 보조금 확대 발표, RSI 42
+ 핵심뉴스: "Trump signs CHIPS Act expansion..."
+
+ 2. XOM 🟢 BUY (확신도: 0.72)
+ 근거: 유가 상승 + 실적 서프라이즈, 볼륨 급증
+
+ 시장심리: Fear & Greed 55 (Neutral) | VIX 18.2
+ """
+```
+
+## 10. Testing Strategy
+
+**Unit tests:**
+- Each collector: mock HTTP responses → verify NewsItem parsing
+- Sentiment analysis: verify VADER + keyword scoring
+- Aggregator: mock news data → verify SymbolScore calculation and freshness decay
+- Stock selector: mock scores → verify candidate/filter/selection pipeline
+- LLM calls: mock Claude response → verify SelectedStock parsing
+
+**Integration tests:**
+- Full pipeline: news collection → DB → aggregation → selection
+- Market gating: verify `risk_off` blocks all trades
+- MOC integration: verify selected stocks flow to MOC strategy
+
+**Post-hoc analysis (future):**
+- Use `stock_selections` audit trail to measure selection accuracy
+- Historical news data replay for backtesting requires paid data (deferred)
+
+## 11. Out of Scope (Future)
+
+- Paid API integration (designed for, not implemented)
+- Historical news backtesting
+- WebSocket real-time news streaming
+- Multi-language sentiment analysis
+- Options/derivatives signals