Product Requirements Document (PRD)¶
DSTA — Dr. Strange Trading Analysis¶
Version: 2.0 Last Updated: 2026-05-20 Status: Active Development
1. Executive Summary¶
1.1 Vision¶
DSTA (Dr. Strange Trading Analysis) is a personal, end-to-end cryptocurrency trading platform built as a microservices monorepo. It covers the full lifecycle from raw market data ingestion through strategy backtesting, ML-driven signal generation, live order execution, and portfolio monitoring — all running on self-hosted infrastructure.
The platform is evolving across four sequential phases:
| Phase | Theme | Timeline |
|---|---|---|
| 1 | Close the Loop — first live trade end-to-end | Weeks 1–4 |
| 2 | Intelligence Upgrade — SOTA ML signals | Weeks 5–10 |
| 3 | Alternative Data & Portfolio | Weeks 10–18 |
| 4 | LLM Agent Layer | Month 5+ |
1.2 Objectives¶
- Deliver a live trading pipeline where a strategy running in
dsta-trading-svcplaces real orders on Binance/Huobi/Gate.io throughdsta-exchange-svc. - Validate every strategy against walk-forward backtests before live deployment.
- Replace heuristic-only signals with an ensemble that combines technical indicators, SOTA time-series ML models, and sentiment analysis.
- Extend to alternative data sources (funding rates, on-chain metrics, options flow, order book depth) and portfolio-level optimization.
- Build a
MarketAnalystAgentusing the Claude API that autonomously discovers, backtests, and proposes strategies — with human approval before execution.
1.3 Target Users¶
This is a personal research platform. The primary user is a quantitative developer who: - Understands Python, Docker, REST/WebSocket APIs, and basic quantitative finance. - Wants full transparency and control over every layer of the stack. - Is willing to iterate on strategy logic, ML models, and infra in parallel.
2. Current State of the Codebase (as of 2026-05-20)¶
Understanding what already exists is required before planning new work.
2.1 Services¶
| Service | Port | Status | Notes |
|---|---|---|---|
dsta-core-svc | 8001 | Partial | JWT auth works; Telegram notifications not wired |
dsta-trading-svc | 8002 | Partial | Backtesting engine works; live execution is dry-run only; strategy registry is in-memory |
dsta-data-svc | 8003 | Partial | Historical download works; WebSocket ingestion to TimescaleDB not implemented |
dsta-ml-svc | 8004 | Partial | LSTM predictor and PPO RL agent exist; not wired to trading-svc |
dsta-exchange-svc | 8005 | Partial | Binance + Huobi adapters work; Gate.io adapter incomplete; circuit breaker missing |
dsta-cli | — | Skeleton | Commands exist but no end-to-end smoke test |
dsta-qa | — | Partial | Quant analytics scripts exist; not wired to live data |
dsta-web | — | Shell | React/Vite shell; no dashboard pages implemented |
2.2 What Works¶
- Backtesting engine (event-driven, supports slippage + fees)
- Binance and Huobi exchange adapters (REST + WebSocket)
- Feature engineering pipeline
- LSTM price predictor (training + inference)
- PPO reinforcement learning agent
- CLI skeleton
- JWT authentication in core-svc
2.3 What Does Not Work Yet¶
- Live end-to-end trading pipeline (strategy → exchange order)
- Gate.io adapter
- ML models connected to trading signal flow
- WebSocket OHLCV ingestion persisted to TimescaleDB
- Web dashboard (UI is an empty shell)
- Telegram notifications
- Docker Compose full-stack deployment
- All services deployed simultaneously with API Gateway
3. Market Context¶
3.1 Opportunity¶
Cryptocurrency markets operate 24/7 with high volatility and thin institutional participation in altcoin pairs, creating exploitable inefficiencies for systematic strategies. Pain points with existing solutions:
- Commercial bots (3Commas, Cryptohopper): opaque signal logic, expensive subscriptions, no ML customization.
- Open-source backtesting libraries (Backtrader, VectorBT): backtest only, no live execution path.
- Full platforms (Freqtrade, Hummingbot): rigid strategy DSLs, limited ML integration surface.
3.2 DSTA Differentiators¶
- Full vertical stack: data ingestion → feature engineering → ML training → live execution → portfolio monitoring.
- Bring-your-own model: plug in any PyTorch model for signal generation.
- Alternative data first-class: funding rates, on-chain, options flow, sentiment built into the feature store.
- LLM-assisted strategy discovery: automated hypothesis generation and backtesting via Claude API tool-use.
- Self-hosted: no SaaS fees, no data leaving the machine, reproducible experiments via MLflow.
4. User Requirements¶
4.1 Data & Research¶
- As a developer, I want WebSocket OHLCV data ingested continuously into TimescaleDB so I can build features from recent candles without manual downloads.
- As a developer, I want historical OHLCV data available for at least 2 years at 1m resolution for all tracked pairs, so backtests cover multiple market regimes.
- As a developer, I want a feature store (Feast + Redis) that serves consistent features at both training time and inference time, so there is no train/serve skew.
4.2 Strategy & Backtesting¶
- As a developer, I want to run walk-forward backtests on any registered strategy and get a structured report (Sharpe, Calmar, max drawdown, win rate, trade log) before deploying live.
- As a developer, I want strategy parameters optimized with Optuna walk-forward search, not grid search, so parameter selection is not forward-looking.
- As a developer, I want to compare multiple strategy results in a single CLI command.
4.3 Live Trading¶
- As a developer, I want trading-svc to call exchange-svc's order API so that a BUY/SELL signal from a strategy results in a real order on the exchange.
- As a developer, I want the strategy registry persisted in the database so running strategies survive a service restart.
- As a developer, I want per-strategy risk controls (max position size, max drawdown circuit breaker) enforced before any order is submitted.
4.4 ML & Signals¶
- As a developer, I want an ensemble signal aggregator that combines technical indicators, ML price predictions, and sentiment scores into a single confidence-weighted signal.
- As a developer, I want live market regime detection (HMM/BOCPD) so strategies can adapt their parameters to trending vs. mean-reverting conditions.
- As a developer, I want MLflow experiment tracking so I can compare model versions and reproduce any training run.
4.5 Alternative Data¶
- As a developer, I want funding rate data from Binance and Gate.io ingested and exposed as a feature, so I can use leverage imbalance as a signal.
- As a developer, I want on-chain metrics (SOPR, MVRV, whale flows) from Glassnode free tier / CryptoQuant ingested on a daily schedule.
- As a developer, I want L2 order book imbalance computed from WebSocket depth streams and stored as a feature.
4.6 Portfolio Management¶
- As a developer, I want simultaneous multi-asset positions managed by a portfolio layer that tracks correlation and enforces Kelly Criterion sizing.
- As a developer, I want CVaR-constrained portfolio optimization (PyPortfolioOpt + CVXPY) to compute target weights.
- As a developer, I want auto-rebalancing triggered when actual weights drift beyond a configurable threshold from target weights.
4.7 Notifications & Monitoring¶
- As a developer, I want Telegram notifications on: trade fill, PnL alert (daily summary), max drawdown breach, and service health failure.
- As a developer, I want a Prometheus/Grafana stack where each service exposes
/metricsand dashboards are version-controlled.
4.8 LLM Agent¶
- As a developer, I want a
MarketAnalystAgentthat can autonomously generate strategy hypotheses, run backtests, and present ranked results — but requires my explicit approval before placing any live order. - As a developer, I want the agent to query a RAG knowledge base built from my research notebooks and past backtest results.
5. Functional Requirements¶
5.1 Data Collection (dsta-data-svc)¶
5.1.1 Historical Data¶
- REQ-DATA-001: Fetch OHLCV candlestick data from Binance, Huobi, and Gate.io REST APIs.
- REQ-DATA-002: Support timeframes: 1m, 3m, 5m, 15m, 30m, 1h, 2h, 4h, 6h, 12h, 1d.
- REQ-DATA-003: Implement rate-limit-aware download with exponential backoff and resume capability.
- REQ-DATA-004: Store candlesticks in TimescaleDB hypertables partitioned by time and symbol.
- REQ-DATA-005: Validate candles on ingest (gap detection, outlier rejection, OHLC consistency checks).
5.1.2 Real-Time Ingestion Pipeline¶
- REQ-DATA-006: Maintain persistent WebSocket connections to Binance, Huobi, and Gate.io for kline/candlestick streams.
- REQ-DATA-007: Write arriving candles to TimescaleDB within 500ms of receipt.
- REQ-DATA-008: Reconnect automatically on WebSocket drop; log gap intervals.
- REQ-DATA-009: Expose an HTTP SSE endpoint (
/stream/ohlcv) so other services can subscribe to the live candle feed.
5.1.3 Order Book Data¶
- REQ-DATA-010: Subscribe to L2 order book depth streams (top-20 levels) from WebSocket.
- REQ-DATA-011: Compute and persist order book imbalance (bid volume / (bid + ask volume)) as a time-series feature at 1-second resolution.
5.1.4 Alternative Data¶
- REQ-DATA-012: Ingest funding rates from Binance Futures and Gate.io Futures on a 1-hour schedule via REST.
- REQ-DATA-013: Ingest Glassnode free-tier metrics (SOPR, MVRV, exchange flows) on a daily schedule via REST.
- REQ-DATA-014: Ingest CryptoPanic RSS feed and score each article with FinBERT; store score + timestamp per token.
- REQ-DATA-015: Fetch Google Trends weekly data for top-10 tracked tokens via
pytrends. - REQ-DATA-016: Pull Deribit options data (put/call ratio, gamma exposure) for BTC and ETH on an hourly schedule.
5.1.5 Data Export¶
- REQ-DATA-017: Provide a CLI command and REST endpoint to export any symbol/timeframe slice to CSV, JSON, or Parquet.
5.2 Exchange Adapters (dsta-exchange-svc)¶
- REQ-EXCH-001: Implement a unified
ExchangeAdapterinterface with methods:get_ticker,get_orderbook,get_ohlcv,place_order,cancel_order,get_position,get_balance. - REQ-EXCH-002: Complete the Gate.io adapter implementing all
ExchangeAdaptermethods. - REQ-EXCH-003: Implement circuit breaker per adapter: after 5 consecutive failures within 60 seconds, open the circuit for 30 seconds before retrying.
- REQ-EXCH-004: Expose a WebSocket proxy endpoint so other services can subscribe to real-time exchange feeds without holding their own exchange connections.
- REQ-EXCH-005: All order placement calls must be idempotent (client-order-id based) to prevent double-submission on retry.
- REQ-EXCH-006: Log every order request and response (masked API keys) to an audit table.
5.3 Backtesting Engine (dsta-trading-svc)¶
- REQ-BT-001: Event-driven architecture: data events → strategy signal → order event → fill event → position update.
- REQ-BT-002: Simulate realistic fills: market orders fill at next-candle open with configurable slippage model; limit orders fill when price crosses limit.
- REQ-BT-003: Apply per-exchange fee schedules (maker/taker).
- REQ-BT-004: Support long, short, and leveraged positions.
- REQ-BT-005: Support portfolio-level backtesting across multiple symbols simultaneously.
- REQ-BT-006: Walk-forward validation: split data into rolling in-sample/out-of-sample windows; report out-of-sample performance only.
- REQ-BT-007: Output metrics: total return, annualized return, Sharpe ratio, Calmar ratio, Sortino ratio, max drawdown, win rate, profit factor, average trade duration, trade count.
- REQ-BT-008: Generate equity curve (JSON + optional PNG) and per-trade log (CSV).
5.4 Strategy Framework (dsta-trading-svc)¶
- REQ-STRAT-001: Provide a
BaseStrategyabstract class with lifecycle hooks:on_candle,on_fill,on_stop. - REQ-STRAT-002: Bundle the following reference strategies: SMA crossover, RSI mean-reversion, Bollinger Band breakout.
- REQ-STRAT-003: Persist strategy registry in PostgreSQL. Each row stores:
strategy_id,name,class_path,parameters(JSONB),status(enabled/disabled/paper),exchange,symbol,created_at. - REQ-STRAT-004: Support hot-reload: mark a strategy as disabled in DB; the execution scheduler detects the change within one candle interval and stops the strategy without restart.
- REQ-STRAT-005: Optuna walk-forward hyperparameter optimization: given a strategy class and parameter search space, run N walk-forward trials and return the best parameter set.
5.5 Live Execution (dsta-trading-svc)¶
- REQ-EXEC-001: The execution engine subscribes to the live candle feed from data-svc and delivers candles to each enabled strategy.
- REQ-EXEC-002: When a strategy emits a signal, the execution engine validates risk controls (position size limit, max drawdown circuit breaker, open order count) before calling exchange-svc.
- REQ-EXEC-003: On order fill confirmation from exchange-svc, update the position tracker and emit a fill event to core-svc for notification.
- REQ-EXEC-004: Maintain a paper trading mode where signals are executed against a virtual account at mid-price; no exchange API calls are made.
- REQ-EXEC-005: Emergency stop: a CLI command or API call halts all strategies and cancels all open orders on all exchanges within 5 seconds.
5.6 Ensemble Signal Aggregator (dsta-trading-svc)¶
- REQ-ENS-001: Aggregate signals from three sources per symbol: (a) technical indicator strategy output, (b) ml-svc price prediction confidence score, © sentiment pipeline score from data-svc.
- REQ-ENS-002: Apply confidence weighting: each signal source has a configurable weight; weights sum to 1.0.
- REQ-ENS-003: Emit a composite signal (
BUY/SELL/HOLD) with a confidence value [0, 1] to the execution engine. - REQ-ENS-004: Log each component signal and the composite result per candle for post-hoc analysis.
5.7 ML Models (dsta-ml-svc)¶
5.7.1 Price Prediction¶
- REQ-ML-001: Retain the existing LSTM model as a baseline; add PatchTST and iTransformer as alternative architectures selectable via config.
- REQ-ML-002: Integrate Chronos (Amazon, 2024) as a zero-shot inference option — no fine-tuning required, inference only.
- REQ-ML-003: Expose a REST endpoint
POST /predictaccepting{symbol, features: [...], horizon: int}, returning{point_estimate, confidence_interval, model_id}. - REQ-ML-004: Track all training runs in MLflow (self-hosted): log hyperparameters, validation metrics, and the serialized model artifact.
5.7.2 Market Regime Detection¶
- REQ-ML-005: Implement a Hidden Markov Model (HMM) regime detector with states: trending-up, trending-down, mean-reverting, high-volatility.
- REQ-ML-006: Expose
GET /regime/{symbol}returning current regime and posterior probabilities. - REQ-ML-007: Store regime transitions in the database so trading-svc can filter strategy signals by regime.
5.7.3 Feature Store¶
- REQ-ML-008: Deploy Feast with Redis online store and TimescaleDB offline store.
- REQ-ML-009: Define feature views for: OHLCV-derived technical features, order book imbalance, funding rate, on-chain metrics, sentiment score.
- REQ-ML-010: Training pipelines read features from the Feast offline store; inference pipelines read from the Redis online store.
5.7.4 Reinforcement Learning¶
- REQ-ML-011: The existing PPO agent is retrained against the backtesting environment after each new month of data is available.
- REQ-ML-012: The PPO agent's action (buy/sell/hold + position fraction) is exposed as a signal source via the same
POST /predictinterface.
5.8 Core Services (dsta-core-svc)¶
- REQ-CORE-001: Issue and validate JWTs for all API requests; token rotation supported without service restart.
- REQ-CORE-002: Telegram notification dispatcher: receive
NotificationEventmessages from RabbitMQ/Kafka and send to configured chat ID. - REQ-CORE-003: Notification event types:
TRADE_FILL,PNL_DAILY_SUMMARY,DRAWDOWN_ALERT,SERVICE_DOWN,STRATEGY_STOPPED. - REQ-CORE-004: Rate-limit outgoing Telegram messages to avoid hitting Bot API limits (30 messages/second per bot).
5.9 Portfolio Management (dsta-trading-svc)¶
- REQ-PORT-001: Track simultaneous positions across multiple symbols; maintain a correlation matrix updated daily.
- REQ-PORT-002: Implement Kelly Criterion position sizing: given win rate and payoff ratio estimated from recent trades, compute the fraction of capital to deploy.
- REQ-PORT-003: CVaR-constrained optimization: use PyPortfolioOpt + CVXPY to compute target portfolio weights given expected returns and covariance.
- REQ-PORT-004: Auto-rebalancing: when any asset weight deviates from target by more than a configurable threshold (default 5%), trigger a rebalance order.
5.10 LLM Agent Layer (dsta-ml-svc or standalone dsta-agent-svc)¶
- REQ-AGENT-001: Implement
MarketAnalystAgentusing the Claude API (tool-use pattern). Tools:fetch_news,query_onchain_metrics,run_backtest,get_portfolio_state,recommend_trade. - REQ-AGENT-002: Human-in-the-loop gate: the agent's
recommend_tradetool call produces a pending recommendation; the user must confirm via CLI or Telegram before execution. - REQ-AGENT-003: RAG knowledge base: index research notebooks, backtest result JSON files, and strategy documentation using LlamaIndex with pgvector. The agent can query this via a
search_knowledge_basetool. - REQ-AGENT-004: Autonomous strategy discovery loop: agent generates a hypothesis, calls
run_backtest, evaluates Sharpe/Calmar against a threshold, and either proposes deployment or discards. - REQ-AGENT-005: Use FinBERT for domain-specific financial NLP inside the sentiment pipeline; expose as a callable function to the agent.
5.11 Web Dashboard (dsta-web)¶
- REQ-WEB-001: Dashboard page: real-time portfolio value, unrealized and realized PnL, positions table, open orders table.
- REQ-WEB-002: Strategy management page: list strategies with status, parameters, and last backtest summary; enable/disable toggle.
- REQ-WEB-003: Market data page: interactive candlestick chart (with overlaid technical indicators), order book heatmap, funding rate chart.
- REQ-WEB-004: Backtesting page: trigger a backtest run, view progress, and download report.
- REQ-WEB-005: Alerts page: list all notification events with acknowledge functionality.
- REQ-WEB-006: Authenticate using JWT from core-svc; all API calls pass the token in the Authorization header.
- REQ-WEB-007: Real-time price and position updates via WebSocket connection to exchange-svc proxy.
5.12 CLI (dsta-cli)¶
- REQ-CLI-001: Commands:
backtest run,strategy list,strategy enable/disable,order place,order list,position list,data download,data status,health check. - REQ-CLI-002: Each command targets a specific service endpoint; authentication via stored JWT (login command).
- REQ-CLI-003: Output format: human-readable table by default;
--jsonflag for machine-readable output.
6. Non-Functional Requirements¶
6.1 Performance¶
- REQ-PERF-001: WebSocket candle ingestion to DB write: < 500ms end-to-end.
- REQ-PERF-002: ML inference latency (
/predict): < 200ms for LSTM/PatchTST; < 1s for Chronos. - REQ-PERF-003: Order placement to confirmation: < 2s on Binance; < 5s on Gate.io.
- REQ-PERF-004: Backtesting 1 year of 1m data for a single symbol: < 60 seconds.
- REQ-PERF-005: Ensemble signal computation per candle: < 50ms.
6.2 Reliability¶
- REQ-REL-001: Exchange WebSocket connections recover automatically within 10 seconds of disconnect.
- REQ-REL-002: No trade is placed without a corresponding audit log entry.
- REQ-REL-003: Database migrations run automatically on service startup via Alembic/Django migrations.
- REQ-REL-004: Each service exposes
GET /healthandGET /metrics(Prometheus format).
6.3 Security¶
- REQ-SEC-001: Exchange API keys stored encrypted at rest (Fernet or similar); never logged.
- REQ-SEC-002: All inter-service communication over HTTPS/WSS in production.
- REQ-SEC-003: API Gateway enforces JWT validation before routing to downstream services.
- REQ-SEC-004: Secrets managed via environment variables or a secrets manager; never committed to the repository.
6.4 Maintainability¶
- REQ-MAINT-001: Each service has its own
Makefilewith standard targets:install,test,lint,run,docker-build. - REQ-MAINT-002: Unit test coverage > 80% for all new code in trading-svc, data-svc, and ml-svc.
- REQ-MAINT-003: All services follow the same structured logging format (JSON, with
service,level,timestamp,trace_idfields). - REQ-MAINT-004: OpenAPI specs generated from code and committed to
docs/openapi/.
6.5 Deployment¶
- REQ-DEPLOY-001: A single
docker compose upin thedeploy/directory starts the full stack. - REQ-DEPLOY-002: Environment-specific configs managed via
.envfiles; a.env.exampleis committed. - REQ-DEPLOY-003: TimescaleDB hypertable migrations run as a one-shot init container.
7. AI/ML Architecture¶
7.1 Signal Pipeline¶
[TimescaleDB OHLCV] [Alt Data sources] [Sentiment pipeline]
| | |
v v v
[Feast offline store] ——> [Feature Engineering] ——> [Feast online store (Redis)]
|
┌────────────────────────────────────┤
| | |
[Technical [ML Price [Sentiment
indicators] predictor] score]
| | |
└────────────────────────────────────┘
|
[Ensemble aggregator]
|
[Regime filter (HMM)]
|
[Risk controls]
|
[Order execution]
7.2 ML Model Stack¶
| Model | Purpose | Architecture | Notes |
|---|---|---|---|
| LSTM | Price prediction (baseline) | 2-layer LSTM | Existing; kept for comparison |
| PatchTST | Price prediction (SOTA) | Transformer on patches | Replace LSTM in Phase 2 |
| iTransformer | Price prediction (SOTA) | Inverted transformer | Alternative to PatchTST |
| Chronos | Zero-shot forecasting | Foundation model | Amazon 2024; inference only |
| PPO RL agent | Position sizing signal | Proximal Policy Optimization | Existing; monthly retraining |
| HMM / BOCPD | Regime detection | Hidden Markov Model | Online inference; state stored in DB |
| FinBERT | Sentiment scoring | Domain-adapted BERT | Inference only; weights from HuggingFace |
7.3 Feature Store Design (Feast + Redis)¶
Feature Views: - ohlcv_features: returns, log-returns, rolling volatility (5/20/60 periods), RSI, MACD, ATR, Bollinger width. - orderbook_features: bid/ask imbalance, spread, depth ratio. - funding_rate_features: current rate, 24h rolling mean, deviation from neutral. - onchain_features: SOPR 7d MA, MVRV Z-score, exchange net flows. - sentiment_features: FinBERT score 24h rolling mean, momentum (score delta 1h).
Entities: symbol (BTC/USDT, ETH/USDT, etc.)
Offline store: TimescaleDB (training data reads) Online store: Redis (sub-10ms feature retrieval at inference)
7.4 MLflow Tracking¶
- Self-hosted MLflow server (
mlflow-svc) with PostgreSQL backend store and S3-compatible (MinIO) artifact store. - Every training run logs: architecture name, hyperparameters, train/val loss curve, validation Sharpe, model artifact.
- Registered model versions are tagged
stagingorproduction. dsta-ml-svcloads the model version taggedproductionat startup.
7.5 LLM Agent Design (Phase 4)¶
User query / automated trigger
|
v
MarketAnalystAgent (Claude API, tool-use)
|
┌─────┴─────────────────────────────┐
| | | | |
fetch query run get recommend
news onchain backtest portfolio trade
| | | | |
└──── RAG KB ────┤ └── Human approval gate
(LlamaIndex (Telegram confirm / CLI confirm)
+ pgvector) |
v
[Execution engine]
8. Technology Stack¶
8.1 Core Runtime¶
| Component | Technology | Notes |
|---|---|---|
| Backend services | Python 3.12+, FastAPI (new) / Django 4.x (core-svc) | exchange-svc already migrated to FastAPI |
| OHLCV storage | TimescaleDB (PostgreSQL extension) | Hypertables for time-series |
| Relational storage | PostgreSQL 17 | Strategies, orders, positions, configs |
| Cache / pub-sub | Redis 8 | Online feature store + session cache |
| Message broker | RabbitMQ (Phase 1–2), Kafka (Phase 3+) | Kafka adds replay capability |
| Workflow orchestration | Prefect (Phase 3+) | Ingestion pipelines, retraining jobs |
| Feature store | Feast + Redis | Phase 2+ |
| Experiment tracking | MLflow (self-hosted) | Phase 2+ |
| Hyperparameter optimization | Optuna | Phase 2+ |
| Containerization | Docker, Docker Compose | All services |
| API Gateway | Traefik or Kong | Phase 1 |
8.2 ML / Data Science¶
| Component | Technology |
|---|---|
| Data manipulation | Pandas, NumPy, Polars (performance-critical paths) |
| ML framework | PyTorch 2.x |
| Time-series models | PatchTST, iTransformer, Chronos |
| RL | Stable-Baselines3 (PPO) |
| NLP | HuggingFace Transformers (FinBERT) |
| Portfolio optimization | PyPortfolioOpt + CVXPY |
| Regime detection | hmmlearn |
| RAG | LlamaIndex + pgvector |
| LLM agent | Anthropic Claude API (tool-use) |
8.3 Exchange Connectivity¶
| Exchange | REST | WebSocket | Status |
|---|---|---|---|
| Binance | python-binance | native | Working |
| Huobi | custom client | custom | Working |
| Gate.io | gate-api SDK | gate-api SDK | Incomplete — Phase 1 |
| Deribit | deribit-api | — | Phase 3 (options data only) |
8.4 Observability¶
| Component | Technology |
|---|---|
| Metrics | Prometheus |
| Dashboards | Grafana |
| Logs | Loki + Promtail |
| Tracing | Jaeger (OpenTelemetry) |
9. Development Roadmap¶
Phase 1 — Close the Loop (Weeks 1–4)¶
Goal: The first live trade is placed by a strategy and confirmed on the exchange.
- WebSocket OHLCV ingestion pipeline writing to TimescaleDB (data-svc).
- Gate.io adapter completing the exchange-svc adapter set.
- trading-svc execution engine wired to exchange-svc live order API (no more dry-run).
- Strategy registry migrated from in-memory to PostgreSQL.
- Telegram notifications on
TRADE_FILL,DRAWDOWN_ALERT,PNL_DAILY_SUMMARY(core-svc). - Full-stack Docker Compose deployment with API Gateway.
- Walk-forward backtest validation of SMA crossover and RSI mean-reversion before going live.
Exit criteria: A paper-mode strategy running in trading-svc generates signals from live TimescaleDB data, calls exchange-svc, and core-svc sends a Telegram fill notification. All services start with docker compose up.
Phase 2 — Intelligence Upgrade (Weeks 5–10)¶
Goal: Replace LSTM-only signals with a SOTA ensemble; introduce feature store and MLflow.
- PatchTST and iTransformer added to ml-svc as selectable architectures.
- Chronos integrated as zero-shot inference option.
- Live regime detection endpoint (HMM) in ml-svc.
- Sentiment pipeline: CryptoPanic RSS → FinBERT → score stored in TimescaleDB.
- Ensemble signal aggregator in trading-svc combining technical + ML + sentiment.
- Optuna walk-forward hyperparameter optimization for strategy parameters.
- Feast + Redis feature store with defined feature views for all data sources.
- MLflow self-hosted for experiment tracking.
Exit criteria: The ensemble signal aggregator is live; at least one trading strategy uses a PatchTST or iTransformer signal; all training runs are logged in MLflow.
Phase 3 — Alternative Data & Portfolio (Weeks 10–18)¶
Goal: Richer signals and true portfolio-level management.
- Funding rate ingestion (Binance + Gate.io) as a feature.
- On-chain data ingestion (Glassnode / CryptoQuant): SOPR, MVRV, whale flows.
- Deribit options flow ingestion: put/call ratio, gamma exposure (BTC + ETH).
- L2 order book imbalance feature from WebSocket depth.
- Multi-asset simultaneous positions with correlation tracking.
- Kelly Criterion position sizing.
- CVaR portfolio optimization (PyPortfolioOpt + CVXPY).
- Auto-rebalancing on weight drift.
- Kafka streaming to replace/augment RabbitMQ (enables ML training replay).
- Prefect for scheduling ingestion, retraining, and reporting pipelines.
Exit criteria: The platform runs at least 3 simultaneous strategies across 5+ symbols; portfolio weights are rebalanced automatically; funding rate is an active feature in at least one live strategy.
Phase 4 — LLM Agent Layer (Month 5+)¶
Goal: Autonomous strategy discovery with human-in-the-loop execution.
MarketAnalystAgentusing Claude API with 5 tools:fetch_news,query_onchain_metrics,run_backtest,get_portfolio_state,recommend_trade.- Human-in-the-loop approval gate (Telegram confirm or CLI confirm).
- RAG knowledge base over research notebooks + backtest results (LlamaIndex + pgvector).
- Autonomous strategy discovery: agent generates → backtests → ranks by Sharpe/Calmar → proposes deployment.
- FinBERT / FinGPT integration for financial NLP tasks inside the agent tool chain.
Exit criteria: Agent can autonomously generate a strategy hypothesis, backtest it, and present a ranked recommendation requiring only human approval to deploy.
10. Data Models¶
10.1 Market Data (TimescaleDB hypertables)¶
Candlestick
symbol TEXT NOT NULL
exchange TEXT NOT NULL
timeframe TEXT NOT NULL -- '1m', '1h', etc.
timestamp TIMESTAMPTZ NOT NULL
open NUMERIC(24,8)
high NUMERIC(24,8)
low NUMERIC(24,8)
close NUMERIC(24,8)
volume NUMERIC(28,8)
PRIMARY KEY (symbol, exchange, timeframe, timestamp)
-- Hypertable partitioned on timestamp
OrderBookImbalance
symbol TEXT
exchange TEXT
timestamp TIMESTAMPTZ
imbalance FLOAT8 -- bid_vol / (bid_vol + ask_vol)
spread_bps FLOAT8
PRIMARY KEY (symbol, exchange, timestamp)
FundingRate
symbol TEXT
exchange TEXT
timestamp TIMESTAMPTZ
rate NUMERIC(18,8)
PRIMARY KEY (symbol, exchange, timestamp)
SentimentScore
symbol TEXT
source TEXT -- 'cryptopanic', 'reddit'
timestamp TIMESTAMPTZ
score FLOAT4 -- FinBERT output [-1, 1]
article_url TEXT
10.2 Trading (PostgreSQL)¶
Strategy
id UUID PRIMARY KEY
name TEXT NOT NULL
class_path TEXT NOT NULL
parameters JSONB
status TEXT CHECK (status IN ('enabled','disabled','paper'))
exchange TEXT
symbol TEXT
created_at TIMESTAMPTZ
updated_at TIMESTAMPTZ
Order
id UUID PRIMARY KEY
exchange_order_id TEXT
strategy_id UUID REFERENCES Strategy
symbol TEXT
exchange TEXT
side TEXT -- 'buy' | 'sell'
order_type TEXT -- 'market' | 'limit'
quantity NUMERIC(24,8)
price NUMERIC(24,8)
status TEXT -- 'open' | 'filled' | 'canceled' | 'partial'
created_at TIMESTAMPTZ
filled_at TIMESTAMPTZ
canceled_at TIMESTAMPTZ
client_order_id TEXT UNIQUE
Position
id UUID PRIMARY KEY
strategy_id UUID REFERENCES Strategy
symbol TEXT
exchange TEXT
side TEXT
quantity NUMERIC(24,8)
entry_price NUMERIC(24,8)
current_price NUMERIC(24,8)
unrealized_pnl NUMERIC(24,8)
opened_at TIMESTAMPTZ
closed_at TIMESTAMPTZ
TradeHistory
id UUID PRIMARY KEY
order_id UUID REFERENCES Order
timestamp TIMESTAMPTZ
symbol TEXT
exchange TEXT
side TEXT
quantity NUMERIC(24,8)
price NUMERIC(24,8)
fee NUMERIC(18,8)
fee_asset TEXT
pnl NUMERIC(24,8)
strategy_id UUID
10.3 System / Config (PostgreSQL)¶
ExchangeAccount
id UUID PRIMARY KEY
exchange TEXT
api_key TEXT -- encrypted
api_secret TEXT -- encrypted
label TEXT
is_active BOOLEAN
created_at TIMESTAMPTZ
NotificationEvent
id UUID PRIMARY KEY
event_type TEXT -- 'TRADE_FILL' | 'PNL_DAILY_SUMMARY' | ...
payload JSONB
sent_at TIMESTAMPTZ
acknowledged BOOLEAN DEFAULT FALSE
11. Success Metrics¶
11.1 Development¶
- Phase 1 exit: first real (or paper) order placed end-to-end by Week 4.
- Phase 2 exit: ensemble signal live in production by Week 10.
- Test coverage > 80% on all new code.
- Zero broken builds on
mainbranch.
11.2 Performance¶
- WebSocket ingestion lag < 500ms (P99).
- ML inference < 200ms (P95).
- Backtest 1 year of 1m data: < 60 seconds.
- Order-to-fill confirmation round-trip: < 2s (Binance).
11.3 Trading (Post-Phase 1 live)¶
- Strategy Sharpe ratio > 1.5 (out-of-sample walk-forward).
- Maximum drawdown < 20%.
- Win rate > 52% on live trades.
- Live vs. backtest Sharpe variance < 15%.
12. Risk Management¶
12.1 Technical Risks¶
| Risk | Impact | Mitigation |
|---|---|---|
| Exchange API changes | High | Unified adapter interface; versioned client libs; smoke tests against testnet |
| Data gaps in TimescaleDB | High | Gap detection on ingest; backfill job; alert on missing candles |
| ML model degradation | High | MLflow model versioning; shadow mode comparison; automatic rollback |
| Service downtime during live trading | High | Health checks, auto-restart policies, emergency stop command |
| Train/serve feature skew | Medium | Feast enforces consistent feature definitions across offline/online |
| Double order submission on retry | High | Idempotent client-order-id on all order calls |
12.2 Trading Risks¶
| Risk | Impact | Mitigation |
|---|---|---|
| Strategy overfitting | High | Mandatory walk-forward validation before live deployment |
| Regime change making model stale | High | Regime detector gates strategy signals; monthly retraining |
| Flash crash | High | Circuit breaker in exchange-svc; position size limits; emergency stop |
| Runaway losses | Critical | Max drawdown circuit breaker per strategy; portfolio-level hard stop |
| Execution slippage vs. backtest | Medium | Conservative slippage model in backtest; post-live slippage report |
12.3 Operational Risks¶
| Risk | Impact | Mitigation |
|---|---|---|
| API key compromise | Critical | Encrypted at rest, never logged, rotation procedure documented |
| LLM agent runaway execution | Critical | Hard human-approval gate before any order; agent cannot call order API directly |
| Infrastructure cost blowup | Low | Self-hosted stack; resource limits in Docker Compose |
13. Compliance & Legal¶
- DSTA is a personal research platform; no trading advice is given to third parties.
- Users trade at their own risk; no guarantee of profit or performance.
- Users are responsible for their own tax reporting and local regulatory compliance.
- API keys are the user's responsibility; the platform stores them encrypted but provides no custodial guarantees.
- Open source under MIT License.
14. Appendices¶
14.1 Glossary¶
| Term | Definition |
|---|---|
| OHLCV | Open, High, Low, Close, Volume — standard candlestick format |
| Sharpe Ratio | Annualized risk-adjusted return: (mean return − risk-free rate) / std dev of return |
| Calmar Ratio | Annualized return / max drawdown |
| Slippage | Difference between expected fill price and actual fill price |
| Walk-forward | Rolling in-sample/out-of-sample validation; avoids look-ahead bias |
| Feature skew | Discrepancy between features computed at training vs. inference time |
| HMM | Hidden Markov Model — used for regime detection |
| BOCPD | Bayesian Online Change Point Detection |
| CVaR | Conditional Value at Risk — tail-loss measure for portfolio optimization |
| Kelly Criterion | Formula for optimal bet size given edge and odds |
| PatchTST | 2023 transformer architecture that patches time series before self-attention |
| iTransformer | 2024 transformer that inverts attention to the feature (variate) dimension |
| Chronos | Amazon 2024 zero-shot time-series foundation model |
| Feast | Open-source feature store for ML |
| RAG | Retrieval-Augmented Generation — LLM answers grounded in retrieved documents |
14.2 References¶
- "Advances in Financial Machine Learning" — Marcos López de Prado
- "Algorithmic Trading: Winning Strategies" — Ernest P. Chan
- PatchTST: https://arxiv.org/abs/2211.14730
- iTransformer: https://arxiv.org/abs/2310.06625
- Chronos: https://arxiv.org/abs/2403.07815
- Feast documentation: https://docs.feast.dev
- Claude API tool-use: https://docs.anthropic.com/en/docs/tool-use
14.3 Related Documents¶
TASKS.md: Phased task breakdown (Phase 1–4)docs/CHANGELOG.md: Service-level change historydeploy/docker-compose.yaml: Full-stack deployment specdocs/openapi/: Per-service OpenAPI specs
Document Control
| Version | Date | Author | Changes |
|---|---|---|---|
| 1.0 | 2025-11-21 | AI Assistant | Initial PRD creation |
| 2.0 | 2026-05-20 | minhdqdev | Full rewrite: 4-phase roadmap, ML architecture, alt data, agent layer |