Troubleshooting Guide¶
This guide helps you diagnose and resolve common issues in the DSTA trading system.
Quick Diagnostics¶
Health Check¶
# Check system health
curl http://localhost:8000/health
# Expected response:
{
"status": "healthy",
"database": "connected",
"redis": "connected",
"timestamp": "2024-01-15T10:30:00Z"
}
Service Status¶
# Check Docker containers
docker-compose ps
# Expected: All services should be "Up"
# If "Exit" or "Restarting", check logs
docker-compose logs <service-name>
Quick Fixes¶
# Restart all services
docker-compose restart
# Rebuild and restart
docker-compose up -d --build
# Reset everything (WARNING: deletes data)
docker-compose down -v
docker-compose up -d
Common Issues¶
1. Service Won't Start¶
Symptoms¶
- Container keeps restarting
- Service shows as "Exit 1" or "Exit 137"
Diagnosis¶
# Check container logs
docker-compose logs api-server
# Check events
docker events --filter container=<container-name>
# Check resource usage
docker stats
Common Causes and Solutions¶
A. Database Not Ready
# Error in logs:
# django.db.utils.OperationalError: could not connect to server
# Solution: Wait for database to be ready
docker-compose exec postgres pg_isready
# Add healthcheck to docker-compose.yaml
services:
api-server:
depends_on:
postgres:
condition: service_healthy
B. Missing Environment Variables
# Error in logs:
# KeyError: 'POSTGRES_PASSWORD'
# Solution: Check environment file
cat .env_files/dev.env
# Ensure all required variables are set:
# - POSTGRES_DB
# - POSTGRES_USER
# - POSTGRES_PASSWORD
# - SECRET_KEY
C. Port Already in Use
# Error in logs:
# Error starting userland proxy: listen tcp 0.0.0.0:8000: bind: address already in use
# Find process using port
lsof -i :8000
# Kill process
kill -9 <PID>
# Or use different port in docker-compose.yaml
ports:
- "8001:8000"
D. Out of Memory
# Error in logs:
# Exit 137 (killed by system)
# Check memory usage
free -h
docker stats
# Solution: Increase memory limit
services:
api-server:
mem_limit: 2g
mem_reservation: 1g
# Or restart Docker daemon
sudo systemctl restart docker
2. Database Issues¶
Cannot Connect to Database¶
Symptoms: - psycopg2.OperationalError: could not connect to server - django.db.utils.OperationalError
Diagnosis:
# Test database connection
docker-compose exec api-server python manage.py dbshell
# Test from command line
docker-compose exec postgres psql -U dsta_user -d dsta_dev
# Check postgres logs
docker-compose logs postgres
Solutions:
# 1. Ensure postgres is running
docker-compose ps postgres
# 2. Check credentials
docker-compose exec api-server env | grep POSTGRES
# 3. Restart database
docker-compose restart postgres
# 4. Recreate database (WARNING: deletes data)
docker-compose exec postgres psql -U postgres -c "DROP DATABASE dsta_dev;"
docker-compose exec postgres psql -U postgres -c "CREATE DATABASE dsta_dev OWNER dsta_user;"
docker-compose exec api-server python manage.py migrate
Database Lock¶
Symptoms: - Queries hang indefinitely - database is locked error
Diagnosis:
-- Connect to database
docker-compose exec postgres psql -U dsta_user dsta_dev
-- Check for long-running queries
SELECT
pid,
now() - pg_stat_activity.query_start AS duration,
query,
state
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY duration DESC;
-- Check for locks
SELECT
locktype,
database,
relation::regclass,
mode,
granted
FROM pg_locks
WHERE NOT granted;
Solutions:
-- Kill long-running query
SELECT pg_terminate_backend(<pid>);
-- Or kill all connections (use carefully)
SELECT pg_terminate_backend(pg_stat_activity.pid)
FROM pg_stat_activity
WHERE datname = 'dsta_dev' AND pid <> pg_backend_pid();
Slow Queries¶
Symptoms: - API requests timeout - Database CPU at 100%
Diagnosis:
-- Enable query logging
ALTER DATABASE dsta_dev SET log_min_duration_statement = 1000; -- Log queries > 1s
-- Find slow queries
SELECT
query,
calls,
total_time,
mean_time,
max_time
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 10;
Solutions:
-- Analyze query
EXPLAIN ANALYZE SELECT * FROM candlesticks WHERE symbol = 'BTCUSDT';
-- Add missing index
CREATE INDEX idx_symbol ON candlesticks(symbol);
-- Update statistics
ANALYZE candlesticks;
-- Vacuum database
VACUUM ANALYZE;
3. Redis Issues¶
Cannot Connect to Redis¶
Symptoms: - redis.exceptions.ConnectionError - Cache not working
Diagnosis:
# Test Redis connection
docker-compose exec redis redis-cli ping
# Expected: PONG
# Test from Python
docker-compose exec api-server python
>>> import redis
>>> r = redis.Redis(host='redis', port=6379)
>>> r.ping()
True
Solutions:
# Restart Redis
docker-compose restart redis
# Check Redis logs
docker-compose logs redis
# Clear Redis cache
docker-compose exec redis redis-cli FLUSHALL
Redis Out of Memory¶
Symptoms: - OOM command not allowed when used memory > 'maxmemory'
Solutions:
# Check memory usage
docker-compose exec redis redis-cli INFO memory
# Increase maxmemory (in redis.conf or command line)
docker-compose exec redis redis-cli CONFIG SET maxmemory 2gb
# Or flush unnecessary data
docker-compose exec redis redis-cli FLUSHDB
4. API/Django Issues¶
Import Errors¶
Symptoms: - ModuleNotFoundError: No module named 'XXX' - ImportError: cannot import name 'YYY'
Solutions:
# Rebuild container
docker-compose up -d --build api-server
# Check installed packages
docker-compose exec api-server pip list
# Install missing package
docker-compose exec api-server pip install <package-name>
# Update requirements.txt and rebuild
echo "package-name==1.0.0" >> src/requirements.txt
docker-compose up -d --build
Migrations Issues¶
Symptoms: - django.db.migrations.exceptions.InconsistentMigrationHistory - django.db.migrations.exceptions.NodeNotFoundError
Solutions:
# Check migration status
docker-compose exec api-server python manage.py showmigrations
# Fake initial migrations (if database already has tables)
docker-compose exec api-server python manage.py migrate --fake-initial
# Reset migrations (WARNING: may lose data)
docker-compose exec api-server python manage.py migrate <app> zero
docker-compose exec api-server python manage.py migrate
# Recreate migrations
docker-compose exec api-server python manage.py makemigrations
docker-compose exec api-server python manage.py migrate
Static Files Not Loading¶
Symptoms: - CSS/JS not loading in admin - 404 errors for static files
Solutions:
# Collect static files
docker-compose exec api-server python manage.py collectstatic --noinput
# Check STATIC_ROOT setting
docker-compose exec api-server python manage.py shell
>>> from django.conf import settings
>>> print(settings.STATIC_ROOT)
# Ensure volume is mounted correctly in docker-compose.yaml
services:
api-server:
volumes:
- static_files:/app/staticfiles
5. Exchange Connectivity Issues¶
API Authentication Failures¶
Symptoms: - binance.exceptions.BinanceAPIException: Invalid API-key - 401 Unauthorized
Solutions:
# Verify API keys
docker-compose exec api-server env | grep BINANCE
# Test API keys
docker-compose exec api-server python
>>> from binance.client import Client
>>> client = Client(api_key='YOUR_KEY', api_secret='YOUR_SECRET')
>>> client.ping()
# Ensure API keys have correct permissions:
# - Spot trading (if trading)
# - Market data (required)
# - Futures trading (if using futures)
# Check IP whitelist on exchange
# Some exchanges require whitelisting server IP
Rate Limiting¶
Symptoms: - 429 Too Many Requests - Rate limit exceeded
Solutions:
# Add rate limiting to API calls
import time
from functools import wraps
def rate_limit(calls_per_second=10):
"""Rate limiting decorator."""
min_interval = 1.0 / calls_per_second
last_called = [0.0]
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
elapsed = time.time() - last_called[0]
wait = min_interval - elapsed
if wait > 0:
time.sleep(wait)
result = func(*args, **kwargs)
last_called[0] = time.time()
return result
return wrapper
return decorator
@rate_limit(calls_per_second=5)
def fetch_data(symbol):
# API call here
pass
WebSocket Disconnections¶
Symptoms: - WebSocket connections keep dropping - Connection closed errors
Solutions:
# Implement reconnection logic
import websocket
import time
class WebSocketReconnect:
def __init__(self, url, on_message):
self.url = url
self.on_message = on_message
self.ws = None
self.reconnect_delay = 5
def connect(self):
"""Connect with automatic reconnection."""
while True:
try:
self.ws = websocket.WebSocketApp(
self.url,
on_message=self.on_message,
on_error=self.on_error,
on_close=self.on_close
)
self.ws.run_forever()
except Exception as e:
logger.error(f"WebSocket error: {e}")
time.sleep(self.reconnect_delay)
def on_error(self, ws, error):
logger.error(f"WebSocket error: {error}")
def on_close(self, ws, close_status_code, close_msg):
logger.info(f"WebSocket closed: {close_msg}")
time.sleep(self.reconnect_delay)
6. Data Collection Issues¶
No Data Being Collected¶
Symptoms: - Database tables empty - Candlesticks not updating
Diagnosis:
# Check data collection workers
docker-compose logs dsta-sync-schedule
# Check for errors
docker-compose logs dsta-sync-schedule | grep ERROR
# Verify database connection
docker-compose exec api-server python manage.py dbshell
SELECT COUNT(*) FROM candlesticks;
Solutions:
# Restart data collection
docker-compose restart dsta-sync-schedule
# Manual data collection test
docker-compose exec api-server python manage.py shell
>>> from api.models import Candlestick
>>> from datetime import datetime
>>> # Test creating a candlestick
>>> candle = Candlestick.objects.create(...)
# Check exchange connectivity
docker-compose exec api-server python
>>> from binance.client import Client
>>> client = Client()
>>> candles = client.get_klines(symbol='BTCUSDT', interval='1h', limit=1)
>>> print(candles)
Duplicate Data¶
Symptoms: - IntegrityError: duplicate key value violates unique constraint - Multiple entries for same timestamp
Solutions:
# Check for duplicates
docker-compose exec postgres psql -U dsta_user dsta_dev
SELECT exchange, symbol, interval, timestamp, COUNT(*)
FROM candlesticks
GROUP BY exchange, symbol, interval, timestamp
HAVING COUNT(*) > 1;
# Remove duplicates (keep oldest)
DELETE FROM candlesticks a USING candlesticks b
WHERE a.id > b.id
AND a.exchange = b.exchange
AND a.symbol = b.symbol
AND a.interval = b.interval
AND a.timestamp = b.timestamp;
# Ensure unique constraint exists
ALTER TABLE candlesticks ADD CONSTRAINT unique_candlestick
UNIQUE (exchange, symbol, interval, timestamp);
7. Performance Issues¶
High CPU Usage¶
Diagnosis:
# Check which service is using CPU
docker stats
# Check processes inside container
docker-compose exec api-server top
# Profile Python code
docker-compose exec api-server python -m cProfile -o profile.stats manage.py <command>
Solutions:
# Optimize database queries (add indexes)
# See DATABASE_SCHEMA.md
# Reduce worker processes
# In gunicorn config: workers = 2
# Enable query caching
# In Django settings:
CACHES = {
'default': {
'BACKEND': 'django_redis.cache.RedisCache',
'LOCATION': 'redis://redis:6379/1',
}
}
# Use select_related and prefetch_related
queryset = Candlestick.objects.select_related('symbol').all()
High Memory Usage¶
Diagnosis:
# Check memory usage
docker stats
# Check memory inside container
docker-compose exec api-server free -h
# Python memory profiling
docker-compose exec api-server python -m memory_profiler script.py
Solutions:
# Limit container memory
services:
api-server:
mem_limit: 2g
# Use iterator for large querysets
for obj in MyModel.objects.iterator(chunk_size=1000):
process(obj)
# Clear query cache periodically
from django.db import reset_queries
reset_queries()
# Reduce worker memory
# In gunicorn: max_requests = 1000
Slow API Responses¶
Diagnosis:
# Enable Django debug toolbar
pip install django-debug-toolbar
# Check query count
from django.db import connection
print(len(connection.queries))
# Time endpoints
time curl http://localhost:8000/api/candlesticks/
Solutions:
# Use database indexes
# Add select_related / prefetch_related
# Enable caching
from django.views.decorators.cache import cache_page
@cache_page(60 * 5) # Cache for 5 minutes
def my_view(request):
# View logic
pass
# Use pagination
from rest_framework.pagination import PageNumberPagination
class StandardResultsSetPagination(PageNumberPagination):
page_size = 100
page_size_query_param = 'page_size'
max_page_size = 1000
8. Testing Issues¶
Tests Failing¶
Common Causes:
# Database not clean between tests
# Solution: Use pytest fixtures
@pytest.fixture(autouse=True)
def reset_db(db):
"""Reset database between tests."""
yield
# Cleanup happens automatically
# Missing test dependencies
pip install pytest pytest-django pytest-cov
# Incorrect test settings
# Ensure using test settings
pytest --ds=dsta.settings_test
Test Database Issues¶
# Permission denied creating test database
# Grant createdb permission
docker-compose exec postgres psql -U postgres
GRANT CREATE ON DATABASE template1 TO dsta_user;
# Test database not being destroyed
# Manually drop test databases
DROP DATABASE test_dsta_dev;
Debug Logging¶
Enable Debug Mode¶
View Logs¶
# Follow all logs
docker-compose logs -f
# Specific service
docker-compose logs -f api-server
# Last 100 lines
docker-compose logs --tail=100 api-server
# Search logs
docker-compose logs api-server | grep ERROR
# Export logs
docker-compose logs api-server > api-logs.txt
Django Shell Debugging¶
# Access Django shell
docker-compose exec api-server python manage.py shell
# Test code interactively
>>> from api.models import Candlestick
>>> candles = Candlestick.objects.all()
>>> print(candles.count())
# Test imports
>>> from backtesting.backtest import Backtest
>>> print(Backtest.__doc__)
Python Debugger (pdb)¶
# Add breakpoint in code
import pdb; pdb.set_trace()
# Or use debugpy for remote debugging
import debugpy
debugpy.listen(("0.0.0.0", 5678))
debugpy.wait_for_client()
FAQ¶
Q: Docker containers keep restarting¶
A: Check logs for the specific error:
Common causes: missing env vars, database not ready, port conflicts.Q: How do I reset the entire system?¶
A: WARNING - This deletes all data:
docker-compose down -v
rm -rf data/docker-storage/*
docker-compose up -d
docker-compose exec api-server python manage.py migrate
Q: API returns 500 errors¶
A: Check Django logs:
Enable DEBUG mode to see detailed error pages.Q: Cannot access admin panel¶
A: 1. Create superuser:
2. Collect static files:Q: Backtests run very slowly¶
A: - Add database indexes (see DATABASE_SCHEMA.md) - Use smaller date ranges - Reduce data frequency (use 1h instead of 1m) - Check if running in DEBUG mode (should be False)
Q: Out of disk space¶
A:
# Check disk usage
df -h
# Clean Docker
docker system prune -a --volumes
# Clean logs
truncate -s 0 /var/log/dsta/*.log
# Archive old data
pg_dump dsta_dev | gzip > backup.sql.gz
# Then delete old candlesticks
DELETE FROM candlesticks WHERE timestamp < '2023-01-01';
Q: How to update DSTA?¶
A:
# Backup first!
./backup-db.sh
# Pull latest changes
git pull origin main
# Rebuild containers
docker-compose down
docker-compose up -d --build
# Run migrations
docker-compose exec api-server python manage.py migrate
Getting Help¶
Information to Provide¶
When asking for help, include:
- Error message (full traceback)
- Steps to reproduce
- Docker compose logs:
- System info:
- Configuration (sanitize secrets):
Resources¶
- Documentation: Check all docs in
docs/ - GitHub Issues: Search existing issues
- Docker Logs: Always check logs first
- Django Debug Toolbar: Enable in development
Contact¶
- Open GitHub issue with
buglabel - Include information from "Information to Provide"
- Be specific about the problem
- Share relevant logs and error messages
Checklist for Production Issues¶
- Check service status:
docker-compose ps - Review logs:
docker-compose logs - Verify health endpoint:
curl http://localhost:8000/health - Check database connection:
docker-compose exec postgres pg_isready - Check Redis connection:
docker-compose exec redis redis-cli ping - Verify disk space:
df -h - Check memory:
free -h - Review recent changes:
git log --oneline -10 - Check environment variables:
docker-compose config - Verify backups are current
- Test API manually:
curl http://localhost:8000/api/
Remember: Most issues can be resolved by checking logs and following the error messages. When in doubt, restart the service or rebuild the container.