Database Initialization Guide¶
Overview¶
This guide explains how to initialize and manage the DSTA database, including schema creation, migrations, and test data generation.
Quick Start¶
Automatic Initialization¶
The easiest way to initialize the database is using the provided script:
# Basic initialization
./scripts/init_db.sh
# With sample data
./scripts/init_db.sh --with-sample-data
# With tests
./scripts/init_db.sh --with-tests
# Custom database settings
./scripts/init_db.sh --db-host localhost --db-name dsta_dev --with-sample-data
Manual Initialization¶
If you prefer manual control:
cd src
# 1. Run migrations
python manage.py migrate
# 2. Create superuser
python manage.py createsuperuser
# 3. (Optional) Load sample data
python manage.py loaddata api/fixtures/sample_data.json
# 4. (Optional) Generate test data
python manage.py generate_test_data --days 30
Database Schema¶
Tables¶
The DSTA database consists of four main tables:
1. Candlesticks¶
Stores OHLCV (Open, High, Low, Close, Volume) candlestick data.
Fields: - exchange: Exchange name (binance, gateio, huobi) - symbol: Trading pair (e.g., BTCUSDT) - interval: Timeframe (1m, 5m, 15m, 30m, 1h, 4h, 1d, 7d) - timestamp: Candlestick opening time - open_price: Opening price - high_price: Highest price - low_price: Lowest price - close_price: Closing price - volume: Trading volume (base currency) - quote_volume: Trading volume (quote currency) - trades_count: Number of trades
Indexes: - Composite: (exchange, symbol, interval, timestamp) - (symbol, timestamp) - (exchange, timestamp)
Constraints: - Unique: (exchange, symbol, interval, timestamp)
2. Trades¶
Stores individual trade execution records.
Fields: - exchange: Exchange name - symbol: Trading pair - trade_id: Unique trade ID from exchange - timestamp: Trade execution time - price: Trade price - quantity: Trade quantity (base currency) - quote_quantity: Trade quantity (quote currency) - is_buyer_maker: Whether buyer was maker
Indexes: - (exchange, symbol, timestamp) - (symbol, timestamp)
Constraints: - Unique: (exchange, symbol, trade_id)
3. Order Books¶
Stores order book snapshot data.
Fields: - exchange: Exchange name - symbol: Trading pair - timestamp: Snapshot time - bids: JSON array of [price, quantity] bid levels - asks: JSON array of [price, quantity] ask levels - last_update_id: Exchange update ID
Indexes: - (exchange, symbol, timestamp) - (symbol, timestamp) - (timestamp)
4. Market Metrics¶
Stores market-wide indicators.
Fields: - metric_type: Type of metric (btc_dominance, total_market_cap, fear_greed_index, other) - timestamp: Metric timestamp - value: Metric value - metadata: Additional JSON metadata
Indexes: - (metric_type, timestamp)
Constraints: - Unique: (metric_type, timestamp)
Migrations¶
Creating Migrations¶
# After modifying models.py
python manage.py makemigrations
# With custom name
python manage.py makemigrations --name add_new_field
Applying Migrations¶
# Apply all pending migrations
python manage.py migrate
# Apply specific app migrations
python manage.py migrate api
# Show migration status
python manage.py showmigrations
# See SQL for a migration
python manage.py sqlmigrate api 0001
Rolling Back Migrations¶
# Rollback to specific migration
python manage.py migrate api 0001_initial
# Rollback all migrations for an app
python manage.py migrate api zero
Test Data Generation¶
Using Django Command¶
# Generate 30 days of data
python manage.py generate_test_data --days 30
# Multiple symbols
python manage.py generate_test_data --days 30 --symbols BTCUSDT ETHUSDT BNBUSDT
# Different exchange
python manage.py generate_test_data --exchange gateio --days 7
# Quiet mode
python manage.py generate_test_data --quiet
Using Python¶
from api.fixtures.test_data_generator import generate_test_data
# Generate test data
stats = generate_test_data(
days=30,
symbols=['BTCUSDT', 'ETHUSDT', 'BNBUSDT'],
exchange='binance',
verbose=True
)
print(f"Created {stats['candlesticks']} candlesticks")
print(f"Created {stats['trades']} trades")
print(f"Created {stats['market_metrics']} market metrics")
Generated Data Characteristics¶
- Candlesticks: Random walk with realistic OHLC patterns
- Trades: Random price movements with varying quantities
- Market Metrics: Daily BTC dominance and Fear & Greed index
Sample Data¶
Loading Sample Data¶
# Load from fixture file
python manage.py loaddata api/fixtures/sample_data.json
# Export current data as fixture
python manage.py dumpdata api.Candlestick api.Trade --indent 2 > my_data.json
Sample Data Contents¶
The sample_data.json includes: - 2 candlesticks (BTCUSDT, ETHUSDT) - 2 market metrics (BTC dominance, Fear & Greed)
Database Management¶
Backup Database¶
# Using pg_dump
pg_dump -h localhost -U dsta -d dsta_dev -F c -f backup.dump
# Plain SQL format
pg_dump -h localhost -U dsta -d dsta_dev > backup.sql
Restore Database¶
# From custom format
pg_restore -h localhost -U dsta -d dsta_dev backup.dump
# From SQL file
psql -h localhost -U dsta -d dsta_dev < backup.sql
Reset Database¶
# Using Django
python manage.py flush
# Or drop and recreate
dropdb dsta_dev
createdb dsta_dev
python manage.py migrate
Verification¶
Check Schema¶
# List all tables
psql -h localhost -U dsta -d dsta_dev -c "\dt"
# Describe table
psql -h localhost -U dsta -d dsta_dev -c "\d candlesticks"
# Show indexes
psql -h localhost -U dsta -d dsta_dev -c "\di"
Check Data¶
# Count rows
psql -h localhost -U dsta -d dsta_dev -c "SELECT
(SELECT COUNT(*) FROM candlesticks) as candlesticks,
(SELECT COUNT(*) FROM trades) as trades,
(SELECT COUNT(*) FROM order_books) as order_books,
(SELECT COUNT(*) FROM market_metrics) as market_metrics;"
# Sample data
psql -h localhost -U dsta -d dsta_dev -c "SELECT * FROM candlesticks LIMIT 5;"
Using Django Shell¶
python manage.py shell
>>> from api.models import Candlestick, Trade, MarketMetric
>>>
>>> # Count records
>>> print(f"Candlesticks: {Candlestick.objects.count()}")
>>> print(f"Trades: {Trade.objects.count()}")
>>> print(f"Market Metrics: {MarketMetric.objects.count()}")
>>>
>>> # Latest candlestick
>>> latest = Candlestick.objects.latest('timestamp')
>>> print(f"{latest.symbol} @ {latest.timestamp}: ${latest.close_price}")
>>>
>>> # Query examples
>>> btc_candles = Candlestick.objects.filter(symbol='BTCUSDT', interval='1h')
>>> print(f"BTC hourly candles: {btc_candles.count()}")
Performance Optimization¶
Analyze Queries¶
-- Enable query timing
\timing on
-- Analyze query plan
EXPLAIN ANALYZE SELECT * FROM candlesticks
WHERE exchange = 'binance' AND symbol = 'BTCUSDT' AND interval = '1h'
ORDER BY timestamp DESC LIMIT 100;
Add Indexes¶
# In models.py
class Meta:
indexes = [
models.Index(fields=['exchange', 'symbol', 'timestamp']),
]
# Create migration
python manage.py makemigrations
python manage.py migrate
Vacuum and Analyze¶
# Vacuum database
psql -h localhost -U dsta -d dsta_dev -c "VACUUM ANALYZE;"
# Specific table
psql -h localhost -U dsta -d dsta_dev -c "VACUUM ANALYZE candlesticks;"
Troubleshooting¶
Migration Conflicts¶
# Show migration status
python manage.py showmigrations
# Merge conflicting migrations
python manage.py makemigrations --merge
# Fake a migration (if manually applied)
python manage.py migrate --fake api 0001_initial
Connection Issues¶
# Test database connection
psql -h localhost -U dsta -d dsta_dev -c "SELECT 1;"
# Check if PostgreSQL is running
docker-compose ps postgres
# View logs
docker-compose logs postgres
Data Issues¶
# Check for duplicate entries
psql -h localhost -U dsta -d dsta_dev -c "
SELECT exchange, symbol, interval, timestamp, COUNT(*)
FROM candlesticks
GROUP BY exchange, symbol, interval, timestamp
HAVING COUNT(*) > 1;"
# Remove duplicates (backup first!)
python manage.py shell
>>> from api.models import Candlestick
>>> # Identify and remove duplicates...
Environment-Specific Settings¶
Development¶
Testing¶
Production¶
Best Practices¶
- Always backup before migrations in production
- Test migrations on a copy of production data
- Use transactions for bulk operations
- Monitor query performance regularly
- Archive old data to keep tables manageable
- Use database constraints to ensure data integrity
- Index frequently queried fields
- Document schema changes in migrations