Parameter Sensitivity Analyzer¶
Overview¶
The Parameter Sensitivity Analyzer is a comprehensive tool for evaluating the robustness of backtesting strategy parameters. It helps identify optimal parameter ranges, detect overfitting, and assess strategy stability.
Created for: TASK-RESEARCH-005
Author: DSTA Team
Date: 2026-01-27
Features¶
1. Grid Search¶
- Exhaustive search over parameter combinations
- Multi-dimensional parameter space exploration
- Automatic identification of optimal parameters
- Performance surface visualization (2D heatmaps)
2. Sensitivity Analysis¶
- Single parameter sensitivity testing
- Multi-parameter comparative analysis
- Sensitivity score calculation (coefficient of variation)
- Performance degradation metrics
3. Overfitting Detection¶
- Parameter robustness scoring
- Sharp peak detection
- Near-optimal region analysis
- Risk level classification (LOW/MEDIUM/HIGH)
4. Monte Carlo Perturbation¶
- Random parameter perturbation testing
- Robustness score calculation
- Performance distribution analysis
- Type-preserving perturbations (int/float)
5. Walk-Forward Analysis¶
- Time-based parameter stability testing
- Sequential period evaluation
- Parameter consistency metrics
6. Heatmap Generation¶
- 2D performance surface visualization
- Parameter interaction analysis
- Ready-to-plot data formats
Installation¶
The module is part of the DSTA analytics package:
from analytics.sensitivity_analyzer import (
ParameterRange,
SensitivityAnalyzer,
SensitivityResult,
GridSearchResult
)
Quick Start¶
Basic Grid Search¶
from analytics.sensitivity_analyzer import ParameterRange, SensitivityAnalyzer
# Define your backtest function
def my_backtest(params):
# Run backtest with params
return sharpe_ratio
# Create analyzer
analyzer = SensitivityAnalyzer(
backtest_func=my_backtest,
metric_name='sharpe_ratio',
maximize=True
)
# Define parameter ranges
ranges = [
ParameterRange(name='period', min_value=10, max_value=50, step=5),
ParameterRange(name='threshold', min_value=0.01, max_value=0.1, step=0.02)
]
# Run grid search
result = analyzer.grid_search(ranges)
print(f"Best params: {result.best_params}")
print(f"Best performance: {result.best_performance}")
Single Parameter Sensitivity¶
baseline_params = {'period': 20, 'threshold': 0.05}
param_range = ParameterRange(name='period', min_value=10, max_value=30, step=5)
result = analyzer.analyze_single_parameter(baseline_params, param_range)
print(f"Sensitivity score: {result.sensitivity_score:.3f}")
print(f"Overfitted: {result.is_overfitted}")
Monte Carlo Robustness Testing¶
mc_result = analyzer.monte_carlo_perturbation(
baseline_params={'period': 20, 'threshold': 0.05},
perturbation_pct=0.2, # ±20% perturbation
n_samples=100
)
print(f"Robustness score: {mc_result['robustness_score']:.2%}")
Overfitting Detection¶
grid_result = analyzer.grid_search(param_ranges)
overfitting = analyzer.detect_overfitting(grid_result)
print(f"Risk level: {overfitting['risk_level']}")
print(f"Robustness: {overfitting['robustness']:.2%}")
API Reference¶
ParameterRange¶
Define a search space for a parameter.
Parameters: - name (str): Parameter name - min_value (float/int, optional): Minimum value - max_value (float/int, optional): Maximum value - step (float/int, optional): Step size for linear ranges, or number of points for log ranges - values (list, optional): Explicit list of values - scale (str): 'linear' or 'log' (default: 'linear')
Methods: - get_values(): Generate list of parameter values to test
Examples:
# Linear range (integers)
ParameterRange(name='period', min_value=10, max_value=50, step=5)
# Returns: [10, 15, 20, 25, 30, 35, 40, 45, 50]
# Logarithmic range
ParameterRange(name='alpha', min_value=0.001, max_value=1.0, step=10, scale='log')
# Returns: 10 logarithmically-spaced values
# Explicit values
ParameterRange(name='method', values=['sma', 'ema', 'wma'])
# Returns: ['sma', 'ema', 'wma']
SensitivityAnalyzer¶
Main class for parameter sensitivity analysis.
Constructor Parameters: - backtest_func (callable): Function that takes params dict and returns performance metric - metric_name (str): Name of the performance metric (default: 'sharpe_ratio') - maximize (bool): Whether to maximize the metric (default: True) - n_jobs (int): Number of parallel workers (default: 1) - random_state (int, optional): Random seed for reproducibility
Methods:
grid_search(param_ranges, base_params=None)¶
Perform exhaustive grid search over parameter ranges.
Returns: GridSearchResult with: - parameter_combinations: All tested combinations - performances: Performance for each combination - best_params: Optimal parameter set - best_performance: Best performance achieved - performance_surface: DataFrame (for 2D grids) - heatmap_data: Data for visualization
analyze_single_parameter(baseline_params, param_range, overfitting_threshold=0.2)¶
Analyze sensitivity to a single parameter.
Returns: SensitivityResult with: - parameter_name: Name of analyzed parameter - baseline_value: Original value - baseline_performance: Performance at baseline - test_values: Values tested - performances: Performance at each value - sensitivity_score: Coefficient of variation - degradation_pct: Performance degradation (%) - is_overfitted: Overfitting flag - confidence_interval: 95% CI - statistics: Additional metrics
analyze_multiple_parameters(baseline_params, param_ranges, overfitting_threshold=0.2)¶
Analyze sensitivity to multiple parameters independently.
Returns: Dictionary mapping parameter names to SensitivityResult objects
monte_carlo_perturbation(baseline_params, perturbation_pct=0.1, n_samples=100, param_names=None)¶
Perform Monte Carlo parameter perturbation analysis.
Returns: Dictionary with: - baseline_performance: Baseline metric value - mean_performance: Mean across perturbations - std_performance: Standard deviation - degradation_pct: Performance degradation - robustness_score: % of runs within 90% of baseline - perturbed_params: List of perturbed parameter sets - performances: List of performance values
walk_forward_analysis(baseline_params, param_range, time_periods, refit_frequency=1)¶
Perform walk-forward parameter analysis across time periods.
Returns: Dictionary with: - period_performances: Performance for each period - optimal_params_per_period: Optimal parameters per period - parameter_stability: Parameter consistency metric
calculate_sensitivity_scores(sensitivity_results)¶
Calculate and rank sensitivity scores for multiple parameters.
Returns: DataFrame with parameters ranked by sensitivity
generate_heatmap_data(param_range1, param_range2, base_params=None)¶
Generate 2D heatmap data for two parameters.
Returns: Dictionary with: - values: 2D array of performances - x_labels: X-axis labels - y_labels: Y-axis labels - x_param: X parameter name - y_param: Y parameter name
detect_overfitting(grid_result, threshold_percentile=90)¶
Detect overfitting indicators in grid search results.
Returns: Dictionary with: - risk_level: 'LOW', 'MEDIUM', or 'HIGH' - risk_score: Numeric risk score (0-7) - robustness: Fraction of near-optimal combinations - coefficient_of_variation: Performance CV - performance_gap_pct: Gap between best and median - Additional statistics
Interpretation Guide¶
Sensitivity Score¶
The sensitivity score is the coefficient of variation (CV) of performance across parameter values:
- CV < 0.1: Low sensitivity (robust parameter)
- 0.1 ≤ CV < 0.3: Moderate sensitivity
- CV ≥ 0.3: High sensitivity (parameter requires careful tuning)
Overfitting Indicators¶
A parameter is flagged as overfitted if: - Performance degrades significantly (>20% by default) away from optimal - Robustness score is low (<20% of combinations near optimal) - Sharp performance peak with rapid degradation
Risk Levels¶
LOW Risk (score 0-2): - Robust parameter set - Many near-optimal combinations - Low performance variance - Small gap between best and median
MEDIUM Risk (score 3-4): - Some sensitivity to parameters - Moderate robustness - Consider validation on out-of-sample data
HIGH Risk (score 5-7): - High overfitting risk - Few near-optimal combinations - Large performance variance - Significant gap between best and median - Action: Use wider parameter ranges or different optimization approach
Robustness Score¶
Monte Carlo robustness score = % of perturbed parameter sets that achieve ≥90% of baseline performance
- ≥70%: Highly robust
- 50-70%: Moderately robust
- <50%: Sensitive to parameter variations
Best Practices¶
1. Start with Wide Ranges¶
Begin with broad parameter ranges to understand the performance landscape:
ranges = [
ParameterRange(name='period', min_value=5, max_value=100, step=5),
ParameterRange(name='threshold', min_value=0.001, max_value=0.5, step=20, scale='log')
]
2. Use Multiple Analysis Methods¶
Combine different analyses for comprehensive evaluation:
# Grid search for optimal region
grid_result = analyzer.grid_search(param_ranges)
# Sensitivity analysis at optimal point
sensitivity_results = analyzer.analyze_multiple_parameters(
grid_result.best_params,
param_ranges
)
# Monte Carlo for robustness
mc_result = analyzer.monte_carlo_perturbation(
grid_result.best_params,
perturbation_pct=0.15
)
# Overfitting detection
overfitting = analyzer.detect_overfitting(grid_result)
3. Set Appropriate Thresholds¶
Adjust overfitting thresholds based on strategy type:
- High-frequency strategies: Lower threshold (0.1-0.15)
- Daily strategies: Medium threshold (0.15-0.25)
- Position strategies: Higher threshold (0.2-0.3)
4. Validate with Walk-Forward¶
Use walk-forward analysis to verify parameter stability over time:
time_periods = [
('2020-01', '2020-06'),
('2020-06', '2020-12'),
('2021-01', '2021-06')
]
wf_result = analyzer.walk_forward_analysis(
baseline_params,
param_range,
time_periods
)
# Low stability score indicates parameter drift
if wf_result['parameter_stability'] > 0.3:
print("Warning: Parameters not stable over time")
5. Document Your Findings¶
Always document sensitivity analysis results:
# Calculate sensitivity rankings
scores = analyzer.calculate_sensitivity_scores(sensitivity_results)
scores.to_csv('sensitivity_rankings.csv')
# Generate heatmap for visualization
heatmap = analyzer.generate_heatmap_data(param_range1, param_range2)
# Use with matplotlib, seaborn, or plotly
Testing¶
Run the comprehensive test suite:
Test coverage includes: - ParameterRange validation and value generation - Grid search with single and multiple parameters - Sensitivity analysis (single and multiple parameters) - Overfitting detection - Monte Carlo perturbation - Walk-forward analysis - Edge cases (single values, categorical parameters, etc.) - Integration tests
Example Output¶
See src/analytics/sensitivity_example.py for complete examples with output.
Run the examples:
Limitations¶
- Computational Cost: Grid search complexity grows exponentially with number of parameters
- Use coarse grids initially
- Refine around promising regions
-
Consider parallel execution (
n_jobs > 1) -
Independence Assumption:
analyze_multiple_parameters()varies parameters independently - Use
grid_search()for interaction effects -
Generate 2D heatmaps for parameter pairs
-
Backtest Function: Must be deterministic or use
random_statefor reproducibility -
Monte Carlo analysis requires some randomness to be meaningful
-
Memory Usage: Large grid searches store all results in memory
- For very large searches, consider implementing incremental results
Future Enhancements¶
Potential additions: - Bayesian optimization integration - Parallel coordinate plots - Automatic parameter range suggestion - Statistical significance testing - Interactive visualization exports
License¶
Part of the DSTA project.
Support¶
For issues or questions, contact the DSTA team or file an issue in the project repository.