Parameter Sensitivity Analyzer¶

Overview¶

The Parameter Sensitivity Analyzer is a comprehensive tool for evaluating the robustness of backtesting strategy parameters. It helps identify optimal parameter ranges, detect overfitting, and assess strategy stability.

Created for: TASK-RESEARCH-005

Author: DSTA Team
Date: 2026-01-27

Features¶

1. Grid Search¶

Exhaustive search over parameter combinations
Multi-dimensional parameter space exploration
Automatic identification of optimal parameters
Performance surface visualization (2D heatmaps)

2. Sensitivity Analysis¶

Single parameter sensitivity testing
Multi-parameter comparative analysis
Sensitivity score calculation (coefficient of variation)
Performance degradation metrics

3. Overfitting Detection¶

Parameter robustness scoring
Sharp peak detection
Near-optimal region analysis
Risk level classification (LOW/MEDIUM/HIGH)

4. Monte Carlo Perturbation¶

Random parameter perturbation testing
Robustness score calculation
Performance distribution analysis
Type-preserving perturbations (int/float)

5. Walk-Forward Analysis¶

Time-based parameter stability testing
Sequential period evaluation
Parameter consistency metrics

6. Heatmap Generation¶

2D performance surface visualization
Parameter interaction analysis
Ready-to-plot data formats

Installation¶

The module is part of the DSTA analytics package:

from analytics.sensitivity_analyzer import (
    ParameterRange,
    SensitivityAnalyzer,
    SensitivityResult,
    GridSearchResult
)

Quick Start¶

Basic Grid Search¶

from analytics.sensitivity_analyzer import ParameterRange, SensitivityAnalyzer

# Define your backtest function
def my_backtest(params):
    # Run backtest with params
    return sharpe_ratio

# Create analyzer
analyzer = SensitivityAnalyzer(
    backtest_func=my_backtest,
    metric_name='sharpe_ratio',
    maximize=True
)

# Define parameter ranges
ranges = [
    ParameterRange(name='period', min_value=10, max_value=50, step=5),
    ParameterRange(name='threshold', min_value=0.01, max_value=0.1, step=0.02)
]

# Run grid search
result = analyzer.grid_search(ranges)
print(f"Best params: {result.best_params}")
print(f"Best performance: {result.best_performance}")

Single Parameter Sensitivity¶

baseline_params = {'period': 20, 'threshold': 0.05}
param_range = ParameterRange(name='period', min_value=10, max_value=30, step=5)

result = analyzer.analyze_single_parameter(baseline_params, param_range)
print(f"Sensitivity score: {result.sensitivity_score:.3f}")
print(f"Overfitted: {result.is_overfitted}")

Monte Carlo Robustness Testing¶

mc_result = analyzer.monte_carlo_perturbation(
    baseline_params={'period': 20, 'threshold': 0.05},
    perturbation_pct=0.2,  # ±20% perturbation
    n_samples=100
)
print(f"Robustness score: {mc_result['robustness_score']:.2%}")

Overfitting Detection¶

grid_result = analyzer.grid_search(param_ranges)
overfitting = analyzer.detect_overfitting(grid_result)
print(f"Risk level: {overfitting['risk_level']}")
print(f"Robustness: {overfitting['robustness']:.2%}")

API Reference¶

ParameterRange¶

Define a search space for a parameter.

Parameters: - name (str): Parameter name - min_value (float/int, optional): Minimum value - max_value (float/int, optional): Maximum value - step (float/int, optional): Step size for linear ranges, or number of points for log ranges - values (list, optional): Explicit list of values - scale (str): 'linear' or 'log' (default: 'linear')

Methods: - get_values(): Generate list of parameter values to test

Examples:

# Linear range (integers)
ParameterRange(name='period', min_value=10, max_value=50, step=5)
# Returns: [10, 15, 20, 25, 30, 35, 40, 45, 50]

# Logarithmic range
ParameterRange(name='alpha', min_value=0.001, max_value=1.0, step=10, scale='log')
# Returns: 10 logarithmically-spaced values

# Explicit values
ParameterRange(name='method', values=['sma', 'ema', 'wma'])
# Returns: ['sma', 'ema', 'wma']

SensitivityAnalyzer¶

Main class for parameter sensitivity analysis.

Constructor Parameters: - backtest_func (callable): Function that takes params dict and returns performance metric - metric_name (str): Name of the performance metric (default: 'sharpe_ratio') - maximize (bool): Whether to maximize the metric (default: True) - n_jobs (int): Number of parallel workers (default: 1) - random_state (int, optional): Random seed for reproducibility

Methods:

`grid_search(param_ranges, base_params=None)`¶

Perform exhaustive grid search over parameter ranges.

Returns: GridSearchResult with: - parameter_combinations: All tested combinations - performances: Performance for each combination - best_params: Optimal parameter set - best_performance: Best performance achieved - performance_surface: DataFrame (for 2D grids) - heatmap_data: Data for visualization

`analyze_single_parameter(baseline_params, param_range, overfitting_threshold=0.2)`¶

Analyze sensitivity to a single parameter.

Returns: SensitivityResult with: - parameter_name: Name of analyzed parameter - baseline_value: Original value - baseline_performance: Performance at baseline - test_values: Values tested - performances: Performance at each value - sensitivity_score: Coefficient of variation - degradation_pct: Performance degradation (%) - is_overfitted: Overfitting flag - confidence_interval: 95% CI - statistics: Additional metrics

`analyze_multiple_parameters(baseline_params, param_ranges, overfitting_threshold=0.2)`¶

Analyze sensitivity to multiple parameters independently.

Returns: Dictionary mapping parameter names to SensitivityResult objects

`monte_carlo_perturbation(baseline_params, perturbation_pct=0.1, n_samples=100, param_names=None)`¶

Perform Monte Carlo parameter perturbation analysis.

Returns: Dictionary with: - baseline_performance: Baseline metric value - mean_performance: Mean across perturbations - std_performance: Standard deviation - degradation_pct: Performance degradation - robustness_score: % of runs within 90% of baseline - perturbed_params: List of perturbed parameter sets - performances: List of performance values

`walk_forward_analysis(baseline_params, param_range, time_periods, refit_frequency=1)`¶

Perform walk-forward parameter analysis across time periods.

Returns: Dictionary with: - period_performances: Performance for each period - optimal_params_per_period: Optimal parameters per period - parameter_stability: Parameter consistency metric

`calculate_sensitivity_scores(sensitivity_results)`¶

Calculate and rank sensitivity scores for multiple parameters.

Returns: DataFrame with parameters ranked by sensitivity

`generate_heatmap_data(param_range1, param_range2, base_params=None)`¶

Generate 2D heatmap data for two parameters.

Returns: Dictionary with: - values: 2D array of performances - x_labels: X-axis labels - y_labels: Y-axis labels - x_param: X parameter name - y_param: Y parameter name

`detect_overfitting(grid_result, threshold_percentile=90)`¶

Detect overfitting indicators in grid search results.

Returns: Dictionary with: - risk_level: 'LOW', 'MEDIUM', or 'HIGH' - risk_score: Numeric risk score (0-7) - robustness: Fraction of near-optimal combinations - coefficient_of_variation: Performance CV - performance_gap_pct: Gap between best and median - Additional statistics

Interpretation Guide¶

Sensitivity Score¶

The sensitivity score is the coefficient of variation (CV) of performance across parameter values:

CV < 0.1: Low sensitivity (robust parameter)
0.1 ≤ CV < 0.3: Moderate sensitivity
CV ≥ 0.3: High sensitivity (parameter requires careful tuning)

Overfitting Indicators¶

A parameter is flagged as overfitted if: - Performance degrades significantly (>20% by default) away from optimal - Robustness score is low (<20% of combinations near optimal) - Sharp performance peak with rapid degradation

Risk Levels¶

LOW Risk (score 0-2): - Robust parameter set - Many near-optimal combinations - Low performance variance - Small gap between best and median

MEDIUM Risk (score 3-4): - Some sensitivity to parameters - Moderate robustness - Consider validation on out-of-sample data

HIGH Risk (score 5-7): - High overfitting risk - Few near-optimal combinations - Large performance variance - Significant gap between best and median - Action: Use wider parameter ranges or different optimization approach

Robustness Score¶

Monte Carlo robustness score = % of perturbed parameter sets that achieve ≥90% of baseline performance

≥70%: Highly robust
50-70%: Moderately robust
<50%: Sensitive to parameter variations

Best Practices¶

1. Start with Wide Ranges¶

Begin with broad parameter ranges to understand the performance landscape:

ranges = [
    ParameterRange(name='period', min_value=5, max_value=100, step=5),
    ParameterRange(name='threshold', min_value=0.001, max_value=0.5, step=20, scale='log')
]

2. Use Multiple Analysis Methods¶

Combine different analyses for comprehensive evaluation:

# Grid search for optimal region
grid_result = analyzer.grid_search(param_ranges)

# Sensitivity analysis at optimal point
sensitivity_results = analyzer.analyze_multiple_parameters(
    grid_result.best_params,
    param_ranges
)

# Monte Carlo for robustness
mc_result = analyzer.monte_carlo_perturbation(
    grid_result.best_params,
    perturbation_pct=0.15
)

# Overfitting detection
overfitting = analyzer.detect_overfitting(grid_result)

3. Set Appropriate Thresholds¶

Adjust overfitting thresholds based on strategy type:

High-frequency strategies: Lower threshold (0.1-0.15)
Daily strategies: Medium threshold (0.15-0.25)
Position strategies: Higher threshold (0.2-0.3)

4. Validate with Walk-Forward¶

Use walk-forward analysis to verify parameter stability over time:

time_periods = [
    ('2020-01', '2020-06'),
    ('2020-06', '2020-12'),
    ('2021-01', '2021-06')
]

wf_result = analyzer.walk_forward_analysis(
    baseline_params,
    param_range,
    time_periods
)

# Low stability score indicates parameter drift
if wf_result['parameter_stability'] > 0.3:
    print("Warning: Parameters not stable over time")

5. Document Your Findings¶

Always document sensitivity analysis results:

# Calculate sensitivity rankings
scores = analyzer.calculate_sensitivity_scores(sensitivity_results)
scores.to_csv('sensitivity_rankings.csv')

# Generate heatmap for visualization
heatmap = analyzer.generate_heatmap_data(param_range1, param_range2)
# Use with matplotlib, seaborn, or plotly

Testing¶

Run the comprehensive test suite:

pytest tests/analytics/test_sensitivity.py -v

Test coverage includes: - ParameterRange validation and value generation - Grid search with single and multiple parameters - Sensitivity analysis (single and multiple parameters) - Overfitting detection - Monte Carlo perturbation - Walk-forward analysis - Edge cases (single values, categorical parameters, etc.) - Integration tests

Example Output¶

See src/analytics/sensitivity_example.py for complete examples with output.

Run the examples:

PYTHONPATH=src python src/analytics/sensitivity_example.py

Limitations¶

Computational Cost: Grid search complexity grows exponentially with number of parameters
Use coarse grids initially
Refine around promising regions
Consider parallel execution (n_jobs > 1)
Independence Assumption: analyze_multiple_parameters() varies parameters independently
Use grid_search() for interaction effects
Generate 2D heatmaps for parameter pairs
Backtest Function: Must be deterministic or use random_state for reproducibility
Monte Carlo analysis requires some randomness to be meaningful
Memory Usage: Large grid searches store all results in memory
For very large searches, consider implementing incremental results

Future Enhancements¶

Potential additions: - Bayesian optimization integration - Parallel coordinate plots - Automatic parameter range suggestion - Statistical significance testing - Interactive visualization exports

License¶

Part of the DSTA project.

Support¶

For issues or questions, contact the DSTA team or file an issue in the project repository.