← Back to Learn
II IntermediateWeek 13 • Lesson 40Duration: 35 min

RPT Validation Reporting

Presenting results clearly and honestly — no cherry-picking

Learning Objectives

  • Learn how to create clear, honest validation reports
  • Understand what metrics to include and how to present them
  • Build reports that enable informed deployment decisions

Explain Like I'm 5

A validation report should make it obvious whether to deploy or not. Include the good and the bad. If your strategy has a weakness, report it. Hiding it doesn't make it go away — it just means you'll be surprised when it matters most.

Think of It This Way

A good validation report is like a medical checkup report. It shows everything — blood pressure, cholesterol, good numbers and bad numbers. Hiding the bad numbers doesn't make you healthy. Reporting them honestly lets you make informed decisions.

1Required Metrics

Every validation report should include: Performance metrics: - Total return (in R and %) - Win rate (with confidence interval) - Profit factor - Sharpe ratio - Number of trades Risk metrics: - Maximum drawdown - 95th and 99th percentile drawdown (Monte Carlo) - Breach probability - Expected shortfall Validation quality: - PBO - Walk-forward efficiency ratio - Sensitivity analysis results - Benchmark comparisons Deployment recommendation: - Clear DEPLOY / DO NOT DEPLOY recommendation - Specific conditions under which recommendation changes - Monitoring requirements post-deployment

2The Report Structure That Works

After writing dozens of these, here's the format that works best: Page 1: Executive summary. One-line recommendation: DEPLOY or DO NOT DEPLOY. Key metrics: WR, total R, max DD, breach probability. Biggest strength and biggest risk. Page 2: Performance deep dive. Equity curve (walk-forward OOS). Win rate by regime, by cluster, by time period. Drawdown analysis. R distribution. Page 3: Validation quality. PBO score and interpretation. Walk-forward window-by-window results. Sensitivity analysis summary. Benchmark comparisons. Page 4: Risk assessment. Monte Carlo results. Tail risk analysis. Known failure modes. Monitoring plan. The executive summary is the most important page. If someone has to read 20 pages to figure out whether to deploy, the report has failed its purpose.

3Honest Reporting

The temptation to cherry-pick results is real. Resist it. Things people hide in reports (don't do this): - Worst-case drawdown scenario - The one walk-forward window where the model tanked - The regime where the strategy underperforms - The benchmark comparison that's close - The sensitivity parameter that's borderline fragile Why honesty matters: you're the one trading this. If you hide a weakness from yourself, you won't prepare for it. And it will show up in live trading. Better format: "Win rate ranges from 55-63% across windows, with one window at 52% during the 2020 volatility spike. Strategy underperforms in extreme vol regimes but recovers quickly." That's honest and actionable. You now know to watch for extreme vol regimes and potentially reduce size during them.

In-Sample vs Out-of-Sample Performance Gap

4Red Flags in Other People's Reports

When reviewing someone else's validation report (or a vendor's pitch), watch for: Only in-sample results. If they don't show OOS results, assume the worst. No confidence intervals. A point estimate without CI is meaningless. "59% WR" tells you nothing without "±3%". Missing transaction costs. If the backtest doesn't model slippage and commissions, mentally cut the reported return by 30-50%. Suspiciously low drawdown. If they claim <2% max DD over 5 years, they either used tiny position sizes (boring) or the backtest is wrong. No benchmark comparison. "We make 30% annually" means nothing without context. Vague methodology. "We use advanced AI" without specifics = marketing, not engineering. 90% of trading system pitch decks would fail a proper validation review. Be skeptical by default.

Hands-On Code

Automated Validation Report

python
def generate_validation_report(results, mc_results, pbo, benchmarks):
    """Generate structured validation report."""
    report = []
    report.append("=" * 60)
    report.append("ENGINE VALIDATION REPORT")
    report.append("=" * 60)
    
    report.append("\n--- PERFORMANCE ---")
    report.append(f"Win Rate: {results['win_rate']:.1%} "
                  f"(n={results['n_trades']})")
    report.append(f"Total R: {results['total_r']:+.1f}")
    report.append(f"Profit Factor: {results['profit_factor']:.2f}")
    
    report.append("\n--- RISK ---")
    report.append(f"Max Drawdown: {results['max_dd']:.2%}")
    report.append(f"95th Pctl DD: {mc_results['p95_dd']:.2%}")
    report.append(f"Breach Prob: {mc_results['breach_prob']:.2%}")
    
    report.append("\n--- VALIDATION ---")
    report.append(f"PBO: {pbo:.3f} "
                  f"{'[PASS]' if pbo < 0.25 else '[FAIL]'}")
    report.append(f"Walk-Forward: "
                  f"{'Yes' if results['walk_forward'] else 'No'}")
    
    deploy = (results['win_rate'] > 0.55
              and mc_results['breach_prob'] < 0.05
              and pbo < 0.25)
    
    report.append("\n--- RECOMMENDATION ---")
    report.append(f"{'DEPLOY' if deploy else 'DO NOT DEPLOY'}")
    return "\n".join(report)

Automated reports ensure nothing gets missed. Every deployment decision should be backed by a thorough, honest validation report.

Knowledge Check

Q1.Your strategy passes all validation checks except PBO is 0.35. Should you deploy?

Assignment

Generate a complete validation report for your strategy. Share it with someone who can review it critically. Does the report provide enough information for an informed deployment decision?