← Back to Learn
II IntermediateWeek 13 • Lesson 39Duration: 40 min

DOC Documentation Standards

If it's not documented, it doesn't exist

Learning Objectives

  • Learn documentation standards for trading strategies and models
  • Understand what to document and why
  • Build documentation habits that save you from future headaches

Explain Like I'm 5

You will forget why you made a decision 3 months from now. Document everything: what the model does, why you chose these parameters, what the validation results were, and what assumptions you're making. Future you will be grateful. Past me has burned current me too many times by not writing things down.

Think of It This Way

Documentation is like leaving yourself notes for a scavenger hunt. Without notes, you'll be lost retracing your steps. With good notes, you can quickly understand past decisions and build on them instead of re-deriving everything from scratch.

1What to Document

Model documentation: - Architecture description (layers, parameters, activation functions) - Feature list with descriptions and data sources - Training procedure (hyperparameters, validation method, stopping criteria) - Performance metrics (IS and OOS, with dates) - Known limitations and failure modes - Version history Strategy documentation: - Entry and exit rules (complete specification) - Risk management parameters with rationale - Prop firm compliance measures - Backtesting methodology and results - PBO and Monte Carlo validation Operational documentation: - Deployment procedure - Monitoring checklist - Escalation procedures - Disaster recovery plan A good production system has: frozen config JSON for complete parameter specification, baseline results JSON for official validation, model manifest for hash-verified model versions, and architecture overview explaining how everything fits together.

2The Decision Log

This is the most underrated piece of documentation. Write down WHY you made each major decision. Not just "set max_depth = 6." But "set max_depth = 6 because sensitivity analysis showed the performance plateau spans depth 4-8, and 6 minimizes overfit risk while capturing enough complexity." Template for decision log entries: - Date: when was this decided? - Decision: what was decided? - Alternatives considered: what else did you try? - Evidence: what data or analysis supported the decision? - Risks: what could go wrong? - Review date: when should this decision be reconsidered? Six months from now, when you're wondering "why on earth did I set this parameter to 6?", you can look it up in 30 seconds instead of spending 2 hours re-deriving it.

3Version Control for Models

Code has git. What do models have? You need a model versioning system. Minimum viable model versioning: - Every model file gets an MD5 hash at deployment - Hash stored in a manifest JSON alongside performance metrics - Before deployment: new hash vs. manifest hash = verification - Rollback: keep previous model version, switch back if needed What to version: - Model weights/files (the .pkl, .pt files) - Training config (hyperparameters, feature list) - Training data snapshot (or hash of training data) - Validation results at time of training The number of times "which model is actually deployed right now?" has caused a crisis in production trading is too many. Hash everything. Verify everything. Trust nothing.

4Documentation Debt Is Real

Just like technical debt, documentation debt compounds. Week 1: you skip documenting one design decision. No big deal. Month 3: you can't remember why feature X was excluded. You spend 4 hours investigating. Month 6: a bug appears. Is it in the model or the data pipeline? No docs say which model version is deployed. 8 hours lost. Month 12: you want to improve the system but can't understand your own code. You rewrite from scratch. The 15-minute rule: spend 15 minutes documenting after every significant change. That's it. 15 minutes now saves hours later. What counts as "significant": - Any model change - Any config change - Any new feature added or removed - Any bug fix that changes behavior - Any deployment Write it down. Every single time.

Hands-On Code

Model Documentation Template

python
MODEL_DOC = {
    "name": "L1 XGBoost Signal Model",
    "version": "7.0.43",
    "created": "2026-01-15",
    
    "architecture": {
        "type": "XGBoost Classifier",
        "n_estimators": 200,
        "max_depth": 6,
        "learning_rate": 0.03,
        "features": 38,
    },
    
    "training": {
        "method": "Walk-forward, 12mo train / 3mo test",
        "data_period": "2018-01 to 2026-01",
        "total_trades": 4505,
    },
    
    "performance": {
        "oos_win_rate": 0.592,
        "total_r": 533.9,
        "max_drawdown": 0.0149,
        "pbo": 0.112,
        "mc_breach_prob": 0.0008,
    },
    
    "limitations": [
        "Recent data density higher than 2018-2023",
        "Entry gate has high SKIP rate (60%+)",
        "Performance varies by market regime",
    ],
    
    "frozen": True,
    "hash": "md5:abc123...",
}

Complete documentation captured in a structured format. Every important decision is recorded. Future you — or another developer — can understand the entire system from this document.

Knowledge Check

Q1.You find an old model with great performance but no documentation. Should you deploy it?

Assignment

Create a complete documentation package for your trading strategy using the template above. Include: model architecture, training procedure, validation results, limitations, and operational procedures.