← Back to Learn
III AdvancedWeek 30 • Lesson 82Duration: 35 min

MLOP MLOps & Deployment

The full lifecycle — from training to production to retraining

Learning Objectives

  • Understand MLOps principles for trading systems
  • Learn model retraining pipelines and schedules
  • Build CI/CD for model deployment

Explain Like I'm 5

MLOps = DevOps for machine learning. It covers the full lifecycle: train model -> validate -> deploy -> monitor -> detect degradation -> retrain. For a trading system, this means having a systematic process for updating models without breaking live trading.

Think of It This Way

MLOps is like car maintenance. You don't just build a car and drive it forever. You need regular oil changes (model retraining), inspections (validation), and occasionally replacement parts (new model versions). Skipping maintenance leads to breakdowns.

1The ML Lifecycle for Trading

1. Training: Train models on historical data - V7: retrain monthly using walk-forward approach - Use most recent data but validate on held-out period 2. Validation: Thorough testing before deployment - Walk-forward IC - PBO check (< 0.50) - Monte Carlo breach probability (< 5%) - Sensitivity analysis - Compare to current production model 3. Shadow deployment: Run new model alongside production - Both models receive same data - Only production model's signals are executed - Compare predictions and hypothetical performance 4. Gradual rollout: If shadow testing passes - Deploy to 1 account first - Monitor for 2 weeks - If performance meets expectations, deploy to all accounts 5. Monitoring: Continuous performance tracking - Rolling IC, win rate, drawdown - Feature drift detection - Prediction distribution monitoring 6. Trigger retraining: When monitoring detects degradation - IC drops below threshold - Feature distributions shift significantly - Scheduled monthly retrain

2V7 Model Update Process

The V7 frozen model update process: Current state: Models are FROZEN (hash-verified). No ad-hoc changes. Update procedure: 1. Train new model version on expanded data 2. Hash the new model files 3. Run FULL validation suite: - Baseline backtest: WR > 57%, profit factor > 1.5 - Walk-forward: WFER > 0.85 - PBO: < 0.15 - Monte Carlo: breach prob < 3% 4. Shadow test for 2 weeks minimum 5. Update V7_FROZEN_MODELS_MANIFEST.json with new hashes 6. Deploy with new config 7. Monitor for 30 days If ANY validation step fails -> reject the update, keep current model. This is conservative by design. A slightly worse model deployed hastily can lose real money. The bar for updates should be HIGH.

Hands-On Code

Model Retraining Pipeline

python
import hashlib
import json
from datetime import datetime

class ModelRetrainPipeline:
    """Systematic model retraining with validation gates."""
    
    def __init__(self, current_manifest_path):
        with open(current_manifest_path) as f:
            self.manifest = json.load(f)
    
    def validate_new_model(self, model, test_data):
        """Run all validation gates."""
        results = {}
        
        # Gate 1: Baseline performance
        wr = self._compute_win_rate(model, test_data)
        results['win_rate'] = {'value': wr, 'threshold': 0.57, 'pass': wr > 0.57}
        
        # Gate 2: Walk-forward
        wfer = self._walk_forward(model, test_data)
        results['wfer'] = {'value': wfer, 'threshold': 0.85, 'pass': wfer > 0.85}
        
        # Gate 3: PBO
        pbo = self._compute_pbo(model, test_data)
        results['pbo'] = {'value': pbo, 'threshold': 0.15, 'pass': pbo < 0.15}
        
        # Gate 4: Monte Carlo
        breach = self._monte_carlo_breach(model, test_data)
        results['breach_prob'] = {'value': breach, 'threshold': 0.03, 'pass': breach < 0.03}
        
        all_pass = all(r['pass'] for r in results.values())
        
        print(f"=== VALIDATION RESULTS ===")
        for name, r in results.items():
            status = "[PASS]" if r['pass'] else "[FAIL]"
            print(f"  {status} {name}: {r['value']:.4f} (threshold: {r['threshold']})")
        
        print(f"\nOverall: {'[PASS] APPROVED for shadow testing' if all_pass else '[FAIL] REJECTED'}")
        
        return all_pass, results
    
    def _compute_win_rate(self, model, data):
        return 0.592  # placeholder
    
    def _walk_forward(self, model, data):
        return 0.91
    
    def _compute_pbo(self, model, data):
        return 0.112
    
    def _monte_carlo_breach(self, model, data):
        return 0.008

Every model update must pass ALL validation gates. One failure = rejection. This prevents deploying degraded models that could lose real money in production.

Knowledge Check

Q1.Your new model has better WR (61%) but higher PBO (0.25). Should you deploy it?

Assignment

Build a model retraining pipeline with all 4 validation gates (WR, walk-forward, PBO, Monte Carlo). Run it on your current model. Does the current frozen model pass all gates?