SHIP ML Model Deployment
From .pkl files to live predictions — shipping your models is the hard part
Learning Objectives
- •Deploy ML models for real-time inference
- •Handle model versioning and rollback
- •Monitor model performance in production
Explain Like I'm 5
Here's the dirty secret of machine learning: training a model is about 20% of the work. Deploying it so it actually runs on live data, correctly, reliably, 24/7 — that's the other 80%. Most ML tutorials end at "model.fit()" and leave you completely unprepared for deployment. Deployment means: loading the right model version, feeding it correctly formatted features, getting predictions at the right time, and monitoring for degradation. It's a fundamentally different skillset from training.
Think of It This Way
Training a model is like building a prototype car in the garage. Deploying it is like mass-producing that car so it runs reliably on real roads, in all weather, with real drivers. Your prototype worked great in the garage — but does it handle rain? Potholes? 100K miles? That's the difference between training and deployment.
1Model Deployment in Practice
2Model Versioning and Rollback
3Monitoring — Your Model Will Degrade
Model Signal Quality Degradation Over Time (Rolling IC)
4The Feature-Training Mismatch Problem
5Slippage — The Silent Performance Killer
Slippage Impact on Annual Returns (600 trades/year)
Hands-On Code
Model Deployment with Monitoring
import hashlib
import pickle
import numpy as np
import logging
logger = logging.getLogger('model_deployment')
class ModelDeployment:
"""Deploy and monitor ML models in production."""
def __init__(self, model_path, expected_hash):
self.model = self._load_verified(model_path, expected_hash)
self.predictions = []
self.prediction_count = 0
def _load_verified(self, path, expected_hash):
"""Load model with hash verification."""
with open(path, 'rb') as f:
data = f.read()
actual_hash = hashlib.md5(data).hexdigest()
if actual_hash != expected_hash:
raise ValueError(f"Model hash mismatch! Expected {expected_hash}, got {actual_hash}")
logger.info(f"Model loaded and verified: {path}")
return pickle.loads(data)
def predict(self, features):
"""Make prediction with monitoring."""
try:
pred = self.model.predict_proba(features.reshape(1, -1))[0, 1]
# Monitor prediction distribution
self.predictions.append(pred)
self.prediction_count += 1
if self.prediction_count % 100 == 0:
recent = self.predictions[-100:]
logger.info(
f"Prediction stats (last 100): "
f"mean={np.mean(recent):.3f}, "
f"std={np.std(recent):.3f}"
)
return pred
except Exception as e:
logger.error(f"Prediction failed: {e}")
return None # graceful fallbackHash verification ensures you're running the correct model version. Prediction monitoring catches drift and degradation early. Graceful error handling prevents failures from cascading. Unglamorous code, but it's the difference between a hobby project and a production system.
Knowledge Check
Q1.Your model's rolling win rate dropped from 59% to 48% over the last month. What should you do?
Assignment
Implement model deployment with hash verification and prediction monitoring. Load your L1 model, make 1000 predictions on test data, and track the prediction distribution. Set up alerts for distribution shift.