PERF Performance Optimization
Making your system fast enough without wasting time on unnecessary speed
Learning Objectives
- •Identify bottlenecks in trading systems
- •Optimize data processing and model inference
- •Balance speed vs accuracy
Explain Like I'm 5
Performance optimization is about making your system FAST ENOUGH without wasting time on unnecessary speed. For M15 trading, you have 15 minutes between bars — latency isn't critical. But for backtesting 7.5 years of data, speed matters significantly. Optimize what's slow, ignore what's already fast.
Think of It This Way
Performance optimization is like preparing for a marathon, not a sprint. You don't need to be the fastest — you need to be consistent and efficient. Find the bottleneck (shoes, hydration, or pacing?) and fix THAT. Don't buy fancy shoes if you're dehydrated.
1Common Bottlenecks
2Practical Optimization Tips
Hands-On Code
Vectorized vs Loop Performance
import numpy as np
import time
def compute_rsi_loop(prices, period=14):
"""RSI computed with Python loops (SLOW)."""
rsi_values = []
for i in range(period, len(prices)):
gains, losses = [], []
for j in range(i-period, i):
change = prices[j+1] - prices[j]
if change > 0:
gains.append(change)
else:
losses.append(abs(change))
avg_gain = np.mean(gains) if gains else 0
avg_loss = np.mean(losses) if losses else 1e-10
rs = avg_gain / avg_loss
rsi_values.append(100 - 100/(1+rs))
return rsi_values
def compute_rsi_vectorized(prices, period=14):
"""RSI computed with numpy vectorization (FAST)."""
deltas = np.diff(prices)
gains = np.where(deltas > 0, deltas, 0)
losses = np.where(deltas < 0, -deltas, 0)
avg_gain = np.convolve(gains, np.ones(period)/period, mode='valid')
avg_loss = np.convolve(losses, np.ones(period)/period, mode='valid')
avg_loss = np.where(avg_loss == 0, 1e-10, avg_loss)
rs = avg_gain / avg_loss
rsi = 100 - 100 / (1 + rs)
return rsi
# Benchmark
prices = np.random.randn(10000).cumsum() + 100
t1 = time.time()
r1 = compute_rsi_loop(prices)
t2 = time.time()
r2 = compute_rsi_vectorized(prices)
t3 = time.time()
print(f"Loop: {t2-t1:.3f}s")
print(f"Vectorized: {t3-t2:.6f}s")
print(f"Speedup: {(t2-t1)/(t3-t2):.0f}x")Numpy vectorization typically gives 100-1000x speedup over Python loops. This is the single most impactful optimization for data processing in Python.
Knowledge Check
Q1.Your backtest takes 3 hours. Profiling shows 90% of time is in feature computation (Python loops). What should you do?
Assignment
Profile your backtest with cProfile. Identify the top 3 bottlenecks. Optimize the biggest one (likely feature computation or bar-by-bar loops). Measure the speedup.