PYT Python for Quantitative Finance
The tools most quant teams actually use every day
Learning Objectives
- •Set up a Python environment for quantitative finance
- •Use pandas DataFrames for financial time series
- •Understand numpy vectorized operations and why they matter
- •Build a basic data pipeline from raw data to analysis
Explain Like I'm 5
Python is a powerful calculator that crunches millions of numbers in seconds. Instead of staring at charts hoping to spot patterns, you write a few instructions and the computer finds them faster, more accurately, and without emotional bias. Every major quant firm uses Python as a primary research language.
Think of It This Way
If quant trading is professional cooking, Python is your industrial kitchen. You could cook over a campfire (manual analysis), but a professional kitchen — with pandas for prep, numpy for fast computation, and scikit-learn for ML — lets you operate at institutional scale.
1Why Python Won
2pandas — Working with Financial Data
3numpy — The Fast Math Layer
Python Loops vs numpy Vectorization — Execution Time
Key Formulas
Simple Returns
Arithmetic returns — computed with df["close"].pct_change(). Intuitive for single-period analysis but not additive across time.
Logarithmic Returns
Log returns are additive over time: multi-period log return = sum of individual-period log returns. This is why they're preferred for statistical modeling and multi-period analysis.
Hands-On Code
Financial Data Pipeline — From Raw Data to Analysis
import pandas as pd
import numpy as np
# --- Load and prepare financial data ---
df = pd.read_csv('EURUSD_M15.csv', parse_dates=['time'])
df = df.set_index('time').sort_index()
# --- Compute returns and core features ---
df['returns'] = df['close'].pct_change()
df['log_returns'] = np.log(df['close'] / df['close'].shift(1))
df['volatility'] = df['returns'].rolling(20).std()
df['sma_50'] = df['close'].rolling(50).mean()
df['sma_200'] = df['close'].rolling(200).mean()
# --- Summary statistics (fully vectorized) ---
print(f"Mean daily return: {df['returns'].mean():.6f}")
print(f"Daily volatility: {df['returns'].std():.6f}")
sharpe = df['returns'].mean() / df['returns'].std() * np.sqrt(252)
print(f"Annualized Sharpe: {sharpe:.2f}")
# --- Identify high-volatility regimes ---
vol_95 = df['volatility'].quantile(0.95)
high_vol = df[df['volatility'] > vol_95]
print(f"High-vol bars: {len(high_vol)} ({len(high_vol)/len(df)*100:.1f}%)")This is the skeleton of every quant pipeline: load data, compute features with vectorized operations, generate summary stats. No for loops anywhere. This pattern scales from prototype to production.
Knowledge Check
Q1.Why should for loops be avoided when processing pandas DataFrames?
Q2.What is the primary advantage of logarithmic returns over simple returns?
Assignment
Download OHLCV data for any major currency pair. Compute: daily returns, 20-day rolling volatility, 50 and 200-day moving averages, and RSI(14). Create a 4-panel matplotlib figure with all indicators. Find periods where volatility exceeds its 95th percentile and research what was happening in markets at those times.