ARP Alpha Research Process
The systematic discipline that separates real research from expensive data mining
Learning Objectives
- •Establish a rigorous alpha research pipeline
- •Distinguish hypothesis-driven from data-driven research
- •Account for multiple testing and the probability of false discoveries
- •Know when to kill a signal and when to iterate
Explain Like I'm 5
Alpha research is the process of finding new trading signals. The challenge isn't finding things that worked in the past — it's finding things that will work in the future. Most "discoveries" are noise.
Think of It This Way
Alpha research is like drug discovery. You screen thousands of candidates, most fail in testing, and only a few survive to "production." The key is having rigorous trials (not just looking at backtest equity curves) so you don't deploy placebos.
1The Alpha Research Pipeline
2Hypothesis-Driven vs. Data Mining
3The Multiple Testing Problem
4Signal Kill Criteria
5Building Your Research Lab
Hands-On Code
Research Pipeline Validator
import numpy as np
from scipy.stats import spearmanr
def validate_signal(signal_is, returns_is, signal_oos, returns_oos,
n_total_tests=1, min_ic=0.02, alpha=0.05):
"""Run a candidate signal through the validation pipeline."""
results = {'passed': True, 'stages': {}}
# Stage 1: In-sample IC
ic_is, p_is = spearmanr(signal_is, returns_is)
results['stages']['in_sample'] = {
'ic': round(ic_is, 4),
'p_value': round(p_is, 4),
'pass': ic_is > min_ic and p_is < alpha
}
if not results['stages']['in_sample']['pass']:
results['passed'] = False
results['kill_reason'] = 'In-sample IC below threshold'
return results
# Stage 2: Multiple testing correction (Bonferroni)
adjusted_alpha = alpha / max(n_total_tests, 1)
results['stages']['multiple_testing'] = {
'bonferroni_alpha': round(adjusted_alpha, 6),
'pass': p_is < adjusted_alpha
}
if not results['stages']['multiple_testing']['pass']:
results['passed'] = False
results['kill_reason'] = 'Fails multiple testing correction'
return results
# Stage 3: Out-of-sample validation
ic_oos, p_oos = spearmanr(signal_oos, returns_oos)
ic_decay = 1 - (ic_oos / ic_is) if ic_is > 0 else 1.0
results['stages']['out_of_sample'] = {
'ic': round(ic_oos, 4),
'p_value': round(p_oos, 4),
'ic_decay': round(ic_decay * 100, 1),
'pass': ic_oos > min_ic and p_oos < alpha
}
if not results['stages']['out_of_sample']['pass']:
results['passed'] = False
results['kill_reason'] = 'Out-of-sample IC below threshold'
return results
results['recommendation'] = 'PROCEED to live simulation'
return resultsProvides a framework that evaluates candidate signals through multiple validation stages including IC testing, statistical significance with Bonferroni correction, and out-of-sample confirmation.
Knowledge Check
Q1.You test 50 candidate signals and find 3 significant at p < 0.05. After Bonferroni correction (alpha = 0.05/50 = 0.001), how many are likely genuine?
Q2.What distinguishes hypothesis-driven research from data mining?
Q3.A signal's rolling IC has been below 0.02 for 7 months but was previously strong. What should you do?
Assignment
Build a research notebook template with these sections: Hypothesis, Economic Mechanism, Data Sources, Feature Engineering, In-Sample Results (IC, t-stat), Out-of-Sample Results, PBO Score, Decision (deploy/iterate/kill). Then populate it for one hypothesis of your choosing. Follow the full pipeline honestly, including the uncomfortable step of checking if your idea survives multiple testing correction.