This directory contains Jupyter notebook tutorials demonstrating the features of the diff-diff library.
Introduction to Difference-in-Differences with diff-diff:
- Basic 2x2 DiD estimation
- Column-name and formula interfaces
- Adding covariates
- Fixed effects (dummy and absorbed)
- Two-Way Fixed Effects (TWFE)
- Cluster-robust standard errors
- Wild cluster bootstrap
Handling staggered treatment adoption with the Callaway-Sant'Anna estimator:
- Understanding staggered adoption
- Problems with TWFE in staggered settings
- Goodman-Bacon decomposition: Diagnosing why TWFE fails
- Group-time effects ATT(g,t)
- Aggregation methods (simple, group, event-study)
- Control group specifications
- Visualization
Synthetic Difference-in-Differences for few treated units:
- When to use Synthetic DiD
- Understanding unit and time weights
- Pre-treatment fit diagnostics
- Inference methods (bootstrap, placebo)
- Regularization tuning
- Comparison with standard DiD
Testing assumptions and diagnostics:
- Visual inspection of trends
- Simple parallel trends tests
- Robust Wasserstein-based tests
- Equivalence testing (TOST)
- Placebo tests (timing, group, permutation)
- Event study as a diagnostic
- What to do if parallel trends fails
Efficient Difference-in-Differences (Chen, Sant'Anna & Xie 2025):
- Optimal weighting across comparison groups and baselines
- PT-All vs PT-Post assumptions
- Efficiency gains vs Callaway-Sant'Anna
- Event study and group-level aggregation
- Bootstrap inference and diagnostics
Wooldridge Extended Two-Way Fixed Effects (ETWFE) for staggered DiD:
- Basic OLS estimation with cohort x time ATT cells
- Aggregation methods: event-study, group, calendar, simple
- Poisson QMLE for count / non-negative outcomes
- Logit for binary outcomes
- Comparison with Callaway-Sant'Anna
- Delta-method standard errors
Survey-aware DiD with complex sampling designs (strata, PSU, FPC, weights):
- Why survey design matters for DiD inference
- Setting up
SurveyDesign(weights, strata, PSU, FPC) - Basic DiD and staggered DiD with survey design
- Replicate weights (JK1, BRR, Fay, JKn)
- Subpopulation analysis
- DEFF diagnostics
- Repeated cross-sections with survey design
Practitioner walkthrough for measuring brand-campaign lift on survey data with complex sampling:
- The brand-tracker problem framed for marketing analytics
- Naive vs survey-aware DiD comparison (overconfidence under naive)
SurveyDesignsetup (strata, PSU, FPC, weights) wired into the fit- Funnel-metric extension across awareness / consideration / purchase intent
- Diagnostics (parallel trends, placebo, automated
practitioner_next_steps()) - Stakeholder communication template
Practitioner walkthrough for marketing analytics teams measuring geo-experiment lift:
- The geo-experiment problem framed for marketing analytics
- Synthetic panel of 80 markets with simulated campaign launch
SyntheticDiDfit, diagnostics, and inference (placebo + bootstrap)- Unit weights and time weights interpretation
- Stakeholder communication template (Tutorial 17 Section 9 pattern)
Practitioner walkthrough for measuring lift from on/off promotional pulses across markets, where treatment can switch in both directions:
- The marketing-pulse problem framed for reversible (non-absorbing) treatment
- TWFE decomposition diagnostic (
twowayfeweights) showing why standard regression misleads on reversible panels (de Chaisemartin & D'Haultfoeuille 2020 Theorem 1) DCDHPhase 1: DID_M, joiners-vs-leavers decomposition, single-lag placebo- Multi-horizon event study with
L_max+ multiplier bootstrap - Stakeholder communication template + drift guards
Practitioner walkthrough for measuring per-dollar lift when every market is treated at a different dose level and no never-treated unit exists (comparison comes from the dose variation across markets):
- The measurement problem framed for heterogeneous-adoption (no-untreated-control) panels
HADoverall fit on a 2-period collapse, withdesign="auto"resolving tocontinuous_near_d_lower(Design 1) and targetWAS_d_lower(per-$1K marginal effect above the lightest-touch DMA's spend)- Multi-week event study showing per-week dynamics with pre-launch placebos
- Stakeholder communication template flagging the Assumption 5/6 identification caveat
- Companion drift-test file (
tests/test_t20_had_brand_campaign_drift.py)
Composite pre-test walkthrough for HeterogeneousAdoptionDiD, building on Tutorial 20's brand-campaign framing on a panel where the dose distribution has a strictly positive but very near-zero lower bound (so the QUG step fails-to-reject H0: d_lower = 0):
- Paper Section 4.2 step taxonomy (QUG support-infimum, parallel pre-trends, linearity)
did_had_pretest_workflow(aggregate="overall")on a two-period collapse: Step 1 + Step 3 only, verdict explicitly flags Step 2 as deferred- Upgrade to
did_had_pretest_workflow(aggregate="event_study")on the multi-week panel: adds the joint pre-trends Stute and joint homogeneity Stute diagnostics (none of the three testable steps reject) - Side panel comparing
yatchew_hr_testnull="linearity"(default, paper Theorem 7) vsnull="mean_independence"(Phase 4 R-parity with RYatchewTest::yatchew_test(order=0)) - Companion drift-test file (
tests/test_t21_had_pretest_workflow_drift.py)
- Install diff-diff with dependencies:
pip install diff-diff
pip install matplotlib # for visualizations
pip install jupyter # to run notebooks- Start Jupyter:
jupyter notebook- Open any notebook and run the cells.
- Python 3.8+
- diff-diff
- numpy
- pandas
- matplotlib (optional, for visualizations)