tutorials

diff-diff Tutorials

This directory contains Jupyter notebook tutorials demonstrating the features of the diff-diff library.

Notebooks

1. Basic DiD (`01_basic_did.ipynb`)

Introduction to Difference-in-Differences with diff-diff:

Basic 2x2 DiD estimation
Column-name and formula interfaces
Adding covariates
Fixed effects (dummy and absorbed)
Two-Way Fixed Effects (TWFE)
Cluster-robust standard errors
Wild cluster bootstrap

2. Staggered DiD (`02_staggered_did.ipynb`)

Handling staggered treatment adoption with the Callaway-Sant'Anna estimator:

Understanding staggered adoption
Problems with TWFE in staggered settings
Goodman-Bacon decomposition: Diagnosing why TWFE fails
Group-time effects ATT(g,t)
Aggregation methods (simple, group, event-study)
Control group specifications
Visualization

3. Synthetic DiD (`03_synthetic_did.ipynb`)

Synthetic Difference-in-Differences for few treated units:

When to use Synthetic DiD
Understanding unit and time weights
Pre-treatment fit diagnostics
Inference methods (bootstrap, placebo)
Regularization tuning
Comparison with standard DiD

4. Parallel Trends (`04_parallel_trends.ipynb`)

Testing assumptions and diagnostics:

Visual inspection of trends
Simple parallel trends tests
Robust Wasserstein-based tests
Equivalence testing (TOST)
Placebo tests (timing, group, permutation)
Event study as a diagnostic
What to do if parallel trends fails

15. Efficient DiD (`15_efficient_did.ipynb`)

Efficient Difference-in-Differences (Chen, Sant'Anna & Xie 2025):

Optimal weighting across comparison groups and baselines
PT-All vs PT-Post assumptions
Efficiency gains vs Callaway-Sant'Anna
Event study and group-level aggregation
Bootstrap inference and diagnostics

16. Wooldridge ETWFE (`16_wooldridge_etwfe.ipynb`)

Wooldridge Extended Two-Way Fixed Effects (ETWFE) for staggered DiD:

Basic OLS estimation with cohort x time ATT cells
Aggregation methods: event-study, group, calendar, simple
Poisson QMLE for count / non-negative outcomes
Logit for binary outcomes
Comparison with Callaway-Sant'Anna
Delta-method standard errors

Survey-Aware DiD (`16_survey_did.ipynb`)

Survey-aware DiD with complex sampling designs (strata, PSU, FPC, weights):

Why survey design matters for DiD inference
Setting up SurveyDesign (weights, strata, PSU, FPC)
Basic DiD and staggered DiD with survey design
Replicate weights (JK1, BRR, Fay, JKn)
Subpopulation analysis
DEFF diagnostics
Repeated cross-sections with survey design

17. Brand Awareness Survey (`17_brand_awareness_survey.ipynb`)

Practitioner walkthrough for measuring brand-campaign lift on survey data with complex sampling:

The brand-tracker problem framed for marketing analytics
Naive vs survey-aware DiD comparison (overconfidence under naive)
SurveyDesign setup (strata, PSU, FPC, weights) wired into the fit
Funnel-metric extension across awareness / consideration / purchase intent
Diagnostics (parallel trends, placebo, automated practitioner_next_steps())
Stakeholder communication template

18. Geo-Experiment Analysis with SyntheticDiD (`18_geo_experiments.ipynb`)

Practitioner walkthrough for marketing analytics teams measuring geo-experiment lift:

The geo-experiment problem framed for marketing analytics
Synthetic panel of 80 markets with simulated campaign launch
SyntheticDiD fit, diagnostics, and inference (placebo + bootstrap)
Unit weights and time weights interpretation
Stakeholder communication template (Tutorial 17 Section 9 pattern)

19. dCDH Marketing Pulse Campaigns (`19_dcdh_marketing_pulse.ipynb`)

Practitioner walkthrough for measuring lift from on/off promotional pulses across markets, where treatment can switch in both directions:

The marketing-pulse problem framed for reversible (non-absorbing) treatment
TWFE decomposition diagnostic (twowayfeweights) showing why standard regression misleads on reversible panels (de Chaisemartin & D'Haultfoeuille 2020 Theorem 1)
DCDH Phase 1: DID_M, joiners-vs-leavers decomposition, single-lag placebo
Multi-horizon event study with L_max + multiplier bootstrap
Stakeholder communication template + drift guards

20. HAD for National Brand Campaign with Regional Spend Intensity (`20_had_brand_campaign.ipynb`)

Practitioner walkthrough for measuring per-dollar lift when every market is treated at a different dose level and no never-treated unit exists (comparison comes from the dose variation across markets):

The measurement problem framed for heterogeneous-adoption (no-untreated-control) panels
HAD overall fit on a 2-period collapse, with design="auto" resolving to continuous_near_d_lower (Design 1) and target WAS_d_lower (per-$1K marginal effect above the lightest-touch DMA's spend)
Multi-week event study showing per-week dynamics with pre-launch placebos
Stakeholder communication template flagging the Assumption 5/6 identification caveat
Companion drift-test file (tests/test_t20_had_brand_campaign_drift.py)

21. HAD Pre-test Workflow (`21_had_pretest_workflow.ipynb`)

Composite pre-test walkthrough for HeterogeneousAdoptionDiD, building on Tutorial 20's brand-campaign framing on a panel where the dose distribution has a strictly positive but very near-zero lower bound (so the QUG step fails-to-reject H0: d_lower = 0):

Paper Section 4.2 step taxonomy (QUG support-infimum, parallel pre-trends, linearity)
did_had_pretest_workflow(aggregate="overall") on a two-period collapse: Step 1 + Step 3 only, verdict explicitly flags Step 2 as deferred
Upgrade to did_had_pretest_workflow(aggregate="event_study") on the multi-week panel: adds the joint pre-trends Stute and joint homogeneity Stute diagnostics (none of the three testable steps reject)
Side panel comparing yatchew_hr_test null="linearity" (default, paper Theorem 7) vs null="mean_independence" (Phase 4 R-parity with R YatchewTest::yatchew_test(order=0))
Companion drift-test file (tests/test_t21_had_pretest_workflow_drift.py)

Running the Notebooks

Install diff-diff with dependencies:

pip install diff-diff
pip install matplotlib  # for visualizations
pip install jupyter     # to run notebooks

Start Jupyter:

jupyter notebook

Open any notebook and run the cells.

Requirements

Python 3.8+
diff-diff
numpy
pandas
matplotlib (optional, for visualizations)

Name		Name	Last commit message	Last commit date
parent directory ..
01_basic_did.ipynb		01_basic_did.ipynb
02_staggered_did.ipynb		02_staggered_did.ipynb
03_synthetic_did.ipynb		03_synthetic_did.ipynb
04_parallel_trends.ipynb		04_parallel_trends.ipynb
05_honest_did.ipynb		05_honest_did.ipynb
06_power_analysis.ipynb		06_power_analysis.ipynb
07_pretrends_power.ipynb		07_pretrends_power.ipynb
08_triple_diff.ipynb		08_triple_diff.ipynb
09_real_world_examples.ipynb		09_real_world_examples.ipynb
10_trop.ipynb		10_trop.ipynb
11_imputation_did.ipynb		11_imputation_did.ipynb
12_two_stage_did.ipynb		12_two_stage_did.ipynb
13_stacked_did.ipynb		13_stacked_did.ipynb
14_continuous_did.ipynb		14_continuous_did.ipynb
15_efficient_did.ipynb		15_efficient_did.ipynb
16_survey_did.ipynb		16_survey_did.ipynb
16_wooldridge_etwfe.ipynb		16_wooldridge_etwfe.ipynb
17_brand_awareness_survey.ipynb		17_brand_awareness_survey.ipynb
18_geo_experiments.ipynb		18_geo_experiments.ipynb
19_dcdh_marketing_pulse.ipynb		19_dcdh_marketing_pulse.ipynb
20_had_brand_campaign.ipynb		20_had_brand_campaign.ipynb
21_had_pretest_workflow.ipynb		21_had_pretest_workflow.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

diff-diff Tutorials

Notebooks

1. Basic DiD (`01_basic_did.ipynb`)

2. Staggered DiD (`02_staggered_did.ipynb`)

3. Synthetic DiD (`03_synthetic_did.ipynb`)

4. Parallel Trends (`04_parallel_trends.ipynb`)

15. Efficient DiD (`15_efficient_did.ipynb`)

16. Wooldridge ETWFE (`16_wooldridge_etwfe.ipynb`)

Survey-Aware DiD (`16_survey_did.ipynb`)

17. Brand Awareness Survey (`17_brand_awareness_survey.ipynb`)

18. Geo-Experiment Analysis with SyntheticDiD (`18_geo_experiments.ipynb`)

19. dCDH Marketing Pulse Campaigns (`19_dcdh_marketing_pulse.ipynb`)

20. HAD for National Brand Campaign with Regional Spend Intensity (`20_had_brand_campaign.ipynb`)

21. HAD Pre-test Workflow (`21_had_pretest_workflow.ipynb`)

Running the Notebooks

Requirements

FilesExpand file tree

tutorials

Directory actions

More options

Directory actions

More options

Latest commit

History

tutorials

Folders and files

parent directory

README.md

diff-diff Tutorials

Notebooks

1. Basic DiD (01_basic_did.ipynb)

2. Staggered DiD (02_staggered_did.ipynb)

3. Synthetic DiD (03_synthetic_did.ipynb)

4. Parallel Trends (04_parallel_trends.ipynb)

15. Efficient DiD (15_efficient_did.ipynb)

16. Wooldridge ETWFE (16_wooldridge_etwfe.ipynb)

Survey-Aware DiD (16_survey_did.ipynb)

17. Brand Awareness Survey (17_brand_awareness_survey.ipynb)

18. Geo-Experiment Analysis with SyntheticDiD (18_geo_experiments.ipynb)

19. dCDH Marketing Pulse Campaigns (19_dcdh_marketing_pulse.ipynb)

20. HAD for National Brand Campaign with Regional Spend Intensity (20_had_brand_campaign.ipynb)

21. HAD Pre-test Workflow (21_had_pretest_workflow.ipynb)

Running the Notebooks

Requirements

1. Basic DiD (`01_basic_did.ipynb`)

2. Staggered DiD (`02_staggered_did.ipynb`)

3. Synthetic DiD (`03_synthetic_did.ipynb`)

4. Parallel Trends (`04_parallel_trends.ipynb`)

15. Efficient DiD (`15_efficient_did.ipynb`)

16. Wooldridge ETWFE (`16_wooldridge_etwfe.ipynb`)

Survey-Aware DiD (`16_survey_did.ipynb`)

17. Brand Awareness Survey (`17_brand_awareness_survey.ipynb`)

18. Geo-Experiment Analysis with SyntheticDiD (`18_geo_experiments.ipynb`)

19. dCDH Marketing Pulse Campaigns (`19_dcdh_marketing_pulse.ipynb`)

20. HAD for National Brand Campaign with Regional Spend Intensity (`20_had_brand_campaign.ipynb`)

21. HAD Pre-test Workflow (`21_had_pretest_workflow.ipynb`)