igerber · igerber · Feb 7, 2026 · Feb 7, 2026
diff --git a/METHODOLOGY_REVIEW.md b/METHODOLOGY_REVIEW.md
@@ -21,7 +21,7 @@ Each estimator in diff-diff should be periodically reviewed to ensure:
 | Estimator | Module | R Reference | Status | Last Review |
 |-----------|--------|-------------|--------|-------------|
 | DifferenceInDifferences | `estimators.py` | `fixest::feols()` | **Complete** | 2026-01-24 |
-| MultiPeriodDiD | `estimators.py` | `fixest::feols()` | Not Started | - |
+| MultiPeriodDiD | `estimators.py` | `fixest::feols()` | **Complete** | 2026-02-02 |
 | TwoWayFixedEffects | `twfe.py` | `fixest::feols()` | Not Started | - |
 | CallawaySantAnna | `staggered.py` | `did::att_gt()` | **Complete** | 2026-01-24 |
 | SunAbraham | `sun_abraham.py` | `fixest::sunab()` | Not Started | - |
@@ -99,16 +99,48 @@ Each estimator in diff-diff should be periodically reviewed to ensure:
 | Field | Value |
 |-------|-------|
 | Module | `estimators.py` |
-| Primary Reference | Freyaldenhoven et al. (2021) |
+| Primary Reference | Freyaldenhoven et al. (2021), Wooldridge (2010), Angrist & Pischke (2009) |
 | R Reference | `fixest::feols()` |
-| Status | Not Started |
-| Last Review | - |
+| Status | **Complete** |
+| Last Review | 2026-02-02 |
+
+**Verified Components:**
+- [x] Full event-study specification: treatment × period interactions for ALL non-reference periods (pre and post)
+- [x] Reference period coefficient is zero (normalized by omission from design matrix)
+- [x] Default reference period is last pre-period (e=-1 convention, matches fixest/did)
+- [x] Pre-period coefficients available for parallel trends assessment
+- [x] Average ATT computed from post-treatment effects only, with covariance-aware SE
+- [x] Returns PeriodEffect objects with confidence intervals for all periods
+- [x] Supports balanced and unbalanced panels
+- [x] NaN inference: t_stat/p_value/CI use NaN when SE is non-finite or zero
+- [x] R-style NA propagation: avg_att is NaN if any post-period effect is unidentified
+- [x] Rank-deficient design matrix: warns and sets NaN for dropped coefficients (R-style)
+- [x] Staggered adoption detection warning (via `unit` parameter)
+- [x] Treatment reversal detection warning
+- [x] Time-varying D_it detection warning (advises creating ever-treated indicator)
+- [x] Single pre-period warning (ATT valid but pre-trends assessment unavailable)
+- [x] Post-period reference_period raises ValueError (would bias avg_att)
+- [x] HonestDiD/PreTrendsPower integration uses interaction sub-VCV (not full regression VCV)
+- [x] All REGISTRY.md edge cases tested
+
+**Test Coverage:**
+- 50 tests across `TestMultiPeriodDiD` and `TestMultiPeriodDiDEventStudy` in `tests/test_estimators.py`
+- 18 new event-study specification tests added in PR #125
 
 **Corrections Made:**
-- (None yet)
+- **PR #125 (2026-02-02)**: Transformed from post-period-only estimator into full event-study
+  specification with pre-period coefficients. Reference period default changed from first
+  pre-period to last pre-period (e=-1 convention). HonestDiD/PreTrendsPower VCV extraction
+  fixed to use interaction sub-VCV instead of full regression VCV.
 
 **Outstanding Concerns:**
-- (None yet)
+- No R comparison benchmarks yet (unlike DifferenceInDifferences and CallawaySantAnna which
+  have formal R benchmark tests). Consider adding `benchmarks/R/multiperiod_benchmark.R`.
+- Default SE is HC1 (not cluster-robust at unit level as fixest uses). Cluster-robust
+  available via `cluster` parameter but not the default.
+- Endpoint binning for distant event times not yet implemented.
+- FutureWarning for reference_period default change should eventually be removed once
+  the transition is complete.
 
 ---
 

diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md
@@ -160,6 +160,11 @@ where V is the VCV sub-matrix for post-treatment δ_e coefficients.
 *Edge cases:*
 - Reference period: omitted from design matrix; coefficient is zero by construction.
   Default is last pre-treatment period (e=-1). User can override via `reference_period`.
+- Post-period reference: raises ValueError. Post-period references would exclude a
+  post-treatment period from estimation, biasing avg_att and breaking downstream inference.
+- Reference period default change: FutureWarning emitted when `reference_period` is not
+  explicitly specified and ≥2 pre-periods exist, noting the default changed from first
+  to last pre-period (e=-1 convention, matching fixest/did).
 - Never-treated units: all event-time indicators are zero; they identify the time
   fixed effects and serve as comparison group.
 - Endpoint binning: distant event times (e.g., e < -K or e > K) should be binned
@@ -171,6 +176,17 @@ where V is the VCV sub-matrix for post-treatment δ_e coefficients.
   coefficients (R-style, matches `lm()`)
 - Average ATT (`avg_att`) is NA if any post-period effect is unidentified
   (R-style NA propagation)
+- NaN inference for undefined statistics:
+  - t_stat: Uses NaN (not 0.0) when SE is non-finite or zero
+  - p_value and CI: Also NaN when t_stat is NaN
+  - avg_se: Checked for finiteness before computing avg_t_stat
+  - **Note**: Defensive enhancement matching CallawaySantAnna NaN convention
+- Treatment reversal: warns if any unit transitions from treated to untreated
+  (non-absorbing treatment violates the simultaneous adoption assumption)
+- Time-varying treatment (D_it): warns when `unit` parameter is provided and
+  within-unit treatment variation is detected. Advises creating an ever-treated
+  indicator. Without ever-treated D_i, pre-period interaction coefficients are
+  unidentified.
 - Pre-test of parallel trends: joint F-test on pre-treatment δ_e coefficients.
   Low power in pre-test does not validate parallel trends (Roth 2022).