Add convex hull violation diagnostic for synthetic control methods #599

Copilot · 2025-12-20T11:36:34Z

Synthetic control methods construct counterfactuals using non-negative weights summing to one—a convex combination. This mathematically constrains predictions to lie within the convex hull of control units. When the treated unit's pre-intervention trajectory falls outside this hull (e.g., consistently above or below all controls), the method cannot produce accurate counterfactuals.

Changes

Core diagnostic

Added check_convex_hull_violation() utility that validates treated unit values fall within control range at each time point
Integrated automatic check into SyntheticControl.__init__() that warns when violated:

result = cp.SyntheticControl(
    df,
    treatment_time,
    control_units=["a", "b", "c"],
    treated_units=["treated"],
    model=cp.pymc_models.WeightedSumFitter(...)
)
# UserWarning: Convex hull assumption may be violated: 30 pre-intervention 
# time points (100.0% above, 0.0% below control range). Consider: (1) adding 
# more diverse control units, (2) using ITS with intercept, or (3) using 
# Augmented Synthetic Control Method.

Testing

5 unit tests covering violation scenarios (above/below/both/boundary/pass cases)
2 integration tests verifying warning behavior in actual usage

Documentation

Brief subsection in sc_pymc.ipynb introducing the concept
Detailed pedagogical section in sc_pymc_brexit.ipynb with mathematical explanation, visualization code, and alternatives when violated
Glossary term "Convex hull condition" with citation to Abadie et al. (2010)
Added references: Abadie et al. (2010), Ben-Michael et al. (2021)

Original prompt

This section details on the original issue you should resolve

<issue_title>Add Diagnostic Test for Convex Hull Assumption in Synthetic Control</issue_title>
<issue_description>## Problem Description

Background

The synthetic control method constructs a counterfactual by finding a weighted combination of control units that best approximates the treated unit in the pre-intervention period. In CausalPy, this is implemented via the WeightedSumFitter (Bayesian) and WeightedProportion (OLS) models, both of which impose:

Non-negativity constraint: All weights β_i ≥ 0
Sum-to-one constraint: Σ β_i = 1

These constraints mean the synthetic control prediction μ is a convex combination of the control units:

$$\mu = \sum_{i=1}^{n} \beta_i x_i \quad \text{where } \beta_i \geq 0 \text{ and } \sum_{i=1}^{n} \beta_i = 1$$

The Convex Hull Assumption

By definition, a convex combination can only produce values that lie within the convex hull of the input points. In the context of synthetic control, this means:

The treated unit's pre-intervention outcomes must be expressible as a weighted average of the control units' outcomes at each time point.

When this assumption is violated:

If all control series are above the treated series → the minimum achievable synthetic control value is the smallest control value (putting all weight on the lowest control), which is still too high
If all control series are below the treated series → the maximum achievable value is the largest control value (putting all weight on the highest control), which is still too low

In either case, no valid convex combination can match the treated unit's trajectory, leading to:

Poor pre-intervention fit (low R²)
Biased treatment effect estimates
Unreliable counterfactual projections

Current State

CausalPy does not currently:

Check whether the convex hull assumption is satisfied
Warn users when the assumption is violated
Provide educational content about this critical assumption

Proposed Solution

1. Implement Diagnostic Test

Add a function to check whether the treated unit's pre-intervention values fall within the convex hull of the control units at each time point. A simplified but effective approach:

def check_convex_hull_violation(
    treated_series: np.ndarray, 
    control_matrix: np.ndarray
) -> dict:
    """
    Check if treated series values fall within the range of control series.
    
    For each time point, verify that:
    min(controls) <= treated <= max(controls)
    
    This is a necessary (but not sufficient) condition for the treated unit
    to lie within the convex hull of control units.
    
    Returns:
        dict with keys:
        - 'passes': bool - whether the check passes
        - 'n_violations': int - number of time points with violations
        - 'pct_above': float - percentage of points where treated > max(controls)
        - 'pct_below': float - percentage of points where treated < min(controls)
    """
    control_min = control_matrix.min(axis=1)
    control_max = control_matrix.max(axis=1)
    
    above = treated_series > control_max
    below = treated_series < control_min
    
    n_points = len(treated_series)
    return {
        'passes': not (above.any() or below.any()),
        'n_violations': above.sum() + below.sum(),
        'pct_above': 100 * above.sum() / n_points,
        'pct_below': 100 * below.sum() / n_points,
    }

2. Issue Warning with Instructive Message

In SyntheticControl.__init__(), after preparing the data but before fitting, run the diagnostic and issue a warning if violated:

import warnings

# Check convex hull assumption
hull_check = check_convex_hull_violation(
    self.datapre_treated.values.flatten(),
    self.datapre_control.values
)

if not hull_check['passes']:
    warnings.warn(
        f"Convex hull assumption may be violated: {hull_check['n_violations']} "
        f"pre-intervention time points ({hull_check['pct_above']:.1f}% above, "
        f"{hull_check['pct_below']:.1f}% below control range). "
        "The synthetic control method requires the treated unit to lie within "
        "the convex hull of control units. Consider: (1) adding more diverse "
        "control units, (2) using a model with an intercept (e.g., ITS with "
        "control predictors), or (3) using the Augmented Synthetic Control Method. "
        "See glossary term 'Convex hull condition' for more details.",
        UserWarning,
        stacklevel=2
    )

3. Update Notebooks

`sc_pymc.ipynb`

Add a brief subsection titled "Convex Hull Assumption" that:

Explains the constraint imposed by non-negative weights summing to one
Notes that this means the treated unit must lie within the "range" of control units
Links to the glossary term

`sc_pymc_brexit.ipynb`

Add a more detailed pedagogical subsection titled "Understanding the Convex Hull Assumption" t...

Fixes Add Diagnostic Test for Convex Hull Assumption in Synthetic Control #598

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Co-authored-by: drbenvincent <[email protected]>

drbenvincent · 2025-12-20T12:45:49Z

bugbot run

cursor · 2025-12-20T12:45:53Z

PR Summary

Adds check_convex_hull_violation() and integrates a pre-fit convex hull warning into SyntheticControl, with tests and glossary docs.

Synthetic Control:
- Adds pre-fit convex hull check in causalpy/experiments/synthetic_control.py using check_convex_hull_violation; issues a UserWarning when violated.
Utilities:
- Introduces check_convex_hull_violation() in causalpy/utils.py to detect treated series outside control range; returns pass/violation stats.
Tests:
- Adds unit tests in causalpy/tests/test_utils.py covering pass/above/below/both/boundary cases.
- Adds integration tests in causalpy/tests/test_input_validation.py verifying warning emission and no-warning scenarios in SyntheticControl.
Docs:
- Adds glossary entry for "Convex hull condition" in docs/source/knowledgebase/glossary.rst.

^{Written by Cursor Bugbot for commit 8f8d79d. This will update automatically on new commits. Configure here.}

causalpy/experiments/synthetic_control.py

causalpy/utils.py

The interrogate coverage badge in the documentation was updated to reflect a new coverage value of 96.4%.

codecov · 2025-12-20T13:18:43Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.38%. Comparing base (2d6bba7) to head (6a22940).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #599      +/-   ##
==========================================
+ Coverage   93.27%   93.38%   +0.11%     
==========================================
  Files          37       37              
  Lines        5632     5733     +101     
  Branches      367      370       +3     
==========================================
+ Hits         5253     5354     +101     
  Misses        248      248              
  Partials      131      131

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

drbenvincent · 2025-12-20T13:58:30Z

Note to self: the "What to Do if Violated" section needs a bit more work.

Added a check in check_convex_hull_violation to safely handle empty treated and control arrays, returning default pass results. Added a corresponding test to ensure correct behavior for this edge case.

Reworded suggestions for alternative methods in the synthetic control discussion to improve clarity and consistency. Adjusted phrasing for Augmented Synthetic Control and Comparative Interrupted Time Series methods.

Initial plan

d35dbde

Copilot AI assigned Copilot and drbenvincent Dec 20, 2025

Copilot started work on behalf of drbenvincent December 20, 2025 11:37 View session

Copilot AI and others added 5 commits December 20, 2025 11:41

Add convex hull diagnostic function and integrate into SyntheticControl

b67ee78

Co-authored-by: drbenvincent <[email protected]>

Add convex hull documentation to synthetic control notebooks

3ac03b2

Co-authored-by: drbenvincent <[email protected]>

Apply pre-commit formatting fixes

3e287d4

Co-authored-by: drbenvincent <[email protected]>

Add integration tests for convex hull warning behavior

d9b90bf

Co-authored-by: drbenvincent <[email protected]>

Apply final formatting fixes

4a804b8

Co-authored-by: drbenvincent <[email protected]>

Copilot AI changed the title ~~[WIP] Add diagnostic test for convex hull assumption~~ Add convex hull violation diagnostic for synthetic control methods Dec 20, 2025

Copilot AI requested a review from drbenvincent December 20, 2025 11:56

Copilot finished work on behalf of drbenvincent December 20, 2025 11:56

minor modifications to notebooks + re-execute

8f8d79d

cursor bot reviewed Dec 20, 2025

View reviewed changes

causalpy/experiments/synthetic_control.py Outdated Show resolved Hide resolved

causalpy/utils.py Show resolved Hide resolved

drbenvincent marked this pull request as ready for review December 20, 2025 12:51

drbenvincent requested review from ErikRingen, cetagostini and juanitorduz December 20, 2025 12:51

drbenvincent added 3 commits December 20, 2025 13:04

get tests passing for convex hull stuff

41c7e12

Update interrogate badge to 96.4%

4bb0cbf

The interrogate coverage badge in the documentation was updated to reflect a new coverage value of 96.4%.

Merge branch 'main' into copilot/add-diagnostic-test-convex-hull

7de2c52

drbenvincent added documentation Improvements or additions to documentation enhancement New feature or request labels Dec 20, 2025

drbenvincent added 3 commits December 20, 2025 20:15

Handle empty input in check_convex_hull_violation

dd62321

Added a check in check_convex_hull_violation to safely handle empty treated and control arrays, returning default pass results. Added a corresponding test to ensure correct behavior for this edge case.

Clarify recommendations in Brexit notebook

f46670e

Reworded suggestions for alternative methods in the synthetic control discussion to improve clarity and consistency. Adjusted phrasing for Augmented Synthetic Control and Comparative Interrupted Time Series methods.

Add glossary link for Comparative interrupted time-series term

6a22940

drbenvincent approved these changes Dec 20, 2025

View reviewed changes

drbenvincent requested a review from NathanielF December 20, 2025 21:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add convex hull violation diagnostic for synthetic control methods #599

Add convex hull violation diagnostic for synthetic control methods #599

Uh oh!

Copilot AI commented Dec 20, 2025 •

edited

Loading

Uh oh!

drbenvincent commented Dec 20, 2025

Uh oh!

cursor bot commented Dec 20, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Dec 20, 2025 •

edited

Loading

Uh oh!

drbenvincent commented Dec 20, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add convex hull violation diagnostic for synthetic control methods #599

Are you sure you want to change the base?

Add convex hull violation diagnostic for synthetic control methods #599

Uh oh!

Conversation

Copilot AI commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Background

The Convex Hull Assumption

Current State

Proposed Solution

1. Implement Diagnostic Test

2. Issue Warning with Instructive Message

3. Update Notebooks

sc_pymc.ipynb

sc_pymc_brexit.ipynb

Uh oh!

drbenvincent commented Dec 20, 2025

Uh oh!

cursor bot commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

drbenvincent commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Dec 20, 2025 •

edited

Loading

`sc_pymc.ipynb`

`sc_pymc_brexit.ipynb`

cursor bot commented Dec 20, 2025 •

edited

Loading

codecov bot commented Dec 20, 2025 •

edited

Loading

drbenvincent commented Dec 20, 2025 •

edited

Loading