generate_piecewise_its_data#
- causalpy.data.simulate_data.generate_piecewise_its_data(N=100, interruption_times=None, baseline_intercept=10.0, baseline_slope=0.1, level_changes=None, slope_changes=None, noise_sigma=1.0, seed=None)[source]#
Generate piecewise Interrupted Time Series data with known ground truth parameters.
This function creates synthetic data for testing and demonstrating piecewise ITS / segmented regression models. The data follows the model:
y_t = β₀ + β₁t + Σₖ(level_k · I_k(t) + slope_k · R_k(t)) + ε_t
Where: - I_k(t) = 1 if t >= T_k else 0 (step function for level change) - R_k(t) = max(0, t - T_k) (ramp function for slope change)
- Parameters:
N (
int) – Number of time points in the series.interruption_times (
list[int] |None) – List of time indices where interruptions occur. Defaults to [50].baseline_intercept (
float) – The intercept (β₀) of the baseline trend.baseline_slope (
float) – The slope (β₁) of the baseline trend.level_changes (
list[float] |None) – List of level changes at each interruption. Length must match interruption_times. If None, defaults to [5.0] for single interruption.slope_changes (
list[float] |None) – List of slope changes at each interruption. Length must match interruption_times. If None, defaults to [0.0] (no slope change).noise_sigma (
float) – Standard deviation of the Gaussian noise.
- Returns:
df (pd.DataFrame) – DataFrame with columns: - ‘t’: time index (0 to N-1) - ‘y’: observed outcome with noise - ‘y_true’: outcome without noise (ground truth) - ‘counterfactual’: baseline trend without intervention effects - ‘effect’: true causal effect at each time point
params (dict) – Dictionary containing the true parameters: - ‘baseline_intercept’: β₀ - ‘baseline_slope’: β₁ - ‘level_changes’: list of level changes - ‘slope_changes’: list of slope changes - ‘interruption_times’: list of interruption times - ‘noise_sigma’: noise standard deviation
- Return type:
Examples
>>> from causalpy.data.simulate_data import generate_piecewise_its_data >>> # Single interruption with level and slope change >>> df, params = generate_piecewise_its_data( ... N=100, ... interruption_times=[50], ... level_changes=[5.0], ... slope_changes=[0.2], ... seed=42, ... ) >>> df.shape (100, 5)
>>> # Multiple interruptions >>> df, params = generate_piecewise_its_data( ... N=150, ... interruption_times=[50, 100], ... level_changes=[3.0, -2.0], ... slope_changes=[0.1, -0.15], ... seed=42, ... ) >>> len(params["interruption_times"]) 2
>>> # Level change only (no slope change) >>> df, params = generate_piecewise_its_data( ... N=100, ... interruption_times=[50], ... level_changes=[5.0], ... slope_changes=[0.0], ... seed=42, ... )