Event studies

We look at Sun and Abraham.

they raise concerns with dynamic specification typicaly used
estimated parameters might not be the average of positively weighted well defined causal effects
can contaminate pre-trends as well as main estimates

We consider an event study framework.

we have N units and T periods
for each units, $E_i$ denotes the start of the treatment, never treated have $E_i = \infty$
the potential outcomes are $Y_{it}(e)$ , the outcome in period $t$ for each potential treatment date $e$
the observed outcome is $Y_{it} = Y_{it}(e {=} E_i)$
we define the following Cohort Average Treatment on the Treated:

$CATT_{e, \ell} = E[Y_{i, e + \ell} - Y_{i, e + \ell}(\infty)|E_i = e]$

for $l \geq 0 $ these are ATT:

$l\geq 0 , \quad CATT_{e, \ell} = E[Y_{i, e + \ell}(e) - Y_{i, e + \ell}(\infty)|E_i = e]$

for $l < 0 $, under the no anticipation assumption, they end up being 0

$l<0, \quad CATT_{e, \ell} = E[Y_{i, e + \ell}(\infty) - Y_{i, e + \ell}(\infty)|E_i = e] = 0$

Define lags of interest $\mathcal{G}$ (e.g., pre-trend coef, coef of interest, avg post-treatment)
Define $D_{it}^\ell = 1[ t-E_i {=}\ell ]$
Consider the usual dynamic specification for an event study:

$Y_{i, t} = \alpha_i + \lambda_t + \sum_{\ell \in \mathcal{G}} \mu_{\ell} D^{\ell}_{i, t} + \nu_{i, t} \tag{1}$

The $\mu_{\ell}$ can be decomposed as $$ \mu_{\ell} = \sum_{e, \ell'}\omega_{e, \ell'}^{\ell} CATT_{e, \ell'} \tag{2} $$
In general, the $\omega$ 's from (2) can be estimated through the regressions that are ran separatly of each value of $e$ and $l$ : $$ D_{i, t}^{\ell} \cdot \mathbb{1} {E_i = e} = \alpha_i + \lambda_t + \sum_{g \in \mathcal{G}} \omega_{e, \ell}^g \mathbb{1}{t - E_i \in g} + u_{i, t} \tag{4} $$

A simple example from the paper

We consider a simple case with $T = 0,1,2$ and cohorts $E_i \in \{1, 2, \infty\}$ .

We define $g^{incl} = \{−2, 0\}$ and $g^{excl} = \{−1, 1\}$ - exclude -1 to normalize relative to -1 - exclude 1 to drop "distant" lags.

In the simple case we consider, if all individuals are treated we have $$ \mu_{-2} = \underbrace{CATT_{2, -2}}{\text{own period}} + \underbrace{\frac{1}{2}CATT - \frac{1}{2}CATT_{2, 0}}{\ell' \in g^{incl}, \ell' \neq -2} + \underbrace{\frac{1}{2}CATT - CATT_{1, -1} - \frac{1}{2}CATT_{2, -1}}_{\ell' \in g^{excl}} \tag{5} $$

With no anticipation we get that $CATT_{e,l} = 0 $ for $l<0$ and so we get:

$\mu_{-2} = \underbrace{\frac{1}{2}CATT_{1, 0} - \frac{1}{2}CATT_{2, 0}}_{\ell' \in g^{incl}, \ell' \neq -2} + \underbrace{\frac{1}{2}CATT_{1, 1}}_{\ell' \in g^{excl}} \tag{5}$

We can still get non zero $\mu_{-2}$ even in the absence of anticipation effects!
This is true even in the case where CATT are homogenous.

Generating panel data

import numpy as np
import pandas as pd
import statsmodels.formula.api as smf
from matplotlib import pyplot as plt
import tqdm

def gen_panel(share_treated, CATT):
    '''
    Purpose:
        Generate panel data.
    Inputs:
        share_treated (float): integer multiple of 0.01 in (0, 1]
        CATT (dict): dictionary giving CATT values
    Returns:
        panel (Pandas DataFrame)
    '''
    T = 2
    N = 200
    n_treated   = N * share_treated
    CATT_E1_lm1 = CATT['CATT_E1_lm1']
    CATT_E1_l0  = CATT['CATT_E1_l0']
    CATT_E1_lp1 = CATT['CATT_E1_lp1']
    CATT_E2_lm2 = CATT['CATT_E2_lm2']
    CATT_E2_lm1 = CATT['CATT_E2_lm1']
    CATT_E2_l0  = CATT['CATT_E2_l0']

    panel = pd.DataFrame()
    panel['t'] = np.tile(np.array(np.arange(T + 1)), N)
    panel['i'] = np.repeat(np.arange(N), T + 1)

    # Add individual and time effects
    panel['alpha_i'] = np.repeat(np.random.random(size=(N)), T + 1)
    panel['lambda_t'] = np.tile(np.random.random(size=(T + 1)), N)

    # Set half the treated to E_1 and the other half to E_2
    panel.eval(
        'E = (i < ({} // 2)) + 2 * (({} // 2) <= i < {})'.format(n_treated,n_treated,n_treated)
        , inplace=True)
    panel.loc[panel['E'] == 0, 'E'] = 100 # Set very large

    # Generate Y
    panel.eval(
        'Y = (E == 1) * ('
            '@CATT_E1_lm1 * (t == 0)'
            '+ @CATT_E1_l0 * (t == 1)'
            '+ @CATT_E1_lp1 * (t == 2)'
        ') +'
        '(E == 2) * ('
            '@CATT_E2_lm2 * (t == 0)'
            '+ @CATT_E2_lm1 * (t == 1)'
            '+ @CATT_E2_l0 * (t == 2)'
        ') + alpha_i + lambda_t'
        , inplace=True)

    # Add group-specific means to alpha_i
    panel.loc[panel['E'] == 1, 'Y'] += 0.3 #np.random.random()
    panel.loc[panel['E'] == 2, 'Y'] += 0.2 #np.random.random()
    panel.loc[panel['E'] == 100, 'Y'] += 0 #np.random.random()

    # Generate D
    panel.eval('D_lm2 = 1 * (t - E == -2)', inplace=True)
    panel.eval('D_l0 = 1 * (t - E == 0)', inplace=True)

    return panel

Viewing some panel data

# Choose share_treated
share_treated = 0.03
# Generate CATT
CATT_E1_lm1 = 0 # no anticipation
CATT_E1_l0 = 0.4 # np.random.random()
CATT_E1_lp1 = 0.8 #np.random.random()

CATT_E2_lm2 = 0 # no anticipation
CATT_E2_lm1 = 0 # no anticipation
CATT_E2_l0 = 0.6 #np.random.random()

CATT = {
    'CATT_E1_lm1': CATT_E1_lm1, 
    'CATT_E1_l0': CATT_E1_l0, 
    'CATT_E1_lp1': CATT_E1_lp1, 
    'CATT_E2_lm2': CATT_E2_lm2, 
    'CATT_E2_lm1': CATT_E2_lm1, 
    'CATT_E2_l0': CATT_E2_l0}

# Generate panel
panel = gen_panel(share_treated, CATT)
display(panel.head(n=20))

	t	i	alpha_i	lambda_t	E	Y	D_lm2	D_l0
0	0	0	0.982422	0.909437	1.0	2.191859	0.0	0.0
1	1	0	0.982422	0.144418	1.0	1.826840	0.0	1.0
2	2	0	0.982422	0.470141	1.0	2.552563	0.0	0.0
3	0	1	0.605144	0.909437	1.0	1.814581	0.0	0.0
4	1	1	0.605144	0.144418	1.0	1.449562	0.0	1.0
5	2	1	0.605144	0.470141	1.0	2.175285	0.0	0.0
6	0	2	0.774368	0.909437	1.0	1.983805	0.0	0.0
7	1	2	0.774368	0.144418	1.0	1.618786	0.0	1.0
8	2	2	0.774368	0.470141	1.0	2.344509	0.0	0.0
9	0	3	0.534308	0.909437	2.0	1.643745	1.0	0.0
10	1	3	0.534308	0.144418	2.0	0.878726	0.0	0.0
11	2	3	0.534308	0.470141	2.0	1.804449	0.0	1.0
12	0	4	0.784083	0.909437	2.0	1.893520	1.0	0.0
13	1	4	0.784083	0.144418	2.0	1.128501	0.0	0.0
14	2	4	0.784083	0.470141	2.0	2.054224	0.0	1.0
15	0	5	0.178669	0.909437	2.0	1.288106	1.0	0.0
16	1	5	0.178669	0.144418	2.0	0.523086	0.0	0.0
17	2	5	0.178669	0.470141	2.0	1.448810	0.0	1.0
18	0	6	0.727277	0.909437	100.0	1.636714	0.0	0.0
19	1	6	0.727277	0.144418	100.0	0.871695	0.0	0.0

Plotting Y over time, grouped by E

panel_sum = panel.groupby(['E', 't'])['Y'].mean()
plt.plot(range(3), panel_sum[:3], label='$E_i=1$')
plt.plot(range(3), panel_sum[3:6], label='$E_i=2$')
plt.plot(range(3), panel_sum[6:9], label='$E_i = \infty$')
plt.xlabel('Time')
plt.xticks([0,1,2])
plt.ylabel('Y')
plt.legend()
plt.show()

png

Computing the weights

Recall that the $\omega$ 's from (2) can be estimated through the regressions for each $(e,\ell)$ $$ D_{i, t}^{\ell} \cdot \mathbb{1} {E_i = e} = \alpha_i + \lambda_t + \sum_{g \in \mathcal{G}} \omega_{e, \ell}^g \mathbb{1}{t - E_i \in g} + \nu_{i, t} \tag{4} $$

def get_weights(panel):
    '''
    Purpose:
        Compute the weights using OLS.
    Inputs:
        panel (Pandas DataFrame): panel to compute weights
    Returns:
        gm2 (dict)
        g0 (dict)
    '''    
    gm2 = {}
    g0 = {}

    # We loop over e and l
    for e in [1, 2]:
        for l in range(-2, 2):
            # Avoid out-of-bounds cases
            if not (((e == 1) and (l == -2)) or ((e == 2) and (l == 1))):

                # construct the regressors as in equation (4)
                panel.eval('D = 1 * ((t - E == @l) & (E == @e))', inplace=True)
                panel.eval('gm2 = (t - E) == -2', inplace=True)
                panel.eval('g0 = (t - E) == 0', inplace=True)

                # run the regression as in equation (4)
                ols = smf.ols('D ~ C(t) + C(i) + gm2 + g0', panel)
                ols_result = ols.fit().params

                # collect the parameters
                gm2[(e, l)] = ols_result['gm2[T.True]']
                g0[(e, l)] = ols_result['g0[T.True]']

    return gm2, g0

Apply the functions

# Loop over share_treated
share_treated = np.linspace(0.01, 1, 20)

mu_lm2 = []
CATT_mu_lm2 = []

gm2_full = {}
g0_full = {}

for share in tqdm.tqdm(share_treated):
    panel = gen_panel(share, CATT)

    # Use Y from panel to estimate mu_l
    # -1 to drop constant
    ols = smf.ols('Y ~ C(i) + C(t) + D_lm2 + D_l0 -1', panel)
    ols_result = ols.fit().params

    # Extract each of the weights for all (e,l) combinations
    gm2, g0 = get_weights(panel)
    wt_E2_lm2 = gm2[(2, -2)]
    wt_E1_l0 = gm2[(1, 0)]
    wt_E2_l0 = gm2[(2, 0)]
    wt_E1_lp1 = gm2[(1, 1)]
    wt_E1_lm1 = gm2[(1, -1)]
    wt_E2_lm1 = gm2[(2, -1)]

    # Save results
    mu_lm2.append(ols_result['D_lm2'])
    CATT_mu_lm2.append(wt_E2_lm2 * CATT_E2_lm2
                       + wt_E1_l0 * CATT_E1_l0
                       + wt_E2_l0 * CATT_E2_l0
                       + wt_E1_lp1 * CATT_E1_lp1
                       + wt_E1_lm1 * CATT_E1_lm1
                       + wt_E2_lm1 * CATT_E2_lm1)

    # Append weights to gm2, g0
    for key in gm2.keys():
        try:
            gm2_full[key].append(gm2[key])
        except:
            gm2_full[key] = [gm2[key]]

    for key in g0.keys():
        try:
            g0_full[key].append(g0[key])
        except:
            g0_full[key] = [g0[key]]

1	`100%\|██████████\| 20/20 [00:05<00:00, 3.93it/s]`

Graph the weights for different treated shares

plt.figure(figsize=(10,5))

plt.subplot(1,2,1)
plt.plot(share_treated, gm2_full[(1, 0)], label='$g=-2,\; e=1,\; \ell=0$')
plt.plot(share_treated, gm2_full[(2, 0)], label='$g{=}2,\; e{=}2,\; \ell{=}0$')
plt.xlabel('Share treated')
plt.ylabel('Weight')
plt.legend()

plt.subplot(1,2,2)
plt.plot(share_treated, gm2_full[(1, 1)], label='$g=-2,\; e=1,\; \ell=1$')
plt.plot(share_treated, gm2_full[(2, -1)], label='$g=-2,\; e=2,\; \ell=-1$')
plt.plot(share_treated, gm2_full[(1, -1)], label='$g=-2,\; e=1,\; \ell=-1$')
plt.xlabel('Share treated')
plt.ylabel('Weight')
plt.legend()
plt.show()

png

Using true CATT to check weights

plt.plot(np.array(mu_lm2), np.array(CATT_mu_lm2),'o', label='OLS mu versus weighted mu')
plt.xlabel('value of $\mu_{-2}$')
plt.ylabel('value of weighted CATTs')
plt.show()

png

Estimating the CATTs

We estimate the CATTs using the following regression: $$ Y_{i, t} = \alpha_i + \lambda_t + \sum_{e \notin C} \sum_{\ell \neq -1} \delta_{e, \ell}(\mathbb{1}{E_i = e} \cdot D_{i, t}^{\ell}) + \epsilon_{i, t} \tag{6} $$

where in this case, $C = \{\infty\}$ gives that the never-treated group serves as the cohort we exclude (if there are no never-treated groups, $C$ could include the last treated group, and any always-treated cohorts must be included in $C$ ) and we treat $\ell = -1$ as a baseline.

$\hat{\delta}_{e, \ell}$ gives a DID estimator for $CATT_{e, \ell}$ .

# Set share_treated in (0, 1)
share_treated = 0.1
panel = gen_panel(share_treated, CATT)

panel.eval('delta_E1_l0  = 1 * ((t - E == 0) & (E == 1))', inplace=True)
panel.eval('delta_E1_lp1 = 1 * ((t - E == 1) & (E == 1))', inplace=True)
panel.eval('delta_E2_lm2 = 1 * ((t - E == -2) & (E == 2))', inplace=True)
panel.eval('delta_E2_l0  = 1 * ((t - E == 0) & (E == 2))', inplace=True)

ols = smf.ols('Y ~ C(i) + C(t) + delta_E1_l0 + delta_E1_lp1 + delta_E2_lm2 + delta_E2_l0 -1', panel)
ols_result = ols.fit().params

delta_E1_l0 = ols_result['delta_E1_l0']
delta_E1_lp1 = ols_result['delta_E1_lp1']
delta_E2_lm2 = ols_result['delta_E2_lm2']
delta_E2_l0 = ols_result['delta_E2_l0']

# Compare to truth:
print('CATT_E1_l0:', CATT_E1_l0)
print('CATT_E1_lp1:', CATT_E1_lp1)
print('CATT_E2_lm2:', CATT_E2_lm2)
print('CATT_E2_l0:', CATT_E2_l0)

# Estimates:
print('delta_E1_l0:', delta_E1_l0)
print('delta_E1_lp1:', delta_E1_lp1)
print('delta_E2_lm2:', delta_E2_lm2)
print('delta_E2_l0:', delta_E2_l0)

# # Differences:
# print('E1_l0:', delta_E1_l0 - CATT_E1_l0)
# print('E1_lp1:', delta_E1_lp1 - CATT_E1_lp1)
# print('E2_lm2:', delta_E2_lm2 - CATT_E2_lm2)
# print('E2_l0:', delta_E2_l0 - CATT_E2_l0)

CATT_E1_l0: 0.4
CATT_E1_lp1: 0.8
CATT_E2_lm2: 0
CATT_E2_l0: 0.6
delta_E1_l0: 0.4000000000000064
delta_E1_lp1: 0.8000000000000045
delta_E2_lm2: 9.764199571088277e-15
delta_E2_l0: 0.6000000000000056

how to construct inference on average over cohort (inference on linear combination of parameters)
how to include covariates