Skip to content

DiD and synthetic control

Before we get started, here are useful links for the following lecture:

In this lecture, we are going to look a bit deeper into extension of the Diff-in-Diff when longer lags and leads are available to us.

We follow the presentation of Kellog, Mogstad, Pouliot and Torgovisky (2020).

We then consider an outcome Y_{it} observed for multiple periods. We similarly assume that we have a binary treatment assignment for each unit i D_i. Finally we assume that the time specific treatment status is of the form D_{it} = D_i 1[ t \geq t^*].

Empirical example: California Prop 99

We are going to use the tabacco ban in California as a running example. This is a common one.

In 1988, California passed a law, Prop 99, to tax tabacco products including cigaret packs. This law was the first of its kind in the US.

We would like to evaluate the effect of passing that regulation on tabacco consumption. We then state level data on cigaret consumption as used in Abadie et al.

1
2
3
4
5
6
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

data = pd.read_csv('../data/cigs.csv')
data

We start by ploting cigaret consumption in California before and after the enactment of the law.

1
2
3
4
5
6
dcal = data.query("state=='CA'")
plt.plot(dcal['year'],dcal['cigs']) 
plt.axvline(1988,linestyle=":",label="Prop 99")
plt.title("Cigaret consumption in Califirnia")
plt.legend()
plt.show()

png

Measuring the causal effect using Before/After

As we well know now, to measure the causal effect of the law, we need to have some notion of what is the right control group.

A before-after approach would compare the cigaret consumption in California before and after. We can compute such a value for different windows around the law.

1
2
3
4
5
res_ba = pd.DataFrame([ {'window':w, 
  'before': dcal.query("year > 1988 - @w & (year < 1988)")['cigs'].mean(), 
  'after' : dcal.query("year < 1988 + @w & (year > 1988)")['cigs'].mean()} for w in range(2,10)])
res_ba['DBA'] = res_ba.eval("before - after")
res_ba
window before after DBA
0 2 97.500000 82.400000 15.100000
1 3 98.600000 80.100000 18.500000
2 4 100.000000 76.300000 23.700000
3 5 101.200000 74.100000 27.100000
4 6 103.120000 71.960000 31.160000
5 7 105.166667 69.733333 35.433333
6 8 107.085714 67.828571 39.257143
7 9 108.725000 66.162500 42.562500

DiD using rest of the US

We can do better and use the rest of the US states as control group to allow for a trend in cigaret consumption.

1
2
3
4
5
6
dcal = data.query("state=='CA'")
plt.plot(dcal['year'],dcal['cigs'],label="CA") 
data.query("state != 'CA'").groupby('year')['cigs'].mean().plot(label="US")
plt.axvline(1988,linestyle=":") 
plt.legend()
plt.show() 

png

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
res_did = pd.DataFrame([ 
{'window':w, 
  'before_t': data.query('state == "CA"').query("year>1988 - @w & (year<1988)")['cigs'].mean(), 
  'after_t' : data.query('state == "CA"').query("year<1988 + @w & (year>1988)")['cigs'].mean(),
  'before_c': data.query('state != "CA"').query("year>1988 - @w & (year<1988)")['cigs'].mean(), 
  'after_c' : data.query('state != "CA"').query("year<1988 + @w & (year>1988)")['cigs'].mean()}
                        for w in range(2,11)])
res_did['DBAT'] = res_did.eval("before_t - after_t")
res_did['DBAC'] = res_did.eval("before_c - after_c")
res_did['DiD']  = res_did.eval("DBAT - DBAC")
res_did
window before_t after_t before_c after_c DBAT DBAC DiD
0 2 97.500000 82.400000 117.586842 109.663158 15.100000 7.923684 7.176316
1 3 98.600000 80.100000 119.090789 107.664474 18.500000 11.426316 7.073684
2 4 100.000000 76.300000 120.432456 106.557018 23.700000 13.875439 9.824561
3 5 101.200000 74.100000 121.550000 105.766447 27.100000 15.783553 11.316447
4 6 103.120000 71.960000 123.490000 105.152105 31.160000 18.337895 12.822105
5 7 105.166667 69.733333 125.624123 104.646491 35.433333 20.977632 14.455702
6 8 107.085714 67.828571 127.390226 104.433835 39.257143 22.956391 16.300752
7 9 108.725000 66.162500 128.727632 104.027632 42.562500 24.700000 17.862500
8 10 110.188889 64.788889 129.767544 103.778947 45.400000 25.988596 19.411404

We note that the DiD estimates are relatively smaller than the before-after estimates. This is not so surprising given the overall trend in cigaret consumption.

All lead and lags - checking common trend

Remember that the assumption beind the DiD estimator is the fact that the non treated outcome for the treatment group is mean indepedent of treatment. Formaly we have been writing

E[ Y_{it}(0) - Y_{it'}(0) | D_{i} {=} 1] = E[ Y_{it}(0)- Y_{it'}(0) | D_{i} {=} 0]

This allowed us to use the cahnge in outcome in the control groups at each time t as a counterfactual for the change in outcome for the treatment group.

While this assumption does not provide any restrictions when the pre-treatment period has only one period, it does if we have multiple lead periods. Indeed for any two periods before t^* we can check where such constraint is satified.

We then write a regression expression where we allow for terms that are time specific and interacted with treatment in the pre-treatment period.

1
2
data['Di'] = (data['state']=='CA')
data['Dit'] = (data['state']=='CA') & (data['year']>=1988) 
1
2
3
4
5
6
7
8
import statsmodels.formula.api as smf

# Initialise and fit linear regression model using `statsmodels`
model = smf.ols("cigs ~ C(year)*C(Di)", data=data)
model = model.fit()

data_cal = data.query("state == 'CA'").copy()
data_cal['dep'] = model.predict(data_cal) - model.predict( data_cal.eval("Di=False") )
1
2
3
4
5
6
7
data_cal.plot(x='year',y='dep',label="Treat")
plt.axvline(x=1988, linestyle=':',label="policy")
plt.axhline(data_cal.set_index('year')['dep'][1987],linestyle='--',color='grey')
plt.legend()

# the value at 1988 would be normalized at 0,  
# @fixme change the intercept to be centered at 0 on y-axis
1
<matplotlib.legend.Legend at 0x7f01520c2df0>

png

1
2
3
4
5
6
7
8
9
#df_pred = pd.concat( [ data_cal.reset_index(), model.get_prediction(data_cal).summary_frame() ],axis=1)
#df_pred

#ax = df_pred.plot(x='year',y='mean')
#df_pred.plot(x='year',y='mean_ci_upper',color="red",linestyle=":",ax=ax,label="mean_ci")
#df_pred.plot(x='year',y='mean_ci_lower',color="red",linestyle=":",ax=ax,label="mean_ci")
#plt.axvline(x=1988, linestyle=':',label="policy")
#plt.legend()
#plt.show()

Construct non treated outcome using observables

In this section we think about directly constructing a control group as a weighted sum of available non treated units. A simple choice would be to pick the state with the most similar observables. Or we could weight becased on the distance on obserbales.

Let's try both of these approaches.

Next we might want to change the way we construct our non-treated counterfactual for California. First we could choose to use observables characteristics to be added to our framework.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# we start by computing pair-wise distances between observations 

df_X = (pd
    .pivot_table(data[['state','loggdppercap','retprice','pctage1524','beerpercap']]
    .drop_duplicates(), index="state"))

X1 = df_X.query('state =="CA"').to_numpy()
X0 = df_X.query('state !="CA"').to_numpy()
state_order = df_X.query('state !="CA"').index

ns = X0.shape[0]
w0 = np.ones(ns)/ns

def loss_obj(w):
    return( ((X1 - w @ X0 )**2).sum())
1
2
3
4
5
6
7
from scipy.optimize import Bounds
# bound the weight in [0,1]
bounds = Bounds(np.zeros(ns), np.ones(ns))

# the weights sum to 1
from scipy.optimize import LinearConstraint
sum_to_one = LinearConstraint(np.ones(ns) , [1], [1])
1
2
from scipy.optimize import minimize
res = minimize(loss_obj, w0, constraints=sum_to_one, options={'verbose': 1}, bounds=bounds)
1
2
<ipython-input-393-15870aad8043>:2: OptimizeWarning: Unknown solver options: verbose
  res = minimize(loss_obj, w0, constraints=sum_to_one, options={'verbose': 1}, bounds=bounds)
1
2
weights = pd.DataFrame({'weight':res['x']},index=state_order)
weights.query('weight > 0.01')
weight
state
CO 0.122732
CT 0.426237
NH 0.167628
VA 0.283403
1
2
3
4
5
6
# we use the estimated weights to compute the counterfactual outcome.
data_t = (data.query("state !='CA'")
    .join(weights,on="state")
    .assign(dep = lambda d: d['cigs'] * d['weight'])
    .groupby('year')
    ['dep'].agg('sum'))
1
2
3
4
ax = data_t.plot(x='year')
data_cal.plot(x='year',y='cigs',ax=ax)
plt.axvline(x=1988, linestyle=':',label="policy")
plt.show()

png

1
2
3
4
5
6
data_diff = data_cal.set_index('year')['cigs'] - data_t
data_diff = data_diff - data_diff[1987]
data_diff.plot()
plt.axvline(x=1988, linestyle=':',label="policy")
plt.axhline(0, linestyle='--',color='grey',label="policy")
plt.show()

png

1
2
3
4
5
6
7
# plot all of these states

ax = data.query("state == 'CA'").plot(x='year', y='cigs',label="CA")
for s in weights.query('weight > 0.01').index:
    data.query("state == '{}'".format(s)).plot(x='year', y='cigs',ax=ax,label=s,alpha=0.3)
plt.axvline(x=1988, linestyle=':',label="policy")
plt.plot()
1
[]

png

Construct non treated outcome using observables and pre-treatment outcome

We can also use pre-treatment outcome to select the weights of the synthetic control group. To this end we add these variables to our minimization problem.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
df_X = pd.pivot_table(data[['state','loggdppercap','retprice','pctage1524','beerpercap','cigsalespercap88','cigsalespercap80','cigsalespercap75']].drop_duplicates(), index="state")

X1 = df_X.query('state =="CA"').to_numpy()
X0 = df_X.query('state !="CA"').to_numpy()
state_order = df_X.query('state !="CA"').index

ns = X0.shape[0]
w0 = np.ones(ns)/ns

def loss_obj(w):
    return( ( (X1 - w @ X0 )**2).sum())
1
2
3
res = minimize(loss_obj, w0, constraints=sum_to_one, options={'verbose': 1}, bounds=bounds)
weights = pd.DataFrame({'weight':res['x']},index=state_order)
weights.query('weight > 0.01')
1
2
<ipython-input-399-643e15aa3ff9>:1: OptimizeWarning: Unknown solver options: verbose
  res = minimize(loss_obj, w0, constraints=sum_to_one, options={'verbose': 1}, bounds=bounds)
weight
state
CO 0.024353
MT 0.195348
ND 0.162775
NV 0.259584
UT 0.357162
1
2
3
4
5
ax = data.query("state == 'CA'").plot(x='year', y='cigs',label="CA")
for s in weights.query('weight > 0.01').index:
    data.query("state == '{}'".format(s)).plot(x='year', y='cigs',ax=ax,label=s,alpha=0.3)
plt.axvline(x=1988, linestyle=':',label="policy")
plt.plot() 
1
[]

png

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# we use the estimated weights to compute the counterfactual outcome.
data_t = (data.query("state !='CA'")
    .join(weights,on="state")
    .assign(dep = lambda d: d['cigs'] * d['weight'])
    .groupby('year')
    ['dep'].agg('sum'))

data_diff = data_cal.set_index('year')['cigs'] - data_t
data_diff = data_diff - data_diff[1987]
data_diff.plot()
plt.axvline(x=1988, linestyle=':',label="policy") 
plt.axhline(0, linestyle='--',color='grey',label="policy")
plt.show()

png

An extra step, pick the distance in the best possible way

The abadie et Al paper suggests to pick the weighting of the distance between variable to minimize the distance between the outcome variable in the pre-treamtent period.

w_s = \arg \min_w \sum_v \omega_v ( X_V(1) - w \times X_v(0))^2

Then pick \omega_v to minimize the realized distance between the outcomes in pre-treamtment

\sum_{t<t^*} (Y_{t,cal} - w_s \times Y_{t,control} )^2
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
def loss_obj2(variable_weights):
    """
    w2 is the norm across the variables that are included in the distance

    w2 here passes in the weighting for the norm, we take the best weights
    and return the other all distance on all pre-treatment outcomes 
    """

    def loss_within(state_weights):    
        return( ( variable_weights * (X1 - state_weights @ X0 )**2).sum())

    w0 = np.ones(ns)/ns
    res = minimize(loss_within, w0, constraints=sum_to_one, bounds=bounds)

    # compute overall distance
    weights = pd.DataFrame({'weight':res['x']},index=state_order)

    # compute the distance using 
    data_pre = (data.query("state !='CA'").query('year < 1988 ')
        .join(weights,on="state")
        .assign(dep = lambda d: d['cigs'] * d['weight'])
        .groupby('year')
        ['dep'].agg('sum'))

    dist  = ((data.query("state =='CA'").set_index('year').query('year < 1988 ')['cigs'] - data_pre)**2).sum()    
    #print(dist)
    return(dist)

nv = X0.shape[1]
w2_0 = np.random.uniform(size=nv)
w2_0 = w2_0/w2_0.sum()

bounds2 = Bounds(np.zeros(nv), np.ones(nv))
sum_to_one2 = LinearConstraint( np.concatenate( [np.ones(nv)] ) , [1], [2])
res2 = minimize(loss_obj2, w2_0, constraints=[sum_to_one2], options={'verbose': 1}, bounds=bounds2)

res2
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
<ipython-input-402-75ac7a18b174>:35: OptimizeWarning: Unknown solver options: verbose
  res2 = minimize(loss_obj2, w2_0, constraints=[sum_to_one2], options={'verbose': 1}, bounds=bounds2)





     fun: 158.99419348271718
     jac: array([-7.59670596e+04,  7.82092205e+04,  6.30907956e+04,  5.76088512e+04,
        4.44411431e+04, -6.96439667e+01,  5.65734923e+04])
 message: 'Optimization terminated successfully'
    nfev: 231
     nit: 18
    njev: 14
  status: 0
 success: True
       x: array([0.47950806, 0.00253885, 0.43252488, 0.00544681, 0.00329743,
       0.00317937, 0.08694906])
1
2
3
4
5
6
7
8
9
w2 = res2['x']

def loss_within(w):    
    return( ( w2 * (X1 - w @ X0 )**2).sum())

w0 = np.ones(ns)/ns
res3 = minimize(loss_within, w0, constraints=sum_to_one, bounds=bounds)
weights = pd.DataFrame({'weight':res3['x']},index=state_order)
weights.query('weight > 0.01')
weight
state
ID 0.037504
MT 0.200644
ND 0.142321
NV 0.248184
TX 0.015679
UT 0.349253
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# we use the estimated weights to compute the counterfactual outcome.
data_t = (data.query("state !='CA'")
    .join(weights,on="state")
    .assign(dep = lambda d: d['cigs'] * d['weight'])
    .groupby('year')
    ['dep'].agg('sum'))

data_diff = data_cal.set_index('year')['cigs'] - data_t
data_diff = data_diff - data_diff[1988]
data_diff.plot()
plt.axvline(x=1988, linestyle=':',label="policy")
plt.axhline(0, linestyle='--',color='grey',label="policy")
plt.show()

png

Taking a step back

Off course now we have used the pre-treatment outcome to form the control group, making it difficult to use the pretrend to check the common trend assumption.

In each of these approach, we are out to build a model for the non-treated outcome. In the DiD we simply use the change in the trend of the control group, ie we do

Y_{it}(0) = \alpha_i + \gamma_t + u_{it}

The synthetic control, though the ability of picking the weights constructs more flexible models of the non-treated outcome. For instance:

Y_{it}(0) = \alpha_i + \gamma_t + \lambda_t \delta_i + u_{it}

The different parameters can be estimated using all Y_{it}(0) at our disposal, this include pre treatment outcome for D_i=1 and all ouctomes for D_i=0.

A unit specific trend would look like: $$ Y_{it}(0) = \alpha_i + \gamma_t + \delta_i \cdot t + u_{it} $$

1
2
3
4
5
6
7
8
9
# Fit a model with state and time 
# 'loggdppercap','retprice','pctage1524','beerpercap'
model = smf.ols("cigs ~ C(year) + C(state) ", data=data.query("Dit == False"))
model = model.fit()

# Predict the counterfactual
data_c = model.predict(data.query("state == 'CA'").assign(Dit=False))
data_did = data.query("state == 'CA'").assign(cigs_c = data_c)
data_did.eval("diff = cigs - cigs_c").plot(x='year',y="diff")
1
<AxesSubplot:xlabel='year'>

png

A general view of the problem

In flexible term, it looks like we are looking for a model of

E[ Y_{it}(0) | X ] = h(\alpha_i,\lambda_t,X_{it})
  • One can tackle this issue directly and jointly think about identification and overfitting.
  • Common parameters and individual specific parameters can be estimated using pre data fomr the treatment and all data from the control.

  • One can add the pre-treatment values to the conditioning set

  • Instead one can try to estimate unit specific parameters
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
data_fit = data.query("Dit == False")

from patsy import dmatrices, dmatrix
outcome, predictors = dmatrices("cigs ~ C(year) + C(state)", data=data_fit)

np = predictors.shape[1]

from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.optimizers import Adam

# we then create a Neural Net model that includes these variables
model3 = Sequential()
model3.add(Dense(20, input_shape=(np,)))
model3.add(Activation('elu'))
model3.add(Dense(1))
model3.add(Activation('elu'))
model3.add(Dense(1))
model3.summary()

# compile the model
model3.compile(optimizer=Adam(learning_rate=0.1), loss='mse')
hist = model3.fit(predictors, outcome, epochs=100, batch_size=50, verbose=1)
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
Model: "sequential_12"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_36 (Dense)             (None, 20)                1400      
_________________________________________________________________
activation_24 (Activation)   (None, 20)                0         
_________________________________________________________________
dense_37 (Dense)             (None, 1)                 21        
_________________________________________________________________
activation_25 (Activation)   (None, 1)                 0         
_________________________________________________________________
dense_38 (Dense)             (None, 1)                 2         
=================================================================
Total params: 1,423
Trainable params: 1,423
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
24/24 [==============================] - 0s 5ms/step - loss: 5270.6152
Epoch 2/100
24/24 [==============================] - 0s 3ms/step - loss: 622.4312
Epoch 3/100
24/24 [==============================] - 0s 2ms/step - loss: 208.6150
Epoch 4/100
24/24 [==============================] - 0s 2ms/step - loss: 171.0508
Epoch 5/100
24/24 [==============================] - 0s 4ms/step - loss: 154.2295
Epoch 6/100
24/24 [==============================] - 0s 7ms/step - loss: 151.3749
Epoch 7/100
24/24 [==============================] - 0s 11ms/step - loss: 153.5272
Epoch 8/100
24/24 [==============================] - 0s 2ms/step - loss: 153.0701
Epoch 9/100
24/24 [==============================] - 0s 2ms/step - loss: 150.5564
Epoch 10/100
24/24 [==============================] - 0s 3ms/step - loss: 158.5476
Epoch 11/100
24/24 [==============================] - 0s 6ms/step - loss: 162.6064
Epoch 12/100
24/24 [==============================] - 0s 1ms/step - loss: 156.3358
Epoch 13/100
24/24 [==============================] - 0s 3ms/step - loss: 152.0533
Epoch 14/100
24/24 [==============================] - 0s 3ms/step - loss: 153.6271
Epoch 15/100
24/24 [==============================] - 0s 6ms/step - loss: 152.2017
Epoch 16/100
24/24 [==============================] - 0s 9ms/step - loss: 152.2559
Epoch 17/100
24/24 [==============================] - 0s 7ms/step - loss: 157.9699
Epoch 18/100
24/24 [==============================] - 0s 3ms/step - loss: 150.3422
Epoch 19/100
24/24 [==============================] - 0s 4ms/step - loss: 150.2430
Epoch 20/100
24/24 [==============================] - 0s 3ms/step - loss: 150.4473
Epoch 21/100
24/24 [==============================] - 0s 6ms/step - loss: 135.1562
Epoch 22/100
24/24 [==============================] - 0s 3ms/step - loss: 114.6240
Epoch 23/100
24/24 [==============================] - 0s 4ms/step - loss: 93.6037
Epoch 24/100
24/24 [==============================] - 0s 5ms/step - loss: 76.4302
Epoch 25/100
24/24 [==============================] - 0s 6ms/step - loss: 58.1515
Epoch 26/100
24/24 [==============================] - 0s 6ms/step - loss: 55.2531
Epoch 27/100
24/24 [==============================] - 0s 3ms/step - loss: 52.2153
Epoch 28/100
24/24 [==============================] - 0s 8ms/step - loss: 45.2615
Epoch 29/100
24/24 [==============================] - 0s 4ms/step - loss: 45.3045
Epoch 30/100
24/24 [==============================] - 0s 3ms/step - loss: 41.6909
Epoch 31/100
24/24 [==============================] - 0s 3ms/step - loss: 41.0619
Epoch 32/100
24/24 [==============================] - 0s 2ms/step - loss: 47.6799
Epoch 33/100
24/24 [==============================] - 0s 5ms/step - loss: 42.3568
Epoch 34/100
24/24 [==============================] - 0s 1ms/step - loss: 41.7850
Epoch 35/100
24/24 [==============================] - 0s 3ms/step - loss: 42.5475
Epoch 36/100
24/24 [==============================] - 0s 3ms/step - loss: 41.6872
Epoch 37/100
24/24 [==============================] - 0s 3ms/step - loss: 41.5499
Epoch 38/100
24/24 [==============================] - 0s 5ms/step - loss: 34.9172
Epoch 39/100
24/24 [==============================] - 0s 4ms/step - loss: 34.7344
Epoch 40/100
24/24 [==============================] - 0s 4ms/step - loss: 34.3114
Epoch 41/100
24/24 [==============================] - 0s 5ms/step - loss: 36.3715
Epoch 42/100
24/24 [==============================] - 0s 6ms/step - loss: 33.8574
Epoch 43/100
24/24 [==============================] - 0s 2ms/step - loss: 34.9000
Epoch 44/100
24/24 [==============================] - 0s 3ms/step - loss: 35.5993
Epoch 45/100
24/24 [==============================] - 0s 3ms/step - loss: 32.4079
Epoch 46/100
24/24 [==============================] - 0s 2ms/step - loss: 28.9420
Epoch 47/100
24/24 [==============================] - 0s 2ms/step - loss: 31.8443
Epoch 48/100
24/24 [==============================] - 0s 2ms/step - loss: 36.9891
Epoch 49/100
24/24 [==============================] - 0s 1ms/step - loss: 31.1576
Epoch 50/100
24/24 [==============================] - 0s 2ms/step - loss: 25.9844
Epoch 51/100
24/24 [==============================] - 0s 3ms/step - loss: 25.4151
Epoch 52/100
24/24 [==============================] - 0s 6ms/step - loss: 25.5655
Epoch 53/100
24/24 [==============================] - 0s 1ms/step - loss: 26.9821
Epoch 54/100
24/24 [==============================] - 0s 1ms/step - loss: 25.8264
Epoch 55/100
24/24 [==============================] - 0s 1ms/step - loss: 28.6249
Epoch 56/100
24/24 [==============================] - 0s 1ms/step - loss: 30.8211
Epoch 57/100
24/24 [==============================] - 0s 1ms/step - loss: 30.8262
Epoch 58/100
24/24 [==============================] - 0s 4ms/step - loss: 27.3592
Epoch 59/100
24/24 [==============================] - 0s 2ms/step - loss: 23.0173
Epoch 60/100
24/24 [==============================] - 0s 9ms/step - loss: 23.0259
Epoch 61/100
24/24 [==============================] - 0s 1ms/step - loss: 22.0185
Epoch 62/100
24/24 [==============================] - 0s 10ms/step - loss: 23.0096
Epoch 63/100
24/24 [==============================] - 0s 9ms/step - loss: 19.5929
Epoch 64/100
24/24 [==============================] - 0s 6ms/step - loss: 21.1058
Epoch 65/100
24/24 [==============================] - 0s 1ms/step - loss: 19.3035
Epoch 66/100
24/24 [==============================] - 0s 7ms/step - loss: 19.0488
Epoch 67/100
24/24 [==============================] - 0s 2ms/step - loss: 17.5515
Epoch 68/100
24/24 [==============================] - 0s 2ms/step - loss: 17.0121
Epoch 69/100
24/24 [==============================] - 0s 2ms/step - loss: 18.2933
Epoch 70/100
24/24 [==============================] - 0s 2ms/step - loss: 20.1941
Epoch 71/100
24/24 [==============================] - 0s 2ms/step - loss: 19.5249
Epoch 72/100
24/24 [==============================] - 0s 2ms/step - loss: 17.3390
Epoch 73/100
24/24 [==============================] - 0s 2ms/step - loss: 19.4463
Epoch 74/100
24/24 [==============================] - 0s 1ms/step - loss: 22.1619
Epoch 75/100
24/24 [==============================] - 0s 2ms/step - loss: 17.1782
Epoch 76/100
24/24 [==============================] - 0s 3ms/step - loss: 16.7280
Epoch 77/100
24/24 [==============================] - 0s 2ms/step - loss: 15.9549
Epoch 78/100
24/24 [==============================] - 0s 2ms/step - loss: 15.9174
Epoch 79/100
24/24 [==============================] - 0s 2ms/step - loss: 15.9146
Epoch 80/100
24/24 [==============================] - 0s 7ms/step - loss: 15.6787
Epoch 81/100
24/24 [==============================] - 0s 9ms/step - loss: 17.8808
Epoch 82/100
24/24 [==============================] - 0s 6ms/step - loss: 15.1826
Epoch 83/100
24/24 [==============================] - 0s 2ms/step - loss: 12.2510
Epoch 84/100
24/24 [==============================] - 0s 6ms/step - loss: 12.1095
Epoch 85/100
24/24 [==============================] - 0s 5ms/step - loss: 12.0299
Epoch 86/100
24/24 [==============================] - 0s 3ms/step - loss: 12.5005
Epoch 87/100
24/24 [==============================] - 0s 4ms/step - loss: 13.0939
Epoch 88/100
24/24 [==============================] - 0s 2ms/step - loss: 12.1476
Epoch 89/100
24/24 [==============================] - 0s 5ms/step - loss: 11.2766
Epoch 90/100
24/24 [==============================] - 0s 4ms/step - loss: 11.2382
Epoch 91/100
24/24 [==============================] - 0s 9ms/step - loss: 10.8647
Epoch 92/100
24/24 [==============================] - 0s 3ms/step - loss: 10.3702
Epoch 93/100
24/24 [==============================] - 0s 2ms/step - loss: 11.6427
Epoch 94/100
24/24 [==============================] - 0s 2ms/step - loss: 12.5492
Epoch 95/100
24/24 [==============================] - 0s 1ms/step - loss: 13.6127
Epoch 96/100
24/24 [==============================] - 0s 4ms/step - loss: 12.8697
Epoch 97/100
24/24 [==============================] - 0s 2ms/step - loss: 12.3416
Epoch 98/100
24/24 [==============================] - 0s 3ms/step - loss: 17.4010
Epoch 99/100
24/24 [==============================] - 0s 3ms/step - loss: 14.7545
Epoch 100/100
24/24 [==============================] - 0s 4ms/step - loss: 15.2182
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# use the fitted model to predict the entire history of california
outcome_c, predictors_c = dmatrices("cigs ~ C(year) + C(state)", data=data.eval("Dit = False"))
data['cigs_c'] = model3.predict(predictors_c)

# we plot treated and control for CA

ax = data.query("state=='CA'").plot(x='year',y='cigs',label = "CA - data")
data.query("state=='CA'").plot(x='year',y='cigs_c',ax=ax,label="CA - predicted control")

plt.show()

png

1
2
data.query("state=='CA'").eval('diff = cigs - cigs_c').plot(x='year',y='diff')
plt.show()

png

1