ABCDEFGHIJ0123456789 |
state <fct> | year <fct> | deaths <dbl> | |
---|---|---|---|
Alabama | 2012 | 13.316056 | |
Alaska | 2012 | 12.311976 | |
Arizona | 2012 | 13.720419 | |
Arkansas | 2012 | 16.466730 | |
California | 2012 | 8.756507 | |
Colorado | 2012 | 10.092204 |
ABCDEFGHIJ0123456789 |
state <fct> | year <fct> | deaths <dbl> | |
---|---|---|---|
Alabama | 2012 | 13.316056 | |
Alaska | 2012 | 12.311976 | |
Arizona | 2012 | 13.720419 | |
Arkansas | 2012 | 16.466730 | |
California | 2012 | 8.756507 | |
Colorado | 2012 | 10.092204 |
ABCDEFGHIJ0123456789 |
state <fct> | year <fct> | deaths <dbl> | |
---|---|---|---|
Maryland | 2007 | 10.866679 | |
Maryland | 2008 | 10.740963 | |
Maryland | 2009 | 9.892754 | |
Maryland | 2010 | 8.783883 | |
Maryland | 2011 | 8.626745 | |
Maryland | 2012 | 8.941916 |
ABCDEFGHIJ0123456789 |
state <fct> | year <fct> | deaths <dbl> | |
---|---|---|---|
Alabama | 2007 | 18.075232 | |
Alabama | 2008 | 16.289227 | |
Alabama | 2009 | 13.833678 | |
Alabama | 2010 | 13.434084 | |
Alabama | 2011 | 13.771989 | |
Alabama | 2012 | 13.316056 | |
Alaska | 2007 | 16.301184 | |
Alaska | 2008 | 12.744090 | |
Alaska | 2009 | 12.973849 | |
Alaska | 2010 | 11.670893 |
ABCDEFGHIJ0123456789 |
state <fct> | year <fct> | deaths <dbl> | |
---|---|---|---|
Alabama | 2007 | 18.075232 | |
Alabama | 2008 | 16.289227 | |
Alabama | 2009 | 13.833678 | |
Alabama | 2010 | 13.434084 | |
Alabama | 2011 | 13.771989 | |
Alabama | 2012 | 13.316056 | |
Alaska | 2007 | 16.301184 | |
Alaska | 2008 | 12.744090 | |
Alaska | 2009 | 12.973849 | |
Alaska | 2010 | 11.670893 |
Panel or Longitudinal data contains
Thus, our regression equation looks like:
\begin{align} \hat{Y}_{\color{red}{i}\color{blue}{t}}} = \beta_0 + \beta_1 X_{\color{red}{i}\color{blue}{t}} + u_{\color{red}{i}\color{blue}{t}} \end{align}
for individual i in time t.
ABCDEFGHIJ0123456789 |
state <fct> | year <fct> | deaths <dbl> | |
---|---|---|---|
Alabama | 2007 | 18.075232 | |
Alabama | 2008 | 16.289227 | |
Alabama | 2009 | 13.833678 | |
Alabama | 2010 | 13.434084 | |
Alabama | 2011 | 13.771989 | |
Alabama | 2012 | 13.316056 | |
Alaska | 2007 | 16.301184 | |
Alaska | 2008 | 12.744090 | |
Alaska | 2009 | 12.973849 | |
Alaska | 2010 | 11.670893 |
Example: Do cell phones cause more traffic fatalities?
No measure of cell phones used while driving
cell_plans
as a proxy for cell phone usageState-level data over 6 years
glimpse(phones)
## Rows: 306## Columns: 8## $ year <fct> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 20…## $ state <fct> Alabama, Alaska, Arizona, Arkansas, California, Colorado…## $ urban_percent <dbl> 30, 55, 45, 21, 54, 34, 84, 31, 100, 53, 39, 45, 11, 56,…## $ cell_plans <dbl> 8135.525, 6730.282, 7572.465, 8071.125, 8821.933, 8162.0…## $ cell_ban <fct> 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…## $ text_ban <fct> 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…## $ deaths <dbl> 18.075232, 16.301184, 16.930578, 19.595430, 12.104340, 1…## $ year_num <dbl> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 20…
phones %>% count(year)
ABCDEFGHIJ0123456789 |
year <fct> | n <int> | |||
---|---|---|---|---|
2007 | 51 | |||
2008 | 51 | |||
2009 | 51 | |||
2010 | 51 | |||
2011 | 51 | |||
2012 | 51 |
phones %>% summarize(States = n_distinct(state), Years = n_distinct(year))
ABCDEFGHIJ0123456789 |
States <int> | Years <int> | |||
---|---|---|---|---|
51 | 6 |
^Yit=β0+β1Xit+uit
^Yit=β0+β1Xit+uit
pooled <- lm(deaths ~ cell_plans, data = phones)pooled %>% tidy()
ABCDEFGHIJ0123456789 |
term <chr> | estimate <dbl> | std.error <dbl> | statistic <dbl> | p.value <dbl> |
---|---|---|---|---|
(Intercept) | 17.3371034167 | 0.975384504 | 17.774635 | 5.821724e-49 |
cell_plans | -0.0005666385 | 0.000106975 | -5.296926 | 2.264086e-07 |
ggplot(data = phones)+ aes(x = cell_plans, y = deaths)+ geom_point()+ labs(x = "Cell Phones Per 10,000 People", y = "Deaths Per Billion Miles Driven")+ theme_bw(base_family = "Fira Sans Condensed", base_size=14)
ggplot(data = phones)+ aes(x = cell_plans, y = deaths)+ geom_point()+ geom_smooth(method = "lm", color = "red")+ labs(x = "Cell Phones Per 10,000 People", y = "Deaths Per Billion Miles Driven")+ theme_bw(base_family = "Fira Sans Condensed", base_size=14)
The expected value of the residuals is 0 E[u]=0
The variance of the residuals over X is constant: var(u|X)=σ2u
Errors are not correlated across observations: cor(ui,uj)=0∀i≠j
There is no correlation between X and the error term: cor(X,u)=0 or E[u|X]=0
^Yit=β0+β1Xit+ϵit
Assumption 3: cor(ui,uj)=0∀i≠j
Pooled regression model is biased because it ignores:
Thus, errors are serially or auto-correlated; cor(ui,uj)≠0 within same i and within same t
^Deathsit=β0+β1Cell Phonesit+uit
Multiple observations from same state i
Multiple observations from same year t
phones %>% filter(state %in% c("District of Columbia", "Maryland", "Texas", "California", "Kansas")) %>%ggplot(data = .)+ aes(x = cell_plans, y = deaths, color = state)+ geom_point()+ geom_smooth(method = "lm")+ labs(x = "Cell Phones Per 10,000 People", y = "Deaths Per Billion Miles Driven", color = NULL)+ theme_bw(base_family = "Fira Sans Condensed", base_size=14)+ theme(legend.position = "top")
phones %>% filter(state %in% c("District of Columbia", "Maryland", "Texas", "California", "Kansas")) %>%ggplot(data = .)+ aes(x = cell_plans, y = deaths, color = state)+ geom_point()+ geom_smooth(method = "lm")+ labs(x = "Cell Phones Per 10,000 People", y = "Deaths Per Billion Miles Driven", color = NULL)+ theme_bw(base_family = "Fira Sans Condensed", base_size=14)+ theme(legend.position = "none")+ facet_wrap(~state, ncol=3)
ggplot(data = phones)+ aes(x = cell_plans, y = deaths, color = state)+ geom_point()+ geom_smooth(method = "lm")+ labs(x = "Cell Phones Per 10,000 People", y = "Deaths Per Billion Miles Driven", color = NULL)+ theme_bw(base_family = "Fira Sans Condensed")+ theme(legend.position = "none")+ facet_wrap(~state, ncol=7)
^Deathsit=β0+β1Cell Phonesit+uit
^Deathsit=β0+β1Cell Phonesit+uit
cor(uit,Cell Phonesit)≠0E[uit|Cell Phonesit]≠0
^Deathsit=β0+β1Cell Phonesit+uit
cor(uit,Cell Phonesit)≠0E[uit|Cell Phonesit]≠0
^Deathsit=β0+β1Cell Phonesit+uit
cor(uit,Cell Phonesit)≠0E[uit|Cell Phonesit]≠0
A simple pooled model likely contains lots of omitted variable bias
Many (often unobservable) factors that determine both Phones & Deaths
A simple pooled model likely contains lots of omitted variable bias
Many (often unobservable) factors that determine both Phones & Deaths
But the beauty of this is that most of these factors systematically vary by U.S. State and are stable over time!
We can simply “control for State” to safely remove the influence of all of these factors!
Much of the endogeneity in Xit can be explained by systematic differences across i (groups)
Exploit the systematic variation across groups with a fixed effects model
Much of the endogeneity in Xit can be explained by systematic differences across i (groups)
Exploit the systematic variation across groups with a fixed effects model
Decompose the model error term into two parts:
uit=αi+ϵit
uit=αi+ϵit
αi are group-specific fixed effects
This includes all factors that do not change within group i over time
uit=αi+ϵit
ϵit is the remaining random error
ϵit includes all other factors affecting Yit not contained in group effect αi
ˆYit=β0+β1Xit+αi+ϵit
We've pulled αi out of the original error term into the regression
Essentially we’ll estimate an intercept for each group (minus one, which is β0)
Must have multiple observations (over time) for each group (i.e. panel data)
^Deathsit=β0+β1Cell phonesit+αi+ϵit
αi is the State fixed effect
There could still be factors in ϵit that are correlated with Cell phonesit!
ˆYit=β0+β1Xit+αi+ϵit
Least Squares Dummy Variable (LSDV) approach
De-meaned data approach
^Yit=β0+β1Xit+β2D1i+β3D2i+⋯+βND(N−1)i+ϵit
^Yit=β0+β1Xit+β2D1i+β3D2i+⋯+βND(N−1)i+ϵit
^Yit=β0+β1Xit+β2D1i+β3D2i+⋯+βND(N−1)i+ϵit
R
^Yit=β0+β1Xit+β2D1i+β3D2i+⋯+βND(N−1)i+ϵit
R
Example: ^Deathsit=β0+β1Cell Phonesit+Alaskai+⋯+Wyomingi
^Deathsit=β0+β1Cell Phonesit+Alaskai+⋯+Wyomingi
If state
is a factor
variable, just include it in the regression
R
automatically creates N−1 dummy variables and includes them in the regression
fe_reg_1 <- lm(deaths ~ cell_plans + state, data = phones)fe_reg_1 %>% tidy()
ABCDEFGHIJ0123456789 |
term <chr> | estimate <dbl> | std.error <dbl> | statistic <dbl> | p.value <dbl> |
---|---|---|---|---|
(Intercept) | 25.507679925 | 1.0176400289 | 25.06552337 | 1.241581e-70 |
cell_plans | -0.001203742 | 0.0001013125 | -11.88147584 | 3.483442e-26 |
stateAlaska | -2.484164783 | 0.6745076282 | -3.68293060 | 2.816972e-04 |
stateArizona | -1.510577383 | 0.6704569688 | -2.25305643 | 2.510925e-02 |
stateArkansas | 3.192662931 | 0.6664383936 | 4.79063476 | 2.829319e-06 |
stateCalifornia | -4.978668651 | 0.6655467951 | -7.48056889 | 1.206933e-12 |
stateColorado | -4.344553493 | 0.6654735335 | -6.52851432 | 3.588784e-10 |
stateConnecticut | -6.595185530 | 0.6654428902 | -9.91097152 | 8.698802e-20 |
stateDelaware | -2.098393628 | 0.6666483193 | -3.14767707 | 1.842218e-03 |
stateDistrict of Columbia | 6.355790010 | 1.2897172620 | 4.92804911 | 1.499627e-06 |
Alternatively, we can control our regression for group fixed effects without directly estimating them
We simply de-mean the data for each group to remove the group fixed-effect
Alternatively, we can control our regression for group fixed effects without directly estimating them
We simply de-mean the data for each group to remove the group fixed-effect
For each group i, find the means (over time, t): ˉYi=β0+β1ˉXi+ˉαi+ˉϵit
Alternatively, we can control our regression for group fixed effects without directly estimating them
We simply de-mean the data for each group to remove the group fixed-effect
For each group i, find the means (over time, t): ˉYi=β0+β1ˉXi+ˉαi+ˉϵit
^Yit=β0+β1Xit+uitˉYi=β0+β1ˉXi+ˉαi+ˉϵi
^Yit=β0+β1Xit+uitˉYi=β0+β1ˉXi+ˉαi+ˉϵi
Yit−ˉYi=β1(Xit−ˉXi)+˜ϵit˜Yit=β1˜Xit+˜ϵit
^Yit=β0+β1Xit+uitˉYi=β0+β1ˉXi+ˉαi+ˉϵi
Yit−ˉYi=β1(Xit−ˉXi)+˜ϵit˜Yit=β1˜Xit+˜ϵit
Within each group i, the de-meaned variables ˜Yit and ˜Xit's all have a mean of 0†
Variables that don't change over time will drop out of analysis altogether
Removes any source of variation across groups (all now have mean of 0) to only work with variation within each group
† Recall Rule 4 from the 2.3 class notes on the Summation Operator: ∑(Xi−ˉX)=0
˜Yit=β1˜Xit+˜ϵit
Yields identical results to dummy variable approach
More useful when we have many groups (would be many dummies)
Demonstrates intuition behind fixed effects:
We are basically comparing groups to themselves over time
Ignore all differences between groups, only look at differences within groups over time
# get means of Y and X by statemeans_state <- phones %>% group_by(state) %>% summarize(avg_deaths = mean(deaths), avg_phones = mean(cell_plans))# look at itmeans_state
# get means of Y and X by statemeans_state <- phones %>% group_by(state) %>% summarize(avg_deaths = mean(deaths), avg_phones = mean(cell_plans))# look at itmeans_state
ABCDEFGHIJ0123456789 |
state <fct> | avg_deaths <dbl> | avg_phones <dbl> |
---|---|---|
Alabama | 14.786711 | 8906.370 |
Alaska | 13.612953 | 7817.759 |
Arizona | 14.249825 | 8097.482 |
Arkansas | 17.543881 | 9268.153 |
California | 9.659712 | 9029.594 |
Colorado | 10.351405 | 8981.762 |
Connecticut | 8.141739 | 8947.729 |
Delaware | 12.209610 | 9304.052 |
District of Columbia | 8.015895 | 19811.205 |
Florida | 13.544635 | 9078.592 |
ggplot(data = means_state)+ aes(x = fct_reorder(state, avg_deaths), y = avg_deaths, color = state)+ geom_point()+ geom_segment(aes(y = 0, yend = avg_deaths, x = state, xend = state))+ coord_flip()+ labs(x = "Cell Phones Per 10,000 People", y = "Deaths Per Billion Miles Driven", color = NULL)+ theme_bw(base_family = "Fira Sans Condensed", base_size=10)+ theme(legend.position = "none")
The fixest
package is designed for running regressions with fixed effects
feols()
function is just like lm()
, with some additional arguments:
#install.packages("fixest")library(fixest)fe_reg_1_alt <- feols(deaths ~ cell_plans | state, data = phones)
fe_reg_1_alt %>% summary()
## OLS estimation, Dep. Var.: deaths## Observations: 306 ## Fixed-effects: state: 51## Standard-errors: Clustered (state) ## Estimate Std. Error t value Pr(>|t|) ## cell_plans -0.001204 0.000143 -8.41708 3.792e-11 ***## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1## RMSE: 1.05007 Adj. R2: 0.886524## Within R2: 0.357238
# or using broom's tidy()fe_reg_1_alt %>% tidy()
ABCDEFGHIJ0123456789 |
term <chr> | estimate <dbl> | std.error <dbl> | statistic <dbl> | p.value <dbl> |
---|---|---|---|---|
cell_plans | -0.001203742 | 0.0001430118 | -8.417077 | 3.791955e-11 |
State fixed effect controls for all factors that vary by state but are stable over time
But there are still other (often unobservable) factors that affect both Phones and Deaths, that don’t vary by State
State fixed effect controls for all factors that vary by state but are stable over time
But there are still other (often unobservable) factors that affect both Phones and Deaths, that don’t vary by State
If these factors systematically vary over time, but are the same by State, then we can “control for Year” to safely remove the influence of all of these factors!
A one-way fixed effects model estimates a fixed effect for groups
Two-way fixed effects model estimates fixed effects for both groups and time periods ^Yit=β0+β1Xit+αi+θt+νit
αi: group fixed effects
θt: time fixed effects
νit remaining random error
^Deathsit=β0+β1Cell phonesit+αi+θt+νit
αi: State fixed effects
θt: Year fixed effects
# find averages for yearsmeans_year <- phones %>% group_by(year) %>% summarize(avg_deaths = mean(deaths), avg_phones = mean(cell_plans))means_year
ABCDEFGHIJ0123456789 |
year <fct> | avg_deaths <dbl> | avg_phones <dbl> | ||
---|---|---|---|---|
2007 | 14.00751 | 8064.531 | ||
2008 | 12.87156 | 8482.903 | ||
2009 | 12.08632 | 8859.706 | ||
2010 | 11.61487 | 9134.592 | ||
2011 | 11.36431 | 9485.238 | ||
2012 | 11.65666 | 9660.474 |
ggplot(data = phones)+ aes(x = year, y = deaths)+ geom_point(aes(color = year))+ # Add the yearly means as black points geom_point(data = means_year, aes(x = year, y = avg_deaths), size = 3, color = "black")+ # connect the means with a line geom_line(data = means_year, aes(x = as.numeric(year), y = avg_deaths), color = "black", size = 1)+ theme_bw(base_family = "Fira Sans Condensed", base_size = 14)+ theme(legend.position = "none")
ˆYit=β0+β1Xit+αi+θt+νit
1) Least Squares Dummy Variable (LSDV) Approach: add dummies for both groups and time periods (separate intercepts for groups and times)
ˆYit=β0+β1Xit+αi+θt+νit
1) Least Squares Dummy Variable (LSDV) Approach: add dummies for both groups and time periods (separate intercepts for groups and times)
2) Fully De-meaned data: ˜Yit=β1˜Xit+˜νit
where for each variable: ~varit=varit−¯vart−¯vari
ˆYit=β0+β1Xit+αi+θt+νit
1) Least Squares Dummy Variable (LSDV) Approach: add dummies for both groups and time periods (separate intercepts for groups and times)
2) Fully De-meaned data: ˜Yit=β1˜Xit+˜νit
where for each variable: ~varit=varit−¯vart−¯vari
3) Hybrid: de-mean for one effect (groups or years) and add dummies for the other effect (years or groups)
fe2_reg_1 <- lm(deaths ~ cell_plans + state + year, data = phones)fe2_reg_1 %>% tidy()
ABCDEFGHIJ0123456789 |
term <chr> | estimate <dbl> | std.error <dbl> | statistic <dbl> | p.value <dbl> |
---|---|---|---|---|
(Intercept) | 18.9304707399 | 1.4511323962 | 13.0453092 | 5.427406e-30 |
cell_plans | -0.0002995294 | 0.0001723149 | -1.7382677 | 8.339982e-02 |
stateAlaska | -1.4998292482 | 0.6241082951 | -2.4031554 | 1.698648e-02 |
stateArizona | -0.7791714713 | 0.6113519094 | -1.2745057 | 2.036724e-01 |
stateArkansas | 2.8655344756 | 0.5985062952 | 4.7878101 | 2.895040e-06 |
stateCalifornia | -5.0900897113 | 0.5956293282 | -8.5457338 | 1.299236e-15 |
stateColorado | -4.4127241692 | 0.5953924847 | -7.4114543 | 1.945083e-12 |
stateConnecticut | -6.6325834801 | 0.5952933996 | -11.1417051 | 1.169797e-23 |
stateDelaware | -2.4579829953 | 0.5991822226 | -4.1022295 | 5.546475e-05 |
stateDistrict of Columbia | -3.5044963616 | 1.9710939218 | -1.7779449 | 7.663326e-02 |
fe2_reg_2 <- feols(deaths ~ cell_plans | state + year, data = phones)fe2_reg_2 %>% summary()
## OLS estimation, Dep. Var.: deaths## Observations: 306 ## Fixed-effects: state: 51, year: 6## Standard-errors: Clustered (state) ## Estimate Std. Error t value Pr(>|t|) ## cell_plans -3e-04 0.000305 -0.980739 0.33144 ## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1## RMSE: 0.930036 Adj. R2: 0.909197## Within R2: 0.011989
fe2_reg_2 %>% tidy()
ABCDEFGHIJ0123456789 |
term <chr> | estimate <dbl> | std.error <dbl> | statistic <dbl> | p.value <dbl> |
---|---|---|---|---|
cell_plans | -0.0002995294 | 0.0003054118 | -0.9807394 | 0.3314431 |
State fixed effect absorbs all unobserved factors that vary by state, but are constant over time
Year fixed effect absorbs all unobserved factors that vary by year, but are constant over States
But there are still other (often unobservable) factors that affect both Phones and Deaths, that vary by State and change over time!
We will also need to control for these variables (not picked up by fixed effects!)
^Deathsit=β1Cell Phonesit+αi+θt+urban pctit+cell banit+text banit
fe2_controls_reg <- feols(deaths ~ cell_plans + text_ban + urban_percent + cell_ban | state + year, data = phones) fe2_controls_reg %>% summary()
## OLS estimation, Dep. Var.: deaths## Observations: 306 ## Fixed-effects: state: 51, year: 6## Standard-errors: Clustered (state) ## Estimate Std. Error t value Pr(>|t|) ## cell_plans -0.000340 0.000277 -1.22780 0.225269 ## text_ban1 0.255926 0.243444 1.05127 0.298188 ## urban_percent 0.013135 0.009815 1.33822 0.186878 ## cell_ban1 -0.679796 0.335655 -2.02528 0.048194 * ## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1## RMSE: 0.920123 Adj. R2: 0.910039## Within R2: 0.032939
fe2_controls_reg %>% tidy()
ABCDEFGHIJ0123456789 |
term <chr> | estimate <dbl> | std.error <dbl> | statistic <dbl> | p.value <dbl> |
---|---|---|---|---|
cell_plans | -0.0003403735 | 0.0002772212 | -1.227805 | 0.22526919 |
text_ban1 | 0.2559261569 | 0.2434442111 | 1.051272 | 0.29818803 |
urban_percent | 0.0131347657 | 0.0098150705 | 1.338224 | 0.18687751 |
cell_ban1 | -0.6797956522 | 0.3356553662 | -2.025279 | 0.04819377 |
library(huxtable)huxreg("Pooled" = pooled, "State Effects" = fe_reg_1, "State & Year Effects" = fe2_reg_1, "With Controls" = fe2_controls_reg, coefs = c("Intercept" = "(Intercept)", "Cell phones" = "cell_plans", "Cell Ban" = "cell_ban1", "Texting Ban" = "text_ban1", "Urbanization Rate" = "urban_percent"), statistics = c("N" = "nobs", "R-Squared" = "r.squared", "SER" = "sigma"), number_format = 4)
Pooled | State Effects | State & Year Effects | With Controls | |
---|---|---|---|---|
Intercept | 17.3371 *** | 25.5077 *** | 18.9305 *** | |
(0.9754) | (1.0176) | (1.4511) | ||
Cell phones | -0.0006 *** | -0.0012 *** | -0.0003 | -0.0003 |
(0.0001) | (0.0001) | (0.0002) | (0.0003) | |
Cell Ban | -0.6798 * | |||
(0.3357) | ||||
Texting Ban | 0.2559 | |||
(0.2434) | ||||
Urbanization Rate | 0.0131 | |||
(0.0098) | ||||
N | 306 | 306 | 306 | 306 |
R-Squared | 0.0845 | 0.9055 | 0.9259 | 0.9274 |
SER | 3.2791 | 1.1526 | 1.0310 | 1.0262 |
*** p < 0.001; ** p < 0.01; * p < 0.05. |
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
o | Tile View: Overview of Slides |
Esc | Back to slideshow |
ABCDEFGHIJ0123456789 |
state <fct> | year <fct> | deaths <dbl> | cell_plans <dbl> |
---|---|---|---|
Alabama | 2012 | 13.316056 | 9433.800 |
Alaska | 2012 | 12.311976 | 8872.799 |
Arizona | 2012 | 13.720419 | 8810.889 |
Arkansas | 2012 | 16.466730 | 10047.027 |
California | 2012 | 8.756507 | 9362.424 |
Colorado | 2012 | 10.092204 | 9403.225 |
ABCDEFGHIJ0123456789 |
state <fct> | year <fct> | deaths <dbl> | cell_plans <dbl> |
---|---|---|---|
Alabama | 2012 | 13.316056 | 9433.800 |
Alaska | 2012 | 12.311976 | 8872.799 |
Arizona | 2012 | 13.720419 | 8810.889 |
Arkansas | 2012 | 16.466730 | 10047.027 |
California | 2012 | 8.756507 | 9362.424 |
Colorado | 2012 | 10.092204 | 9403.225 |
ABCDEFGHIJ0123456789 |
state <fct> | year <fct> | deaths <dbl> | cell_plans <dbl> |
---|---|---|---|
Maryland | 2007 | 10.866679 | 8942.137 |
Maryland | 2008 | 10.740963 | 9290.689 |
Maryland | 2009 | 9.892754 | 9339.452 |
Maryland | 2010 | 8.783883 | 9630.120 |
Maryland | 2011 | 8.626745 | 10335.795 |
Maryland | 2012 | 8.941916 | 10393.295 |
ABCDEFGHIJ0123456789 |
state <fct> | year <fct> | deaths <dbl> | cell_plans <dbl> |
---|---|---|---|
Alabama | 2007 | 18.075232 | 8135.525 |
Alabama | 2008 | 16.289227 | 8494.391 |
Alabama | 2009 | 13.833678 | 8979.108 |
Alabama | 2010 | 13.434084 | 9054.894 |
Alabama | 2011 | 13.771989 | 9340.501 |
Alabama | 2012 | 13.316056 | 9433.800 |
Alaska | 2007 | 16.301184 | 6730.282 |
Alaska | 2008 | 12.744090 | 5580.707 |
Alaska | 2009 | 12.973849 | 8389.730 |
Alaska | 2010 | 11.670893 | 8560.595 |
ABCDEFGHIJ0123456789 |
state <fct> | year <fct> | deaths <dbl> | cell_plans <dbl> |
---|---|---|---|
Alabama | 2007 | 18.075232 | 8135.525 |
Alabama | 2008 | 16.289227 | 8494.391 |
Alabama | 2009 | 13.833678 | 8979.108 |
Alabama | 2010 | 13.434084 | 9054.894 |
Alabama | 2011 | 13.771989 | 9340.501 |
Alabama | 2012 | 13.316056 | 9433.800 |
Alaska | 2007 | 16.301184 | 6730.282 |
Alaska | 2008 | 12.744090 | 5580.707 |
Alaska | 2009 | 12.973849 | 8389.730 |
Alaska | 2010 | 11.670893 | 8560.595 |
Panel or Longitudinal data contains
Thus, our regression equation looks like:
\begin{align} \hat{Y}_{\color{red}{i}\color{blue}{t}}} = \beta_0 + \beta_1 X_{\color{red}{i}\color{blue}{t}} + u_{\color{red}{i}\color{blue}{t}} \end{align}
for individual i in time t.
ABCDEFGHIJ0123456789 |
state <fct> | year <fct> | deaths <dbl> | cell_plans <dbl> |
---|---|---|---|
Alabama | 2007 | 18.075232 | 8135.525 |
Alabama | 2008 | 16.289227 | 8494.391 |
Alabama | 2009 | 13.833678 | 8979.108 |
Alabama | 2010 | 13.434084 | 9054.894 |
Alabama | 2011 | 13.771989 | 9340.501 |
Alabama | 2012 | 13.316056 | 9433.800 |
Alaska | 2007 | 16.301184 | 6730.282 |
Alaska | 2008 | 12.744090 | 5580.707 |
Alaska | 2009 | 12.973849 | 8389.730 |
Alaska | 2010 | 11.670893 | 8560.595 |
Example: Do cell phones cause more traffic fatalities?
No measure of cell phones used while driving
cell_plans
as a proxy for cell phone usageState-level data over 6 years
glimpse(phones)
## Rows: 306## Columns: 8## $ year <fct> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 20…## $ state <fct> Alabama, Alaska, Arizona, Arkansas, California, Colorado…## $ urban_percent <dbl> 30, 55, 45, 21, 54, 34, 84, 31, 100, 53, 39, 45, 11, 56,…## $ cell_plans <dbl> 8135.525, 6730.282, 7572.465, 8071.125, 8821.933, 8162.0…## $ cell_ban <fct> 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…## $ text_ban <fct> 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…## $ deaths <dbl> 18.075232, 16.301184, 16.930578, 19.595430, 12.104340, 1…## $ year_num <dbl> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 20…
phones %>% count(year)
ABCDEFGHIJ0123456789 |
year <fct> | n <int> |
---|---|
2007 | 51 |
2008 | 51 |
2009 | 51 |
2010 | 51 |
2011 | 51 |
2012 | 51 |
phones %>% summarize(States = n_distinct(state), Years = n_distinct(year))
ABCDEFGHIJ0123456789 |
States <int> | Years <int> |
---|---|
51 | 6 |
^Yit=β0+β1Xit+uit
^Yit=β0+β1Xit+uit
pooled <- lm(deaths ~ cell_plans, data = phones)pooled %>% tidy()
ABCDEFGHIJ0123456789 |
term <chr> | estimate <dbl> | std.error <dbl> | statistic <dbl> | p.value <dbl> |
---|---|---|---|---|
(Intercept) | 17.3371034167 | 0.975384504 | 17.774635 | 5.821724e-49 |
cell_plans | -0.0005666385 | 0.000106975 | -5.296926 | 2.264086e-07 |
ggplot(data = phones)+ aes(x = cell_plans, y = deaths)+ geom_point()+ labs(x = "Cell Phones Per 10,000 People", y = "Deaths Per Billion Miles Driven")+ theme_bw(base_family = "Fira Sans Condensed", base_size=14)
ggplot(data = phones)+ aes(x = cell_plans, y = deaths)+ geom_point()+ geom_smooth(method = "lm", color = "red")+ labs(x = "Cell Phones Per 10,000 People", y = "Deaths Per Billion Miles Driven")+ theme_bw(base_family = "Fira Sans Condensed", base_size=14)
The expected value of the residuals is 0 E[u]=0
The variance of the residuals over X is constant: var(u|X)=σ2u
Errors are not correlated across observations: cor(ui,uj)=0∀i≠j
There is no correlation between X and the error term: cor(X,u)=0 or E[u|X]=0
^Yit=β0+β1Xit+ϵit
Assumption 3: cor(ui,uj)=0∀i≠j
Pooled regression model is biased because it ignores:
Thus, errors are serially or auto-correlated; cor(ui,uj)≠0 within same i and within same t
^Deathsit=β0+β1Cell Phonesit+uit
Multiple observations from same state i
Multiple observations from same year t
phones %>% filter(state %in% c("District of Columbia", "Maryland", "Texas", "California", "Kansas")) %>%ggplot(data = .)+ aes(x = cell_plans, y = deaths, color = state)+ geom_point()+ geom_smooth(method = "lm")+ labs(x = "Cell Phones Per 10,000 People", y = "Deaths Per Billion Miles Driven", color = NULL)+ theme_bw(base_family = "Fira Sans Condensed", base_size=14)+ theme(legend.position = "top")
phones %>% filter(state %in% c("District of Columbia", "Maryland", "Texas", "California", "Kansas")) %>%ggplot(data = .)+ aes(x = cell_plans, y = deaths, color = state)+ geom_point()+ geom_smooth(method = "lm")+ labs(x = "Cell Phones Per 10,000 People", y = "Deaths Per Billion Miles Driven", color = NULL)+ theme_bw(base_family = "Fira Sans Condensed", base_size=14)+ theme(legend.position = "none")+ facet_wrap(~state, ncol=3)
ggplot(data = phones)+ aes(x = cell_plans, y = deaths, color = state)+ geom_point()+ geom_smooth(method = "lm")+ labs(x = "Cell Phones Per 10,000 People", y = "Deaths Per Billion Miles Driven", color = NULL)+ theme_bw(base_family = "Fira Sans Condensed")+ theme(legend.position = "none")+ facet_wrap(~state, ncol=7)
^Deathsit=β0+β1Cell Phonesit+uit
^Deathsit=β0+β1Cell Phonesit+uit
cor(uit,Cell Phonesit)≠0E[uit|Cell Phonesit]≠0
^Deathsit=β0+β1Cell Phonesit+uit
cor(uit,Cell Phonesit)≠0E[uit|Cell Phonesit]≠0
^Deathsit=β0+β1Cell Phonesit+uit
cor(uit,Cell Phonesit)≠0E[uit|Cell Phonesit]≠0
A simple pooled model likely contains lots of omitted variable bias
Many (often unobservable) factors that determine both Phones & Deaths
A simple pooled model likely contains lots of omitted variable bias
Many (often unobservable) factors that determine both Phones & Deaths
But the beauty of this is that most of these factors systematically vary by U.S. State and are stable over time!
We can simply “control for State” to safely remove the influence of all of these factors!
Much of the endogeneity in Xit can be explained by systematic differences across i (groups)
Exploit the systematic variation across groups with a fixed effects model
Much of the endogeneity in Xit can be explained by systematic differences across i (groups)
Exploit the systematic variation across groups with a fixed effects model
Decompose the model error term into two parts:
uit=αi+ϵit
uit=αi+ϵit
αi are group-specific fixed effects
This includes all factors that do not change within group i over time
uit=αi+ϵit
ϵit is the remaining random error
ϵit includes all other factors affecting Yit not contained in group effect αi
ˆYit=β0+β1Xit+αi+ϵit
We've pulled αi out of the original error term into the regression
Essentially we’ll estimate an intercept for each group (minus one, which is β0)
Must have multiple observations (over time) for each group (i.e. panel data)
^Deathsit=β0+β1Cell phonesit+αi+ϵit
αi is the State fixed effect
There could still be factors in ϵit that are correlated with Cell phonesit!
ˆYit=β0+β1Xit+αi+ϵit
Least Squares Dummy Variable (LSDV) approach
De-meaned data approach
^Yit=β0+β1Xit+β2D1i+β3D2i+⋯+βND(N−1)i+ϵit
^Yit=β0+β1Xit+β2D1i+β3D2i+⋯+βND(N−1)i+ϵit
^Yit=β0+β1Xit+β2D1i+β3D2i+⋯+βND(N−1)i+ϵit
R
^Yit=β0+β1Xit+β2D1i+β3D2i+⋯+βND(N−1)i+ϵit
R
Example: ^Deathsit=β0+β1Cell Phonesit+Alaskai+⋯+Wyomingi
^Deathsit=β0+β1Cell Phonesit+Alaskai+⋯+Wyomingi
If state
is a factor
variable, just include it in the regression
R
automatically creates N−1 dummy variables and includes them in the regression
fe_reg_1 <- lm(deaths ~ cell_plans + state, data = phones)fe_reg_1 %>% tidy()
ABCDEFGHIJ0123456789 |
term <chr> | estimate <dbl> | std.error <dbl> | statistic <dbl> | p.value <dbl> |
---|---|---|---|---|
(Intercept) | 25.507679925 | 1.0176400289 | 25.06552337 | 1.241581e-70 |
cell_plans | -0.001203742 | 0.0001013125 | -11.88147584 | 3.483442e-26 |
stateAlaska | -2.484164783 | 0.6745076282 | -3.68293060 | 2.816972e-04 |
stateArizona | -1.510577383 | 0.6704569688 | -2.25305643 | 2.510925e-02 |
stateArkansas | 3.192662931 | 0.6664383936 | 4.79063476 | 2.829319e-06 |
stateCalifornia | -4.978668651 | 0.6655467951 | -7.48056889 | 1.206933e-12 |
stateColorado | -4.344553493 | 0.6654735335 | -6.52851432 | 3.588784e-10 |
stateConnecticut | -6.595185530 | 0.6654428902 | -9.91097152 | 8.698802e-20 |
stateDelaware | -2.098393628 | 0.6666483193 | -3.14767707 | 1.842218e-03 |
stateDistrict of Columbia | 6.355790010 | 1.2897172620 | 4.92804911 | 1.499627e-06 |
Alternatively, we can control our regression for group fixed effects without directly estimating them
We simply de-mean the data for each group to remove the group fixed-effect
Alternatively, we can control our regression for group fixed effects without directly estimating them
We simply de-mean the data for each group to remove the group fixed-effect
For each group i, find the means (over time, t): ˉYi=β0+β1ˉXi+ˉαi+ˉϵit
Alternatively, we can control our regression for group fixed effects without directly estimating them
We simply de-mean the data for each group to remove the group fixed-effect
For each group i, find the means (over time, t): ˉYi=β0+β1ˉXi+ˉαi+ˉϵit
^Yit=β0+β1Xit+uitˉYi=β0+β1ˉXi+ˉαi+ˉϵi
^Yit=β0+β1Xit+uitˉYi=β0+β1ˉXi+ˉαi+ˉϵi
Yit−ˉYi=β1(Xit−ˉXi)+˜ϵit˜Yit=β1˜Xit+˜ϵit
^Yit=β0+β1Xit+uitˉYi=β0+β1ˉXi+ˉαi+ˉϵi
Yit−ˉYi=β1(Xit−ˉXi)+˜ϵit˜Yit=β1˜Xit+˜ϵit
Within each group i, the de-meaned variables ˜Yit and ˜Xit's all have a mean of 0†
Variables that don't change over time will drop out of analysis altogether
Removes any source of variation across groups (all now have mean of 0) to only work with variation within each group
† Recall Rule 4 from the 2.3 class notes on the Summation Operator: ∑(Xi−ˉX)=0
˜Yit=β1˜Xit+˜ϵit
Yields identical results to dummy variable approach
More useful when we have many groups (would be many dummies)
Demonstrates intuition behind fixed effects:
We are basically comparing groups to themselves over time
Ignore all differences between groups, only look at differences within groups over time
# get means of Y and X by statemeans_state <- phones %>% group_by(state) %>% summarize(avg_deaths = mean(deaths), avg_phones = mean(cell_plans))# look at itmeans_state
# get means of Y and X by statemeans_state <- phones %>% group_by(state) %>% summarize(avg_deaths = mean(deaths), avg_phones = mean(cell_plans))# look at itmeans_state
ABCDEFGHIJ0123456789 |
state <fct> | avg_deaths <dbl> | avg_phones <dbl> |
---|---|---|
Alabama | 14.786711 | 8906.370 |
Alaska | 13.612953 | 7817.759 |
Arizona | 14.249825 | 8097.482 |
Arkansas | 17.543881 | 9268.153 |
California | 9.659712 | 9029.594 |
Colorado | 10.351405 | 8981.762 |
Connecticut | 8.141739 | 8947.729 |
Delaware | 12.209610 | 9304.052 |
District of Columbia | 8.015895 | 19811.205 |
Florida | 13.544635 | 9078.592 |
ggplot(data = means_state)+ aes(x = fct_reorder(state, avg_deaths), y = avg_deaths, color = state)+ geom_point()+ geom_segment(aes(y = 0, yend = avg_deaths, x = state, xend = state))+ coord_flip()+ labs(x = "Cell Phones Per 10,000 People", y = "Deaths Per Billion Miles Driven", color = NULL)+ theme_bw(base_family = "Fira Sans Condensed", base_size=10)+ theme(legend.position = "none")
The fixest
package is designed for running regressions with fixed effects
feols()
function is just like lm()
, with some additional arguments:
#install.packages("fixest")library(fixest)fe_reg_1_alt <- feols(deaths ~ cell_plans | state, data = phones)
fe_reg_1_alt %>% summary()
## OLS estimation, Dep. Var.: deaths## Observations: 306 ## Fixed-effects: state: 51## Standard-errors: Clustered (state) ## Estimate Std. Error t value Pr(>|t|) ## cell_plans -0.001204 0.000143 -8.41708 3.792e-11 ***## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1## RMSE: 1.05007 Adj. R2: 0.886524## Within R2: 0.357238
# or using broom's tidy()fe_reg_1_alt %>% tidy()
ABCDEFGHIJ0123456789 |
term <chr> | estimate <dbl> | std.error <dbl> | statistic <dbl> | p.value <dbl> |
---|---|---|---|---|
cell_plans | -0.001203742 | 0.0001430118 | -8.417077 | 3.791955e-11 |
State fixed effect controls for all factors that vary by state but are stable over time
But there are still other (often unobservable) factors that affect both Phones and Deaths, that don’t vary by State
State fixed effect controls for all factors that vary by state but are stable over time
But there are still other (often unobservable) factors that affect both Phones and Deaths, that don’t vary by State
If these factors systematically vary over time, but are the same by State, then we can “control for Year” to safely remove the influence of all of these factors!
A one-way fixed effects model estimates a fixed effect for groups
Two-way fixed effects model estimates fixed effects for both groups and time periods ^Yit=β0+β1Xit+αi+θt+νit
αi: group fixed effects
θt: time fixed effects
νit remaining random error
^Deathsit=β0+β1Cell phonesit+αi+θt+νit
αi: State fixed effects
θt: Year fixed effects
# find averages for yearsmeans_year <- phones %>% group_by(year) %>% summarize(avg_deaths = mean(deaths), avg_phones = mean(cell_plans))means_year
ABCDEFGHIJ0123456789 |
year <fct> | avg_deaths <dbl> | avg_phones <dbl> |
---|---|---|
2007 | 14.00751 | 8064.531 |
2008 | 12.87156 | 8482.903 |
2009 | 12.08632 | 8859.706 |
2010 | 11.61487 | 9134.592 |
2011 | 11.36431 | 9485.238 |
2012 | 11.65666 | 9660.474 |
ggplot(data = phones)+ aes(x = year, y = deaths)+ geom_point(aes(color = year))+ # Add the yearly means as black points geom_point(data = means_year, aes(x = year, y = avg_deaths), size = 3, color = "black")+ # connect the means with a line geom_line(data = means_year, aes(x = as.numeric(year), y = avg_deaths), color = "black", size = 1)+ theme_bw(base_family = "Fira Sans Condensed", base_size = 14)+ theme(legend.position = "none")
ˆYit=β0+β1Xit+αi+θt+νit
1) Least Squares Dummy Variable (LSDV) Approach: add dummies for both groups and time periods (separate intercepts for groups and times)
ˆYit=β0+β1Xit+αi+θt+νit
1) Least Squares Dummy Variable (LSDV) Approach: add dummies for both groups and time periods (separate intercepts for groups and times)
2) Fully De-meaned data: ˜Yit=β1˜Xit+˜νit
where for each variable: ~varit=varit−¯vart−¯vari
ˆYit=β0+β1Xit+αi+θt+νit
1) Least Squares Dummy Variable (LSDV) Approach: add dummies for both groups and time periods (separate intercepts for groups and times)
2) Fully De-meaned data: ˜Yit=β1˜Xit+˜νit
where for each variable: ~varit=varit−¯vart−¯vari
3) Hybrid: de-mean for one effect (groups or years) and add dummies for the other effect (years or groups)
fe2_reg_1 <- lm(deaths ~ cell_plans + state + year, data = phones)fe2_reg_1 %>% tidy()
ABCDEFGHIJ0123456789 |
term <chr> | estimate <dbl> | std.error <dbl> | statistic <dbl> | p.value <dbl> |
---|---|---|---|---|
(Intercept) | 18.9304707399 | 1.4511323962 | 13.0453092 | 5.427406e-30 |
cell_plans | -0.0002995294 | 0.0001723149 | -1.7382677 | 8.339982e-02 |
stateAlaska | -1.4998292482 | 0.6241082951 | -2.4031554 | 1.698648e-02 |
stateArizona | -0.7791714713 | 0.6113519094 | -1.2745057 | 2.036724e-01 |
stateArkansas | 2.8655344756 | 0.5985062952 | 4.7878101 | 2.895040e-06 |
stateCalifornia | -5.0900897113 | 0.5956293282 | -8.5457338 | 1.299236e-15 |
stateColorado | -4.4127241692 | 0.5953924847 | -7.4114543 | 1.945083e-12 |
stateConnecticut | -6.6325834801 | 0.5952933996 | -11.1417051 | 1.169797e-23 |
stateDelaware | -2.4579829953 | 0.5991822226 | -4.1022295 | 5.546475e-05 |
stateDistrict of Columbia | -3.5044963616 | 1.9710939218 | -1.7779449 | 7.663326e-02 |
fe2_reg_2 <- feols(deaths ~ cell_plans | state + year, data = phones)fe2_reg_2 %>% summary()
## OLS estimation, Dep. Var.: deaths## Observations: 306 ## Fixed-effects: state: 51, year: 6## Standard-errors: Clustered (state) ## Estimate Std. Error t value Pr(>|t|) ## cell_plans -3e-04 0.000305 -0.980739 0.33144 ## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1## RMSE: 0.930036 Adj. R2: 0.909197## Within R2: 0.011989
fe2_reg_2 %>% tidy()
ABCDEFGHIJ0123456789 |
term <chr> | estimate <dbl> | std.error <dbl> | statistic <dbl> | p.value <dbl> |
---|---|---|---|---|
cell_plans | -0.0002995294 | 0.0003054118 | -0.9807394 | 0.3314431 |
State fixed effect absorbs all unobserved factors that vary by state, but are constant over time
Year fixed effect absorbs all unobserved factors that vary by year, but are constant over States
But there are still other (often unobservable) factors that affect both Phones and Deaths, that vary by State and change over time!
We will also need to control for these variables (not picked up by fixed effects!)
^Deathsit=β1Cell Phonesit+αi+θt+urban pctit+cell banit+text banit
fe2_controls_reg <- feols(deaths ~ cell_plans + text_ban + urban_percent + cell_ban | state + year, data = phones) fe2_controls_reg %>% summary()
## OLS estimation, Dep. Var.: deaths## Observations: 306 ## Fixed-effects: state: 51, year: 6## Standard-errors: Clustered (state) ## Estimate Std. Error t value Pr(>|t|) ## cell_plans -0.000340 0.000277 -1.22780 0.225269 ## text_ban1 0.255926 0.243444 1.05127 0.298188 ## urban_percent 0.013135 0.009815 1.33822 0.186878 ## cell_ban1 -0.679796 0.335655 -2.02528 0.048194 * ## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1## RMSE: 0.920123 Adj. R2: 0.910039## Within R2: 0.032939
fe2_controls_reg %>% tidy()
ABCDEFGHIJ0123456789 |
term <chr> | estimate <dbl> | std.error <dbl> | statistic <dbl> | p.value <dbl> |
---|---|---|---|---|
cell_plans | -0.0003403735 | 0.0002772212 | -1.227805 | 0.22526919 |
text_ban1 | 0.2559261569 | 0.2434442111 | 1.051272 | 0.29818803 |
urban_percent | 0.0131347657 | 0.0098150705 | 1.338224 | 0.18687751 |
cell_ban1 | -0.6797956522 | 0.3356553662 | -2.025279 | 0.04819377 |
library(huxtable)huxreg("Pooled" = pooled, "State Effects" = fe_reg_1, "State & Year Effects" = fe2_reg_1, "With Controls" = fe2_controls_reg, coefs = c("Intercept" = "(Intercept)", "Cell phones" = "cell_plans", "Cell Ban" = "cell_ban1", "Texting Ban" = "text_ban1", "Urbanization Rate" = "urban_percent"), statistics = c("N" = "nobs", "R-Squared" = "r.squared", "SER" = "sigma"), number_format = 4)
Pooled | State Effects | State & Year Effects | With Controls | |
---|---|---|---|---|
Intercept | 17.3371 *** | 25.5077 *** | 18.9305 *** | |
(0.9754) | (1.0176) | (1.4511) | ||
Cell phones | -0.0006 *** | -0.0012 *** | -0.0003 | -0.0003 |
(0.0001) | (0.0001) | (0.0002) | (0.0003) | |
Cell Ban | -0.6798 * | |||
(0.3357) | ||||
Texting Ban | 0.2559 | |||
(0.2434) | ||||
Urbanization Rate | 0.0131 | |||
(0.0098) | ||||
N | 306 | 306 | 306 | 306 |
R-Squared | 0.0845 | 0.9055 | 0.9259 | 0.9274 |
SER | 3.2791 | 1.1526 | 1.0310 | 1.0262 |
*** p < 0.001; ** p < 0.01; * p < 0.05. |