The data come from the Panel Study of Income Dynamics, years 1981 to 1992 (also contains earnings data from 1980). The sample consists of 579 white females, who were followed over the considered period. In total, there are 6,948 observations over the 12-year period (1981-1992). This data frame contains the following columns:
id: Individual identifier
year: Survey year
age: Calculated age in years (based on year and month of birth)
educ: Years of schooling
children: Total number of children in family unit, ages 0-17
s: Participation dummy, =1 if worked (hours>0)
lnw: Log of real average hourly earnings
lnw80: Log earnings in 1980
agesq: Age squared
children_lag1: Number of children in t-1
children_lag2: Number of children in t-2
lnw2: Log of real average hourly earnings
Lnw: Log of real average hourly earnings
PSID2
An object of class data.frame
with 6948 rows and 13 columns.
Anastasia Semykina, Jeffrey M Wooldridge (2013). “Estimation of dynamic panel data models with sample selection.” Journal of Applied Econometrics, 28(1), 47--61. Mikhail Zhelonkin, Marc G. Genton, Elvezio Ronchetti (2019). ssmrob: Robust Estimation and Inference in Sample Selection Models. R package version 0.7, https://CRAN.R-project.org/package=ssmrob. Ott Toomet, Arne Henningsen (2008). “Sample Selection Models in R: Package sampleSelection.” Journal of Statistical Software, 27(7). https://www.jstatsoft.org/article/view/v027i07.
data(PSID2)
attach(PSID2)
#> The following objects are masked from data:
#>
#> age, educ, id
#> The following objects are masked from data3:
#>
#> age, educ, id
#> The following objects are masked from nhanes:
#>
#> age, educ, id
#> The following objects are masked from Mroz87 (pos = 6):
#>
#> age, educ
#> The following objects are masked from MEPS2001 (pos = 7):
#>
#> age, educ
#> The following objects are masked from MEPS2001 (pos = 8):
#>
#> age, educ
#> The following objects are masked from MEPS2001 (pos = 9):
#>
#> age, educ
#> The following objects are masked from Mroz87 (pos = 10):
#>
#> age, educ
#> The following objects are masked from MEPS2001 (pos = 11):
#>
#> age, educ
#> The following objects are masked from MEPS2001 (pos = 12):
#>
#> age, educ
#> The following objects are masked from MEPS2001 (pos = 13):
#>
#> age, educ
#> The following objects are masked from MEPS2001 (pos = 14):
#>
#> age, educ
hist(Lnw)
selectEq <- s ~ educ+ age+ children+ year
outcomeEq <- Lnw ~ educ+ age+ children
HCinitial(selectEq,outcomeEq, data = PSID2)
#> xs(Intercept) xseduc xsage xschildren xsyear
#> 1.904417294 0.021724081 -0.019771859 -0.169149600 -0.021483592
#> xo(Intercept) xoeduc xoage xochildren sigma
#> 0.492835520 0.128685876 -0.009081435 -0.119664822 0.854195970
#> rho
#> 1.426127369
#Note that the estimated value of rho by the two-step
#method is greater than 1
summary(HeckmanGe(selectEq,outcomeEq, 1, 1, data = PSID2))
#>
#> --------------------------------------------------------------
#> Generalized Heckman Model (Package: ssmodels)
#> --------------------------------------------------------------
#> --------------------------------------------------------------
#> Maximum Likelihood estimation
#> optim function with method BFGS-iterations numbers: 36
#> Log-Likelihood: -7461.423
#> AIC: 14944.85 BIC: 15020.15
#> Number of observations: ( 1057 censored and 5891 observed )
#> 11 free parameters ( df= 6937 )
#> --------------------------------------------------------------
#> Probit selection equation:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 1.887573 0.156411 12.068 < 2e-16 ***
#> educ 0.021938 0.008207 2.673 0.007532 **
#> age -0.019202 0.002390 -8.034 1.10e-15 ***
#> children -0.166495 0.020306 -8.199 2.85e-16 ***
#> year -0.023440 0.006078 -3.857 0.000116 ***
#> --------------------------------------------------------------
#> Outcome equation:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 0.4878258 0.0566149 8.617 <2e-16 ***
#> educ 0.1205652 0.0031053 38.826 <2e-16 ***
#> age -0.0003632 0.0008558 -0.424 0.671
#> children -0.0494552 0.0075144 -6.581 5e-11 ***
#> --------------------------------------------------------------
#> Dispersion terms:
#> Estimate Std. Error t value Pr(>|t|)
#> Sigma 1.6934 0.0096 176.4 <2e-16 ***
#> --------------------------------------------------------------
#> Correlation terms:
#> Estimate Std. Error t value Pr(>|t|)
#> Rho 0.08879 0.08450 1.051 0.293
#> --------------------------------------------------------------