Networks and Economic Self-sufficiency
Homework 1Eco 231 – Undergraduate EconometricsSpring 20211. Assume that you are hired to investigate the causal effect between being raised in high-povertyneighborhoods in the US and future outcomes during adulthood (such as health, well-being, socialnetworks and economic self-sufficiency). Employ the sources in the Datasets file in Blackboardand succinctly answer the following questions:(a) Mention a suitable dataset that can help you answer the question above. Provide its nameand the website where it can be downloaded.(b) What is the sample size in this dataset? Is this a reasonable number for your research?(c) Briefly describe the data you found in part (a). Using the codebook discuss which variablesare crucial to answer the research question posed in the statement above (no more than 10lines).2. Suppose you are a researcher interested in studying the relationship between household character-istics and future educational outcomes of children. You have been advised that one dataset whichsatisfies your requirements is the Early Childhood Longitudinal Study, Birth Cohort. Try to findthe data through the sources talked about in the Stata lecture. In order to answer the followingquestions, additionally you will need to locate the codebooks of this database. (Note: You do notneed the data, the codebooks and webpage pdfs contain all information you require)(a) Briefly describe the objectives of this study and the different rounds of the survey. Mentionthe methods employed for data collection. At what ages are the interviews conducted? (Youranswer should not exceed 10 lines).(b) Describe which are the restrictions for the use of this database.(c) How many children are classified as low birth weight in the first round of the survey?(d) Describe the groups of variables available in the first round. Classify them in child charac-teristics, mother characteristics and household characteristics.(e) Choose two variables you could employ as baseline characteristics of the household. Describehow these variables would be relevant for studying future outcomes of children.(f) Calculate the nonresponse rate between the initial number of individuals interviewed and thetwo following rounds of the survey.1(g) Suppose you are interested in studying how socio-emotional skills are developed before theage of two. Describe which assessments included in this study could be employed for thispurpose. Does the study have similar assessments for higher ages?(h) Describe which measurements can be used to analyze the cognitive skills of children in kinder-garten.3. This problem asks you to work directly with Stata. Suppose you are a researcher interested instudying the labor market outcomes of recent college graduates. One public-use, suitable datasetfor this purpose is the National Survey of College Graduates (NSCG). In order to answer thefollowing questions, you will need to use the attached documentation to identify the variables ofinterest.(a) Explore the survey using the interview questionnaire. Based on this, write down one scientificquestion (related to the topic mentioned above) which could be answered using the NSCG.(b) Use the interview questionnaire provided with the database to identify the variables relatedto hours worked per week, weeks worked per year and year earnings. Notice that informationabout weeks worked can be derived using two variables. Also note that the NSCG15 valueof 98 for hrs worked per week = logical skip.(c) After handling invalid values properly, create a table showing the mean and standard devia-tion of the three variables described in part (a) for men and women separately.(d) In order to see the distribution of hours worked per week, crate a histogram of this variablefor men and women. Plot the density in the y-axis and use a bin width of 10 for the x-axis.(e) Create a new variable lnhourwage defined as the (natural) logarithm of year earnings di-vided by total hours worked during the year. Produce a table showing the mean, standarddeviation and percentiles 10th and 90th of this variable for men and women separately. Dropobservations which yield a negative value of this variable.(f) Use the interview questionnaire to identify the variable which indicates whether a respondentchanged employer and/or job between 2013 and 2015, as well as the variables describing thereason of change in case the employer is different between these two years. What is theproportion of respondents who stayed with the same employer and job during this period?(g) As a researcher, you are also interested in studying how the gender wage gap varies acrossmajor fields. Using the variable related to the first bachelor degree (nbamemg) and yourvariable lnhourwage create a table showing the mean hourly wage for women and menacross different majors. Which is the one that presents the higher wage gap?(h) Run a regression of hourly wages on education separately for men and women. How does theparameter of education differ across gender?(i) Create a variable of potential experience ptlexper, defined as age-education-6. Run a re-gression of hourly wages on education, potential experience and potential experience squaredseparately for men and women. Interpret your results.2