Modelling annual earnings with unemployment: Non-random selection in female workers
Female earnings are underrepresented in the earnings and earnings dynamics literature. This underrepresentation is largely a result of the di erences in participation rates between male and female workers. Female workers tend to have more frequent changes in employment status, and more periods of unemployment than their male counterparts. These periods of unemployment result in observations with zero earnings, and common transformations such as the logarithm are not de ned for zero values. This means that any analysis of the logarithm of earnings is forced to exclude periods where an individual does not work, and cannot take into account the e ect of moving into or out of employment. The higher rate of unemployment in female workers also increases the risk of sample selection bias. If selection into employment is non-random, then estimating earnings equations based on only workers will result in biased estimates. This thesis takes a novel approach by focusing on the annual earnings of females, and in doing so introduces two methods for addressing the issues associated with zero earnings observations. First, the Inverse Hyperbolic Sine (IHS) function is introduced as an alternative to the logarithm. The IHS is de ned for zero values, allowing for the creation of descriptive statistics that take into account periods of unemployment and changes in employment status. While the IHS has many properties that are useful when working with annual earnings, this thesis also highlights a number of estimation issues that can arise when using the function that have not previously been mentioned in the literature. Second, a new correction for sample selection bias that has been proposed by Semykina and Wooldridge (2013) is used to model the annual earnings of female workers. Both the sample selection bias correction and the IHS are applied to data on prime aged females from the Survey of Families, Income, and Employment (SoFIE) data set.