Data Mining in Automotive Warranty Analysis
This thesis is about data mining in automotive warranty analysis, with an emphasis on modeling the mean cumulative warranty cost or number of claims (per vehicle). In our study, we deal with a type of truncation that is typical for automotive warranty data, where the warranty coverage and the resulting warranty data are limited by age and mileage. Age, as a function of time, is known for all sold vehicles at all time. However, mileage is only observed for a vehicle with at least one claim and only at the time of the claim. To deal with this problem of incomplete mileage information, we consider a linear approach and a piece-wise linear approach within a nonparametric framework. We explore the univariate case, as well as the bivariate case. For the univariate case, we evaluate the mean cumulative warranty cost and its standard error as a function of age, a function of mileage, and a function of actual (calendar) time. For the bivariate case, we evaluate the mean cumulative warranty cost as a function of age and mileage. The effect of reporting delay of claim and several methods for making prediction are also considered. Throughout this thesis, we illustrate the ideas using examples based on real data.