Abstract:
Life expectancy at birth (LEB) gives implications regarding the
overall development of a nation. So identification of prominent
determinant factors that affect LEB, will lead to take relevant decisions
regarding the development of a nation. Studies have been conducted to
identify prominent determinant factors of LEB, using ordinary least
squares procedure in linear regression models with a limited number of
determinant factors. Problems regarding multicollinearity, prediction
accuracy and model interpretation occur when using this procedure with
a wider range of determinant factors. The machine learning techniques:
shrinkage and dimensionality reduction techniques were applied to
overcome these problems. 17 determinant factors were identified and
applied to data obtained for 193 countries of United Nations Agencies
for the year 2016. As shrinkage techniques ridge and lasso regression
and as dimensionality reduction techniques principal components
regression and partial least squares regression were applied. These
regression techniques were compared concerning mean squared error,
goodness of fit and ranking based on regression coefficient estimates.
Ridge regression model turned out to be the best model with a good fit
for data on hand, because it has the highest adjusted R2 for the training
data. Lasso regression model shows the highest adjusted R2 and lowest
mean squared error for the test data. So lasso regression model is the
best predictive model.