Stern School Data Analysis A Crucial Method Used by So Many Scientist R Coding Task


In this individual project, you will use the diabetes data in Efron, et al. (2003) to examine the effects of ten baseline predictor variables [age, sex, body mass index (bmi), average blood pressure (map), and six blood serum measurements (tc, ldl, hdl, tch, ltg, glu)] on a quantitative measure of disease progression one year after baseline. There are 442 diabetes patients in this data set. The data are available in the R package “lars.” You must employ several machine learning techniques using the diabetes data to fit linear regression, ridge regression and lasso models. You must also incorporate best subset selection and cross-validation techniques.