September, 29 Friday…

 
In my previous blog post, I forgot to mention that I had also performed quadratic regression on the urban-rural dataset with diagnosed diabetes as dependent variable and obesity, inactivity and food insecurity as independent variables . For the rural dataset, the model predicted an average Diagnosed Diabetes percentage(y_pred) of 8.379, and the R-squared value was 0.58. The R-squared value tells us that about 58% of the variation in Diagnosed Diabetes percentage can be explained by the model, which is a moderate fit. 

For the urban dataset, the average predicted Diagnosed Diabetes percentage (y_pred) was 8.971, and the R-squared value was 0.61. This R-squared value means that approximately 61% of the variation in Diagnosed Diabetes percentage in urban areas is explained by the model, indicating a slightly better fit compared to the rural dataset. 

After that, I moved on to cross-validation, as I explained in my previous blog post. Today, I explored different cross-validation methods to get a comprehensive understanding of my model’s performance. I plan to know more about “bootstrapping” in particular. 

Following this exploration, I analyzed all the tests and analyses I’ve done so far to understand the data better. This analysis will be useful when discussing our findings with my project group as we work on the project report. 

Leave a Reply

Your email address will not be published. Required fields are marked *