With respect to the project, my primary goal today was to assess the relationship between diagnosed diabetes percentage and three key independent variables: obesity, inactivity, and food insecurity. For the rural dataset, the multiple regression analysis gave an R-squared value of 0.549 indicating that about 54.9% of the changes in diagnosed diabetes percentage in rural areas can be explained by factors like obesity, inactivity, and food insecurity. Following this, I replicated the same multiple regression analysis on the urban dataset, which resulted in an R-squared value of 0.602 implying that about 60.2% of the changes in diagnosed diabetes percentage in urban areas can be explained by factors – obesity, inactivity, and food insecurity.
For Rural Dataset:
For Urban Dataset:
Further, I delved into the topic of t-tests, which had been discussed in today’s class. I conducted a t-test on the diabetes data for both urban and rural counties, producing a t-statistic of -9.25558917924443, implying that there is a significant disparity in diabetes rates between these two types of counties, with urban counties having a notably higher diabetes rate than rural counties.
Additionally, the very low p-value: 3.832914332163736e-20 confirms strong statistical significance and that urban counties indeed have a significantly higher diabetes rate than rural counties, this result is highly reliable and unlikely to be due to random fluctuations in the data.