I began by converting categorical data into numerical format in order to facilitate clustering techniques. The initial step involved applying DBSCAN clustering based on the ‘latitude’, ‘longitude’, and ‘total_shootings’ columns from the dataset. This enabled the visualization of clusters on a map of the USA using the folium library, making it possible to observe shooting incidents that occurred in close geographic proximity.
Subsequently, I delved into exploring appropriate clustering methods to extract meaningful insights from the dataset. I opted to employ K-Means clustering, as it operates without requiring a target variable, aligning with the principles of unsupervised learning. Following this decision, I partitioned the data into training and testing sets, and I proceeded to train the K-Means model using the training data. Further, I applied the model to predict cluster labels for the testing dataset. The results obtained from these need further analysis and evaluation.