Each observation in the gun violence dataset is a single instance of a crime involving a gun. So far, I have only focused on three columns:
n_killed: The number of people killed for this incident
n_injured: The number of people injured for this incident
n_guns_involved: The number of guns involved in this incident
I also used the latitude and longitude columns in the process of grouping incidents by area.
In order to compare this dataset with the TEDS-D dataset, I had to figure out the CBSA code for each incident. To do this, I obtained a list of CBSA codes and then used Google's GeoCoding API to get the latitude and longitude for each one.
Using this list, and the latitude and longitude columns of the gun violence dataset, I was able to find the closest CBSA for each incident and assign it a code.
From there I grouped the data by the CBSA codes and totaled up the columns (n_killed, n_injured, n_guns_involved) for each area. To avoid any problems, I divided each of the columns by the number of incidents reported for that area.
Finally, I merged the treatment scores with the gun violence dataset by joining on the CBSA code.
Want to print your doc? This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (