Talk:List of U.S. states and territories by intentional homicide rate

Latest comment: 3 days ago by Wizmut in topic Contradiction between zero rates and counts

2014 info available now

edit

Data for 2014 here: https://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/2014/crime-in-the-u.s.-2014 also, the bottom of this article links to a secondary-source which re-digested the primary source... probably better to link directly to the FBI. -SColombo (talk) 23:07, 28 October 2015 (UTC)Reply

Data table

edit

The table desperately needs to be able to be sorted by homicide rate. Unfortunately, I don't know how to do that. Bond Head (talk) 05:38, 4 January 2021 (UTC)Reply

Table moved here

edit

I removed the US table from List of countries by intentional homicide rate#United States and put it here.

The regular editors there (like me) want the individual country tables to be moved to the separate articles. In order to shorten the article length there, and to hopefully have more concentrated editing on the individual country tables. They tend to get forgotten on the global page.

Feel free to integrate the 2 tables. Or whatever you want. --Timeshifter (talk) 01:23, 28 January 2021 (UTC)Reply

Table needs a new legend

edit

@Abbasi786786 and Guarapiranga: See diff. The legend was removed by Abbasi786786. Edit summary: "Removed now-obsolete legend; Great work on the shading!"

Old legend:

  0.0 - 0.9
  1.0 - 4.9
  5.0 - 9.9
  10.0 +

Could you guys create a new legend? I have no idea what the break points are for the new shading that Guarapiranga added. --Timeshifter (talk) 02:08, 22 May 2021 (UTC)Reply

@Timeshifter: I don't believe that we can make a legend anymore, now it looks like the shading is based on a color spectrum. There are far too many breakpoints now for us to make one of these. Unless we have an option to add in a color spectrum on the top? -- Abbasi786786 (talk) 02:15, 22 May 2021 (UTC)Reply
Abbasi786786. I don't understand. The changes in color must be based on the numbers. I find it hard to believe it is using an infinite number of gradations. Can someone show me an example of a color spectrum in use on a Wikipedia table. Not on an image embedded in an article. I have seen color spectrums in those embedded table images. I want to see some color spectrum legends used for actual Wikipedia tables. --Timeshifter (talk) 02:37, 22 May 2021 (UTC)Reply
I find it hard to believe it is using an infinite number of gradations.
Indeed it isn't; just 256. — 𝐆𝐮𝐚𝐫𝐚𝐩𝐢𝐫𝐚𝐧𝐠𝐚 (talk) 02:48, 22 May 2021 (UTC)Reply
@Timeshifter and Guarapiranga: So I guess the natural solution would be to embed a color spectrum explaining the shading scheme just above the chart. Do either of you know how to do that? I think it'd be a great improvement to the article. -- Abbasi786786 (talk) 02:52, 22 May 2021 (UTC)Reply
A key is hardly necessary when all values are stated in each table cell (it's not a choropleth map), but if you find it utterly necessary, you could try copying it from {{Graph:Map}}. — 𝐆𝐮𝐚𝐫𝐚𝐩𝐢𝐫𝐚𝐧𝐠𝐚 (talk) 03:46, 22 May 2021 (UTC)Reply
OK. With 256 gradations, a legend will not work, and is not needed. And as you say, the numbers are in each cell. Unlike in most maps. So I agree that neither a legend nor a spectrum is needed for this table. --Timeshifter (talk) 04:53, 22 May 2021 (UTC)Reply

Please explain how all of the 2011 to 2020 data is derived

edit

Abbasi786786. Guarapiranga. Please explain step-by-step exactly how the 2011 through 2020 data used in the table and map is derived from the references.

Map:

References:

Is there a download with all the 2011 through 2020 data?

Please provide exact links too.

And is this diff justified by data from the references? --Timeshifter (talk) 07:25, 22 February 2022 (UTC)Reply

@Timeshifter: No, there no a single download link, at least not one that I can find (and I looked for a while). That's a big issue with the whole dataset, especially since when I originally made these, I couldn't copy-paste, but spent time manually entering all the numbers from the UCR Crime Reports.
Every crime report only had data for that specific year (numbers I now know are preliminary) and the year right before it (numbers that are finalized). While manually inputting the data, I skipped years, thinking that the numbers were already finalized, and if there were differences, they wouldn't be too vast. This resulted in 2010 and all the even year data to be preliminary, while 2011 and all the of year data is final (to my knowledge).
I realized this was a mistake this year when comparing the graph data off the crime explorer to the Wikipedia numbers, so I fixed the 2018 (and I think 2016 data too) finalizing it. While the last few years are up-to- date and precise now, this is probably not true for the previous years (that I've been meaning to get around to but don't have much time to fix).
Last update isn't justified by anything, that's just vandalism. — Preceding unsigned comment added by Abbasi786786 (talkcontribs) 22 February 2022 (UTC)

Conflating intentional homicide with murder

edit

As I understand it, all murder is intentional homicide, but not all intentional homicide is murder, since it may be justified. This page seems to use the two terms interchangeably. Oktayey (talk) 15:43, 30 April 2023 (UTC)Reply

How do you recommend fixing this issue? Dronebogus (talk) 00:02, 12 May 2023 (UTC)Reply

Louisiana listed as 300+

edit

Louisiana is listed > 300 per 100,000. This must of course be an error 70.163.237.44 (talk) 22:32, 17 October 2023 (UTC)Reply

I changed the table entry in 2021 for Louisiana, to match the map above. They now both say 21.3 70.163.237.44 (talk) 22:36, 17 October 2023 (UTC)Reply

Contradiction between zero rates and counts

edit

I initially changed the zeros to "N/A" in the table for rates, but just realized that the problem is CDC. According to the CDC source for 2021, New Hampshire had 15 homicides and a rate of 0, Vermont had 10 homicides and a rate of 0, and Wyoming had 16 homicides and a rate of 0. That makes no sense at all (I've also done a fast check of some non-zero states on CDC's page and it appears the problem only exists for the zero states). In 2021, New Hampshire had a population of 1.389 million (= homicide rate 1.1 per 100K), Vermont had a population of 0.646 million (= homicide rate 1.5 per 100K), and Wyoming had a population of 0.578 million (= homicide rate 2.8 per 100K). We have the exact same problem for earlier years on CDC's page, where they also list a rate of zero for some states, directly contradicting their adjacent count.
For now, I have left "N/A" as I think this is better than using wrong numbers for rates, but the best would be to put in the real numbers, despite the ref issue. If necessary, one can use WP:HIDDEN to explain and/or {{round|{{#expr: 15 / 13.89000 }}|1}} for the math (that example is for New Hampshire 2021, where "15" is count, "13.89000" is population in 100K, and "1" is decimals in result = template just shows result "1.1" in table). Unless someone objects or has a better idea, I'll do it later. RN1970 (talk) 11:18, 26 October 2023 (UTC)Reply

If you click on the individual state links on that page that are in question, those destination pages list 'N/A'. This suggests that the values are not necessarily verified as homicides, or the data is/was preliminary, or it's just a badly constructed table on the main page. The CDC isn't a great source for homicide data to begin with, unfortunately, but that's a separate matter. cheers. anastrophe, an editor he is. 18:01, 26 October 2023 (UTC)Reply
The CDC is trying to follow the law by suppressing any mortality data point with 20 deaths or less. They show a notice every time you use their query tool that it's a violation of federal law to report any count or rate that has such a tiny number. This is all for privacy reasons. Wizmut (talk) 06:36, 27 November 2024 (UTC)Reply

Type table

edit

I added a table of homicide rates by type. Unfortunately, the CDC loves to suppress small data values, so even public data such as homicide is suppressed if a given category has a small value (1 to 9). This makes the table incomplete, and the sorting meaningless for most columns. For this reason, I may chop off the columns with mostly suppressed values. This would leave only Total, Gun and Stab homicides. Wizmut (talk) 01:04, 13 February 2024 (UTC)Reply

I would keep them. It is better than nothing. It at least gives people an idea that those methods happen less.
Table needs a caption for screen reader accessibility. And the rate per ? needs to go there too. To keep the header line narrow for cell phones. --Timeshifter (talk) 01:30, 13 February 2024 (UTC)Reply
For those interested here are the instructions for downloading the data on the CDC site and the R code for producing the rate table (and a table for totals).
The language used is the R (programming language).
R code for US homicide by type
# Attempt to set working directory
# setwd(getSrcDirectory()[1]) # if running entire file
setwd(dirname(rstudioapi::getActiveDocumentContext()$path)) # if running section
options(scipen=999) # don't use scientific notation
library(dplyr)

# https://wonder.cdc.gov/ucd-icd10-expanded.html
# Section 1 will vary depending on the data needed (see below). Leave sections 2, 3 and 5 as default. In section 4 select the most recent year. In section 6, in the largest box, scroll down and select the last value "External causes of morbidity and mortality". In section 7 check all boxes except "Show totals".

# state homicide and population totals
# In section 1, group results by State and by Injury Intent; type in a title such as "CDC - State Intent".
state.homicide = read.table("CDC - State Intent.txt", header=T, sep="\t", fill=T) %>%
  filter(Injury.Intent == "Homicide") %>%
  mutate(Deaths = as.numeric(Deaths)) %>%
  select(State,Deaths,Population)

# homicide mechanism totals
# In section 1, group results by State and by Injury Mechanism; type in a title such as "CDC - Intent Mechanism".
homicide.mechanism = read.table("CDC - Intent Mechanism.txt", header=T, sep="\t", fill=T) %>%
  filter(Injury.Intent == "Homicide") %>%
  select(4,6) %>%
  setNames(nm = c("Mechanism","Deaths")) %>%
  mutate(Mechanism = c("Stab","Drown.Other","Fall.Other","Fire","Hot.Other","Gun","Road",
                       "Poison","Struck","Choke","Other1","Other2","Other3")) %>%
  filter(!stringr::str_detect(Mechanism, "Other")) %>%
  mutate(Deaths = as.numeric(Deaths)) %>%
  arrange(desc(Deaths))

# state homicide by mechanism
# In section 1, group results by State, Injury Intent, and Injury Mechanism; type in a title such as "CDC - State Intent Mechanism".
state.homicide.mechanism = read.table("CDC - State Intent Mechanism.txt", header=T, sep="\t", fill=T) %>%
  filter(Injury.Intent == "Homicide") %>%
  select(2,6,8) %>%
  setNames(nm = c("State","Mechanism","Deaths")) %>%
  mutate(Deaths = as.numeric(ifelse(Deaths == "Suppressed", "-1", Deaths))) %>%
  tidyr::pivot_wider(names_from = Mechanism, values_from = Deaths, values_fill=0) %>%
  setNames(nm = c("State","Stab","Drown","Fall","Fire","Hot.Other","Gun",
                  "Road","Poison","Struck","Choke","Other1","Other2","Other3")) %>%
  inner_join(state.homicide) %>%
  select(State,Population,Deaths,Gun,Stab,Choke,Struck,Poison,Road,Fire) %>%
  bind_rows(setNames(data.frame("United States", sum(.$Population), sum(.$Deaths), t(homicide.mechanism$Deaths)), names(.))) %>%
  
  mutate(State = paste0("{{flagg|uspeft|pref=Crime in|",State,"}}")) %>%
  mutate(State = ifelse(stringr::str_detect(State,"United States"), 
                        "{{noflag|'''United States'''}}", State)) %>%
  mutate(State = ifelse(stringr::str_detect(State,"Georgia"), 
                        "{{flagg|uspeft|pref=Crime in|Georgia (U.S. state)|name=Georgia}}", State)) %>%
  mutate(State = ifelse(stringr::str_detect(State,"District"), 
                        "{{flagg|uspeft|pref=Crime in|District of Columbia|name=District of Columbia}}", State))

state.homicide.totals = state.homicide.mechanism %>%
  select(-Population) %>%
  arrange(desc(Deaths)) %>%
  mutate(across(where(is.numeric), ~ifelse(. == -1, "", as.character(.))))
write.csv(state.homicide.totals,"state_homicide_totals.csv",row.names=F)

state.homicide.rates = state.homicide.mechanism %>%
  mutate(across(c(Deaths:Fire), ~./Population*10^5)) %>%
  select(-Population) %>%
  arrange(desc(Deaths)) %>%
  arrange(-stringr::str_detect(State, "United States")) %>%
  mutate(across(where(is.numeric), ~ifelse(. < 0, "", 
                                    ifelse(. == 0, "0",
                                    ifelse(. < 0.05, "0.0",
                                    sprintf("%.1f", .) )))))
write.csv(state.homicide.rates,"state_homicide_rates.csv",row.names=F)
Wizmut (talk) 02:41, 13 February 2024 (UTC)Reply
Thanks. I suggest creating a subpage for these instructions. See example subpages linked from the note at the top of this talk page:
Talk:List of countries by intentional homicide rate
--Timeshifter (talk) 03:42, 13 February 2024 (UTC)Reply
Good idea. I created the page Talk:List of U.S. states and territories by intentional homicide rate/Table creation code.
Wizmut (talk) 03:54, 13 February 2024 (UTC)Reply
Thanks. Please add an undated note with that link at the top of the page just below all the header stuff. Since it is there and undated, the talk bot will not archive it. And it is easily visible to other editors who want to learn more. --Timeshifter (talk) 04:36, 13 February 2024 (UTC)Reply

Decade table

edit

I replaced the table for rates by year with a table of rates by decade. If you would still like to see yearly data, I suppose I could put that up as well, but I think decade alone might be good enough (and better value for each column).

I also used FBI data instead of CDC data, which solves the problem of missing values.

Here is the relevant R code for generating this and similar tables:

R code for US state homicide data by decade
# Attempt to set working directory
# setwd(getSrcDirectory()[1]) # if running entire file
setwd(dirname(rstudioapi::getActiveDocumentContext()$path)) # if running section
options(scipen=999) # don't use scientific notation
library(dplyr)

# FBI data
# https://cde.ucr.cjis.gov/LATEST/webapp/#/pages/downloads
# Additional Datasets > Summary Reporting System (SRS) > Download

# all crimes, states, years
crime_yearly = read.csv("estimated_crimes_1979_2022.csv") %>%
  mutate(across(where(is.numeric), ~ifelse(is.na(.), 0, .))) %>%
  filter(state_abbr != "") %>%
  mutate(rape_legacy = rape_legacy + rape_revised) %>%
  rename(State = state_name) %>%
  select(-c(state_abbr,rape_revised))

# US overall
us_yearly = crime_yearly %>%
  group_by(year) %>%
  summarise(across(where(is.numeric), sum)) %>%
  ungroup %>%
  mutate(State = "United States")

# append US totals
crime_yearly = crime_yearly %>%
  bind_rows(us_yearly)

# homicide by location and year
homicide_yearly = crime_yearly %>%
  select(year,State,population,homicide) %>%
  mutate(rate = homicide/population*10^5)

# average of each decade
homicide_decade = homicide_rate_yearly %>%
  filter(year > 1979) %>%
  mutate(decade = paste0(floor(year/10)*10,"s")) %>%
  select(decade,State,homicide,rate) %>%
  
  group_by(decade,State) %>%
  summarise(across(where(is.numeric),mean)) %>%
  ungroup %>%
  
  tidyr::pivot_wider(names_from = decade, values_from = c(homicide,rate)) %>%
  
  mutate(State = paste0("{{flagg|uspeft|pref=Crime in|",State,"}}")) %>%
  mutate(State = ifelse(stringr::str_detect(State,"United States"), 
                        "{{noflag|'''United States'''}}", State)) %>%
  mutate(State = ifelse(stringr::str_detect(State,"Georgia"), 
                        "{{flagg|uspeft|pref=Crime in|Georgia (U.S. state)|name=Georgia}}", State)) %>%
  mutate(State = ifelse(stringr::str_detect(State,"District"), 
                        "{{flagg|uspeft|pref=Crime in|District of Columbia|name=District of Columbia}}", State))

total_table = homicide_decade %>%
  select(State,homicide_1980s:homicide_2020s) %>%
  arrange(desc(homicide_2020s)) %>%
  arrange(-stringr::str_detect(State, "United States"))
write.csv(total_table,"state_homicide_total_decade.csv",row.names=F)

rate_table = homicide_decade %>%
  select(State,rate_1980s:rate_2020s) %>%
  arrange(desc(rate_2020s)) %>%
  arrange(-stringr::str_detect(State, "United States"))
write.csv(rate_table,"state_homicide_rate_decade.csv",row.names=F)

Wizmut (talk) 03:45, 13 February 2024 (UTC)Reply

Wizmut. How is this calculated? Do you total the rate for each year, and then divide by ten? Or do you add up all the homicides, divide by ten, and then divide by the population in the middle of the decade? Or what? That needs to be in the reference. --Timeshifter (talk) 14:33, 3 March 2024 (UTC)Reply

Homicides by year

edit
As an aside, I am not sure how the last table, total homicides by year, should be refurbished. Averages for each decade don't make as much sense for raw totals. Would be open to suggestions - most of the code is already written. Wizmut (talk) 03:49, 13 February 2024 (UTC)Reply
I added back the homicides by year table before I saw this talk section. There are many list articles with more tables in the article. I put it on top since the article is titled about homicide rate. People expect the latest numbers first. Then the other tables. --Timeshifter (talk) 04:05, 13 February 2024 (UTC)Reply
Alrighty, but which years should be in a yearly table? Latest five? I wouldn't do more than eight to keep it skinny, especially since the latest year would be on the right, and that's what people want to see first. Wizmut (talk) 04:09, 13 February 2024 (UTC)Reply
OK. I kept the last 5 years for the yearly rates and counts. In landscape view I can barely see 6 years in my iphone SE 2020 (a smaller cell phone). --Timeshifter (talk) 04:38, 13 February 2024 (UTC)Reply
I have further updated those tables so they now have the years 2018-2022. Please let me know if you see any errors or have any further comments. Wizmut (talk) 05:09, 13 February 2024 (UTC)Reply
Thanks. I don't have time nowadays to do detailed checks. One thing I have learned is that it is better to do a full update of all years versus trying to paste on the latest year. Many mistakes possible that way. The numbers for all years are often continually updated. And it is easy to mix up the states. Depending on alphabetization, DC placement, totals, etc..--Timeshifter (talk) 05:39, 13 February 2024 (UTC)Reply

Wizmut. I can't figure out where you got the new 2018-2022 FBI homicide rates from. See diff where you added new 2018-2022 rates.

The FBI reference currently in the article only shows counts, not rates, for the years 2018 to 2022:

  • At the bottom under 'Additional Datasets' find 'Summary Reporting System (SRS)' and click 'Download'. "Crime Data Explorer"

Please provide a clear step-by-step reference that meets WP:V or I will have to revert the page backwards to earlier charts and maps.

Or I guess you used the code in the talk subpage. In that case a reference is still needed in the article for population. And it needs to explain if you used mid-year population, beginning of year, or end of year. WP:CALC says this is not original research. Since anybody could calculate the rate state by state by dividing the number of homicides by the population.

The R code does not need to be discussed in the article, or its references. Talk subpage is good enough. --Timeshifter (talk) 21:49, 2 March 2024 (UTC)Reply

The csv file gives population estimates right in the file download. I can't find where the FBI discusses the time of year for the most recent estimates, but in 2019 they used calculations from the US Census[1] which are usually mid-year (in 2020 they gave estimates for both census day and mid-year). And their 2022 estimates are almost exactly that of the US Census's[2], perhaps a different revision. Wizmut (talk) 22:15, 2 March 2024 (UTC)Reply
Thanks for updating the reference! I had a brain fart, and didn't notice the population column.
Now it is easy for any reader to check a random state or 2 and see if the calculation you describe in the reference ends up as the same number in the table. Which I just did. --Timeshifter (talk) 22:22, 2 March 2024 (UTC)Reply

Map update

edit
 

I am going to try to update this map:

Old instructions:

New instructions (a work in progress):

--Timeshifter (talk) 16:42, 26 February 2024 (UTC)Reply

I will create a separate map with 2022 FBI rates. --Timeshifter (talk) 23:13, 2 March 2024 (UTC)Reply
Both maps (FBI and CDC) exist now.
Commons:File:Homicide rates per 100,000 by state. FBI. US map.svg
Commons:File:Homicide rates per 100,000 by state. CDC. US map.svg
--Timeshifter (talk) 19:54, 5 March 2024 (UTC)Reply