Talk:Histogram/Archives/2016

Latest comment: 8 years ago by 104.194.119.200 in topic Error: Histogram vs. Bar Chart


Assessment comment

The comment(s) below were originally left at Talk:Histogram/Archives/Comments, and are posted here for posterity. Following several discussions in past years, these subpages are now deprecated. The comments may be irrelevant or outdated; if so, please feel free to remove this section.

This is a decent article but to reach "good article" status it could do with some general clean-up and other improvements, e.g.:
  • Make clear that although bin widths can vary, in practice this is unusual.
  • Better figures for the "travel time" example — show rectangles not just outlines (some stats software can show histograms for varying bin width) and clearer caption   Done --Qwfp (talk) 13:50, 23 February 2008 (UTC)
  • See also kernel density estimation needs to be in the main text perhaps instead of kernel near the start
  • Better referencing.

This article gets a lot of views so seems worth some effort. I may have an edit myself at some point...

Qwfp (talk) 12:37, 22 February 2008 (UTC)

Last edited at 13:50, 23 February 2008 (UTC). Substituted at 17:58, 29 April 2016 (UTC)

Error: Histogram vs. Bar Chart

Most of the images shown on the page are bar charts, not histograms. This could lead to confusion amongst readers. —Preceding unsigned comment added by 129.170.241.32 (talk) 19:10, 8 January 2010 (UTC)

The main difference between bar charts and histograms is that there is no natural separation between the rectangles. The bars in a bar chart are seperated to clarify the seperation of classes. In fact the graphics in the article are histograms, not bar charts. Phill779 (talk) 09:20, 6 April 2011 (UTC)

I understand what you are saying, no need to rexplain it to me, and a vast number of people may agree with you, but it makes no particular sense for that to be the definition of histogram, it's just plotting points and then arbitrarily drawing rectangles; alternatively, you could connect the points with line segments and call it a line chart, because your definition of bar chart and line chart are the same as your definition of histogram. A much more useful definition of histogram (and what I was taught was a histogram at a Major University(TM) a long time ago) is the one with the widths varying such that the area of each "bar" comes out the same. The benefit of this histogram is that it more meaningfully represents a population density function which is what these charts are trying to represent, representing equally well many points sampled near any meaty part of the distribution, and with a wider bar encompassing the same number of sparser points sampled in low density areas, smoothing out random variation in a natural way. (the distinction in the sets of numbers you apply your histograms vs bar charts to is a valid distinction, but the same distinction applies to my/our definition of histogram also, so it's not a meaningful one, save for the meaning to the datasets.) 68.173.49.156 (talk) 20:25, 20 February 2015 (UTC)

I agree "histogram" is the common term for these things that have bar heights equal to data counts, as shown in the black cherry tree example. But the description describes something else: "A histogram consists of tabular frequencies, shown as adjacent rectangles, erected over discrete intervals (bins), with an area equal to the frequency of the observations in the interval. The height of a rectangle is also equal to the frequency density of the interval, i.e., the frequency divided by the width of the interval. The total area of the histogram is equal to the number of data." One could define a plot this way, but in typical use and in the cherry tree example, the height is equal to the count, and the area is equal to the count times the bin width. — Preceding unsigned comment added by Tplane (talkcontribs) 15:20, 13 July 2011 (UTC)

It's not a case of "one could define it in this way", this is the definition of a histogram! Only when the bin width equals 1 does the numerical value of the bar height equal the frequency. Incidentally this is a serious problem with all the histograms in this article: the vertical axes are labelled "frequency" or "count", even for non-unity widths. The second paragraph of the article even explicitly warns against doing this: "The vertical axis is not frequency but density...". 129.67.149.107 (talk) 13:34, 18 March 2016 (UTC)

"The total area of the histogram is equal to the number of data." does not compute - an area cannot be "equal" to a number. ITOtto (talk) 05:56, 2 April 2013 (UTC)

There is a much simpler distinction, one so obvious and important that I cannot understand why it has not been added already. A bar chart is a visualization of the relationship between TWO variables (one categorical, one quantitative). A histogram is a visualization of the distribution of ONE variable (quantitative). Is there some reason this is not the accepted difference? 104.194.119.200 (talk) 06:32, 28 September 2016 (UTC)