Talk:Empirical distribution function

Latest comment: 7 months ago by Harrydiv321 in topic Mean, variance, MSE, quantiles

Wiki Education Foundation-supported course assignment

edit

  This article was the subject of a Wiki Education Foundation-supported course assignment, between 27 August 2021 and 19 December 2021. Further details are available on the course page. Student editor(s): Shg7D1. Peer reviewers: Jx2022, Philipphaku, Joycecs.

Above undated message substituted from Template:Dashboard.wikiedu.org assignment by PrimeBOT (talk) 20:33, 16 January 2022 (UTC)Reply

"Logical problems" section

edit

This section either needs serious rework or to be removed entirely. It reads like gibberish and isn't particularly relevant to the article topic. 74.124.58.202 (talk) 17:02, 21 December 2016 (UTC)Reply

A visual example

edit

would be helpful here. — Preceding unsigned comment added by 70.90.143.154 (talk) 13:13, 13 October 2011 (UTC)Reply

I just created a visualization and added it. Bscan (talk) 15:15, 25 March 2013 (UTC)Reply

Quantile Function

edit

Isn't it generally both easier to both generate and use the quantile function when using empirical or simulated data? If so then, yes the "Empirical Distribution Function" is an analog to the cdf, but we might restructure this article to emphasize that when one wishes to empirically characterize a distribution it is more common to do so via quantiles.Dbooksta (talk) 01:09, 20 March 2014 (UTC)Reply

edit

Hello fellow Wikipedians,

I have just modified one external link on Empirical distribution function. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{Sourcecheck}}).

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 01:19, 24 December 2016 (UTC)Reply

Graph needs to be redrawn

edit
 
The blue line shows an empirical distribution function. The black bars represent the observations in the sample corresponding to the sample’s empirical distribution function and the gray curve is the true cumulative distribution function.

The lead contains this image:

This graph is self-contradictory: The true population cumulative distribution reaches 1 at about X = 3, meaning that in the entire population there is no value greater than 3. But the empirical distribution function shows that there is a value of about 3.5 sampled from that population, which is impossible.

Can someone please redraw the graph to fix this? Thanks. Loraof (talk) 22:09, 8 January 2018 (UTC)Reply

New graph

edit

{{CSS image crop|Image = ECDF-100.png|bSize = 2500|cWidth = 250|cHeight = 250|oTop = {{#invoke:MjolnirPants|GraphPickerX}}|oLeft = {{#invoke:MjolnirPants|GraphPickerY}}|Location = right|Description = The blue step function graph shows an empirical distribution function. The grey bars represent the observations in the sample corresponding to the sample’s empirical distribution function, and the green curve, which asymptotically approaches a height of 1 without reaching it, is the true cumulative distribution function. ({{purge|Click here to load a new graph.}})}} I've swapped the graph for a new specially designed widget (left). The specific graph shown is just one selected at random on-the-fly from a hundred such graphs. Click on the graph to see the whole set. Clicking on the link in the caption will purge the page, replacing the graph with another one selected at random. The rationale behind this is given at User talk:Loraof#New graph.

Please take a look at the set of 100 graphs, maybe read through the discussion, and most importantly try clicking the Click here to load a new graph link. Any feedback would be welcome. Cheers. nagualdesign 19:03, 30 January 2018 (UTC)Reply

@Loraof: @Nagualdesign: Hi, I'm the original creator of the graph and I feel like I missed the party. I just saw all of the lengthy discussions posted on https://en.wikipedia.org/wiki/User_talk:Nagualdesign/Archive_6 and https://en.wikipedia.org/wiki/User_talk:Loraof . For the random value > 3, you are correct that the CDF is near 1 at x=3 (99.7% of samples will have magnitude less than 3), but since we are drawing 20 points, the probability of having at least one with magnitude greater than 3 is 1 - 0.997^20 or about 6% which isn't too unlikely. I likely also reran the script a few times until I got a chart that had some interesting characteristics and mostly "looked right". The original graph had the Matlab code attached (on https://commons.wikimedia.org/wiki/File:Empirical_CDF.png), which I believe was helpful for people trying to dig in further into the graph. Can you add the code you used to generate the new graphs as info on the file? Thanks! Bscan (talk) 18:42, 13 April 2018 (UTC)Reply

@Bscan:   Done - I've copy/pasted the code from my talk page archive to the image file page.[1] Regards, nagualdesign 19:50, 13 April 2018 (UTC)Reply

@Nagualdesign: . Great, thanks! One more request if you have time. One mistake I made in the original was using English instead of symbols for "Cumulative Probability". If we do less text, then it can be used on all the other language wikis. For example, I noticed the Ukranian one but they all need to be updated separately. https://commons.wikimedia.org/wiki/File:Empirical_CDF_uk.png or they are displayed in the wrong language such as on the Chinese Wikipedia: https://zh.wikipedia.org/wiki/%E7%BB%8F%E9%AA%8C%E5%88%86%E5%B8%83%E5%87%BD%E6%95%B0 Some discussion about that is here too: https://en.wikipedia.org/wiki/Template_talk:Infobox_probability_distribution#Standard_Plots . I looked at a bunch of the probability distributions to see how they label CDFs and there isn't much uniformity. Lowercase x is generally on the x-axis and the y axis is either some capital letter function like P(x), F(x), P(X <= x), or it's unlabeled. I think I like P(x), but I'm flexible. At some point, I'll need to update the KS test ones too: https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test Bscan (talk) 18:03, 15 April 2018 (UTC)Reply

Relevant: I built another empirical CDF chart and added it to the Dvoretzky–Kiefer–Wolfowitz_inequality page. Bscan (talk) 20:26, 16 April 2018 (UTC)Reply

Random graph display

edit

I'm not sure why, but neither method ( compare the original method we used, seen in this revert to Pppery's direct invocation of Random|number) of displaying a random portion of the graph on a page reload is working, at least for me in Firefox. ᛗᛁᛟᛚᚾᛁᚱPants Tell me all about it. 15:12, 16 October 2018 (UTC)Reply

@Pppery: See above... ᛗᛁᛟᛚᚾᛁᚱPants Tell me all about it. 19:02, 16 October 2018 (UTC)Reply
Caching {{3x|p}}ery (talk) 19:03, 16 October 2018 (UTC)Reply
I'm Ctrl+F5ing.. so try again with less condescension this time. ᛗᛁᛟᛚᚾᛁᚱPants Tell me all about it. 19:05, 16 October 2018 (UTC)Reply

Clicking the "click here to load a new graph" works for me. I don't think it's possible to make refreshing the page work, because, as I said, the result of parsing Wikitext is cached. {{3x|p}}ery (talk) 19:08, 16 October 2018 (UTC)Reply

Do you know what Ctrl+F5 does? It force clears the cache. It's the method I was using to test the function that I wrote specifically for this page. And I've tried that, as well. It does the same exact thing as Ctrl+F5 for future reference. ᛗᛁᛟᛚᚾᛁᚱPants Tell me all about it. 19:15, 16 October 2018 (UTC)Reply
  • So I just checked it in chrome, and I can get a new graph using the functions in the MjolnirPants module, whether I Ctrl+F5 or hit the "click here to load a new.." link. I cannot, however, get it to display a new graph using the #expr call, no matter what I do. And I can't get it to display a new graph in Firefox under any circumstances. ᛗᛁᛟᛚᚾᛁᚱPants Tell me all about it. 19:22, 16 October 2018 (UTC)Reply
Checking some more machines, I'm seeing something similar. I found that loading a new tab in FF will almost always generate some new numbers, no matter how fast I open in new tabs (so it's probably working, just occasionally hitting the same combination of values twice in a row because it's not a real RNG). Then, I found that every single browser/OS combination I tried at work (FF, Chrome and IE11 on Win server 2008, Ubuntu, Win7 and Win10 going out on a cable line) using either Ctrl+F5 or hitting the link (if you enable the time at the top right, hitting that does the same thing) will always give a new set of numbers, using either version, after one set of numbers has been used for about 4 minutes. So I'm still not sure if this is a parsing issue or something else. To make things even weirder, I'm getting the exact same results at home now, (checking using every combination of FF, Chrome and Safari on Raspian, WinXP, Win7 and Win10, as well as an Epiphany version on Raspian and Edge on Win7 and Win10; all going over a pair of bonded cable connections) checking both Ctrl+F5 and the clock link and getting the same results, only having to wait 1-2 minutes between refreshes.
I give up and I'm too lazy to file a bug report. Just be aware that it's not always working. For the record, I was verifying by inspecting the element, then tracking the top and left position of the div that contains the image (line 47 of the source). ᛗᛁᛟᛚᚾᛁᚱPants Tell me all about it. 23:14, 16 October 2018 (UTC)Reply

"Statistical distribution" redirect page

edit

I noticed that "statistical distribution" redirects to "Empirical distribution function," but "statistical distributions" redirects to "probability distribution." Is this an error? Jarble (talk) 21:50, 12 October 2019 (UTC)Reply

Mean, variance, MSE, quantiles

edit

I don't really understand the inclusion of these sections. Needless to say, these are more general concepts and they don't really fit into this article. I have removed these sections, but feel free to revert this if you think I am mistaken. Harrydiv321 (talk) 10:35, 7 April 2024 (UTC)Reply