Talk:Estimator

Latest comment: 3 months ago by Daan314 in topic Unbiased section is confusing

Wiki Education Foundation-supported course assignment

edit

  This article was the subject of a Wiki Education Foundation-supported course assignment, between 27 August 2021 and 19 December 2021. Further details are available on the course page. Student editor(s): Ziyanggod. Peer reviewers: Yungam99, GeorgePan1012, Jiang1725.

Above undated message substituted from Template:Dashboard.wikiedu.org assignment by PrimeBOT (talk) 20:52, 16 January 2022 (UTC)Reply

near-circularity of definition

edit

One should be able better to define this word without immediately linking it to estimate.

:In mathematics designations need to be redefined. This is fine.  Limit-theorem (talk) 19:14, 6 August 2018 (UTC)Reply

Unbiased section is confusing

edit

The 2nd paragraph of the subsection titled "Unbiased" is quite confusing. I'm not sure what it's trying to say. It should be rewritten. Vired (talk) 04:05, 6 April 2024 (UTC)Reply

I think the confusion comes from the statement  . Indeed, if we consider   and follow this statement, then we get a contradiction  . I believe this statement should be changed into  , the expression that is also used in the Unbiased estimation of standard deviation Wikipedia page and also is used as an example in the Bias of an estimator Wikipedia page. Then indeed we would get  . Does this look correct, and if so is it okay for me to make this edit? Daan314 (talk) 23:48, 6 August 2024 (UTC)Reply

I deleted the erroneous Sampling Distribution section

edit

I found this section, labeled Sampling Distribution [sic: The "D" was incorrectly capitalized.] I deleted it for reasons that should be obvious to those who know the subject.

The sampling distribution can be shown by the estimator  . represented by the random sample :  The sampling distribution is equivalent to the probability distribution of the estimator S which can also be represented by the equation:

 

where Y is the number of   equal to zero and n is the number of trials. To understand why the expectation value is dependent on the probability ( ) we need to understand the distribution. For example, in the sampling distribution for each i in the random dataset X it can be considered a success when X = 0. This makes Y is equal to the success of X = 0 in n trials. With the concept of Y either being a success or not it can be thought of as a binomial distribution with constant probability  . Therefore, the sampling distribution S can be seen as the distribution   making S a discrete random variable. As a result, the expectation for the sampling distribution can be thought of as

 

proving that the property holds regardless of what the value of   is. This shows that despite   values fluctuating between samples estimators can be on target regardless of the differences.

Start with the first sentence: "The sampling distribution can be shown by the estimator  ." What does that mean? Presumably the estimator is  , and this section is about the sampling distribution of the estimator. But it says the sampling distribution "can be shown by the estimator". What??

Then "The sampling distribution is equivalent to the probability distribution of the estimator S". Indeed. The sampling distribution of the estimator is the sampling distribution of the estimator. A tautologous sentence.

Then: "which can also be represented by the equation  " What? No statistic called   has been defined, and obviously not all estimators are something divided by n.

Then: "where Y is the number of   equal to zero and n is the number of trials." So all estimators count the number of observations equal to 0 and divide by the sample size? Obviously that is false.

Then: "To understand why the expectation value is dependent on the probability (p0) we need to understand the distribution." What is this probability? Obviously the expectation of an estimator depends on its probability distribution. Why are we talking about the "expectation value" anyway? There is no reason what that should be our focus here.

Then: "With the concept of Y either being a success or not it can be thought of as a binomial distribution with constant probability  ." How often does one see a sentence that is this badly written? Does this mean Y can be thought of as a binomial distribution, or that Y has a binomial distribution?

The succeeding sentence seem to be devoted to showing that a certain statistic is an unbiased estimator of a certain probability of success. Why is that what matters in a section titled "sampling distribution"? It's not telling us anything about a sampling distribution.

The sampling distribution of the sample variance from a normally distributed population is a scaled chi-square distribution. That' is an example of a sampling distribution. Does this section even tell us what a sampling distribution is? It appears that whoever wrote it knows nothing about that. Michael Hardy (talk) 21:35, 20 July 2024 (UTC)Reply

Estimate versus estimator

edit
 
An estimate is not the same thing as an estimator: an estimate is a specific value dependent on only the dataset while an estimator is a method for estimation that is realized through random variables.

The first two boxes under "estimate" look ok. The third, "Provides 'true' value of the parameter" seems at best misleading. It estimates the value of the parameter. It does not say with certainty what that value is, but the way this is phrased could give the impression that that is what is meant.

Under "estimator", the first box looks ok. The "realization" box does not. A realization is what an estimate is, not what as estimator is. And the third box has the same problem: "Special cases"? An estimate, rather than an estimator, is a special case.

And why does every word except "of" in that last box begin with a capital letter? That's not what is in those other boxes. That is at best substandard. Wikipedia generally is fairly sparing in the use of capital letters. See WP:MOS.

Michael Hardy (talk) 21:49, 20 July 2024 (UTC)Reply
I agree, the idea behind this figure is terrific, to visually summarize the duality between a (model) estimator and a (data-driven) estimate. However, the actual realization of the parallels between the two concepts is somewhat diffused. I wonder if a new diagram illustrating this estimate-estimator duality may be useful to construct and insert in the article (mostly to aid learners)... We can probably generate a schematic like this?
 
VodnaTopka (talk) 18:56, 1 August 2024 (UTC)Reply