Wikipedia:Reference desk/Archives/Mathematics/2024 August 13

Mathematics desk
< August 12 << Jul | August | Sep >> August 14 >
Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


August 13

edit

Another probability question

edit

Hello, this is not homework. It is a conundrum that arose out of something I was trying to work out for myself. If we have only one sample of a random variable, and no other information at all, then the best estimate of the mean of the distribution is the sample itself. It may be a rubbish estimate, but nothing more can be said. However, suppose we also know that the mean is >= 0. With this extra information, can any better estimate of the mean be achieved from a single data point? (If necessary, please define "better" in any sensible way that aids the question.) It seems to me that, in the event that the sample is negative, replacing it with zero would always be a better estimate. However, averaged multiple adjusted samples will no longer converge to the true mean, assuming that positive samples are left alone, which seems undesirable. It's not obvious to me anyway that positive samples have to be left alone. Can anything better be achieved? How about if we also know anything else necessary about the distribution, even down to its exact "shape", except that we do not know the mean, apart from that it is >= 0. What then? Does that extra information help at all? 2A00:23C8:7B0C:9A01:8D5A:A879:6AFC:AB6 (talk) 20:56, 13 August 2024 (UTC)[reply]

Adjusting positive samples will do you little good if you don't know how to adjust them – but if nothing else is known about the distribution than μ ≥ 0, there is no information to base an adjustment on. For an extreme case, assume all samples have sample size 1. Unknown to the sampler, the distribution is discrete with possible outcomes {−9, 1}. The sample means are them equal to the single element in each sample. Adjusting them to be nonnegative amounts to left-censoring. If μ is the true mean of the distribution, the average of the adjusted (left-censored) sample means tends to (μ + 9) / 10. For the average of the adjusted sample means to tend to the true mean, the positive sample means should be replaced by 10μ / (μ + 9). But this replacement requires already knowing both the value of μ and the outcome space – in fact, all there is to know about the distribution.  --Lambiam 07:33, 14 August 2024 (UTC)[reply]