Talk:Box–Cox distribution

Latest comment: 2 years ago by Research Psychologist in topic Φ

It says m is the location parameter. The Greek letter μ also appears. It doesn't say what μ is. Should m and μ both be the same letter? Michael Hardy (talk) 17:46, 31 August 2008 (UTC)Reply

Φ

edit

The formula for the pdf contains   and the text below says "Φ is the cumulative distribution function of the standard normal distribution". But the cdf of the standard normal takes one argument, not three. A quick look at the reference suggests the single argument should be a function of m, s and f. Qwfp (talk) 17:40, 25 May 2010 (UTC)Reply

Research Psychologist (talk) 18:41, 21 September 2022 (UTC) Does Box-Cox the math (as shown in this Wikipedia article) work? A colleague of mine is a PhD physicist. He offered the following critique of Wikipedia’s article on the Box-Cox distribution:Reply

Wikipedia defines the Box-Cox distribution as “the distribution of a random variable X for which the Box–Cox transformation on X follows a truncated normal distribution.” The truncated-normal variate is defined as Y, which is the Box-Cox transformation of X: Y = (X^a – 1)/a if a is not equal to zero, else Y = ln(X). I denote the probability density functions (pdfs) of X and Y as fx(X) and fy(Y), and I use symbol a instead of Wikipedia’s notation lambda.

This is a textbook-level problem. Upon solving it, I find there are 5 errors in Wikipedia equation that ostensibly represents fx(X). The derivation goes as follows:

After equating the CDFs Fy(Y(X)) = Fx(X), differentiating both sides with respect to X gives

fx(X) = fy(Y) |dY/dX|. (1)

Direct substitution of Y(X) into Eq. (1) gives directly an expression for fx(X):

fx(X) = fy(Y(X)) |X^(a-1)|. (2)

Here, we remind that Y(X) = (X^a – 1)/a if a is nonzero, else Y(X) = ln(X). Also, Y is undefined when X < 0, and hence Y must also be non-negative by dint of the Box-Cox transformation. The non-negativity enforced on Y forces fy(Y) to be NOT a Gaussian as one would have hoped, but a truncated Gaussian with truncation at Y = 0 and keeping only the part of the Y domain that is either greater than or less than zero. With this information, one can write Eq. 2 (with substitution from Eq. 3) understanding that Y is an implicit function of X. Also, in this case we can drop the absolute-value sign, so

  fx(X) = fy(Y) X^(a-1).                                                                                                                      (3)

In Eq. 3, we can substitute for fy(Y) the following:

      fy(Y) = (1/K) fyo(Y) =(1/K) [1/[sqrt(2pi s^2)] exp[-(Y – m)^2/(2s^2)).                                                                      (4)

Here, fyo is the pdf of the Gaussian that has not yet been truncated, m is its mean, s is its standard deviation, and K is a constant that normalizes the post-truncated Gaussian so its integral is 1 over the truncated variate Y.

In Eq. 4, the constant K has two possible values depending on whether we select the negative or positive half-line of the variate Y. (We consider only these two options.) Denote as Phi(z) as the standard normal CDF in z, in which case Phi((Y – m)/s) is the CDF of the un-truncated Gaussian fyo(Y) with mean m and standard deviation s. Then Phi(-m/ s) is the CDF of fyo at the cut point 0, which includes all of fyo(Y) to the left of 0. Accordingly, 1 – Phi(-m/s) is the CDF of the complementary distribution that includes all of fyo(Y) to the right of 0. Whichever decision is made, we make the truncation at 0. Note that we have freedom to choose the values of m and s. Accordingly, the choice to use the left side of zero receives the normalization K = Phi(-m/s), and the choice to use the right side of zero receives the normalization K = [1 – Phi(-m/s)]. This choice can be effected by the sign of a parameter f and the following expression for K:

K = 1 – I(f<0) – sgn(f) Phi(-m/s), (5)

where I is the indicator function (in this case evaluating to 1 when f < 0, else 0) and sgn(f) is the algebraic sign of f if f is nonzero, else sgn(f) = 0.

Our final expression for fx(X) is Eq. 3 with substitution of fy(Y) from Eq. 4 and substitution of K into Eq. 4 from Eq. 5. It differs in five ways from the Wikipedia equation:

1. Wikipedia omits the factor |dY/dX|, but should include it.

2. Where one expects Y in the exponential, Wikipedia takes (Y ^ f)/f, where f is called the “family parameter.” I think this is a mistake.

3. The argument sqrt(s) in the CDF Phi should be plain s. By writing the “dispersion factor” s with a square root over it, as Wikipedia does, it gives the impression that s a variance and not a standard deviation as it plainly is elsewhere.

4. Where the Wikipedia article introduces the indicator function, there is a hyperlink to the correct definition but the subset of text available to the floating cursor (prior to actual click) renames this term as “characteristic function.” The Wikipedia article on Characteristic function shows abundant definitions of the term, including the topic-relevant definition as the Fourier transform of a pdf. To avoid confusion, I recommend that the words “characteristic function” not be included in the floating text excerpted in the present article.

5. As used in the article, the standard normal CDF is not Phi, but Phi(z, 0,1); with general m and s , Phi(Y, m, s) is the general normal CDF in Y. A preferable notation might cite Phi(z) as the standard normal CDF in z, and hence the single-argument function Phi((Y-m)/s) is the general normal CDF in Y.

I also suggest a bit more specificity in the terminology. Given the specificity of the distribution fyo, there is no reason to call m a location parameter and not a mean; and no reason to call s a dispersion and not a standard deviation.

Still to be resolved: What can a truncated Gaussian offer in conceptual simplification of the problem of non-normal statistics? A truncated Gaussian is already VERY non-normal!

Dr. Michael H. Brill, (609) 375-6368, mhbrill2001@gmail.com

The Wikipedia editorial interlocutor for Dr. Brill is Research Psychologist (talk) 16:24, 9 July 2022 (UTC)

To me it seems, separate from Dr. Brill's comments, that this article and https://en.wikipedia.org/wiki/Power_transform ought to be linked in both directions. Research Psychologist (talk) 18:41, 21 September 2022 (UTC)Reply