Talk:Importance sampling

Latest comment: 9 years ago by 189.103.24.199 in topic In the first section (Basic Theory)


On January 21st, 2006, this article was expanded by Izar(199.111.224.202; I was logged out during editing). I am a 4th year undergraduate student in electrical engineering major, and my concentraion is communications and signal processing. As an undergraduate researcher, I am interested in information theory and coding theory.

Before I expand this article, it was a mathematics stub, so I added some sections. However, it still can be expanded in a lot of aspects. For example, for the biasing methods, there are many other kinds of them such as 'exponential twisting', so I think those can be explained briefly or with some details. Or, some applications using this importance sampling technique may be discussed in a different section. Izar 05:06, 21 January 2006 (UTC)Reply


Introduction

edit

I don't see why importance sampling is a variance reduction technique in MC estimation. It a technique for estimating E[X|A] when all you have is E[X|B]. If I remember correctly, it is the 'weighted importance sampling' techniques that have reduced variance compared to the standard 'importance sampling' technique, at the expense of becoming biased. --Olethros 16:15, 9 February 2006 (UTC)Reply

Mathematical approach

edit

Why is this talking about the binomial distribution and event (X>t)? Just talking about the expectation of a general random variable would have been much simpler and much more general. The X>t treatment is confusing and overlong.--Olethros 16:15, 9 February 2006 (UTC)Reply

I totally agree! The edits by 199.111.224.202 [1] made the article very difficult to understand. Someone should really perform some cleanup here. --Fredrik Orderud 17:51, 9 February 2006 (UTC)Reply


What's ?

edit

This has been unfixed for more than a year, so I have simply commented out the offending paragraph which mentions q. Better that it not be there than that it be gibberish.

--Rpgoldman (talk) 19:56, 2 April 2010 (UTC)Reply

Rewrite

edit

I my opinion the article here is fundamentally flawed. If X is the outcome of some experiment, I don't have f(X) or F(x), so I can't possibly calculate W(x). I only have f(x) for the input parameters of the simulation. So there has to be a distinction being made between the distributions of the input parameters and the distribution of the simulation result. Furthermore, as was already pointed out before, the restriction to E[X>t] is unnecessary, and that there is some binomial distribution for this event is true, but completely off topic. What I would propose for a rewrite is along the following lines:

Let   be some simulation depending on some input parameters   where   itself is a random variable with some known (and favorably analytical) distribution  . The problem is to determine some estimate for the expected value of a function of the solution e.g.  , where e.g.   could be something like   (which would correspond to the current version of the article i.e.  ). Now we have

 
 
 

where the   are i.i.d. according to  .

Now we rewrite this to

 
 

with   and the   i.i.d. according to  .

Example: Let   be the simulation outcome (ok, a bit simple for a simulation, but its just an example, right?), the distribution of input values uniform on   (i.e.   and the function of interest  . Then the normal estimator for   would be

 

Where the   are   distributed and   is 1 if b is true and 0 otherwise (see Knuth: A short note on notation). Taking for   a normal distribution   (with   and   giving quite good results) we get

 

and the   are   distributed.

Some Matlab code illustrating the example (should run with Octave too)

 N=10000;
 m=100;
 alpha=0.7;
 phi1=[];
 phi2=[];
 for i=1:m
     u_i=rand(N,1);
     phi1=[phi1 1/N*sum(u_i.^2>=alpha)];

    mu=(1+sqrt(alpha))/2;
     sig=0.3*sqrt((1-sqrt(alpha)));
     n_i=sig*randn(N,1)+mu;
     phi2=[phi2 1/N*sum( (n_i.^2>=alpha) .* (0<=n_i) .* (n_i<=1) .* (sqrt(2*pi)*sig*exp( (((n_i-mu)/sig).^2)/2) ) )];
 end
 fprintf( '\n\n\n------------\n' );
 fprintf( '%10.6f\n', 1-sqrt(alpha) );
 fprintf( '%10.6f+-%g\n', mean(phi1), std(phi1) );
 fprintf( '%10.6f+-%g\n', mean(phi2), std(phi2) );

Prints:

 0.163340
 0.163480+-0.00397507
 0.163354+-0.0015864

134.169.77.186 14:46, 8 March 2007 (UTC) (ezander)Reply

OK, I have re-written it, basically adding a short introduction plus the basic theory on top. The material I added was verified from the IS lecture notes which I linked to, and the 'Monte Carlo Methods In Practice Book'.

The rest of the stuff that was there before can be thought of as an application, so I left it in. In particular it was an application to simulation. I guess the binomial thing made sense in the context of an 'event simulator', but it does not really help anywhere else. Someone more knowledgeable with the application of MC and IS methods to simulation should try and fix those sections, or perhaps they could be moved to another article.

I currently don't have time to update the probabilistic inference section, but all that'd be required would be some links to other sections. I might do that eventually if nobody else does it. --Olethros 17:37, 21 May 2007 (UTC)Reply

Why unweighted average for importance sampling estimator ?

edit

In the "Mathematical approach" section, why is an straight arithmetic average used for  ? This is supposed to be an estimation of an expected value, correct? Normally, wouldn't this be, for instance:

 

So I would expect the estimator to be:

 

B.Mearns*, KSC 19:10, 5 December 2012 (UTC)Reply

So this makes me think that   is a uniform distribution, but that's not the case, is it?

Ok, so this is just Monte Carlo estimation: the expected value of   is equal to  , so we estimate that expected value simply by averaging lots of values of the random variable (the random variable being  ). Since we're drawing values   according to the  , values for   that are more likely according to   should show up more often, and so they will tend to naturally be more heavily weighted in the average.
In other words, the expected value is really defined as :
 
But instead of using the probability of each item in X, we're using the sampled frequency of each item in X which, for sufficient number of samples, closely approximates the probability:
 
Where   is the number of times the specific value   shows up in the sampled values, which we expect to be proportional to the probability,   for a large number of samples, N.
B.Mearns*, KSC 19:10, 5 December 2012 (UTC)Reply

Basic theory section

edit

I think the symbols need to be better defined. For example, at the introduction of L, L(omega) was not at all defined. What is omega? I found the article difficult to read beyond this point. I think this section generally needs better clarification of the notation, and better explanation of the concepts. Srfahmy1 (talk) 18:27, 23 January 2015 (UTC)Reply

In the first section (Basic Theory)

edit

In:

If we have random samples  , generated according to P, then an empirical estimate of E[X;P] is
 

I think this is wrong. If   is generated according to P, then I think the correct would be:  

189.103.24.199 (talk) 15:38, 21 November 2015 (UTC)Reply