Why at all consider sampling without replacement in a practical application?

by Raffael   Last Updated July 17, 2017 09:19 AM

Sampling with replacement has two advantages over sampling without replacement as I see it:

1) You don't need to worry about the finite population correction.

2) There is a chance that elements from the population are drawn multiple times - then you can recycle the measurements and save time.

Of course from an academic POV one has to investigate both methods. But from a practical POV I don't see why one would consider sampling without replacement given the advantages of with replacement.

But I am a beginner in statistics so there might be plenty of good reasons why without replacement might be the superior choice - at least for specific use cases. Please, unconfuse me!

Tags : sampling

Answers 3

Expanding on the answer of @Scortchi . . .

Suppose the population had 5 members and you have budget to sample 5 individuals. You are interested in the population mean of a variable X, a characteristic of individuals in this population. You could do it your way, and randomly sample with replacement. The variance of the sample mean will be V(X)/5.

On the other hand, suppose you sample the five individuals without replacement. Then, the variance of the sample mean is 0. You've sampled the whole population, each individual exactly once, so there is no distinction between "sample mean" and "population mean." They are the same thing.

In the real world, you should jump for joy each time you have to do the finite population correction because (drumroll . . .) it makes the variance of your estimator go down without you having to collect more data. Almost nothing does this. It's like magic: good magic.

Saying the exact same thing in math (pay attention to the <, and assume sample size is greater than 1): \begin{equation} \textrm{finite sample correction} = \frac{N-n}{N-1} < \frac{N-1}{N-1} = 1 \end{equation}

Correction < 1 means that applying the correction makes the variance go DOWN, 'cause you apply the correction by multiplying it against the variance. Variance DOWN == good.

Moving in the opposite direction, entirely away from math, think about what you are asking. If you want to learn about the population and you can sample 5 people from it, does it seem likely that you will learn more by taking the chance of sampling the same guy 5 times or does it seem more likely that you will learn more by ensuring that you sample 5 different guys?

The real world case is almost the opposite of what you are saying. Almost never do you sample with replacement --- it's only when you are doing special things like bootstrapping. In that case, you are actually trying to screw up the estimator and give it a "too big" variance.

September 11, 2013 16:26 PM

The precision of estimates is usually higher for sampling without replacement comparing to sampling with replacement.

For example, it is possible to select only one element $n$ times when sampling is done with replacement in an extreme case. That could lead to very imprecise estimate of the population parameter of interest. Such a situation is not possible under sampling without replacement. So the variance is usually lower for estimates made from sampling without replacement.

September 11, 2013 16:28 PM

I have a result which treats without replacement practically as with replacement and removes all the difficulties. Note that with replacement calculations are much easier. So, if a probability involves p and q,probabilities of success and failure, in with replacement case, the corresponding probability in without replacement case is obtained simply with the the replacement of p^a.q^b with (N-a-b)C(R-a) for any a and b, where N, R are the total number of balls and the number of white balls. Remember that p is treated as R/N.


Krish Balasubramanian
Krish Balasubramanian
July 17, 2017 08:45 AM

Related Questions

Additional sample after simple random sampling

Updated May 24, 2018 17:19 PM

Uniform sampling of a set of weighted samples

Updated June 10, 2015 22:08 PM

Importance Sampling to evaluate integral in R

Updated February 19, 2017 09:19 AM

Expected value of a "logistic uniform" multivariate

Updated December 05, 2017 17:19 PM