Estimate the number of elements by random sampling with replacement
Setup:
$N$ numbered balls are in a bag. $N$ is unknown.
Pick a ball uniformly at random, record its number, replace it, shuffle.
After $M$ samples, of which we noticed $R$ repeated numbers, how can we
estimate the value of $N$?
Clearly, after we've covered the set, only repeats will appear. However,
there is a vanishingly small probability we've just missed one.
Simplification:
Assume we know $N<n^\star$, which should make it easy to compute
$P(N=n|R=r,M=m)$ for all $n<n^\star$.
Possibly related:
probability distribution of coverage of a set after X independently,
randomly selected members of the set. Though $N$ is known and coverage is
estimated? Here we'd like to estimate $N$.
How many random samples needed to pick all elements of set? Not the same
problem, however.
The exact probability of observing x unique elements after sampling with
replacement from a set with N elements, A sub-problem? Could we compute
the probability $N=n$ by using an answer to this question.
Application:
Estimating the number of webcomics when I'm pressing "random," assuming
random is uniform over the comics.
No comments:
Post a Comment