Peaking into the abyss

Here is a weird question: how “zero” can samples from a standard unit Gaussian actually get? Sure, the mean is zero. But that’s just a statistic. How much mass is around zero? Gaussian-weighted components are abundant in inference; can they ever truly “switch off” ?

Let $x \sim \mathcal{N}(0,1)$ and consider the smallest magnitude among $n$ samples:

m_n = \min_{i \le n} |x_i|.

Near zero the Gaussian density is essentially flat,

p(x) \approx \frac{1}{\sqrt{2\pi}},

so mass scales linearly:

p(|x| < \varepsilon) \approx \sqrt{\frac{2}{\pi}}\,\varepsilon.

Invert this heuristic and you obtain a clean rule:

After $n$ samples, the smallest value you typically see is on the order of $1/n$ .

More precisely,

\mathbb{E}[m_n] \approx \sqrt{\frac{\pi}{2}}\frac{1}{n}.

So:

samples from $\mathcal{N}(0,1)$	expected minimum
$10^3$	$\sim 10^{-3}$
$10^6$	$\sim 10^{-6}$

Every decade of samples gets you another order of magnitude near what I call “the abyss floor.”

Looking at the abyss on a log scale

A convenient way to visualize this is to define

y = \log_{10} |x|.

Now each unit step left corresponds to another factor of ten toward zero.

The density becomes

p(y) \propto 10^y,

for very negative $y$ . Each additional decade loses a factor of ten in probability mass.

Surely a sparse prior fixes this?

A common intuition is that Gaussian priors are “dense,” while Laplace priors encourage sparsity. So perhaps a Laplace distribution explores the abyss more eagerly.

Take the maximum entropy distribution with fixed mean $E[x]=0$ and mean absolute deviation $E|x|=1$ :

p(x) = \frac12 e^{-|x|}.

Let $r = |x|$ . Then

r \sim \mathrm{Exp}(1).

The minimum of $n$ exponentials is still exponential:

m_n \sim \mathrm{Exp}(n),

and therefore

\mathbb{E}[m_n] = \frac{1}{n}.

Same scaling as for the Gaussian above!

Near zero, Gaussian and Laplace behave identically.

Both satisfy

P(|x| < \varepsilon) \propto \varepsilon.

So neither prior truly “dives into zero”.

The Laplace prior changes the shape of the peak (it has a cusp), which strongly affects optimization and MAP estimates. But in terms of raw probability mass around microscopic neighborhoods, it is far less different than our intuition suggests.

Go to next post ▤ or ▥ previous post in research series.