In this post we will cover the fundamental facts of the gamma
function, $\Gamma$. The gamma function is usually defined as
$\Gamma(x) = \int_0^{\infty}t^{x-1}e^{-t}\mathrm{d}t$ and having
the property $\Gamma(n+1) = n!$ for all non-negative integers
$n$. In a way, we can consider the gamma function as a
continuous extension of the factorial operator. I highly
recommend the interested reader to check out the small book
The Gamma Function [2] by Artin
for its self-contained treatment of the gamma function. For a
thorough treatment check out Chapter 12 in [5], which covers probably everything that was
known about the gamma function up to the end of the nineteenth
century.
Convexity
As background material the first major concept we will explore is
convexity. A convex function $f$ essentially has the property
that the secant line connecting any two points on the graph of $f$
does not lie below the graph of $f$. Here is a formal definition.
Definition.
Assume $V$ is real vector space. We say a real-valued function $f$
is convex on $D \subseteq V$ if for every $x$ and $y$ in $D$
and for all $t \in [0,1]$, we have
$$
f(tx + (1-t)y) \le tf(x) + (1-t)f(y).
$$
Furthermore we say $f$ is strictly convex on $D$ if for
$x \neq y$ and $t \in (0,1)$, we have
$$
f(tx + (1-t)y) < tf(x) + (1-t)f(y).
$$
It follows from the definition of convexity that if $f$ and $g$
are convex on $D$ then $f + g$ is convex. Here is a useful
proposition we will need later.
Proposition A.
Suppose $f$ is convex on the interval $I$ with $a < b < c$
in $I$. Then we have
Suppose $f$ is a real-valued function on the real vector space
$V$ such that $f(x) > 0$ for all $x \in V$. If $\log f$ is
convex, then we call $f$ logarithmically convex.
Actually if $f$ is logarithmically convex then $f$ is also
convex. The proof of this follows directly from $f = e^{\log
f}$ and $e^x$ being monotone increasing and convex. Moreover if
$f$ and $g$ are logarithmically convex then $fg \defeq fg(x) \to
f(x)g(x)$ is logarithmically convex.
Hölder's Inequality
We use the concept of convexity to prove Hölder's
inequality which is a fundamental inequality we will need later.
The usually way to prove Hölder's inequality is to derive
it from the following inequality:
Proposition (Young's inequality).
Let $p, q > 0$ such that
$$
\frac{1}{p} + \frac{1}{q} = 1.
$$
If $a$ and $b$ are both non-negative real numbers, then
$$
ab \le \frac{a^p}{p} + \frac{b^q}{q}.
$$
Proof.
The proof is trivial for $a = 0$ or $b = 0$ so we may assume $a$
and $b$ are both strictly positive. We see
$$
ab = e^{\log{ab}} = e^{\frac{1}{p}\log{a^p} + \frac{1}{q}\log{b^q}}.
$$
Theorem (Hölder's inequality).
Let $f$ and $g$ be real-valued functions on the open interval
$(a,b)$, $-\infty \le a < b \le \infty$ such that
$$
A \defeq \left(\int_a^b\left|f\right|^p\right)^\frac{1}{p} \quad \quad
B \defeq \left(\int_a^b\left|g\right|^q\right)^\frac{1}{q}
$$
are both well-defined with $\frac{1}{p} + \frac{1}{q} = 1$, then
$$
\int_a^b\left|fg\right| \le AB.
$$
Proof.
Again if $A$ or $B$ are $0$ then the proof is trivial, so we can
assume $A, B > 0$. By taking $a$ and $b$ to be $\frac{|f|}{A}$
and $\frac{|g|}{B}$ in Young's inequality we have
We first work on proving that $\Gamma$ is well-defined. Take $x
> 0$ be fixed. Since we get issues at the $0$ endpoint when
$x < 1$, we take $\epsilon > 0$ and we see that
Taking $\epsilon \to 0$, we get $\frac{1}{x}$ as an upper bound.
Next choose any integer $n$ such that $n > x + 1$, and as
$\frac{t^n}{n!} < e^t$, we have
Since $x$ is fixed and the integral ${\int_{1}^{\epsilon}
t^{x-1}e^{-t}\mathrm{d}t}$ is monotonically increasing as
$\epsilon \to \infty$, we have that the following limit
And finally, from Proposition B, we
have that $\Beta(x,y)$ is also logarithmically convex in $x$ for
each fixed $y$.
The Bohr-Mollerup Theorem
Now that we know the basic properties of $\Gamma$ and $\Beta$,
we can prove a powerful theorem that has many useful
applications.
Theorem (Bohr–Mollerup).
Let $f$ be a real-valued function defined on $(0, \infty)$ that
satisfies the following properties:
$f(1) = 1$,
$f(x + 1) = xf(x)$, and
$f$ is logarithmically convex.
then $f$ is uniquely determined.
Proof.
We know such an $f$ exists as $\Gamma$ satisfies all three
properties.
Let $\varphi(x) \defeq \log{f}$, so that $\varphi(1) = 0$, and
$\varphi(x+1) = \varphi(x) + \log{x}$. Furthermore
$\varphi(n+1) = \log(n!)$ for any integer $n \ge 0$.
Fix $0 < x < 1$ and let $n$ be any positive integer. Then
by applying Proposition A we have
This implies that $f$ is uniquely determined on
the interval $0 < x \le 1$. However from the recursion
formula $f(x+1) = xf(x)$, we see $f$ is uniquely determined
for all $x > 0$, and the proof is complete.
The Bohr-Mollerup theorem gives us some far reaching
applications. We will present two of them in this blog post.
Here is the first one.
Taking $x = y = 1/2$, we get $\Beta(x,y) = \pi$. Therefore
$\Gamma(1/2) = \sqrt{\pi}$.
Substituting $t = s^2$ in the integral definition of $\Gamma$,
we get
The function $f$ is logarithmically convex as its a product of
logarithmically convex functions. From $\Gamma(1/2) =
\sqrt{\pi}$, we have $f(1) = 1$. And finally,
so that $\{H_n - \log{(n+1)}\}$ is a monotone increasing
sequence converging to $\gamma$. Since $(H_n - \log{n}) +
\log{n} = H_n$ and $H_n = (H_n - \log(n+1)) + \log(n+1)$, we
conclude that
giving us a descent approximation to $H_n$. Let's tie in a
probability example:
Example.
Suppose there are exactly $n$ different toys that we are
collecting from cereal boxes. Each toy has a $\frac{1}{n}$
chance of appearing in any box. How many cereal boxes are we
expected to buy in order to get all $n$ different toys?
Let's start off with defining a sample space $S$.
Let's label the toys $1$ through $n$, and let's take
$S \defeq \{1, 2, \dots, n\}^{\infty}$. That is $S$ is the
set of infinite tuples made up of integers inclusively between
$1$ and $n$. The sample space $S$ is in an one-to-one
correspondence with indefinitely purchasing cereal boxes and
writing the toy numbers down in order.
Now take $X \colon S \to [0, \infty]$ to be defined as the
number of times it takes to get all $n$ toys. Therefore $X$
is a random variable and the question we posed is asking us to
find $\mathbb{E}(X)$.
Let's take $X_i \colon S \to [0, \infty]$ to be the number
of times it takes to get a different toy once we have $i - 1$
unique toys already.
Let's do a quick example. Suppose $s \in S$ is
$(1,2,4,4,5,4,4,4,3,\dots)$ with $n=5$. We see $X_1(s) = 1$,
which is always the case, $X_2(s) = 1$, $X_3(s) = 1$, $X_4(s) =
2$, and $X_5(s) = 4$. And that
that is we collect $k-1$ toys that we already own with the $k$th
toy being not in our collection. This just follows the geometric
distribution, but we can derive the expectation of $X_i$
directly. Take $p$ to be $\frac{n-(i-1)}{n}$ so that the above
formula reads as $(1-p)^{k-1}p$. Now
which converges. Therefore the series in \eqref{eq:w2} is
uniformly convergent (See Theorem 7.7 on page 152 in [4] for the relevant theorem). It follows
almost immediately from \eqref{eq:w2} that
$$
\Gamma'(1) = -\gamma
$$
References
Lars Ahlors.
Complex Analysis.
McGraw-Hill, third edition, 1979.
Emil Artin.
The Gamma Function.
Dover, 2015.
Donald E. Knuth.
The Art of Computer Programming,
volume 1. Addison Wesley, third edition, 1997.
Walter Rudin.
Principles of Mathematical Analysis.
McGraw-Hill, third edition, 1976.
E. T. Whittaker and G. N. Watson.
A Course of Modern Analysis.
Cambridge University Press, fifth edition, 2021.