The Gamma Function


filed under : #math #probability #analysis

$$\DeclareMathOperator{\vol}{Vol}$$ $$\DeclareMathOperator{\div}{div}$$ $$\DeclareMathOperator{\tr}{tr}$$ $$\DeclareMathOperator{\det}{det}$$ $$\newcommand{\lib}[2][]{\frac{\mathrm{d}#1}{\mathrm{d}#2}}$$ $$\newcommand{\pd}[2][]{\frac{\partial#1}{\partial#2}}$$ $$\newcommand{\defeq}{=}$$ $$\newcommand\at[2]{\left.#1\right|_{#2}}$$ $$\newcommand\Beta{\mathrm{B}}$$

In this post we will cover the fundamental facts of the gamma function, $\Gamma$. The gamma function is usually defined as $\Gamma(x) = \int_0^{\infty}t^{x-1}e^{-t}\mathrm{d}t$ and having the property $\Gamma(n+1) = n!$ for all non-negative integers $n$. In a way, we can consider the gamma function as a continuous extension of the factorial operator. I highly recommend the interested reader to check out the small book The Gamma Function [2] by Artin for its self-contained treatment of the gamma function. For a thorough treatment check out Chapter 12 in [5], which covers probably everything that was known about the gamma function up to the end of the nineteenth century.

Convexity

As background material the first major concept we will explore is convexity. A convex function $f$ essentially has the property that the secant line connecting any two points on the graph of $f$ does not lie below the graph of $f$. Here is a formal definition.

Definition. Assume $V$ is real vector space. We say a real-valued function $f$ is convex on $D \subseteq V$ if for every $x$ and $y$ in $D$ and for all $t \in [0,1]$, we have
$$ f(tx + (1-t)y) \le tf(x) + (1-t)f(y). $$
Furthermore we say $f$ is strictly convex on $D$ if for $x \neq y$ and $t \in (0,1)$, we have
$$ f(tx + (1-t)y) < tf(x) + (1-t)f(y). $$

It follows from the definition of convexity that if $f$ and $g$ are convex on $D$ then $f + g$ is convex. Here is a useful proposition we will need later.

Proposition A. Suppose $f$ is convex on the interval $I$ with $a < b < c$ in $I$. Then we have
$$ \frac{f(b) - f(a)}{b - a} \le \frac{f(c) - f(a)}{c - a} \le \frac{f(c) - f(b)}{c - b}. $$
Proof. Take $t \defeq \frac{b-a}{c-a}$, so $t \in (0,1)$ and $b = (1-t)a + tc$. Now by convexity, we have
$$ \frac{f(b) - f(a)}{b - a} = \frac{f((1-t)a + tc) - f(a)}{(1-t)a + tc - a} \le \frac{(1-t)f(a) + tf(c) - f(a)}{t(c-a)} = \frac{f(c) - f(a)}{c - a} $$
Now take $t \defeq \frac{c-b}{c-a}$ so $t \in (0,1)$ and $b = ta + (1-t)c$. Now by convexity, we have
$$ \frac{f(c) - f(b)}{c - b} = \frac{f(c) - f(ta + (1-t)c)}{c - ta - (1-t)c} \ge \frac{f(c) - tf(a) - (1-t)f(c)}{tc - ta} = \frac{f(c) - f(a)}{c - a}, $$
and the proof is complete.

Suppose $f$ is a real-valued function on the real vector space $V$ such that $f(x) > 0$ for all $x \in V$. If $\log f$ is convex, then we call $f$ logarithmically convex. Actually if $f$ is logarithmically convex then $f$ is also convex. The proof of this follows directly from $f = e^{\log f}$ and $e^x$ being monotone increasing and convex. Moreover if $f$ and $g$ are logarithmically convex then $fg \defeq fg(x) \to f(x)g(x)$ is logarithmically convex.

Hölder's Inequality

We use the concept of convexity to prove Hölder's inequality which is a fundamental inequality we will need later. The usually way to prove Hölder's inequality is to derive it from the following inequality:

Proposition (Young's inequality). Let $p, q > 0$ such that
$$ \frac{1}{p} + \frac{1}{q} = 1. $$
If $a$ and $b$ are both non-negative real numbers, then
$$ ab \le \frac{a^p}{p} + \frac{b^q}{q}. $$
Proof. The proof is trivial for $a = 0$ or $b = 0$ so we may assume $a$ and $b$ are both strictly positive. We see
$$ ab = e^{\log{ab}} = e^{\frac{1}{p}\log{a^p} + \frac{1}{q}\log{b^q}}. $$
As $e^x$ is convex, we have
$$ e^{\frac{1}{p}\log{a^p} + \frac{1}{q}\log{b^q}} \le \frac{e^{\log{a^p}}}{p} + \frac{e^{\log{b^q}}}{q} = \frac{a^p}{p} + \frac{b^q}{q}, $$
which completes the proof.
Theorem (Hölder's inequality). Let $f$ and $g$ be real-valued functions on the open interval $(a,b)$, $-\infty \le a < b \le \infty$ such that
$$ A \defeq \left(\int_a^b\left|f\right|^p\right)^\frac{1}{p} \quad \quad B \defeq \left(\int_a^b\left|g\right|^q\right)^\frac{1}{q} $$
are both well-defined with $\frac{1}{p} + \frac{1}{q} = 1$, then
$$ \int_a^b\left|fg\right| \le AB. $$
Proof. Again if $A$ or $B$ are $0$ then the proof is trivial, so we can assume $A, B > 0$. By taking $a$ and $b$ to be $\frac{|f|}{A}$ and $\frac{|g|}{B}$ in Young's inequality we have
$$ \frac{|f||g|}{AB} \le \frac{|f|^p}{pA^p} + \frac{|g|^q}{qB^q}. $$
Integrating both sides we see
$$ \frac{1}{AB}\int_a^b|fg| \le \frac{\int_a^b|f|^p}{pA^p} + \frac{\int_a^b|g|^q}{qB^q} = \frac{1}{p} + \frac{1}{q} = 1, $$
and the inequality follows.

As an application of Hölder's inequality we have

Proposition B. Let $\varphi$ be a positive real-valued continuous function on $(a,b)$ where $a,b \in [-\infty, \infty]$. Define $f(x)$ as
$$ f(x) = \int_a^b \varphi(t)t^{x-1}\mathrm{d}t, $$
then $f$ is logarithmically convex.
Proof. Fix $t \in (0,1)$ and let $\frac{1}{p} = t$ and $\frac{1}{q} = (1-t)$ so that $\frac{1}{p} + \frac{1}{q} = 1$. By Hölder's inequality, we see
$$ f\left(\frac{x}{p} + \frac{y}{q} \right) = \int_a^b\varphi(t)^{\frac{1}{p} + \frac{1}{q}}t ^{\frac{x}{p} - \frac{1}{p} + \frac{y}{q} - \frac{1}{q}}\mathrm{d}t = \int_a^b\left(\varphi(t)t^{x-1}\right)^{\frac{1}{p}} \left(\varphi(t)t^{y-1}\right)^{\frac{1}{q}}\mathrm{d}t \le f(x)^{\frac{1}{p}}f(y)^{\frac{1}{q}}. $$
Taking the logarithm on both sides shows
$$ \log f(tx + (1-t)y) \le t \log f(x) + (1-t) \log f(y), $$
and the proof is complete.

The Gamma and Beta Functions

Definition. The gamma function $\Gamma$ is defined for all real $x > 0$ by the following formula
$$ \Gamma(x) \defeq \int_0^{\infty}t^{x-1}e^{-t}\mathrm{d}t. $$
We first work on proving that $\Gamma$ is well-defined. Take $x > 0$ be fixed. Since we get issues at the $0$ endpoint when $x < 1$, we take $\epsilon > 0$ and we see that
$$ \int_{\epsilon}^1t^{x-1}e^{-t}\mathrm{d}t \le \int_{\epsilon}^1t^{x-1}\mathrm{d}t = \frac{1}{x} - \frac{\epsilon^x}{x}. $$
Taking $\epsilon \to 0$, we get $\frac{1}{x}$ as an upper bound. Next choose any integer $n$ such that $n > x + 1$, and as $\frac{t^n}{n!} < e^t$, we have
$$ t^{x-1}e^{-t} < n!{t^{x-n-1}}. $$
Now for any $\epsilon > 1$, we see
$$ \int_{1}^{\epsilon} t^{x-1}e^{-t}\mathrm{d}t < n! \int_{1}^{\epsilon}t^{x - n - 1}\mathrm{d}t = n! \left(\frac{\epsilon^{x-n}}{x-n} + \frac{1}{n-x}\right) < \frac{n!}{n-x}. $$
Since $x$ is fixed and the integral ${\int_{1}^{\epsilon} t^{x-1}e^{-t}\mathrm{d}t}$ is monotonically increasing as $\epsilon \to \infty$, we have that the following limit
$$ \int_{1}^{\infty}t^{x-1}e^{-t}\mathrm{d}t \defeq \lim_{\epsilon \to \infty} \int_{1}^{\epsilon}t^{x-1}e^{-t}\mathrm{d}t $$
converges. Thus
$$ \Gamma(x) = \int_0^{1}t^{x-1}e^{-t}\mathrm{d}t + \int_1^{\infty}t^{x-1}e^{-t}\mathrm{d}t. $$
is well defined for all $x > 0$.

Next we arrive at the most important property of $\Gamma$.

Proposition. For all $x > 0$ we have
$$ \Gamma(x+1) = x\Gamma(x). $$
Furthermore, for any positive integer $n$, $\Gamma(n) = (n-1)!$.
Proof. We compute using integration by parts:
$$ \Gamma(x+1) = \int_{0}^{\infty}t^xe^{-t}\mathrm{d}t = \at{\frac{t^x}{e^t}}{t=0} - \lim_{t \to \infty}\frac{t^x}{e^t} + x \int_{0}^{\infty}t^{x-1}e^{-t}\mathrm{d}t = x \Gamma(x). $$
For the second part, we have
$$ \Gamma(1) = \int_{0}^{\infty}e^{-t}\mathrm{d}t = \at{e^{-t}}{t=0} - \lim_{t\to \infty}e^{-t} = 1, $$
with the rest following from the recursive form $\Gamma(x+1) = x\Gamma(x)$.

Lastly, from Proposition B we get that $\Gamma$ is logarithmically convex.

Definition. The beta function $\Beta(x,y)$ is defined for all $x > 0$ and $y > 0$ as
$$ \Beta(x,y) \defeq \int_{0}^{1}t^{x-1}(1-t)^{y-1}\mathrm{d}t. $$

Let's run through some basic properties.

Proposition. We have $\Beta(1,y) = 1/y$.
Proof. The computation is trivial:
$$ \Beta(1,y) = \int_0^1(1-t)^{y-1}\mathrm{d}t = \frac{1}{y}. $$
Proposition. We have $\Beta(x+1,y) = \frac{x}{x+y}B(x,y)$.
Proof. We compute via integration by parts:
$$ \Beta(x+1,y) = \int_0^1\left(\frac{t}{1-t}\right)^x(1-t)^{x+y+1}\mathrm{d}t = \frac{x}{x+y}B(x,y). $$

And finally, from Proposition B, we have that $\Beta(x,y)$ is also logarithmically convex in $x$ for each fixed $y$.

The Bohr-Mollerup Theorem

Now that we know the basic properties of $\Gamma$ and $\Beta$, we can prove a powerful theorem that has many useful applications.

Theorem (Bohr–Mollerup). Let $f$ be a real-valued function defined on $(0, \infty)$ that satisfies the following properties:
  1. $f(1) = 1$,
  2. $f(x + 1) = xf(x)$, and
  3. $f$ is logarithmically convex.
then $f$ is uniquely determined.
Proof. We know such an $f$ exists as $\Gamma$ satisfies all three properties. Let $\varphi(x) \defeq \log{f}$, so that $\varphi(1) = 0$, and $\varphi(x+1) = \varphi(x) + \log{x}$. Furthermore $\varphi(n+1) = \log(n!)$ for any integer $n \ge 0$. Fix $0 < x < 1$ and let $n$ be any positive integer. Then by applying Proposition A we have
$$ \frac{\varphi(n+1) - \varphi(n)}{(n+1) - n} \le \frac{\varphi(n+1+x) - \varphi(n+1)}{(n+1+x) - (n+1)} \le \frac{\varphi(n+2) - \varphi(n+1)}{(n+2) - (n+1)}. $$
Simplifying this and using our properties of $\varphi$ we get
$$ \log{n!} - \log{(n-1)!} \le \frac{\varphi(x+n+1) - \log{n!}}{x} \le \log{(n+1)!} - \log{n!}. $$
Again simplifying the logarithms and $\varphi(x+n+1)$, we see
$$ x\log{n} \le \varphi(x) + \log\left(\frac{x(x+1)\cdots(x+n)}{n!}\right) \le x\log{(n+1)}. $$
And now we move some terms around:
$$ 0 \le \varphi(x) - \log\left(\frac{n!n^x}{x(x+1)\cdots(x+n)}\right) \le x\log{\left(1 + \frac{1}{n}\right)}. $$
Finally, we have
$$ \lim_{n \to \infty} x\log{\left(1 + \frac{1}{n}\right)} = 0, $$
and so
$$ f(x) = \lim_{n \to \infty}\left(\frac{n!n^x}{x(x+1)\cdots(x+n)}\right). $$
This implies that $f$ is uniquely determined on the interval $0 < x \le 1$. However from the recursion formula $f(x+1) = xf(x)$, we see $f$ is uniquely determined for all $x > 0$, and the proof is complete.

The Bohr-Mollerup theorem gives us some far reaching applications. We will present two of them in this blog post. Here is the first one.

Fix any $y > 0$ and define $f$ as

$$ f(x) = \frac{\Gamma(x + y)}{\Gamma(y)}\Beta(x,y) $$
Let's show $f$ satisfies the three conditions of the Bohr-Mollerup theorem. First we see
$$ f(1) = \frac{\Gamma(1+y)}{\Gamma(y)}\Beta(1,y) = y \frac{1}{y} = 1. $$
Next we have
$$ f(x + 1) = \frac{\Gamma(x+y+1)}{\Gamma(y)}\Beta(x+1, y) = \frac{\Gamma(x+y)}{\Gamma(y)} \frac{x(x+y)}{x+y}\Beta(x,y) = xf(x). $$
And finally $f$ is logarithmically convex as it's a product of logarithmically convex functions. Therefore $f = \Gamma$ and we conclude
$$ \Beta(x,y) = \frac{\Gamma(x)\Gamma(y)}{\Gamma(x + y)} $$

Now let's substitute $t = \sin^2{\theta}$ into the beta definition

$$ \Beta(x,y) \defeq \int_0^1t^{x-1}(1-t)^{y-1}\mathrm{d}t = 2\int_{0}^{\frac{\pi}{2}}\sin^{2x-1}{\theta}\cos^{2y-1}{\theta}\mathrm{d}\theta $$
Taking $x = y = 1/2$, we get $\Beta(x,y) = \pi$. Therefore $\Gamma(1/2) = \sqrt{\pi}$. Substituting $t = s^2$ in the integral definition of $\Gamma$, we get
$$ \Gamma(x) = 2\int_{0}^{\infty}s^{2x-1}e^{-s^2}\mathrm{d}s. $$
In particular with $\Gamma(1/2) = \sqrt{\pi}$, we derive the probability integral
$$ \int_{-\infty}^{\infty}e^{-s^2}\mathrm{d}s = \sqrt{\pi}. $$

We finish this section with one last application of the Bohr-Mollerup theorem. Define $f$ as

$$ f(x) \defeq \frac{2^{x-1}}{\sqrt{\pi}} \Gamma\left(\frac{x}{2}\right)\Gamma\left(\frac{x+1}{2}\right). $$
The function $f$ is logarithmically convex as its a product of logarithmically convex functions. From $\Gamma(1/2) = \sqrt{\pi}$, we have $f(1) = 1$. And finally,
$$ f(x+1) = \frac{2^x}{\sqrt{\pi}} \Gamma\left(\frac{x+1}{2}\right) \Gamma\left(\frac{x}{2} + 1\right) = \frac{2^x}{\sqrt{\pi}} \Gamma\left(\frac{x+1}{2}\right) \Gamma\left(\frac{x}{2}\right)\frac{x}{2} = xf(x). $$
Therefore $f = \Gamma$ by the Bohr-Mollerup theorem and we have proven the Legendre duplication formula
$$ \Gamma(x)\Gamma\left(x + \frac{1}{2}\right) = 2^{1-2x}\sqrt{\pi}\Gamma(2x). $$

The Euler-Mascheroni Constant

The Euler-Mascheroni constant $\gamma$ is defined by the following limit

$$ \lim_{n \to \infty}\left(1 + \frac{1}{2} + \frac{1}{3} + \cdots + \frac{1}{n} - \log{n}\right). $$
Usually the sum $ 1 + \frac{1}{2} + \frac{1}{3} + \cdots + \frac{1}{n}$ is denoted by $H_n$ and called the $n$th harmonic number.

Let's verify that this limit exists. Take $u_n$ to be

$$ u_n \defeq \int_0^1\frac{t}{n(n+t)}\mathrm{d}t = \int_0^1\frac{1}{n} - \frac{1}{n+t}\mathrm{d}t = \frac{1}{n} - \log{\frac{n+1}{n}}, $$
and observe that
$$ u_n \defeq \int_0^1\frac{t}{n^2+nt}\mathrm{d}t \le \int_0^1\frac{t}{n^2}\mathrm{d}t \le \int_0^1\frac{1}{n^2}\mathrm{d}t = \frac{1}{n^2}, $$
so that the sum $\sum_{n=1}^{\infty}u_n$ converges. Moreover,
$$ \left(1 + \frac{1}{2} + \frac{1}{3} + \cdots + \frac{1}{n} - \log{n}\right) = \sum_{i=1}^nu_i + \log{\frac{n+1}{n}}, $$
with taking the limit on both sides giving us our conclusion. According to Table 1 in Appendix A of [3] the first 40 decimals of $\gamma$ are
$$ \gamma = 0.5772156649015328606065120900824024310421\dots $$

Recall $\log{n} = \int_1^n\frac{1}{t}\mathrm{d}t$, and observe that since $\frac{1}{t}$ is monotone decreasing, we have

$$ \frac{1}{n+1} \le \int_{n}^{n+1}\frac{1}{t}\mathrm{d}t \le \frac{1}{n} $$
so that $\log{(n+1)} = \log{n} + \int_n^{n+1}\frac{1}{t}\mathrm{d}t$. Therefore,
$$ \log{n} + \frac{1}{n+1} \le \log{(n+1)} \le \log{n} + \frac{1}{n}. $$
We see that
$$ H_{n+1} - \log{(n+1)} \le H_{n+1} - \left(\log{n} + \frac{1}{n+1}\right) = H_{n} - \log{n}, $$
so that $\{H_n - \log{n}\}$ is a monotone decreasing sequence converging to $\gamma$, and we see that
$$ H_n - \log(n+1) \ge H_n - \left(\log{n} + \frac{1}{n}\right) = H_{n-1} - \log{n}, $$
so that $\{H_n - \log{(n+1)}\}$ is a monotone increasing sequence converging to $\gamma$. Since $(H_n - \log{n}) + \log{n} = H_n$ and $H_n = (H_n - \log(n+1)) + \log(n+1)$, we conclude that
$$ \gamma + \log{n} \le H_n \le \gamma + \log(n+1), $$
giving us a descent approximation to $H_n$. Let's tie in a probability example:

Example. Suppose there are exactly $n$ different toys that we are collecting from cereal boxes. Each toy has a $\frac{1}{n}$ chance of appearing in any box. How many cereal boxes are we expected to buy in order to get all $n$ different toys?

Let's start off with defining a sample space $S$. Let's label the toys $1$ through $n$, and let's take $S \defeq \{1, 2, \dots, n\}^{\infty}$. That is $S$ is the set of infinite tuples made up of integers inclusively between $1$ and $n$. The sample space $S$ is in an one-to-one correspondence with indefinitely purchasing cereal boxes and writing the toy numbers down in order.

Now take $X \colon S \to [0, \infty]$ to be defined as the number of times it takes to get all $n$ toys. Therefore $X$ is a random variable and the question we posed is asking us to find $\mathbb{E}(X)$.

Let's take $X_i \colon S \to [0, \infty]$ to be the number of times it takes to get a different toy once we have $i - 1$ unique toys already.

Let's do a quick example. Suppose $s \in S$ is $(1,2,4,4,5,4,4,4,3,\dots)$ with $n=5$. We see $X_1(s) = 1$, which is always the case, $X_2(s) = 1$, $X_3(s) = 1$, $X_4(s) = 2$, and $X_5(s) = 4$. And that

$$ X(s) = X_1(s) + \cdots + X_5(s) = 9. $$
Overall, the probability of $X_i = k$ is given by
$$ \left(\frac{i-1}{n}\right)^{k-1} \frac{n-(i-1)}{n}, $$
that is we collect $k-1$ toys that we already own with the $k$th toy being not in our collection. This just follows the geometric distribution, but we can derive the expectation of $X_i$ directly. Take $p$ to be $\frac{n-(i-1)}{n}$ so that the above formula reads as $(1-p)^{k-1}p$. Now
$$ \mathbb{E}(X_i) = \sum_{k=1}^{\infty}kp(1-p)^{k-1} = -p \sum_{k=1}^{\infty}\lib[]{p}(1-p)^k = -p \lib[]{p}\left(\frac{1}{p} - 1\right) = \frac{1}{p}. $$
Finally, we see
$$ \mathbb{E}(X) = \mathbb{E}(X_1) + \cdots + \mathbb{E}(X_n) = n\left(1 + \frac{1}{2} + \cdots + \frac{1}{n}\right), $$
giving us
$$ n\gamma + n\log{n} \le \mathbb{E}(X) \le n\gamma + n\log{(n+1)}. $$

The Weierstrass Representation of Gamma

In this final section we derive another representation of $\Gamma$ which is usually the definition given by book on complex analysis such as [1].

Recall from the proof of the Bohr-Mollerup theorem that

$$ \Gamma(x) = \lim_{n \to \infty}\left(\frac{n!n^x}{x(x+1)\cdots(x+n)}\right). $$
Take $\Gamma_n(x)$ to the $n$th element in this limit, and observe
$$ n^x = e^{x\log{n}} = e^{x\log{n} - xH_n + xH_n} = e^{-x(H_n - \log{n})}e^{x}e^{x/2}\cdots{e^{x/n}}, $$
and so $\Gamma_n(x)$ equals
$$ e^{-x(H_n - \log{n})} \frac{1}{x}\frac{e^x}{1+x} \frac{2e^{x/2}}{2+x}\cdots \frac{ne^{x/n}}{n + x} = e^{-x(H_n - \log{n})} \frac{1}{x}\frac{e^x}{1+x} \frac{e^{x/2}}{1+x/2} \cdots \frac{e^{x/n}}{1 + x/n}. $$
Letting $n \to \infty$ gives us
$$ \label{eq:w}\tag{1} \Gamma(x) = \frac{e^{-x\gamma}}{x}\prod_{n=1}^{\infty} \left(1+\frac{x}{n}\right)^{-1}e^{x/n}, $$
or more naturally,
$$ \frac{1}{\Gamma(x)} = xe^{\gamma{x}} \prod_{n=1}^{\infty}\left(1+\frac{x}{n}\right)e^{-x/n} $$
Taking the log of \eqref{eq:w} we get
$$ \tag{2}\label{eq:w1} \log{\Gamma(x)} = -\gamma{x} - \log{x} + \sum_{n=1}^{\infty} \left[\frac{x}{n} - \log\left(1+\frac{x}{n}\right)\right]. $$
Differentiating the expression \eqref{eq:w1} we get
$$ \tag{3}\label{eq:w2} \frac{\Gamma'(x)}{\Gamma(x)} = -\gamma - \frac{1}{x} + \sum_{n=1}^{\infty} \left(\frac{1}{n} - \frac{1}{n+x}\right) = -\gamma - \frac{1}{x} + \sum_{n=1}^{\infty}\frac{x}{n(x+n)}. $$
We can verify differentiating term by term in \eqref{eq:w2} works by seeing that for any $R > 0$ and any $x \in (0, R)$, we have
$$ \sum_{n=1}^{\infty}\frac{x}{n(x+n)} \le \sum_{n=1}^{\infty}\frac{x}{n^2} \le \sum_{n=1}^{\infty}\frac{R}{n^2} $$
which converges. Therefore the series in \eqref{eq:w2} is uniformly convergent (See Theorem 7.7 on page 152 in [4] for the relevant theorem). It follows almost immediately from \eqref{eq:w2} that
$$ \Gamma'(1) = -\gamma $$

References

  1. Lars Ahlors. Complex Analysis. McGraw-Hill, third edition, 1979.
  2. Emil Artin. The Gamma Function. Dover, 2015.
  3. Donald E. Knuth. The Art of Computer Programming, volume 1. Addison Wesley, third edition, 1997.
  4. Walter Rudin. Principles of Mathematical Analysis. McGraw-Hill, third edition, 1976.
  5. E. T. Whittaker and G. N. Watson. A Course of Modern Analysis. Cambridge University Press, fifth edition, 2021.