Courses and textbooks, at least in undergrad, often gloss over the details of the Legendre transformation, which converts convex functions of one variable into another convex function of the “conjugate” variable. Used ubiquitously in physics, from thermodynamics to quantum field theory, this mathematical method plays a central role in connecting some of the most fundamental concepts, and yet, at least to me for a while, was a mysterious black-box procedure. I am now going to attempt to illuminate this technique.

Let’s begin with the technical definition:

Definition Given a convex function $f: I \subset \mathbb{R} \to \mathbb{R}$ of a variable $x$, the Legendre transform, or convex conjugate, $f^* : I^* \to \mathbb{R}$ in terms of the conjugate variable $x^*$ is given by

$$ \begin{equation} f^*(x^*) = \sup_{x \in I} \big( x^* x - f(x)\big), \ \ x^* \in I^* \end{equation} $$

where

$$ I^* = \left\{ x^* \in \mathbb{R} : \sup_{x \in I} \big( x^* x - f(x)\big) < \infty \right\} $$

This definition generalizes to convex functions of higher dimensions by replacing $x^{*}x$ with $\langle x^*, x \rangle$, the appropriate inner product.

Below is an interactive plot (made with Julia) demonstrating how $f^*(p)$ depends on the maximum of the transformed function $px-f(x)$ for a given function $f(x) = \frac{1}{2}ax^2 + c$, where here $a = 1$ and $c = 4$. The derivation of these functions will be made clear below, but this might give some intuition into the transformation.