Courses and textbooks, at least in undergrad, often gloss over the details of *the Legendre transformation*, which converts *convex* functions of one variable into another convex function of the “*conjugate*” variable. Used ubiquitously in physics, from thermodynamics to quantum field theory, this mathematical method plays a central role in connecting some of the most fundamental concepts, and yet, at least to me for a while, was a mysterious black-box procedure. I am now going to attempt to illuminate this technique.

Let’s begin with the technical definition:

DefinitionGiven a convex function $f: I \subset \mathbb{R} \to \mathbb{R}$ of a variable $x$, the Legendre transform, or convex conjugate, $f^* : I^* \to \mathbb{R}$ in terms of the conjugate variable $x^*$ is given by$$ \begin{equation} f^*(x^*) = \sup_{x \in I} \big( x^* x - f(x)\big), \ \ x^* \in I^* \end{equation} $$

where

$$ I^* = \left\{ x^* \in \mathbb{R} : \sup_{x \in I} \big( x^* x - f(x)\big) < \infty \right\} $$

This definition generalizes to convex functions of higher dimensions by replacing $x^{*}x$ with $\langle x^*, x \rangle$, the appropriate inner product.

Below is an interactive plot (made with Julia) demonstrating how $f^*(p)$ depends on the maximum of the transformed function $px-f(x)$ for a given function $f(x) = \frac{1}{2}ax^2 + c$, where here $a = 1$ and $c = 4$. The derivation of these functions will be made clear below, but this might give some intuition into the transformation.