Linear Transformations and the Scaling of Lebesgue Measure
2025-04-06·18 min read·
How the determinant controls volume distortion, and why translation invariance uniquely characterizes Lebesgue measure up to a scalar.
The problem
Throughout this article, ∣E∣ denotes the outer Lebesgue measure of a set
E in its ambient Euclidean space. When E is a rectangle, ball, or other
elementary region, this agrees with its usual volume. Thus the same notation
is used whether or not the set has already been shown to be measurable.
Let Φ:Rn→Rn be a linear map and E⊆Rn be any set. To determine ∣Φ(E)∣, we need to investigate how Φ affects the measure of E. One intuitive conclusion is that
∣Φ(E)∣=∣detΦ∣∣E∣.
Its first origin is not measure theory, but the geometric meaning of
the determinant.
From the geometric point of view, if v1,…,vn∈Rn, then ∣det(v1,…,vn)∣ is the n-dimensional volume of the parallelepiped spanned by these vectors.
Thus, if A:Rn→Rn is linear, then A sends the unit
cube to the parallelepiped spanned by the column vectors of A, and the volume
of that parallelepiped is ∣detA∣.
The determinant records the signed volume distortion of a linear map.
Image: Claudio Rocchini, Wikimedia Commons, CC BY 3.0/GFDL.
This is the finite-dimensional geometric core of the theorem.
For rectangles, parallelepipeds, and simple polyhedral regions, the determinant
already explains volume distortion. The deeper question is whether the same
formula remains valid for an arbitrary set E⊆Rn:
∣A(E)∣=∣detA∣∣E∣.
This is no longer merely a problem of linear algebra. It requires a theory of
volume that applies to irregular sets. This is exactly the role of Lebesgue
measure. Lebesgue's measure-theoretic viewpoint makes it possible to assign
volume not only to elementary regions, but also to much more complicated subsets
of Euclidean space.
Therefore, the theorem can be read as follows: the classical determinant
formula for parallelepipeds extends to all subsets of Euclidean space, once
volume is interpreted as outer Lebesgue measure.
However, why does this make sense? We will demonstrate this fact in detail using two proofs: one extensively relies on analysis computation, and one is more elegant when we just want a few properties from the algebra of measure.
First Proof
Lemma
Suppose Φ:Rn→Rn is a Lipschitz mapping. If E⊆Rn is Lebesgue measurable, then Φ(E) is measurable.
Proof.
Step 1. If ∣E∣=0, then ∣Φ(E)∣=0. Let ε>0 be arbitrary, there exist cubes {Ci} covering E such that
i∑∣Ci∣<ωnnC2nε,
where ωn=Γ(n/2+1)πn/2 is the volume of the unit ball B(0,1)⊆Rn and Ci=[ai,bi]n. We have
Let (xn)⊆Φ(F) such that xn→x∈Rn. For each n, choose yn∈F such that Φ(yn)=xn. Since F is compact, there is a subsequence (ynk) converging to y∈F. Then xnk→Φ(y)∈Φ(F). As the limit is unique, x=Φ(y)∈Φ(F). Hence Φ(F) is compact.
Step 3. Let H be an Fσ set in Rn; then Φ(H) is an Fσ set in Rn.
Write H=⋃i=1∞Fi, where Fi are closed. For each i, define Kij=Fi∩B(0,j), so that
Fi=j=1⋃∞Kij.
Since Fi is closed, Kij is a closed subset of the compact set B(0,j), hence compact. It follows that Φ(Kij) is compact for all i,j, and one can write
Φ(H)=i,j=1⋃∞Φ(Kij).
Therefore Φ(H) is Fσ.
Step 4. Let A be measurable. By inner regularity of Lebesgue measure,
∣A∣=sup{∣K∣∣K⊆A,K compact}.
For each n, choose a compact subset Kn⊆A such that
∣Kn∣>∣A∣−n1.
Set H=⋃n=1∞Kn; then H is Fσ. Since H⊆A, one has ∣H∣≤∣A∣, and ∣H∣≥∣Kn∣≥∣A∣−n1 for all n, so ∣H∣=∣A∣. Then ∣A∖H∣=0, so N=A∖H is a null set.
Step 5. Write E=H∪N, where H is Fσ and N is a null set. Let A⊆Rn be any subset. Since Φ(H)⊆Φ(E), we have Φ(E)c⊆Φ(H)c. Since H is Fσ, it follows from Step 3 that Φ(H) is Fσ, hence Borel and therefore measurable. We have the estimate
∣Φ(E)c∩A∣≤∣Φ(H)c∩A∣.
By Step 1, Φ maps null sets to null sets, so ∣Φ(N)∣=0. Then
The reverse inequality follows by monotonicity of outer measure. Hence Φ(E) satisfies the Carathéodory condition.
■
Now let A:Rn→Rn be any linear map. We show A is Lipschitz. Writing x=∑i=1nxiei,
∥Ax∥=i=1∑nxiAei≤i=1∑n∣xi∣∥Aei∥.
Applying Cauchy–Schwarz,
∥Ax∥≤(i∑∥Aei∥2)1/2(i∑xi2)1/2=C∥x∥,
where C=(∑i∥Aei∥2)1/2<∞. Thus A is Lipschitz.
We state the SVD theorem, to be proved later using the spectral theorem.
Theorem (SVD)
Let A∈Mm×n(R). Then there exist U∈O(m), V∈O(n), and a diagonal matrix Σ with nonnegative entries such that A=UΣVT.
SVD decomposes a linear map into rotations and coordinate-axis scaling.
Image: Georg-Johann, Wikimedia Commons, CC BY-SA 3.0/GFDL.
Historically, this decomposition was discovered independently by Eugenio
Beltrami in 1873 and Camille Jordan in 1874 in the context of bilinear forms.
James Joseph Sylvester later arrived at a related decomposition for real square
matrices and called the singular values the canonical multipliers of the matrix.
In the twentieth century, the theory was extended and connected with integral
operators by Schmidt and Weyl, while Eckart and Young made the decomposition
central to low-rank approximation.
With this theorem, it suffices to verify the result when Φ is a translation, rotation, or diagonal (scaling) matrix.
Case 1: Φ is a rotation, i.e. Φ∈{A:Rn→Rn∣A⋅A∗=In}.
Step 1: Φ is an isometry. For x,y∈Rn,
⟨Φx,Φy⟩=(Φx)T(Φy)=xTΦTΦy=xTy=⟨x,y⟩.
This implies ∥Φx∥=∥x∥. Since Φ is invertible, Φ−1∈O(n) and thus Φ−1 is also an isometry.
Step 2: Φ(B(xi,r))=B(Φ(xi),r) and covers are preserved. Since Φ is invertible,
It follows that {B(xi,r)} covers E if and only if {B(Φ(xi),r)} covers Φ(E).
Step 3: ∣Φ(E)∣=∣E∣. Let ε>0. Choose open balls {B(xi,r1)} covering E such that ∑i∣B(xi,r1)∣<∣E∣+ε. Then {B(Φ(xi),r1)} covers Φ(E). Since Φ is an isometry, ∣B(Φ(xi),r1)∣=∣B(xi,r1)∣, so
Now choose open balls {B(yi,r2)} covering Φ(E) such that ∑i∣B(yi,r2)∣<∣Φ(E)∣+ε. Then {B(Φ−1(yi),r2)} covers E. Since Φ−1∈O(n) is also an isometry,
Case 2: Φ=diag[σ1,…,σn]. Let R=∏i=1n(ai,bi] be any rectangle. Then
Φ(R)=i=1∏n(σiai,σibi].
One can assume σi>0: if σi=0 the image collapses to lower dimension giving ∣Φ(E)∣=0, and σi<0 merely flips and scales R without affecting the measure calculation. Since Φ(R) is a rectangle,
∣Φ(R)∣=i=1∏n(σibi−σiai)=∣detΦ∣⋅∣R∣.
Since Φ is bijective, {Ri} covers E if and only if {Φ(Ri)} covers Φ(E). Therefore
Case 3: Φ(x)=x+x0, where x0=(x1,…,xn)∈Rn. Let R=∏i=1n(ai,bi] be any rectangle. Then Φ(R)=∏i=1n(ai+xi,bi+xi], and since Φ(R) is a rectangle,
∣Φ(R)∣=i=1∏n(bi+xi−ai−xi)=∣R∣.
Since Φ is bijective, {Ri} covers E if and only if {Φ(Ri)} covers Φ(E). Therefore ∣Φ(E)∣=∣E∣.
For any linear map Φ:Rn→Rn, the SVD theorem gives Φ=UΣVT with U,V∈O(n) and Σ diagonal. Let E⊆Rn be arbitrary. Since U,V are isometries,
Since ∣detU∣=∣detV∣=1, we have ∣detΦ∣=∣detΣ∣. Hence
∣Φ(E)∣=∣detΦ∣⋅∣E∣.
Second Proof
A more standard way to prove such a result is to begin with a small class of sets
where the formula is transparent, and then extend it to a larger
σ-algebra.
For example, one first verifies the formula on half-open rectangles or cubes.
These sets generate the Borel σ-algebra of Rn. Measure
theory then supplies extension tools which allow a statement known on the
generating class to be promoted first to Borel sets and then, by the outer
measure definition, to all subsets.
This is the conceptual role of the Carathéodory extension principle and related
monotone-class or π-λ arguments. In this section, we need to define some special algebras that work quite effectively on measure and will prove one theorem and two lemmas in total.
A nonempty collection of subsets P⊂2X is a π-system if
A,B∈P⇒A∩B∈P.
A collection of subsets L⊆2X is a λ-system if:
X∈L.
A,B∈L and A⊆B implies B∖A∈L.
If {Ak}⊆L and Ak⊆Ak+1 for all k, then ⋃k=1∞Ak∈L.
Theorem (pi-lambda Theorem)
If P is a π-system and L is a λ-system with P⊆L, then σ(P)⊆L.
Proof.
Let
S=L′⊇PL′ is a λ-system⋂L′
be the smallest λ-system containing P. Then P⊆S⊆L, and S is itself a λ-system by construction.
Claim:S is a π-system.
Fix any A∈S and define A={C⊆X∣A∩C∈S}. One can verify directly that A is a λ-system.
Step 1: First take A∈P. For any P∈P, since P is a π-system, A∩P∈P⊆S, so P∈A. Hence P⊆A, and since S is the smallest λ-system containing P, we get S⊆A. This means A∩C∈S for all A∈P and all C∈S.
Step 2: Now take any A∈S. By Step 1, for any P∈P, A∩P∈S, so P∈A. Hence P⊆A, and again S⊆A. In particular, for any B∈S, B∈A, which means A∩B∈S.
Hence S is closed under finite intersections, i.e., a π-system.
S is a σ-algebra. Since S is a λ-system, X∈S and X∖X=∅∈S. If A∈S, then Ac=X∖A∈S. For countable unions: given {Ak}⊆S, set Bk=A1∪⋯∪Ak. Since S is a π-system, it is closed under finite unions (by De Morgan and closure under complements and finite intersections), so Bk∈S. Since Bk↗⋃kAk and S is a λ-system, ⋃kAk∈S. Hence S is a σ-algebra.
Therefore σ(P)⊆σ(S)=S⊆L.
■
Historically, this result is closely associated with Eugene Dynkin and is also
known under the name Sierpiński-Dynkin theorem.
Lemma
Let A⊆Rn be a compact subset whose intersection with {c}×Rn−1 has (n−1)-dimensional measure zero for every c∈R. Then A has measure zero.
Proof.
Since A is compact, there exists a closed interval [a,b] such that A⊆[a,b]×Rn−1. For each c∈[a,b], denote Ac={x∈Rn−1∣(c,x)∈A}.
Let ε>0 be arbitrary. Since ∣Ac∣=0 in Rn−1, there exist finitely many (n−1)-dimensional open cubes {C1,…,Ck} covering Ac with ∑i∣Ci∣<ε. Set Uc=C1∪⋯∪Ck.
Claim: There exists an open interval Jc∋c such that A∩(Jc×Rn−1)⊆Jc×Uc.
Suppose not. Then there exists a sequence (ci,xi)∈A with ci→c and xi∈/Uc. Since A is compact, passing to a subsequence, (ci,xi)→(c,x) for some (c,x)∈A. In particular x∈Ac. But Uc is open and xi∈/Uc for all i, so x∈/Uc, contradicting Ac⊆Uc. This proves the claim.
Since {Jc}c∈[a,b] is an open cover of the compact set [a,b], it admits a finite subcover {Jc1,…,Jcm}. If necessary, we can shrink overlapping parts so that ∑k∣Jck∣≤2(b−a). Then
So V is the graph of the continuous function F:Rn−1→R. We apply the previous lemma: for any c∈R, the slice V∩({c}×Rn−1) consists of at most one point (since xi is uniquely determined by the remaining coordinates), which has (n−1)-dimensional measure zero. Since V is closed and every closed set is a countable union of compact sets, and each compact slice has (n−1)-measure zero, it follows that V has Lebesgue measure zero.
■
There is an even more structural interpretation. The Lebesgue measure ∣E∣ is
translation invariant:
∣E+x∣=∣E∣.
If A∈GL(n,R) and we define
ν(E)=∣AE∣,
then ν is again a translation-invariant measure on Rn. Hence
ν should be a constant multiple of Lebesgue measure:
ν(E)=c∣E∣.
The constant is determined by evaluating both measures on the unit cube:
c=ν([0,1)n)=∣A[0,1)n∣=∣detA∣.
Thus,
∣AE∣=∣detA∣∣E∣.
This viewpoint is closely related to Haar measure. On the additive group
(Rn,+), Lebesgue measure is the canonical translation-invariant
measure, unique up to multiplication by a positive constant.
We now state the main theorem again and prove it. Let Φ:Rn→Rn be
an affine map Φ(x)=Ax+b, where A∈Mn(R) and
b∈Rn. Then
∣Φ(U)∣=∣det(A)∣∣U∣for all U⊆Rn.
Proof.
Reduction. Since Φ(U)=A(U)+b and outer Lebesgue measure is translation invariant, ∣A(U)+b∣=∣A(U)∣. So it suffices to prove ∣A(U)∣=∣det(A)∣∣U∣ for all U⊆Rn.
Case 1: det(A)=0. Then im(A) is a proper affine subspace of Rn, which has measure zero by the previous lemma. Since A(U)⊆im(A), monotonicity gives ∣A(U)∣=0=∣det(A)∣∣U∣.
Case 2: det(A)=0. Then A is invertible. We proceed in five steps.
Step 1: Define a candidate measure ν. Let ν:B(Rn)→[0,+∞] be defined by ν(U)=∣A(U)∣. Clearly ν(∅)=0. For any countable disjoint collection {Ui}⊆B(Rn), since A is a bijection,
ν(i⨆Ui)=i⨆A(Ui)=i∑∣A(Ui)∣=i∑ν(Ui).
Hence ν is a measure on B(Rn).
Step 2: ν is translation invariant. For any U∈B(Rn) and x∈Rn, since A is linear,
ν(U+x)=∣A(U+x)∣=∣A(U)+Ax∣=∣A(U)∣=ν(U),
where the middle equality uses translation invariance of outer measure.
Step 3: ν agrees with ∣detA∣∣⋅∣ on half-open cubes.
For any half-open cube Q=∏i=1n[ai,bi) of side length s, write Q=a+s⋅[0,1)n. By Step 2, ν(Q)=ν(s⋅[0,1)n). Tile [0,1)n by mn disjoint half-open cubes {Qj} of side m1. By translation invariance, all ν(Qj) are equal, so ν([0,1)n)=mnν([0,m1)n). A scaling argument gives ν([0,s)n)=snν([0,1)n) for rational s, and monotonicity extends this to all s>0. The image A([0,1)n) is a parallelepiped whose volume is ∣detA∣ by the geometric interpretation of the determinant. Hence
ν(Q)=sn∣detA∣=∣detA∣∣Q∣.
Step 4: Conclude ν(U)=∣detA∣∣U∣ on B(Rn). Both ν and U↦∣detA∣∣U∣ are σ-finite Borel measures that agree on all half-open cubes. Half-open cubes form a π-system generating B(Rn). Define
D={U∈B(Rn)∣ν(U)=∣detA∣∣U∣}.
One checks that D is a λ-system: Rn∈D by Step 3; if U⊆V are in D and both have finite measure then V∖U∈D by additivity; and D is closed under increasing unions by monotone convergence. Since D contains the π-system of half-open cubes, the π-λ theorem gives B(Rn)=σ(half-open cubes)⊆D. Hence
ν(U)=∣detA∣∣U∣for all U∈B(Rn).
Step 5: Extend to all subsets. For arbitrary U⊆Rn,
∣U∣=inf{∣V∣∣V⊇U,V∈B(Rn)}.
Since A is a bijection, A maps Borel sets to Borel sets and U⊆V⇒A(U)⊆A(V). Therefore
[1] J. Serra, Analysis II, ETH Zürich lecture notes.
[2] J. M. Lee, Introduction to Smooth Manifolds, 2nd ed., Grad. Texts in Math., vol. 218, Springer, New York, 2013.
[3] A.-L. Cauchy, “Mémoire sur les fonctions qui ne peuvent obtenir que deux
valeurs égales et de signes contraires par suite des transpositions opérées
entre les variables qu'elles renferment,” 1812.
[4] H. Lebesgue, Intégrale, longueur, aire, Annali di Matematica Pura ed
Applicata, 1902.
[5] C. Carathéodory, Vorlesungen über reelle Funktionen, Teubner, 1918.
[6] A. Haar, “Der Massbegriff in der Theorie der kontinuierlichen Gruppen,”
Annals of Mathematics, 34 (1933), 147–169.
[7] P. R. Halmos, Measure Theory, Springer, 1950.
[8] G. B. Folland, Real Analysis: Modern Techniques and Their Applications,
2nd ed., Wiley, 1999.
[9] G. W. Stewart, “On the Early History of the Singular Value Decomposition,”
SIAM Review, 35(4), 1993, pp. 551–566.
[10] E. Beltrami, “Sulle funzioni bilineari,” Giornale di Matematiche ad Uso
degli Studenti Delle Universita, 11, 1873, pp. 98–106.
[11] C. Jordan, “Mémoire sur les formes bilinéaires,” Journal de Mathématiques
Pures et Appliquées, 19, 1874, pp. 35–54.
[12] C. Eckart and G. Young, “The approximation of one matrix by another of lower
rank,” Psychometrika, 1, 1936, pp. 211–218.
[13] G. H. Golub and C. Reinsch, “Singular Value Decomposition and Least Squares
Solutions,” Numerische Mathematik, 14, 1970, pp. 403–420.