-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathjensen.lean
536 lines (426 loc) · 25.2 KB
/
jensen.lean
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
-- jensen.lean
-- @authors: Colin J Fan, Brandon H Gomes, Yasa Syed
/-!
# Jensen's Inequality
In this file, we prove a famous and important inequality. The file is self-contained and
requires no other definitions other than those in the Lean core.
We first state the classical Jensen's Inequality:
- Theorem. Let `μ` be a positive measure on a σ-algebra `𝔐` in a set `Ω`, so that
`μ(Ω) = 1`. If `f` is a real function in `L¹(μ)` such that `∀x ∈ Ω, f(x) ∈ (a,b)` and if
`φ` is convex on `(a,b)`, then `φ (∫_Ω f dμ) ≤ ∫_Ω (φ ∘ f) dμ`. More succinctly,
`φ (∫ f) ≤ ∫ (φ ∘ f)`.
The inequality follows from certain facts about integration, real numbers, convex
functions, ... etc. The theorem we prove is a strict generalization in which we state the
theorem in as high of an abstract form as possible from facts about subtraction and
partial orders. We will define a type-theoretic analogue of integration theory and of the
theory of convex functions. More details are discussed below.
This theorem has some important corollaries for real analysis, measure theory, probability
theory, statistics, and statistical physics:
- Corollary 1 (AM-GM). Fix `n ∈ ℕ`. It follows that for any `a₁, a₂, ..., aₙ ∈ ℝ₀` that
`ⁿ√(a₁a₂ ⋯ aₙ) ≤ (a₁ + a₂ + ⋯ + aₙ) / n`.
- Corollary 2 (Young's Inequality). Fix `a b ∈ ℝ₀` and `p q ∈ ℝ` such that `p > 1`, and
`1/p + 1/q = 1`. Then, `ab ≤ a^p / p + b^q / q`.
- Corollary 3 (Hölder's Inequality). Fix `p q ∈ ℝ` such that `p > 1`, and `1/p + 1/q = 1`.
If `X` is a measure space, with measure `μ` and `f`,`g` measurable functions on `X` with
codomain `[0,∞]`, then `∫ fg dμ ≤ (∫ f^p dμ)^1/p * (∫ g^q dμ)^1/q` where the integrals
are taken over `X`.
- Corollary 4 (Minkowski's Inequality). With the same assumptions as Hölder's Inequality
we have `(∫ (f+g)^p dμ)^1/p ≤ (∫ f^p dμ)^1/p + (∫ g^p dμ)^1/p`. In particular, for
`p > 1`, the map `f ↦ (∫ |f|^p dμ)^1/p` is a norm for `f` (up to a relevant equivalence
relation) such that all `f` have finite `p`-norm.
- Corollary 5 (Chandler): `e^(E[X]) ≤ E[e^X]`.
- Corollary 6 (Gibb's Inequality). `D_KL(P||Q) ≥ 0`.
- Corollary 7: Rao-Blackwell Theorem.
Now, we begin our discussion with power sets.
-/
--———————————————————————————————————————————————————————————————————————————————————————--
/--
`Power X` (`𝒫 X`) The powerset of a type.
Given a type `X : 𝒰i` we can consider the type of functions from `X` into a fixed type
universe `𝒰j`.
One can view this type of functions `𝒫 X` as the space of characteristic functions on `X`.
Classically, we can associate any subset `A ⊆ S` to an appropriate characteristic function
`χ_A : S → {0,1}` which takes value `1` on `A` and `0` elsewhere in `S`. This association
is a "bijective correspondence", that is, we can freely change our interpretation of the
subset `A` either as an actual subset or as a characteristic function. In the language of
types there is no natural notion of "subset" but there is a natural (but non-canonical
since it depends on the universe level `j`) notion of a characteristic function.
In general `𝒫 X` behaves like the classical power set of `X`, i.e. a complete lattice:
- Bottom: `⊥ := λ x, 𝟘`
- Top: `⊤ := λ x, 𝟙`
- Union: `P ∪ Q := λ x, P x ∨ Q x`
- Intersection: `P ∩ Q := λ x, P x ∧ Q x`
- Aribtrary Union: `⋃ ℱ := λ x, Σ i, (ℱ i) x`
- Aribtrary Intersection: `⋂ ℱ := λ x, Π i, (ℱ i) x`
where `𝟘` and `𝟙` are the empty type (the type with no terms) and unit type (the type with
one canonical term) respectively. The `⊥` element can be called the "empty set" and the
`⊤` element can be called the "total space" and can be associated to `X` itself.
Classically one can think of the union as the sum of characteristic functions `χ_P + χ_Q`
and the intersection as the product `χ_P * χ_Q`. There is a similar story for the
arbitrary union/intersection.
NB: To know that a term `x : X` is contained in the subset `A : 𝒫 X`, we need a witness of
the form `w : A x`, since `A` is a function and concatenation is function application.
We will be using the `Power` construction many times throughout since partial functions
play an important role in the `jensen_inequality` and also in the theory of integration in
general; partial functions only make sense if we have defined the notion of power set.
-/
definition {𝒰i 𝒰j} Power (X : Type 𝒰i)
:= X → Type 𝒰j
notation `𝒫` X := Power X
/--
`const b` (`↓b`) The constant function at a point.
Given a type `X` and a pointed type `⟨Y, b⟩`, we can consider the constant function which
takes every point of `X` to the basepoint `b`.
Such functions are important for the `jensen_inequality`, and are also important in the
study of pointed spaces in general.
-/
definition {𝒰x 𝒰y} const {X : Type 𝒰x} {Y : Type 𝒰y}
:= λ b:Y, (λ x:X, b)
notation `↓`:max y:max := const y
section difference_domain --—————————————————————————————————————————————————————————————--
universes 𝒰y
variables (Y : Type 𝒰y)
/--
`DifferenceDomain Y extends has_zero Y, has_sub Y` A good place to do subtraction.
A `DifferenceDomain` is a minimal structure that admits a version of subtraction like the
one we are familiar with from abelian groups. The main property that distingushes
subtraction is that image of the diagonal always vanishes. This property is called
`vanishing_diagonal` below.
There are other axioms which would make sense to add like the following two:
- `zero_is_right_id : Π y, y - 0 = y`
- `sub_associativity : Π a b c, a - (b - c) = (a - b) - (0 - c)`
which make the subtraction more like the inverse of some addition operation like
`a + b := a - (0 - b)`, but such details are not considered here. For the
`jensen_inequality`, we need only the `vanishing_diagonal` property.
-/
class DifferenceDomain extends has_zero Y, has_sub Y
:= (vanishing_diagonal : Π y, y - y = zero)
/--
`OrderedDifferenceDomain Y extends has_le Y, DifferenceDomain Y`
We will need an order structure on `Y` to state and prove the `jensen_inequality` so we
add it here. There are no asumptions on the behavior of `≤` like reflexivity/transitivity.
-/
class OrderedDifferenceDomain
extends has_le Y, DifferenceDomain Y
end difference_domain --—————————————————————————————————————————————————————————————————--
section reduction --—————————————————————————————————————————————————————————————————————--
universes 𝒰y 𝒰x
variables {Y : Type 𝒰y} {X : Type 𝒰x} (ℱ : 𝒫 (X → Y))
/--
`Reduction` A generalized functional.
Given types `X` and `Y`, a `Reduction` is a partial function from the type of functions
`X → Y` to the type `Y`. The domain of this function is denoted `ℱ` throughout. One reads
the definition of reduction as "a type which takes a function `f : X → Y` and a proof that
`f` is contained in the family `ℱ` (`p : ℱ f`, see `Power` above) and sends this pair to
term of type `Y`".
Examples of reductions:
- evaluation at a point (`x₀ : X`, `eval : (X → Y) → Y := λ f, f x₀`)
(`ℱ` can be taken to be the total or "everywhere true" predicate)
- integration (`∫_Ω (—) dμ : (X → Y) → Y`)
(`ℱ` is the appropriate subset corresponding to the integrable functions on `Ω`)
- limit of a sequence (`lim : (ℕ → R) → R`)
(`ℱ` is the appropriate subset corresponding to the convergent sequences)
For the `jensen_inequality` we will be using something like the second example as we want
to imitate a weak kind of integration.
For a fixed reduction we call a function `f` *reducible* if it is contained in the family
corresponding to that reduction, i.e. `reducible f := ℱ f`. We say that a reduction
*admits (a reduction of) a function `f`* if `f` is reducible with respect to the family
corresponding to the given reduction.
-/
definition Reduction
:= Π f, ℱ f → Y
/--
`left_closed_at ℱ f g` Left closedness of `g` at `f`.
We say that a function `g : Y → Y` is *left closed at `f` with respect to `ℱ`* when the
reducibility of `f` implies the reducibility of the composition `g ∘ f` with respect to
the fixed family `ℱ`.
-/
definition left_closed_at (f : X → Y) (g : Y → Y)
:= ℱ f → ℱ (g ∘ f)
/--
`left_closed ℱ g` Left closedness of `g`.
We say that a function `g : Y → Y` is *left closed with respect to `ℱ`* when it is left
closed at every function `f : X → Y`.
-/
definition left_closed (g : Y → Y)
:= Π f, left_closed_at ℱ f g
/--
`PointFamily` Family of functions which contains all constant functions.
A reasonable reduction family might admit constant functions as a trivial case of some
more interesting property.
-/
class PointFamily
:= (has_constants : Π y, ℱ ↓y)
section subtraction --———————————————————————————————————————————————————————————————————--
/--
`pointwise_subtraction` Lifting codomain subtraction to pointwise subtraction.
If `Y` has a subtraction structure then for any type `X`, the function space `X → Y` has a
canonical pointwise subtraction. This `instance` is added to simplify the notation below.
-/
instance pointwise_subtraction [has_sub Y] : has_sub (X → Y)
:= ⟨λ f g, (λ x, f x - g x)⟩
/--
`DifferenceFamily` Family of functions closed under pointwise subtraction.
-/
class DifferenceFamily [has_sub Y]
:= (closure : Π f g (fℱ : ℱ f) (gℱ : ℱ g), ℱ (f - g))
--———————————————————————————————————————————————————————————————————————————————————————--
/--
We fix `Int : Reduction ℱ` which will be written symbolically `∫` to conote something like
an integral. This reduction will be used below.
-/
variables (Int : Reduction ℱ)
/--
`left_factors ℱ β lc_β` Left factorizability of `β`.
We say that a left closed function `β` *left factors with respect to `Int`* when for every
reducible `f` we have the identity `∫ (β ∘ f) = β (∫ f)`.
-/
definition left_factors (β : Y → Y) (lc_β : left_closed ℱ β)
:= Π f (fℱ : ℱ f), Int (β ∘ f) (lc_β f fℱ) = β (Int f fℱ)
--———————————————————————————————————————————————————————————————————————————————————————--
/-- For the rest of this section we assume `ℱ` is a `PointFamily`. -/
variables [PointFamily ℱ]
/--
`UnitalReduction` A reduction which is friendly to constant functions.
We say that a reduction is unital if it admits all constant functions and that the
reduction of a constant function is the constant which defines it, i.e. `∫ ↓y = y`.
-/
class UnitalReduction
:= (constant_reduction : Π y, Int ↓y (PointFamily.has_constants ℱ y) = y)
--———————————————————————————————————————————————————————————————————————————————————————--
/-- For the rest of this section we assume `Y` has a `≤` structure. -/
variables [has_le Y]
/--
`pointwise_le` Lifting codomain order to pointwise order.
If `Y` has an order structure `≤` then for any type `X`, the function space `X → Y` has a
canonical pointwise order. This `instance` is added to simplify the notation below.
-/
instance pointwise_le : has_le (X → Y)
:= ⟨λ f g, (Π x, f x ≤ g x)⟩
/--
`MonotonicReduction Y` A reduction which is functorial over `≤`.
We can also make a reduction functorial over the order structure on the space of
functions `X → Y` (restricted to `ℱ`) by requiring that `Int` be a monotonic operator.
-/
class MonotonicReduction
:= (monotonicity : Π f g {fℱ gℱ}, (f ≤ g) → (Int f fℱ ≤ Int g gℱ))
--———————————————————————————————————————————————————————————————————————————————————————--
/--
For the rest of this section we assume `Y` has a zero element and subtraction structure
and that `ℱ` is a `DifferenceFamily`. We want to consider a reduction that interacts with
a subtraction structure on its codomain.
-/
variables [has_zero Y] [has_sub Y] [DifferenceFamily ℱ]
/--
`constant_difference_property` A weak homomorphism property.
For a reduction compatible with subtraction, we would like to have that reductions are
homomorphisms of difference domains but this turns out to be too strong of a property. In
the case of classical integration we do not have in general the property
`∫ (f - g) = (∫ f) - (∫ g)` since both integrals on the right may take value at `±∞` so
the subtraction is not well defined. We instead use the weaker property that we can
commute with subtraction of a constant function.
-/
definition constant_difference_property
:= Π f k {fℱ},
Int (f - ↓k) (DifferenceFamily.closure f ↓k fℱ (PointFamily.has_constants ℱ k))
= Int f fℱ - Int ↓k (PointFamily.has_constants ℱ k)
/--
`translation_invariance_property` A weak translation property across inequalities.
For a reduction compatible with subtraction, we would like to capture the property from
the classical integral that the integral of a difference `f - g` is in the positive cone,
then `∫ g ≤ ∫ f`.
-/
definition translation_invariance_property
:= Π f g {fℱ gℱ}, 0 ≤ Int (g - f) (DifferenceFamily.closure g f gℱ fℱ)
→ Int f fℱ ≤ Int g gℱ
/--
`TranslativeReduction` A reduction which is compatible with subtraction.
We call a reduction a `TranslativeReduction` when it satisfies the
`constant_difference_property` and the `translation_invariance_property` defined above.
-/
class TranslativeReduction
:= (constant_difference : constant_difference_property ℱ Int)
(translation_invariance : translation_invariance_property ℱ Int)
end subtraction --———————————————————————————————————————————————————————————————————————--
end reduction --—————————————————————————————————————————————————————————————————————————--
section subdifferential --———————————————————————————————————————————————————————————————--
universes 𝒰y
variables {Y : Type 𝒰y} [has_zero Y] [has_sub Y] [has_le Y]
(φ : Y → Y) (t : Y) (𝒩 : 𝒫 Y)
/--
`SubDifferential φ t 𝒩` The subdifferential property.
The primary role of convexity in the classical inequality is the following. Given a convex
function `φ` on `(a,b)` we know that for any `s,t,u ∈ (a,b)` such that `s < t < u`, it
follows that
`(φ t - φ s) / (t - s) ≤ (φ u - φ t) / (u - t)`
From this we can deduce that there exists the supremum
`β := sup_{s ∈ (a,t)} (φ t - φ s) / (t - s)`
which then gives us the following inequality:
`β ⬝ (s - t) ≤ φ s - φ t`
To generalize this to a domain of discourse that does not have convex functions, we
instead accept the last inequality by hypothesis and generalize multiplication by `β` from
the left to function application. To ensure that this function application behaves well
with zero we need that it vanish there which is analogous to the fact that `y ⬝ 0 = 0` in
the real numbers.
Since the inequality above holds only in `(a,b)` in the classical case, not on all of `ℝ`,
we need to restrict the inequality to a specific subset of `Y`. We call this subset `𝒩`.
-/
structure SubDifferential
:= (map : Y → Y)
(root_at_zero : map 0 = 0)
(lower_bound_property : Π s, 𝒩 s → map (s - t) ≤ (φ s) - (φ t))
--———————————————————————————————————————————————————————————————————————————————————————--
universes 𝒰x
variables {X : Type 𝒰x} (ℱ : 𝒫 (X → Y)) (Int : Reduction ℱ)
/--
`LeftFactorSubDifferential φ t 𝒩` A left factorizable subdifferential.
If we have `β` a subdifferential corresponding to `φ`, `t`, and `𝒩`, then we say that it
is a *left factor subdifferential* when `β` left factors with respect to the family `ℱ`.
-/
structure LeftFactorSubDifferential extends SubDifferential φ t 𝒩
:= (is_left_closed : left_closed ℱ map)
(left_factors : left_factors ℱ Int map is_left_closed)
end subdifferential --———————————————————————————————————————————————————————————————————--
section jensen_inequality --—————————————————————————————————————————————————————————————--
universes 𝒰y 𝒰x
variables {Y : Type 𝒰y} [OrderedDifferenceDomain Y]
{X : Type 𝒰x}
(ℱ : 𝒫 (X → Y)) [PointFamily ℱ] [DifferenceFamily ℱ]
(Int : Reduction ℱ)
/--
`JensenReduction` A reduction which is strong enough to prove Jensen's Inequality.
We have the following properties inherited from the individual structures:
- `UnitalReduction`
- `∫ ↓t = t`
- `MonotonicReduction`
- `(Π x, f x ≤ g x) → (∫ f ≤ ∫ g)`
- `TranslativeReduction`
- `∫ (f - ↓k) = (∫ f) - (∫ ↓k)`
- `(0 ≤ ∫ (g - f)) → (∫ f ≤ ∫ g)`
-/
class JensenReduction
extends UnitalReduction ℱ Int,
MonotonicReduction ℱ Int,
TranslativeReduction ℱ Int
/--
`jensen_inequality` Jensen's Inequality (`φ (∫ f) ≤ ∫ (φ ∘ f)`).
The theorem exactly generalizes the classical Jensen inequality and so it is sufficient to
prove that the measure-theoretic integral is a `JensenReduction` and that convex functions
have the `LeftFactorSubDifferential` property to apply the following proof to the
classical case. Proof sketches for these two facts are discussed above.
NB: In this proof, we will be working to find a term which has the type of the goal
instead of modifying the goal directly, so it is not necessary to look at the goals.
Instead, follow the proof by reading the commentary which describes the change in the
main term `inequality` which will satisfy the goal in the end. For simplicity the
reducibility proofs are left unspecified while following the main term.
-/
theorem jensen_inequality
-- Proof that Int is a nice reduction.
[JensenReduction ℱ Int]
-- A reducible function f.
(f : X → Y) (fℱ : ℱ f)
-- A distinguished superset of the image of f.
(𝒩 : 𝒫 Y) (image_contained_in_𝒩 : Π x, 𝒩 (f x))
-- A function φ which is left closed at f and has a LeftFactorSubDifferential at the
-- reduction of f with the above distinguished superset 𝒩.
(φ : Y → Y) (φfℱ : ℱ (φ ∘ f))
(subdifferential : LeftFactorSubDifferential φ (Int f fℱ) 𝒩 ℱ Int)
-- From above, the Jensen Inequality follows:
: φ (Int f fℱ) ≤ Int (φ ∘ f) φfℱ
:= begin -- We begin by introducing some relevant variables:
-- The reduction of f.
let t := Int f fℱ,
-- The function F := λ x, f x - t and its proof of reducibility.
let F := f - ↓t,
let Fℱ := DifferenceFamily.closure f ↓t _ _,
-- The subderivative of φ.
let β := subdifferential.map,
/-
Π s, 𝒩 s → β (s - t) ≤ (φ s) - (φ t)
—————————————————————————————————————— image_contained_in_𝒩
Π x, β (F x) ≤ (φ (f x)) - (φ t)
To begin the proof, we use the fact that the subdifferential has the lower bound
property inside of 𝒩 which contains the image of f, so we have that the inequality
holds for all points of X. This is the main term to follow throughout the proof.
-/
let inequality
:= λ x, subdifferential.lower_bound_property (f x) (image_contained_in_𝒩 x),
/-
Π x, β (F x) ≤ (φ (f x)) - (φ t)
————————————————————————————————— monotonicity
∫ β ∘ F ≤ ∫ (φ ∘ f - ↓(φ t))
Next, we use the fact that the reduction is monotonic to "integrate" both sides.
-/
let inequality
:= MonotonicReduction.monotonicity Int (β ∘ F) (φ ∘ f - ↓(φ t)) inequality,
/-
∫ β ∘ F ≤ ∫ (φ ∘ f - ↓(φ t))
————————————————————————————— left_factors
β (∫ F) ≤ ∫ (φ ∘ f - ↓(φ t))
Here we use the fact that the subdifferential left factors with respect to the
reduction family.
-/
rewrite subdifferential.left_factors F Fℱ at inequality,
/-
β (∫ F) ≤ ∫ (φ ∘ f - ↓(φ t))
———————————————————————————————————————— constant_difference
β ((∫ f) - (∫ ↓t)) ≤ ∫ (φ ∘ f - ↓(φ t))
Now we use the fact that our reduction has the constant difference property to
distribute the reduction over the difference of functions.
-/
rewrite TranslativeReduction.constant_difference Int at inequality,
/-
β ((∫ f) - (∫ ↓t)) ≤ ∫ (φ ∘ f - ↓(φ t))
———————————————————————————————————————— constant_reduction
β ((∫ f) - t) ≤ ∫ (φ ∘ f - ↓(φ t))
Since our reduction is unital we can replace the reduction of the constant function
with the defining constant.
-/
rewrite UnitalReduction.constant_reduction Int at inequality,
/-
β ((∫ f) - t) ≤ ∫ (φ ∘ f - ↓(φ t))
——————————————————————————————————— vanishing_diagonal
β 0 ≤ ∫ (φ ∘ f - ↓(φ t))
Since
t := ∫ f
we have that,
(∫ f) - t := (∫ f) - (∫ f)
so we can cancel like terms using the fact that Y is a difference domain.
-/
rewrite DifferenceDomain.vanishing_diagonal at inequality,
/-
β 0 ≤ ∫ (φ ∘ f - ↓(φ t))
————————————————————————— root_at_zero
0 ≤ ∫ (φ ∘ f - ↓(φ t))
To simplify the left hand side of the inequality we use the fact that the
subderivative has a root at zero.
-/
rewrite subdifferential.root_at_zero at inequality,
/-
0 ≤ ∫ (φ ∘ f - ↓(φ t))
——————————————————————— translation_invariance
∫ ↓(φ t) ≤ ∫ (φ ∘ f)
To split the reduction of a difference, we use the translation invariance property
because our reduction is translative.
-/
let inequality
:= TranslativeReduction.translation_invariance Int ↓(φ t) (φ ∘ f) inequality,
/-
∫ ↓(φ t) ≤ ∫ (φ ∘ f)
————————————————————— constant_reduction
φ t ≤ ∫ (φ ∘ f)
Again we use the fact that out reduction is unital to pull out the constant on the
left hand side.
-/
rewrite UnitalReduction.constant_reduction Int at inequality,
/-
φ t ≤ ∫ (φ ∘ f)
———————————————————— definition
φ (∫ f) ≤ ∫ (φ ∘ f)
And now the proof is complete. The inequality has the correct type to satisfy the main
goal. All other goals can be deduced canonically by the proof assistant since the
relevant proof terms are in the given context.
-/
exact inequality,
end -- □
end jensen_inequality --—————————————————————————————————————————————————————————————————--