Capacity of Neural Networks (5): Sauer’s Lemma

1 minute read

In this note we prove the Sauer’s lemma which plays the key role in establishing the connection between VC-dimension and Rademacher complexity. We use the proof in Section 8.3 of (Vershynin, 2017). There are two steps in the proof. First, we prove Pajor’s lemma that for any boolean function class defined on a finite set , its cardinality is bounded by

Then if the VC-dimension of is and , the maximum number of shattered subsets of is

Now let’s use induction to prove Pajor’s lemma. If , then there are either 1 or 2 boolean functions defined on . If there is 1 function, then no subset can be shattered. We take the convention that . If there are 2 functions, then all subsets can be shattered, so the cardinality of shattered subsets is 2. Assuming the conclusion holds for , for an with points, we take out one point and denote the rest points by . Then the function class can be splitted into two disjoint subsets and according to their values at . Denote the class of subsets shattered by in by and , respectively. By induction hypothesis, and . Since , if we can show that , then we are done. For the subsets in , they can be shattered by ; for the subsets in , they can be shattered by . For the subsets in , we consider and it can be shattered by because if the label of is 0, we can find a function in to shatter it and if the label of is 1, we can find a function in to shatter it.

Reference

  1. Vershynin, R. (2017). High Dimensional Probability.