"Rate-distortion optimal approximation with deep ReLU networks"
Elbrächter, DennisWe will present some neural network approximation results with an information theoretic flavour. Specifically, we will require that a "size" M network may, canonically, be represented as a bit-string of length M (up to polylogarithmic factors). This is based on adapting David Donoho's notion of "best M-term approximation rate under polynomial depth search constraint" from the dictionary setting to the neural network setting. In particular, this allows us to compare the asymptotic expressivity of ReLU neural networks to those of a wide range of dictionaries, and thereby establish optimality for various classically studied function classes.