Westudy shows that learnability with deep networks of a target function depends on the ability of simpler classes to approximate the target . We also show that a class offunctions can be learned by an efficient statistical query algorithm if and only if it can be approximated in a weak sense by some kernel class . We gives several examples of functions which demonstrate depth separation, and concludethat they cannot be efficiently learned, even by a hypothesis class that can efficientlyly approximate them . We conclude that a function can be learnable by gradient descenton deep neural networks is to be able to approximate it with shallow neural networks. We also conclude that functions such as Asshallow networks or kernel classes cannot be learned efficiently by an algorithm that can approximate them. We give examples of examples of such functions which demonstrated depth separation and conclude that they are not efficiently learned. We conclude

**Author(s) :**Eran Malach, Gilad Yehudai, Shai Shalev-Shwartz, Ohad Shamir

**Links :**PDF - Abstract

**Code :**

Keywords : networks - approximate - conclude - learned - depth -