Training Fully Connected Neural Networks is ∃R-Complete

9 mai 22

9th May 2022, at 11:00 (morning)

Room: A711

(Hybrid) Cliquez ici pour participer à la réunion

Speaker: Tillman Miltzow, Utrecht University (https://sites.google.com/view/miltzow/home)

Title: Training Fully Connected Neural Networks is ∃R-Complete

Abstract:

We consider the algorithmic problem of finding the optimal weights and biases for a two layer fully connected neural network to fit a given set of data points. This problem is known as "empirical risk minimization" in the machine learning community.

We show that the problem is ∃R-complete. This complexity class can be defined as the set of algorithmic problems that are polynomial-time equivalent to finding real roots of a polynomial with integer coefficients. Our results hold even if the following restrictions are all added simultaneously.
- There are exactly two output neurons.
- There are exactly two input neurons.
- The data has only 13 different labels.
-The number of hidden neurons is a constant fraction of the number of data points.
- The target training error is zero.
- The \ReLU activation function is used.

This shows that even very simple networks are difficult to train. The result offers an explanation (though far from a complete understanding) on why only gradient descent is widely successful in training neural networks in practice. We generalize a recent result by Abrahamsen, Kleist and Miltzow [NeurIPS 2021].

This result falls into a recent line of research that tries to unveil that a series of central algorithmic problems from widely different areas of computer science and mathematics are ∃R-complete: This includes the art gallery problem [STOC 2018], geometric packing [FOCS 2020], covering polygons with convex polygons [FOCS 2021], and continuous constraint satisfaction problems [FOCS 2021].

Joint work with Daniel Bertschinger, Christoph Hertrich, Paul Jungeblut, Tillmann Miltzow, and Simon Weber