9th May 2022, at 11:00 (morning)
Speaker: Tillman Miltzow, Utrecht University (https://sites.google.com/view/miltzow/home)
Title: Training Fully Connected Neural Networks is ∃R-Complete
We consider the algorithmic problem of finding the optimal weights and biases for a two layer fully connected neural network to fit a given set of data points. This problem is known as "empirical risk minimization" in the machine learning community.
We show that the problem is ∃R-complete. This complexity class can be defined as the set of algorithmic problems that are polynomial-time equivalent to finding real roots of a polynomial with integer coefficients. Our results hold even if the following restrictions are all added simultaneously.
- There are exactly two output neurons.
- There are exactly two input neurons.
- The data has only 13 different labels.
-The number of hidden neurons is a constant fraction of the number of data points.
- The target training error is zero.
- The \ReLU activation function is used.
This shows that even very simple networks are difficult to train. The result offers an explanation (though far from a complete understanding) on why only gradient descent is widely successful in training neural networks in practice. We generalize a recent result by Abrahamsen, Kleist and Miltzow [NeurIPS 2021].
This result falls into a recent line of research that tries to unveil that a series of central algorithmic problems from widely different areas of computer science and mathematics are ∃R-complete: This includes the art gallery problem [STOC 2018], geometric packing [FOCS 2020], covering polygons with convex polygons [FOCS 2021], and continuous constraint satisfaction problems [FOCS 2021].
Joint work with Daniel Bertschinger, Christoph Hertrich, Paul Jungeblut, Tillmann Miltzow, and Simon Weber