Notes on the Symmetries of 2-Layer ReLU-Networks
Symmetries in neural networks allow different weight configurations leading to the same network function. For odd activation functions, the set of transformations mapping between such configurations have been studied extensively, but less is known for neural networks with ReLU activation functions. We give a complete characterization for fully-connected networks with two layers. Apart from two well-known transformations, only degenerated situations allow additional transformations that leave the network function unchanged. Reduction steps can remove only part of the degenerated cases. Finally, we present a non-degenerate situation for deep neural networks leading to new transformations leaving the network function intact.
Francesca Albertini, Eduardo D. Sontag, and Vincent Maillot. Uniqueness of weights for neural networks. In in Artificial Neural Networks with Applications in Speech and Vision, pages 115–125. Chapman and Hall, 1993.
Julius Berner, Dennis Elbrächter, and Philipp Grohs. How degenerate is the parametrization of neural networks with the relu activation function? Proceedings of the Thirty-third Conference on Neural Information Processing Systems (NeurIPS), 2019.
An Mei Chen, Hawminn Lu, and Robert Hecht-Nielsen. On the geometry of feedforward neural network error surfaces. Neural computation, 5(6):910–927, 1993.
Laurent Dinh, Razvan Pascanu, Samy Bengio, and Yoshua Bengio. Sharp minima can generalize for deep nets. In Proceedings of the 34th International Conference on Machine Learning- Volume 70, pages 1019–1028. JMLR. org, 2017.
Vera Kurková and Paul C Kainen. Functionally equivalent feedforward neural networks. Neural Computation, 6(3):543–558, 1994.
Behnam Neyshabur, Ruslan R Salakhutdinov, and Nati Srebro. Path-sgd: Path-normalized optimization in deep neural networks. In Advances in Neural Information Processing Systems, pages 2422–2430, 2015.
Mary Phuong and Christoph H. Lampert. Functional vs. parametric equivalence of relu networks. International Conference on Learning Representations 2020, 2020.
David Rolnick and Konrad P. Kording. Identifying weights and architectures of unknown relu networks. arxiv preprint, arXiv:1910.00744v1, 2019.
Héctor J Sussmann. Uniqueness of the weights for minimal feedforward nets with a given input-output map. Neural networks, 5(4):589– 593, 1992.
Copyright (c) 2020 Henning Petzka
This work is licensed under a Creative Commons Attribution 4.0 International License.