Mutual information estimation for graph convolutional neural networks




mutual information, graph convolution neural networks, inductive bias


Measuring model performance is a key issue for deep learning practitioners. However, we often lack the ability to explain why a specific architecture attains superior predictive accuracy for a given data set. Often, validation accuracy is used as a performance heuristic quantifying how well a network generalize to unseen data. Mutual information can be used as a measure of the quality of internal representations in deep learning models, and the information plane provide insights into whether the model exploits the available information in data.

The information plane has previously been explored for fully connected neural networks and convolutional architectures. We present an architecture-agnostic method for tracking a network's internal representations during training, which are then used to create the mutual information plane. The method is exemplified for a graph convolutional neural network fitted on the Cora citation data. We compare how the inductive bias introduced in the graph convolutional architecture changes the mutual information plane relative to a fully connected neural network.