Recent advances in machine learning have led to more accurate software-based predictions by leveraging the vast amounts of data that are currently produced, stored, and analyzed by modern computers. Many of these advances are due to deep learning, through which the available data is used to train artificial neural networks in which every input is sequentially analyzed by many layers of artificial neurons to identify more complex relationships. When the prediction task is more challenging or the predictions need to be more accurate, considerably larger networks are trained with dedicated hardware such as general-purpose Graphics Processing Units (GPUs). Nevertheless, individuals and organizations with more constrained resources may not have access to GPUs, and those larger networks may not fit in embedded systems such as Internet of Things (IoT) and mobile devices. On the one hand, there are many inexact pruning methods for reducing the size of a neural network after training. These methods may reduce the accuracy, affect the robustness of the network when applied to slightly modified data, and lead to fairness issues because the effect of pruning is uneven and disproportionally affects groups that are underrepresented in the data. On the other hand, the relationships that trained neural networks represent are often not as complex as they could potentially be. That implies that it is possible to obtain smaller neural networks representing the same relationships, hence avoiding the side effects of conventional pruning methods. This project aims to improve our understanding of what neural networks can represent, and how they can be exactly compressed for a more efficient use. <br/><br/>This project aims to develop exact neural network compression algorithms and investigate the relationship between network expressiveness in terms of the number of linear regions and network compressibility by leveraging polyhedral theory and discrete optimization techniques. Our primary goal is to develop faster and more scalable algorithms to identify network modifications having limited or no effect to the model represented by trained neural networks. Secondarily, we aim to identify theoretical connections between representability and compressibility as well as develop more efficient methods for measuring the number of linear regions.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.