Machine learning is an increasingly important approach in tackling many difficult scientific, engineering and artificial intelligence tasks, ranging from machine translation and speech recognition, through control of self driving cars, to protein structure prediction and drug design. The core idea of machine learning is to use examples and data to automatically train a system to perform some task. Accordingly, the success of machine learning is tied to availability of large amounts of training data and our ability to process it. Much of the recent success of machine learning is fueled by the large amounts of data (text, images, videos, etc) that can now be collected. But all this data also needs to be processed and learned from---indeed this data flood has shifted the bottleneck, to a large extent, from availability of data to our ability to process it. In particular, the amounts of data involved can no longer be stored and handled on single computers. Consequently, distributed machine learning, where data is processed and learned from on many computers that communicate with each other, is a crucial element of modern large scale machine learning.<br/><br/>The goal of this project is to provide a rigorous framework for studying distributed machine learning, and through it develop efficient methods for distributed learning and a theoretical understanding of the benefits of these methods, as well as the inherent limitations of distributed learning. A central component in the PIs' approach is to model distributed learning as a stochastic optimization problem, where different machines receive samples drawn from the same source distribution, thus allowing methods and analysis that specifically leverage the relatedness between data on different machines. This is crucial for studying how availability of multiple computers can aid in reducing the computational cost of learning. Furthermore, the project also encompasses the more challenging case where there are significant differences between the nature of the data on different machines (for instance, when different machines serve different geographical regions, or when each machine is a personal device, collecting data from a single user). In such a situation, the proposed approach to be studied is to integrate distributed learning with personalization or adaptation, which the PIs argue can not only improve learning performance, but also better leverage distributed computation.<br/><br/>This is an international collaboration, made possible through joint funding with the US-Israel Binational Science Foundation (BSF). The project brings together two PIs that have worked together extensively on related topics in machine learning and optimization.