Hands are the primary way that humans interact with and manipulate the world. Intelligent machines will need to be able to understand how humans use their hands if they are to understand human actions and to work in the world humans have built with their hands. Unfortunately, videos that show people using their hands are surprisingly difficult to understand for current artificial intelligence (AI) systems. Hands may be temporarily hidden as people interact with objects, and even if they are visible, hands can interact with a myriad of different objects ranging from refrigerator handles to coffee mugs to garage door openers. This project develops systems that can enable learning about how humans use their hands from large scale Internet video data. As hands are central to many other areas of study, this project has the potential to empower research in many other disciplines. For instance, robotics researchers may use the systems to teach robots how to interact with objects by observation. Similarly, kinesiologists and mechanical engineers who study how the human hand is used could use the systems to better quantify hand motions and thus improve the lives of people. <br/><br/>This project aims to achieve its goal via three technical directions that together advance the science of understanding human activities and affordances (human/object interaction). The first direction of the project will build systems for automatically parsing hand interaction data from large-scale video. The goal of this direction is to understand what the hand is doing in terms of interaction with the world in physical terms as opposed to via naming the interaction with nouns and verbs. To help understand the context of an interaction, the second direction aims to build learning-based systems that can understand human poses from partial observations that occur naturally in video data. Finally, the third direction puts these systems together by building a graph of interaction where hand interaction examples are nodes, and edges are induced by observations of human pose. This web of interactions will enable systems to learn about how humans can manipulate objects from large-scale data across viewpoints and examples and enable new applications of computer vision.<br/><br/>This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.