The present disclosure relates generally to computer user interfaces, and more specifically to techniques for managing a live video communication session and/or managing digital content.
Computer systems can include hardware and/or software for displaying an interface for a live video communication session.
Some techniques for managing a live video communication session using electronic devices, however, are generally cumbersome and inefficient. For example, some existing techniques use a complex and time-consuming user interface, which may include multiple key presses or keystrokes. Existing techniques require more time than necessary, wasting user time and device energy. This latter consideration is particularly important in battery-operated devices.
Accordingly, the present technique provides electronic devices with faster, more efficient methods and interfaces for managing a live video communication session and/or managing digital content. Such methods and interfaces optionally complement or replace other methods for managing a live video communication session and/or managing digital content. Such methods and interfaces reduce the cognitive burden on a user and produce a more efficient human-machine interface. For battery-operated computing devices, such methods and interfaces conserve power and increase the time between battery charges.
In accordance with some embodiments, a method performed at a computer system that is in communication with a display generation component, one or more cameras, and one or more input devices is described. The method comprises: displaying, via the display generation component, a live video communication interface for a live video communication session, the live video communication interface including a representation of at least a portion of a field-of-view of the one or more cameras; while displaying the live video communication interface, detecting, via the one or more input devices, one or more user inputs including a user input directed to a surface in a scene that is in the field-of-view of the one or more cameras; and in response to detecting the one or more user inputs, displaying, via the display generation component, a representation of the surface, wherein the representation of the surface includes an image of the surface captured by the one or more cameras that is modified based on a position of the surface relative to the one or more cameras.
In accordance with some embodiments, a non-transitory computer-readable storage medium is described. The non-transitory computer-readable storage medium stores one or more programs configured to be executed by one or more processors of a computer system that is in communication with a display generation component, one or more cameras, and one or more input devices, the one or more programs including instructions for: displaying, via the display generation component, a live video communication interface for a live video communication session, the live video communication interface including a representation of at least a portion of a field-of-view of the one or more cameras; while displaying the live video communication interface, detecting, via the one or more input devices, one or more user inputs including a user input directed to a surface in a scene that is in the field-of-view of the one or more cameras; and in response to detecting the one or more user inputs, displaying, via the display generation component, a representation of the surface, wherein the representation of the surface includes an image of the surface captured by the one or more cameras that is modified based on a position of the surface relative to the one or more cameras.
In accordance with some embodiments, a transitory computer-readable storage medium is described. The transitory computer-readable storage medium stores one or more programs configured to be executed by one or more processors of a computer system that is in communication with a display generation component, one or more cameras, and one or more input devices, the one or more programs including instructions for: displaying, via the display generation component, a live video communication interface for a live video communication session, the live video communication interface including a representation of at least a portion of a field-of-view of the one or more cameras; while displaying the live video communication interface, detecting, via the one or more input devices, one or more user inputs including a user input directed to a surface in a scene that is in the field-of-view of the one or more cameras; and in response to detecting the one or more user inputs, displaying, via the display generation component, a representation of the surface, wherein the representation of the surface includes an image of the surface captured by the one or more cameras that is modified based on a position of the surface relative to the one or more cameras.
In accordance with some embodiments, a computer system that is configured to communicate with a display generation component, one or more cameras, and one or more input devices is described. The computer system comprises: one or more processors; and memory storing one or more programs configured to be executed by the one or more processors, the one or more programs including instructions for: displaying, via the display generation component, a live video communication interface for a live video communication session, the live video communication interface including a representation of at least a portion of a field-of-view of the one or more cameras; while displaying the live video communication interface, detecting, via the one or more input devices, one or more user inputs including a user input directed to a surface in a scene that is in the field-of-view of the one or more cameras; and in response to detecting the one or more user inputs, displaying, via the display generation component, a representation of the surface, wherein the representation of the surface includes an image of the surface captured by the one or more cameras that is modified based on a position of the surface relative to the one or more cameras.
In accordance with some embodiments, a computer system that is configured to communicate with a display generation component, one or more cameras, and one or more input devices is described. The computer system comprises: means for displaying, via the display generation component, a live video communication interface for a live video communication session, the live video communication interface including a representation of a first portion of a scene that is in a field-of-view captured by the one or more cameras; and means, while displaying the live video communication interface, for obtaining, via the one or more cameras, image data for the field-of-view of the one or more cameras, the image data including a first gesture; and means, responsive to obtaining the image data for the field-of-view of the one or more cameras, for: in accordance with a determination that the first gesture satisfies a first set of criteria, displaying, via the display generation component, a representation of a second portion of the scene that is in the field-of-view of the one or more cameras, the representation of the second portion of the scene including different visual content from the representation of the first portion of the scene; and in accordance with a determination that the first gesture satisfies a second set of criteria different from the first set of criteria, continuing to display, via the display generation component, the representation of the first portion of the scene.
In accordance with some embodiments, a computer program product is described. The computer program product comprises one or more programs configured to be executed by one or more processors of a computer system that is in communication with a display generation component, one or more cameras, and one or more input devices. The one or more programs include instructions for: displaying, via the display generation component, a live video communication interface for a live video communication session, the live video communication interface including a representation of a first portion of a scene that is in a field-of-view captured by the one or more cameras; and while displaying the live video communication interface, obtaining, via the one or more cameras, image data for the field-of-view of the one or more cameras, the image data including a first gesture; and in response to obtaining the image data for the field-of-view of the one or more cameras: in accordance with a determination that the first gesture satisfies a first set of criteria, displaying, via the display generation component, a representation of a second portion of the scene that is in the field-of-view of the one or more cameras, the representation of the second portion of the scene including different visual content from the representation of the first portion of the scene; and in accordance with a determination that the first gesture satisfies a second set of criteria different from the first set of criteria, continuing to display, via the display generation component, the representation of the first portion of the scene.
In accordance with some embodiments, a method performed at a computer system that is in communication with a display generation component, one or more first cameras, and one or more input devices is described. The method comprises: detecting a set of one or more user inputs corresponding to a request to display a user interface of a live video communication session that includes a plurality of participants; in response to detecting the set of one or more user inputs, displaying, via the display generation component, a live video communication interface for a live video communication session, the live video communication interface including: a first representation of a field-of-view of the one or more first cameras of the first computer system; a second representation of the field-of-view of the one or more first cameras of the first computer system, the second representation of the field-of-view of the one or more first cameras of the first computer system including a representation of a surface in a first scene that is in the field-of-view of the one or more first cameras of the first computer system; a first representation of a field-of-view of one or more second cameras of a second computer system; and a second representation of the field-of-view of the one or more second cameras of the second computer system, the second representation of the field-of-view of the one or more second cameras of the second computer system including a representation of a surface in a second scene that is in the field-of-view of the one or more second cameras of the second computer system.
In accordance with some embodiments, a non-transitory computer-readable storage medium is described. The non-transitory computer-readable storage medium stores one or more programs configured to be executed by one or more processors of a computer system that is in communication a display generation component, one or more first cameras, and one or more input devices, the one or more programs including instructions for: detecting a set of one or more user inputs corresponding to a request to display a user interface of a live video communication session that includes a plurality of participants; in response to detecting the set of one or more user inputs, displaying, via the display generation component, a live video communication interface for a live video communication session, the live video communication interface including: a first representation of a field-of-view of the one or more first cameras of the first computer system; a second representation of the field-of-view of the one or more first cameras of the first computer system, the second representation of the field-of-view of the one or more first cameras of the first computer system including a representation of a surface in a first scene that is in the field-of-view of the one or more first cameras of the first computer system; a first representation of a field-of-view of one or more second cameras of a second computer system; and a second representation of the field-of-view of the one or more second cameras of the second computer system, the second representation of the field-of-view of the one or more second cameras of the second computer system including a representation of a surface in a second scene that is in the field-of-view of the one or more second cameras of the second computer system.
In accordance with some embodiments, a transitory computer-readable storage medium is described. The transitory computer-readable storage medium stores one or more programs configured to be executed by one or more processors of a computer system that is in communication with a display generation component, one or more first cameras, and one or more input devices, the one or more programs including instructions for: detecting a set of one or more user inputs corresponding to a request to display a user interface of a live video communication session that includes a plurality of participants; in response to detecting the set of one or more user inputs, displaying, via the display generation component, a live video communication interface for a live video communication session, the live video communication interface including: a first representation of a field-of-view of the one or more first cameras of the first computer system; a second representation of the field-of-view of the one or more first cameras of the first computer system, the second representation of the field-of-view of the one or more first cameras of the first computer system including a representation of a surface in a first scene that is in the field-of-view of the one or more first cameras of the first computer system; a first representation of a field-of-view of one or more second cameras of a second computer system; and a second representation of the field-of-view of the one or more second cameras of the second computer system, the second representation of the field-of-view of the one or more second cameras of the second computer system including a representation of a surface in a second scene that is in the field-of-view of the one or more second cameras of the second computer system.
In accordance with some embodiments, a computer system that is configured to communicate with a display generation component, one or more first cameras, and one or more input devices is described. The computer system comprises: one or more processors; and memory storing one or more programs configured to be executed by the one or more processors, the one or more programs including instructions for: detecting a set of one or more user inputs corresponding to a request to display a user interface of a live video communication session that includes a plurality of participants; in response to detecting the set of one or more user inputs, displaying, via the display generation component, a live video communication interface for a live video communication session, the live video communication interface including: a first representation of a field-of-view of the one or more first cameras of the first computer system; a second representation of the field-of-view of the one or more first cameras of the first computer system, the second representation of the field-of-view of the one or more first cameras of the first computer system including a representation of a surface in a first scene that is in the field-of-view of the one or more first cameras of the first computer system; a first representation of a field-of-view of one or more second cameras of a second computer system; and a second representation of the field-of-view of the one or more second cameras of the second computer system, the second representation of the field-of-view of the one or more second cameras of the second computer system including a representation of a surface in a second scene that is in the field-of-view of the one or more second cameras of the second computer system.
In accordance with some embodiments, a computer system that is configured to communicate with a display generation component, one or more first cameras, and one or more input devices is described. The computer system comprises: means for detecting a set of one or more user inputs corresponding to a request to display a user interface of a live video communication session that includes a plurality of participants; means, responsive to detecting the set of one or more user inputs, for displaying, via the display generation component, a live video communication interface for a live video communication session, the live video communication interface including: a first representation of a field-of-view of the one or more first cameras of the first computer system; a second representation of the field-of-view of the one or more first cameras of the first computer system, the second representation of the field-of-view of the one or more first cameras of the first computer system including a representation of a surface in a first scene that is in the field-of-view of the one or more first cameras of the first computer system; a first representation of a field-of-view of one or more second cameras of a second computer system; and a second representation of the field-of-view of the one or more second cameras of the second computer system, the second representation of the field-of-view of the one or more second cameras of the second computer system including a representation of a surface in a second scene that is in the field-of-view of the one or more second cameras of the second computer system.
In accordance with some embodiments, a computer program product is described. The computer program product comprises one or more programs configured to be executed by one or more processors of a computer system that is in communication with a display generation component, one or more first cameras, and one or more input devices. The one or more programs include instructions for: detecting a set of one or more user inputs corresponding to a request to display a user interface of a live video communication session that includes a plurality of participants; in response to detecting the set of one or more user inputs, displaying, via the display generation component, a live video communication interface for a live video communication session, the live video communication interface including: a first representation of a field-of-view of the one or more first cameras of the first computer system; a second representation of the field-of-view of the one or more first cameras of the first computer system, the second representation of the field-of-view of the one or more first cameras of the first computer system including a representation of a surface in a first scene that is in the field-of-view of the one or more first cameras of the first computer system; a first representation of a field-of-view of one or more second cameras of a second computer system; and a second representation of the field-of-view of the one or more second cameras of the second computer system, the second representation of the field-of-view of the one or more second cameras of the second computer system including a representation of a surface in a second scene that is in the field-of-view of the one or more second cameras of the second computer system.
In accordance with some embodiments, a method performed at a computer system that is in communication with a display generation component, one or more first cameras, and one or more input devices is described. The method comprises: detecting a set of one or more user inputs corresponding to a request to display a user interface of a live video communication session that includes a plurality of participants; in response to detecting the set of one or more user inputs, displaying, via the display generation component, a live video communication interface for a live video communication session, the live video communication interface including: a first representation of a field-of-view of the one or more first cameras of the first computer system; a second representation of the field-of-view of the one or more first cameras of the first computer system, the second representation of the field-of-view of the one or more first cameras of the first computer system including a representation of a surface in a first scene that is in the field-of-view of the one or more first cameras of the first computer system; a first representation of a field-of-view of one or more second cameras of a second computer system; and a second representation of the field-of-view of the one or more second cameras of the second computer system, the second representation of the field-of-view of the one or more second cameras of the second computer system including a representation of a surface in a second scene that is in the field-of-view of the one or more second cameras of the second computer system.
In accordance with some embodiments, a non-transitory computer-readable storage medium is described. The non-transitory computer-readable storage medium stores one or more programs configured to be executed by one or more processors of a computer system that is in communication with a display generation component, one or more first cameras, and one or more input devices, the one or more programs including instructions for: detecting a set of one or more user inputs corresponding to a request to display a user interface of a live video communication session that includes a plurality of participants; in response to detecting the set of one or more user inputs, displaying, via the display generation component, a live video communication interface for a live video communication session, the live video communication interface including: a first representation of a field-of-view of the one or more first cameras of the first computer system; a second representation of the field-of-view of the one or more first cameras of the first computer system, the second representation of the field-of-view of the one or more first cameras of the first computer system including a representation of a surface in a first scene that is in the field-of-view of the one or more first cameras of the first computer system; a first representation of a field-of-view of one or more second cameras of a second computer system; and a second representation of the field-of-view of the one or more second cameras of the second computer system, the second representation of the field-of-view of the one or more second cameras of the second computer system including a representation of a surface in a second scene that is in the field-of-view of the one or more second cameras of the second computer system.
In accordance with some embodiments, a transitory computer-readable storage medium is described. The transitory computer-readable storage medium stores one or more programs configured to be executed by one or more processors of a computer system that is in communication with a display generation component, one or more first cameras, and one or more input devices, the one or more programs including instructions for: detecting a set of one or more user inputs corresponding to a request to display a user interface of a live video communication session that includes a plurality of participants; in response to detecting the set of one or more user inputs, displaying, via the display generation component, a live video communication interface for a live video communication session, the live video communication interface including: a first representation of a field-of-view of the one or more first cameras of the first computer system; a second representation of the field-of-view of the one or more first cameras of the first computer system, the second representation of the field-of-view of the one or more first cameras of the first computer system including a representation of a surface in a first scene that is in the field-of-view of the one or more first cameras of the first computer system; a first representation of a field-of-view of one or more second cameras of a second computer system; and a second representation of the field-of-view of the one or more second cameras of the second computer system, the second representation of the field-of-view of the one or more second cameras of the second computer system including a representation of a surface in a second scene that is in the field-of-view of the one or more second cameras of the second computer system.
In accordance with some embodiments, a computer system that is configured to communicate with a display generation component, one or more first cameras, and one or more input devices is described. The computer system comprises: one or more processors; and memory storing one or more programs configured to be executed by the one or more processors, the one or more programs including instructions for: detecting a set of one or more user inputs corresponding to a request to display a user interface of a live video communication session that includes a plurality of participants; in response to detecting the set of one or more user inputs, displaying, via the display generation component, a live video communication interface for a live video communication session, the live video communication interface including: a first representation of a field-of-view of the one or more first cameras of the first computer system; a second representation of the field-of-view of the one or more first cameras of the first computer system, the second representation of the field-of-view of the one or more first cameras of the first computer system including a representation of a surface in a first scene that is in the field-of-view of the one or more first cameras of the first computer system; a first representation of a field-of-view of one or more second cameras of a second computer system; and a second representation of the field-of-view of the one or more second cameras of the second computer system, the second representation of the field-of-view of the one or more second cameras of the second computer system including a representation of a surface in a second scene that is in the field-of-view of the one or more second cameras of the second computer system.
In accordance with some embodiments, a computer system that is configured to communicate with a display generation component, one or more first cameras, and one or more input devices is described. The computer system comprises: means for detecting a set of one or more user inputs corresponding to a request to display a user interface of a live video communication session that includes a plurality of participants; means, responsive to detecting the set of one or more user inputs, for displaying, via the display generation component, a live video communication interface for a live video communication session, the live video communication interface including: a first representation of a field-of-view of the one or more first cameras of the first computer system; a second representation of the field-of-view of the one or more first cameras of the first computer system, the second representation of the field-of-view of the one or more first cameras of the first computer system including a representation of a surface in a first scene that is in the field-of-view of the one or more first cameras of the first computer system; a first representation of a field-of-view of one or more second cameras of a second computer system; and a second representation of the field-of-view of the one or more second cameras of the second computer system, the second representation of the field-of-view of the one or more second cameras of the second computer system including a representation of a surface in a second scene that is in the field-of-view of the one or more second cameras of the second computer system.
In accordance with some embodiments, a computer program product is described. The computer program product comprises one or more programs configured to be executed by one or more processors of a computer system that is in communication with a display generation component, one or more first cameras, and one or more input devices. The one or more programs include instructions for: detecting a set of one or more user inputs corresponding to a request to display a user interface of a live video communication session that includes a plurality of participants; in response to detecting the set of one or more user inputs, displaying, via the display generation component, a live video communication interface for a live video communication session, the live video communication interface including: a first representation of a field-of-view of the one or more first cameras of the first computer system; a second representation of the field-of-view of the one or more first cameras of the first computer system, the second representation of the field-of-view of the one or more first cameras of the first computer system including a representation of a surface in a first scene that is in the field-of-view of the one or more first cameras of the first computer system; a first representation of a field-of-view of one or more second cameras of a second computer system; and a second representation of the field-of-view of the one or more second cameras of the second computer system, the second representation of the field-of-view of the one or more second cameras of the second computer system including a representation of a surface in a second scene that is in the field-of-view of the one or more second cameras of the second computer system.
In accordance with some embodiments, a method is described. The method comprises: at a first computer system that is in communication with a first display generation component and one or more sensors: while the first computer system is in a live video communication session with a second computer system: displaying, via the first display generation component, a representation of a first view of a physical environment that is in a field of view of one or more cameras of the second computer system; while displaying the representation of the first view of the physical environment, detecting, via the one or more sensors, a change in a position of the first computer system; and in response to detecting the change in the position of the first computer system, displaying, via the first display generation component, a representation of a second view of the physical environment in the field of view of the one or more cameras of the second computer system that is different from the first view of the physical environment in the field of view of the one or more cameras of the second computer system.
In accordance with some embodiments, a non-transitory computer-readable storage medium is described. The non-transitory computer-readable storage medium stores one or more programs configured to be executed by one or more processors of a computer system that is in communication with a first display generation component and one or more sensors, the one or more programs including instructions for: while the first computer system is in a live video communication session with a second computer system: displaying, via the first display generation component, a representation of a first view of a physical environment that is in a field of view of one or more cameras of the second computer system; while displaying the representation of the first view of the physical environment, detecting, via the one or more sensors, a change in a position of the first computer system; and in response to detecting the change in the position of the first computer system, displaying, via the first display generation component, a representation of a second view of the physical environment in the field of view of the one or more cameras of the second computer system that is different from the first view of the physical environment in the field of view of the one or more cameras of the second computer system.
In accordance with some embodiments, a transitory computer-readable storage medium is described. The transitory computer-readable storage medium stores one or more programs configured to be executed by one or more processors of a computer system that is in communication with a first display generation component and one or more sensors, the one or more programs including instructions for: while the first computer system is in a live video communication session with a second computer system: displaying, via the first display generation component, a representation of a first view of a physical environment that is in a field of view of one or more cameras of the second computer system; while displaying the representation of the first view of the physical environment, detecting, via the one or more sensors, a change in a position of the first computer system; and in response to detecting the change in the position of the first computer system, displaying, via the first display generation component, a representation of a second view of the physical environment in the field of view of the one or more cameras of the second computer system that is different from the first view of the physical environment in the field of view of the one or more cameras of the second computer system.
In accordance with some embodiments, a computer system configured to communicate with a first display generation component and one or more sensors is described. The computer system comprises: one or more processors; and memory storing one or more programs configured to be executed by the one or more processors, the one or more programs including instructions for: while the first computer system is in a live video communication session with a second computer system: displaying, via the first display generation component, a representation of a first view of a physical environment that is in a field of view of one or more cameras of the second computer system; while displaying the representation of the first view of the physical environment, detecting, via the one or more sensors, a change in a position of the first computer system; and in response to detecting the change in the position of the first computer system, displaying, via the first display generation component, a representation of a second view of the physical environment in the field of view of the one or more cameras of the second computer system that is different from the first view of the physical environment in the field of view of the one or more cameras of the second computer system.
In accordance with some embodiments, a computer system configured to communicate with a first display generation component and one or more sensors is described. The computer system comprises: means for, while the first computer system is in a live video communication session with a second computer system: displaying, via the first display generation component, a representation of a first view of a physical environment that is in a field of view of one or more cameras of the second computer system; while displaying the representation of the first view of the physical environment, detecting, via the one or more sensors, a change in a position of the first computer system; and in response to detecting the change in the position of the first computer system, displaying, via the first display generation component, a representation of a second view of the physical environment in the field of view of the one or more cameras of the second computer system that is different from the first view of the physical environment in the field of view of the one or more cameras of the second computer system.
In accordance with some embodiments, a computer program product is described. The computer program product comprises one or more programs configured to be executed by one or more processors of a computer system that is in communication with a first display generation component and one or more sensors, the one or more programs including instructions for: while the first computer system is in a live video communication session with a second computer system: displaying, via the first display generation component, a representation of a first view of a physical environment that is in a field of view of one or more cameras of the second computer system; while displaying the representation of the first view of the physical environment, detecting, via the one or more sensors, a change in a position of the first computer system; and in response to detecting the change in the position of the first computer system, displaying, via the first display generation component, a representation of a second view of the physical environment in the field of view of the one or more cameras of the second computer system that is different from the first view of the physical environment in the field of view of the one or more cameras of the second computer system.
In accordance with some embodiments, a method is described. The method comprises: at a computer system that is in communication with a display generation component: displaying, via the display generation component, a representation of a physical mark in a physical environment based on a view of the physical environment in a field of view of one or more cameras, wherein: the view of the physical environment includes the physical mark and a physical background, and displaying the representation of the physical mark includes displaying the representation of the physical mark without displaying one or more elements of a portion of the physical background that is in the field of view of the one or more cameras; while displaying the representation of the physical mark without displaying the one or more elements of the portion of the physical background that is in the field of view of the one or more cameras, obtaining data that includes a new physical mark in the physical environment; and in response to obtaining data representing the new physical mark in the physical environment, displaying a representation of the new physical mark without displaying the one or more elements of the portion of the physical background that is in the field of view of the one or more cameras.
In accordance with some embodiments, a non-transitory computer-readable storage medium is described. The non-transitory computer-readable storage medium stores one or more programs configured to be executed by one or more processors of a computer system that is in communication with a display generation component, the one or more programs including instructions for: displaying, via the display generation component, a representation of a physical mark in a physical environment based on a view of the physical environment in a field of view of one or more cameras, wherein: the view of the physical environment includes the physical mark and a physical background, and displaying the representation of the physical mark includes displaying the representation of the physical mark without displaying one or more elements of a portion of the physical background that is in the field of view of the one or more cameras; while displaying the representation of the physical mark without displaying the one or more elements of the portion of the physical background that is in the field of view of the one or more cameras, obtaining data that includes a new physical mark in the physical environment; and in response to obtaining data representing the new physical mark in the physical environment, displaying a representation of the new physical mark without displaying the one or more elements of the portion of the physical background that is in the field of view of the one or more cameras.
In accordance with some embodiments, a transitory computer-readable storage medium is described. The transitory computer-readable storage medium stores one or more programs configured to be executed by one or more processors of a computer system that is in communication with a display generation component, the one or more programs including instructions for: displaying, via the display generation component, a representation of a physical mark in a physical environment based on a view of the physical environment in a field of view of one or more cameras, wherein: the view of the physical environment includes the physical mark and a physical background, and displaying the representation of the physical mark includes displaying the representation of the physical mark without displaying one or more elements of a portion of the physical background that is in the field of view of the one or more cameras; while displaying the representation of the physical mark without displaying the one or more elements of the portion of the physical background that is in the field of view of the one or more cameras, obtaining data that includes a new physical mark in the physical environment; and in response to obtaining data representing the new physical mark in the physical environment, displaying a representation of the new physical mark without displaying the one or more elements of the portion of the physical background that is in the field of view of the one or more cameras.
In accordance with some embodiments, a computer system configured to communicate with a display generation component is described. The computer system comprises: one or more processors; and memory storing one or more programs configured to be executed by the one or more processors, the one or more programs including instructions for: displaying, via the display generation component, a representation of a physical mark in a physical environment based on a view of the physical environment in a field of view of one or more cameras, wherein: the view of the physical environment includes the physical mark and a physical background, and displaying the representation of the physical mark includes displaying the representation of the physical mark without displaying one or more elements of a portion of the physical background that is in the field of view of the one or more cameras; while displaying the representation of the physical mark without displaying the one or more elements of the portion of the physical background that is in the field of view of the one or more cameras, obtaining data that includes a new physical mark in the physical environment; and in response to obtaining data representing the new physical mark in the physical environment, displaying a representation of the new physical mark without displaying the one or more elements of the portion of the physical background that is in the field of view of the one or more cameras.
In accordance with some embodiments, a computer system configured to communicate with a display generation component is described. The computer system comprises: means for displaying, via the display generation component, a representation of a physical mark in a physical environment based on a view of the physical environment in a field of view of one or more cameras, wherein: the view of the physical environment includes the physical mark and a physical background, and displaying the representation of the physical mark includes displaying the representation of the physical mark without displaying one or more elements of a portion of the physical background that is in the field of view of the one or more cameras; means for, while displaying the representation of the physical mark without displaying the one or more elements of the portion of the physical background that is in the field of view of the one or more cameras, obtaining data that includes a new physical mark in the physical environment; and means for, in response to obtaining data representing the new physical mark in the physical environment, displaying a representation of the new physical mark without displaying the one or more elements of the portion of the physical background that is in the field of view of the one or more cameras.
In accordance with some embodiments, a computer program product is described. The computer program product comprises one or more programs configured to be executed by one or more processors of a computer system that is in communication with a display generation component, the one or more programs including instructions for: displaying, via the display generation component, a representation of a physical mark in a physical environment based on a view of the physical environment in a field of view of one or more cameras, wherein: the view of the physical environment includes the physical mark and a physical background, and displaying the representation of the physical mark includes displaying the representation of the physical mark without displaying one or more elements of a portion of the physical background that is in the field of view of the one or more cameras; while displaying the representation of the physical mark without displaying the one or more elements of the portion of the physical background that is in the field of view of the one or more cameras, obtaining data that includes a new physical mark in the physical environment; and in response to obtaining data representing the new physical mark in the physical environment, displaying a representation of the new physical mark without displaying the one or more elements of the portion of the physical background that is in the field of view of the one or more cameras.
In accordance with some embodiments, a method is described. The method comprises: at a computer system that is in communication with a display generation component and one or more cameras: displaying, via the display generation component, an electronic document; detecting, via the one or more cameras, handwriting that includes physical marks on a physical surface that is in a field of view of the one or more cameras and is separate from the computer system; and in response to detecting the handwriting that includes physical marks on the physical surface that is in the field of view of the one or more cameras and is separate from the computer system, displaying, in the electronic document, digital text corresponding to the handwriting that is in the field of view of the one or more cameras.
In accordance with some embodiments, a non-transitory computer-readable storage medium is described. The non-transitory computer-readable storage medium stores one or more programs configured to be executed by one or more processors of a computer system that is in communication with a display generation component and one or more cameras, the one or more programs including instructions for: displaying, via the display generation component, an electronic document; detecting, via the one or more cameras, handwriting that includes physical marks on a physical surface that is in a field of view of the one or more cameras and is separate from the computer system; and in response to detecting the handwriting that includes physical marks on the physical surface that is in the field of view of the one or more cameras and is separate from the computer system, displaying, in the electronic document, digital text corresponding to the handwriting that is in the field of view of the one or more cameras.
In accordance with some embodiments, a transitory computer-readable storage medium is described. The transitory computer-readable storage medium stores one or more programs configured to be executed by one or more processors of a computer system that is in communication with a display generation component and one or more cameras, the one or more programs including instructions for: displaying, via the display generation component, an electronic document; detecting, via the one or more cameras, handwriting that includes physical marks on a physical surface that is in a field of view of the one or more cameras and is separate from the computer system; and in response to detecting the handwriting that includes physical marks on the physical surface that is in the field of view of the one or more cameras and is separate from the computer system, displaying, in the electronic document, digital text corresponding to the handwriting that is in the field of view of the one or more cameras.
In accordance with some embodiments, a computer system configured to communicate with a display generation component and one or more cameras is described. The computer system comprises: one or more processors; and memory storing one or more programs configured to be executed by the one or more processors, the one or more programs including instructions for: displaying, via the display generation component, an electronic document; detecting, via the one or more cameras, handwriting that includes physical marks on a physical surface that is in a field of view of the one or more cameras and is separate from the computer system; and in response to detecting the handwriting that includes physical marks on the physical surface that is in the field of view of the one or more cameras and is separate from the computer system, displaying, in the electronic document, digital text corresponding to the handwriting that is in the field of view of the one or more cameras.
In accordance with some embodiments, a computer system configured to communicate with a display generation component and one or more cameras is described. The computer system comprises: means for displaying, via the display generation component, an electronic document; means for detecting, via the one or more cameras, handwriting that includes physical marks on a physical surface that is in a field of view of the one or more cameras and is separate from the computer system; and means for, in response to detecting the handwriting that includes physical marks on the physical surface that is in the field of view of the one or more cameras and is separate from the computer system, displaying, in the electronic document, digital text corresponding to the handwriting that is in the field of view of the one or more cameras.
In accordance with some embodiments, a computer program product is described. The computer program product comprises one or more programs configured to be executed by one or more processors of a computer system that is in communication with a display generation component and one or more cameras, the one or more programs including instructions for: displaying, via the display generation component, an electronic document; detecting, via the one or more cameras, handwriting that includes physical marks on a physical surface that is in a field of view of the one or more cameras and is separate from the computer system; and in response to detecting the handwriting that includes physical marks on the physical surface that is in the field of view of the one or more cameras and is separate from the computer system, displaying, in the electronic document, digital text corresponding to the handwriting that is in the field of view of the one or more cameras.
In accordance with some embodiments, a method performed at a first computer system that is in communication with a display generation component, one or more cameras, and one or more input devices is described. The method comprises: detecting, via the one or more input devices, one or more first user inputs corresponding to a request to display a user interface of an application for displaying a visual representation of a surface that is in a field of view of the one or more cameras; and in response to detecting the one or more first user inputs: in accordance with a determination that a first set of one or more criteria is met, concurrently displaying, via the display generation component: a visual representation of a first portion of the field of view of the one or more cameras; and a visual indication that indicates a first region of the field of view of the one or more cameras that is a subset of the first portion of the field of view of the one or more cameras, wherein the first region indicates a second portion of the field of view of the one or more cameras that will be presented as a view of the surface by a second computer system.
In accordance with some embodiments, a non-transitory computer-readable storage medium is described. The non-transitory computer-readable storage medium stores one or more programs configured to be executed by one or more processors of a first computer system that is in communication with a display generation component, one or more cameras, and one or more input devices, the one or more programs including instructions for: detecting, via the one or more input devices, one or more first user inputs corresponding to a request to display a user interface of an application for displaying a visual representation of a surface that is in a field of view of the one or more cameras; and in response to detecting the one or more first user inputs: in accordance with a determination that a first set of one or more criteria is met, concurrently displaying, via the display generation component: a visual representation of a first portion of the field of view of the one or more cameras; and a visual indication that indicates a first region of the field of view of the one or more cameras that is a subset of the first portion of the field of view of the one or more cameras, wherein the first region indicates a second portion of the field of view of the one or more cameras that will be presented as a view of the surface by a second computer system.
In accordance with some embodiments, a transitory computer-readable storage medium is described. The transitory computer-readable storage medium stores one or more programs configured to be executed by one or more processors of a first computer system that is configured to communicate with a display generation component, one or more cameras, and one or more input devices, the one or more programs including instructions for: detecting, via the one or more input devices, one or more first user inputs corresponding to a request to display a user interface of an application for displaying a visual representation of a surface that is in a field of view of the one or more cameras; and in response to detecting the one or more first user inputs: in accordance with a determination that a first set of one or more criteria is met, concurrently displaying, via the display generation component: a visual representation of a first portion of the field of view of the one or more cameras; and a visual indication that indicates a first region of the field of view of the one or more cameras that is a subset of the first portion of the field of view of the one or more cameras, wherein the first region indicates a second portion of the field of view of the one or more cameras that will be presented as a view of the surface by a second computer system.
In accordance with some embodiments, a first computer system that is configured to communicate with a display generation component, one or more cameras, and one or more input devices is described. The computer system comprises: one or more processors; and memory storing one or more programs configured to be executed by the one or more processors, the one or more programs including instructions for: detecting, via the one or more input devices, one or more first user inputs corresponding to a request to display a user interface of an application for displaying a visual representation of a surface that is in a field of view of the one or more cameras; and in response to detecting the one or more first user inputs: in accordance with a determination that a first set of one or more criteria is met, concurrently displaying, via the display generation component: a visual representation of a first portion of the field of view of the one or more cameras; and a visual indication that indicates a first region of the field of view of the one or more cameras that is a subset of the first portion of the field of view of the one or more cameras, wherein the first region indicates a second portion of the field of view of the one or more cameras that will be presented as a view of the surface by a second computer system.
In accordance with some embodiments, a first computer system that is configured to communicate with a display generation component, one or more cameras, and one or more input devices is described. The computer system comprises: means for detecting, via the one or more input devices, one or more first user inputs corresponding to a request to display a user interface of an application for displaying a visual representation of a surface that is in a field of view of the one or more cameras; and means, responsive to detecting the one or more first user inputs, for: in accordance with a determination that a first set of one or more criteria is met, concurrently displaying, via the display generation component: a visual representation of a first portion of the field of view of the one or more cameras; and a visual indication that indicates a first region of the field of view of the one or more cameras that is a subset of the first portion of the field of view of the one or more cameras, wherein the first region indicates a second portion of the field of view of the one or more cameras that will be presented as a view of the surface by a second computer system.
In accordance with some embodiments, a computer program product is described. The computer program product comprises one or more programs configured to be executed by one or more processors of a first computer system that is in communication with a display generation component, one or more cameras, and one or more input devices. The one or more programs include instructions for: detecting, via the one or more input devices, one or more first user inputs corresponding to a request to display a user interface of an application for displaying a visual representation of a surface that is in a field of view of the one or more cameras; and in response to detecting the one or more first user inputs: in accordance with a determination that a first set of one or more criteria is met, concurrently displaying, via the display generation component: a visual representation of a first portion of the field of view of the one or more cameras; and a visual indication that indicates a first region of the field of view of the one or more cameras that is a subset of the first portion of the field of view of the one or more cameras, wherein the first region indicates a second portion of the field of view of the one or more cameras that will be presented as a view of the surface by a second computer system.
In accordance with some embodiments, a method is described. The method comprises: at a computer system that is in communication with a display generation component and one or more input devices: detecting, via the one or more input devices, a request to use a feature on the computer system; and in response to detecting the request to use the feature on the computer system, displaying, via the display generation component, a tutorial for using the feature that includes a virtual demonstration of the feature, including: in accordance with a determination that a property of the computer system has a first value, displaying the virtual demonstration having a first appearance; and in accordance with a determination that the property of the computer system has a second value, displaying the virtual demonstration having a second appearance that is different from the first appearance.
In accordance with some embodiments, a non-transitory computer-readable storage medium is described. The non-transitory computer-readable storage medium stores one or more programs configured to be executed by one or more processors of a computer system that is in communication with a display generation component and one or more input devices, the one or more programs including instructions for: detecting, via the one or more input devices, a request to use a feature on the computer system; and in response to detecting the request to use the feature on the computer system, displaying, via the display generation component, a tutorial for using the feature that includes a virtual demonstration of the feature, including: in accordance with a determination that a property of the computer system has a first value, displaying the virtual demonstration having a first appearance; and in accordance with a determination that the property of the computer system has a second value, displaying the virtual demonstration having a second appearance that is different from the first appearance.
In accordance with some embodiments, a transitory computer-readable storage medium is described. The transitory computer-readable storage medium stores one or more programs configured to be executed by one or more processors of a computer system that is in communication with a display generation component and one or more input devices, the one or more programs including instructions for: detecting, via the one or more input devices, a request to use a feature on the computer system; and in response to detecting the request to use the feature on the computer system, displaying, via the display generation component, a tutorial for using the feature that includes a virtual demonstration of the feature, including: in accordance with a determination that a property of the computer system has a first value, displaying the virtual demonstration having a first appearance; and in accordance with a determination that the property of the computer system has a second value, displaying the virtual demonstration having a second appearance that is different from the first appearance.
In accordance with some embodiments, a computer system configured to communicate with a display generation component and one or more input devices is described. The computer system comprises: one or more processors; and memory storing one or more programs configured to be executed by the one or more processors, the one or more programs including instructions for: detecting, via the one or more input devices, a request to use a feature on the computer system; and in response to detecting the request to use the feature on the computer system, displaying, via the display generation component, a tutorial for using the feature that includes a virtual demonstration of the feature, including: in accordance with a determination that a property of the computer system has a first value, displaying the virtual demonstration having a first appearance; and in accordance with a determination that the property of the computer system has a second value, displaying the virtual demonstration having a second appearance that is different from the first appearance.
In accordance with some embodiments, a computer system configured to communicate with a display generation component and one or more input devices is described. The computer system comprises: means for detecting, via the one or more input devices, a request to use a feature on the computer system; and means for, in response to detecting the request to use the feature on the computer system, displaying, via the display generation component, a tutorial for using the feature that includes a virtual demonstration of the feature, including: means for, in accordance with a determination that a property of the computer system has a first value, displaying the virtual demonstration having a first appearance; and means for, in accordance with a determination that the property of the computer system has a second value, displaying the virtual demonstration having a second appearance that is different from the first appearance.
In accordance with some embodiments, a computer program product is described. The computer program product comprises one or more programs configured to be executed by one or more processors of a computer system that is in communication with a display generation component and one or more input devices, the one or more programs including instructions for: detecting, via the one or more input devices, a request to use a feature on the computer system; and in response to detecting the request to use the feature on the computer system, displaying, via the display generation component, a tutorial for using the feature that includes a virtual demonstration of the feature, including: in accordance with a determination that a property of the computer system has a first value, displaying the virtual demonstration having a first appearance; and in accordance with a determination that the property of the computer system has a second value, displaying the virtual demonstration having a second appearance that is different from the first appearance.
Executable instructions for performing these functions are, optionally, included in a non-transitory computer-readable storage medium or other computer program product configured for execution by one or more processors. Executable instructions for performing these functions are, optionally, included in a transitory computer-readable storage medium or other computer program product configured for execution by one or more processors.
Thus, devices are provided with faster, more efficient methods and interfaces for managing a live video communication session, thereby increasing the effectiveness, efficiency, and user satisfaction with such devices. Such methods and interfaces may complement or replace other methods for managing a live video communication session.
For a better understanding of the various described embodiments, reference should be made to the Description of Embodiments below, in conjunction with the following drawings in which like reference numerals refer to corresponding parts throughout the figures.
The following description sets forth exemplary methods, parameters, and the like. It should be recognized, however, that such description is not intended as a limitation on the scope of the present disclosure but is instead provided as a description of exemplary embodiments.
There is a need for electronic devices that provide efficient methods and interfaces for managing a live video communication session and/or managing digital content. For example, there is a need for electronic devices to improve the sharing of content. Such techniques can reduce the cognitive burden on a user who shares content during live video communication session and/or manages digital content in an electronic document, thereby enhancing productivity. Further, such techniques can reduce processor and battery power otherwise wasted on redundant user inputs.
Below,
The processes described below enhance the operability of the devices and make the user-device interfaces more efficient (e.g., by helping the user to provide proper inputs and reducing user mistakes when operating/interacting with the device) through various techniques, including by providing improved visual feedback to the user, reducing the number of inputs needed to perform an operation, providing additional control options without cluttering the user interface with additional displayed controls, performing an operation when a set of conditions has been met without requiring further user input, improving efficiency in managing digital content, improving collaboration between users in a live communication session, improving the live communication session experience, and/or additional techniques. These techniques also reduce power usage and improve battery life of the device by enabling the user to use the device more quickly and efficiently.
In addition, in methods described herein where one or more steps are contingent upon one or more conditions having been met, it should be understood that the described method can be repeated in multiple repetitions so that over the course of the repetitions all of the conditions upon which steps in the method are contingent have been met in different repetitions of the method. For example, if a method requires performing a first step if a condition is satisfied, and a second step if the condition is not satisfied, then a person of ordinary skill would appreciate that the claimed steps are repeated until the condition has been both satisfied and not satisfied, in no particular order. Thus, a method described with one or more steps that are contingent upon one or more conditions having been met could be rewritten as a method that is repeated until each of the conditions described in the method has been met. This, however, is not required of system or computer readable medium claims where the system or computer readable medium contains instructions for performing the contingent operations based on the satisfaction of the corresponding one or more conditions and thus is capable of determining whether the contingency has or has not been satisfied without explicitly repeating steps of a method until all of the conditions upon which steps in the method are contingent have been met. A person having ordinary skill in the art would also understand that, similar to a method with contingent steps, a system or computer readable storage medium can repeat the steps of a method as many times as are needed to ensure that all of the contingent steps have been performed.
Although the following description uses terms “first,” “second,” etc. to describe various elements, these elements should not be limited by the terms. In some embodiments, these terms are used to distinguish one element from another. For example, a first touch could be termed a second touch, and, similarly, a second touch could be termed a first touch, without departing from the scope of the various described embodiments. In some embodiments, the first touch and the second touch are two separate references to the same touch. In some embodiments, the first touch and the second touch are both touches, but they are not the same touch.
The terminology used in the description of the various described embodiments herein is for the purpose of describing particular embodiments only and is not intended to be limiting. As used in the description of the various described embodiments and the appended claims, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term “and/or” as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms “includes,” “including,” “comprises,” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The term “if” is, optionally, construed to mean “when” or “upon” or “in response to determining” or “in response to detecting,” depending on the context. Similarly, the phrase “if it is determined” or “if [a stated condition or event] is detected” is, optionally, construed to mean “upon determining” or “in response to determining” or “upon detecting [the stated condition or event]” or “in response to detecting [the stated condition or event],” depending on the context.
Embodiments of electronic devices, user interfaces for such devices, and associated processes for using such devices are described. In some embodiments, the device is a portable communications device, such as a mobile telephone, that also contains other functions, such as PDA and/or music player functions. Exemplary embodiments of portable multifunction devices include, without limitation, the iPhone®, iPod Touch®, and iPad® devices from Apple Inc. of Cupertino, California. Other portable electronic devices, such as laptops or tablet computers with touch-sensitive surfaces (e.g., touch screen displays and/or touchpads), are, optionally, used. It should also be understood that, in some embodiments, the device is not a portable communications device, but is a desktop computer with a touch-sensitive surface (e.g., a touch screen display and/or a touchpad). In some embodiments, the electronic device is a computer system that is in communication (e.g., via wireless communication, via wired communication) with a display generation component. The display generation component is configured to provide visual output, such as display via a CRT display, display via an LED display, or display via image projection. In some embodiments, the display generation component is integrated with the computer system. In some embodiments, the display generation component is separate from the computer system. As used herein, “displaying” content includes causing to display the content (e.g., video data rendered or decoded by display controller 156) by transmitting, via a wired or wireless connection, data (e.g., image data or video data) to an integrated or external display generation component to visually produce the content.
In the discussion that follows, an electronic device that includes a display and a touch-sensitive surface is described. It should be understood, however, that the electronic device optionally includes one or more other physical user-interface devices, such as a physical keyboard, a mouse, and/or a joystick.
The device typically supports a variety of applications, such as one or more of the following: a drawing application, a presentation application, a word processing application, a website creation application, a disk authoring application, a spreadsheet application, a gaming application, a telephone application, a video conferencing application, an e-mail application, an instant messaging application, a workout support application, a photo management application, a digital camera application, a digital video camera application, a web browsing application, a digital music player application, and/or a digital video player application.
The various applications that are executed on the device optionally use at least one common physical user-interface device, such as the touch-sensitive surface. One or more functions of the touch-sensitive surface as well as corresponding information displayed on the device are, optionally, adjusted and/or varied from one application to the next and/or within a respective application. In this way, a common physical architecture (such as the touch-sensitive surface) of the device optionally supports the variety of applications with user interfaces that are intuitive and transparent to the user.
Attention is now directed toward embodiments of portable devices with touch-sensitive displays.
As used in the specification and claims, the term “intensity” of a contact on a touch-sensitive surface refers to the force or pressure (force per unit area) of a contact (e.g., a finger contact) on the touch-sensitive surface, or to a substitute (proxy) for the force or pressure of a contact on the touch-sensitive surface. The intensity of a contact has a range of values that includes at least four distinct values and more typically includes hundreds of distinct values (e.g., at least 256). Intensity of a contact is, optionally, determined (or measured) using various approaches and various sensors or combinations of sensors. For example, one or more force sensors underneath or adjacent to the touch-sensitive surface are, optionally, used to measure force at various points on the touch-sensitive surface. In some implementations, force measurements from multiple force sensors are combined (e.g., a weighted average) to determine an estimated force of a contact. Similarly, a pressure-sensitive tip of a stylus is, optionally, used to determine a pressure of the stylus on the touch-sensitive surface. Alternatively, the size of the contact area detected on the touch-sensitive surface and/or changes thereto, the capacitance of the touch-sensitive surface proximate to the contact and/or changes thereto, and/or the resistance of the touch-sensitive surface proximate to the contact and/or changes thereto are, optionally, used as a substitute for the force or pressure of the contact on the touch-sensitive surface. In some implementations, the substitute measurements for contact force or pressure are used directly to determine whether an intensity threshold has been exceeded (e.g., the intensity threshold is described in units corresponding to the substitute measurements). In some implementations, the substitute measurements for contact force or pressure are converted to an estimated force or pressure, and the estimated force or pressure is used to determine whether an intensity threshold has been exceeded (e.g., the intensity threshold is a pressure threshold measured in units of pressure). Using the intensity of a contact as an attribute of a user input allows for user access to additional device functionality that may otherwise not be accessible by the user on a reduced-size device with limited real estate for displaying affordances (e.g., on a touch-sensitive display) and/or receiving user input (e.g., via a touch-sensitive display, a touch-sensitive surface, or a physical/mechanical control such as a knob or a button).
As used in the specification and claims, the term “tactile output” refers to physical displacement of a device relative to a previous position of the device, physical displacement of a component (e.g., a touch-sensitive surface) of a device relative to another component (e.g., housing) of the device, or displacement of the component relative to a center of mass of the device that will be detected by a user with the user's sense of touch. For example, in situations where the device or the component of the device is in contact with a surface of a user that is sensitive to touch (e.g., a finger, palm, or other part of a user's hand), the tactile output generated by the physical displacement will be interpreted by the user as a tactile sensation corresponding to a perceived change in physical characteristics of the device or the component of the device. For example, movement of a touch-sensitive surface (e.g., a touch-sensitive display or trackpad) is, optionally, interpreted by the user as a “down click” or “up click” of a physical actuator button. In some cases, a user will feel a tactile sensation such as an “down click” or “up click” even when there is no movement of a physical actuator button associated with the touch-sensitive surface that is physically pressed (e.g., displaced) by the user's movements. As another example, movement of the touch-sensitive surface is, optionally, interpreted or sensed by the user as “roughness” of the touch-sensitive surface, even when there is no change in smoothness of the touch-sensitive surface. While such interpretations of touch by a user will be subject to the individualized sensory perceptions of the user, there are many sensory perceptions of touch that are common to a large majority of users. Thus, when a tactile output is described as corresponding to a particular sensory perception of a user (e.g., an “up click,” a “down click,” “roughness”), unless otherwise stated, the generated tactile output corresponds to physical displacement of the device or a component thereof that will generate the described sensory perception for a typical (or average) user.
It should be appreciated that device 100 is only one example of a portable multifunction device, and that device 100 optionally has more or fewer components than shown, optionally combines two or more components, or optionally has a different configuration or arrangement of the components. The various components shown in
Memory 102 optionally includes high-speed random access memory and optionally also includes non-volatile memory, such as one or more magnetic disk storage devices, flash memory devices, or other non-volatile solid-state memory devices. Memory controller 122 optionally controls access to memory 102 by other components of device 100.
Peripherals interface 118 can be used to couple input and output peripherals of the device to CPU 120 and memory 102. The one or more processors 120 run or execute various software programs (such as computer programs (e.g., including instructions)) and/or sets of instructions stored in memory 102 to perform various functions for device 100 and to process data. In some embodiments, peripherals interface 118, CPU 120, and memory controller 122 are, optionally, implemented on a single chip, such as chip 104. In some other embodiments, they are, optionally, implemented on separate chips.
RF (radio frequency) circuitry 108 receives and sends RF signals, also called electromagnetic signals. RF circuitry 108 converts electrical signals to/from electromagnetic signals and communicates with communications networks and other communications devices via the electromagnetic signals. RF circuitry 108 optionally includes well-known circuitry for performing these functions, including but not limited to an antenna system, an RF transceiver, one or more amplifiers, a tuner, one or more oscillators, a digital signal processor, a CODEC chipset, a subscriber identity module (SIM) card, memory, and so forth. RF circuitry 108 optionally communicates with networks, such as the Internet, also referred to as the World Wide Web (WWW), an intranet and/or a wireless network, such as a cellular telephone network, a wireless local area network (LAN) and/or a metropolitan area network (MAN), and other devices by wireless communication. The RF circuitry 108 optionally includes well-known circuitry for detecting near field communication (NFC) fields, such as by a short-range communication radio. The wireless communication optionally uses any of a plurality of communications standards, protocols, and technologies, including but not limited to Global System for Mobile Communications (GSM), Enhanced Data GSM Environment (EDGE), high-speed downlink packet access (HSDPA), high-speed uplink packet access (HSUPA), Evolution, Data-Only (EV-DO), HSPA, HSPA+, Dual-Cell HSPA (DC-HSPDA), long term evolution (LTE), near field communication (NFC), wideband code division multiple access (W-CDMA), code division multiple access (CDMA), time division multiple access (TDMA), Bluetooth, Bluetooth Low Energy (BTLE), Wireless Fidelity (Wi-Fi) (e.g., IEEE 802.11a, IEEE 802.11b, IEEE 802.11g, IEEE 802.11n, and/or IEEE 802.11ac), voice over Internet Protocol (VoIP), Wi-MAX, a protocol for e-mail (e.g., Internet message access protocol (IMAP) and/or post office protocol (POP)), instant messaging (e.g., extensible messaging and presence protocol (XMPP), Session Initiation Protocol for Instant Messaging and Presence Leveraging Extensions (SIMPLE), Instant Messaging and Presence Service (IMPS)), and/or Short Message Service (SMS), or any other suitable communication protocol, including communication protocols not yet developed as of the filing date of this document.
Audio circuitry 110, speaker 111, and microphone 113 provide an audio interface between a user and device 100. Audio circuitry 110 receives audio data from peripherals interface 118, converts the audio data to an electrical signal, and transmits the electrical signal to speaker 111. Speaker 111 converts the electrical signal to human-audible sound waves. Audio circuitry 110 also receives electrical signals converted by microphone 113 from sound waves. Audio circuitry 110 converts the electrical signal to audio data and transmits the audio data to peripherals interface 118 for processing. Audio data is, optionally, retrieved from and/or transmitted to memory 102 and/or RF circuitry 108 by peripherals interface 118. In some embodiments, audio circuitry 110 also includes a headset jack (e.g., 212,
I/O subsystem 106 couples input/output peripherals on device 100, such as touch screen 112 and other input control devices 116, to peripherals interface 118. I/O subsystem 106 optionally includes display controller 156, optical sensor controller 158, depth camera controller 169, intensity sensor controller 159, haptic feedback controller 161, and one or more input controllers 160 for other input or control devices. The one or more input controllers 160 receive/send electrical signals from/to other input control devices 116. The other input control devices 116 optionally include physical buttons (e.g., push buttons, rocker buttons, etc.), dials, slider switches, joysticks, click wheels, and so forth. In some embodiments, input controller(s) 160 are, optionally, coupled to any (or none) of the following: a keyboard, an infrared port, a USB port, and a pointer device such as a mouse. The one or more buttons (e.g., 208,
A quick press of the push button optionally disengages a lock of touch screen 112 or optionally begins a process that uses gestures on the touch screen to unlock the device, as described in U.S. patent application Ser. No. 11/322,549, “Unlocking a Device by Performing Gestures on an Unlock Image,” filed Dec. 23, 2005, U.S. Pat. No. 7,657,849, which is hereby incorporated by reference in its entirety. A longer press of the push button (e.g., 206) optionally turns power to device 100 on or off. The functionality of one or more of the buttons are, optionally, user-customizable. Touch screen 112 is used to implement virtual or soft buttons and one or more soft keyboards.
Touch-sensitive display 112 provides an input interface and an output interface between the device and a user. Display controller 156 receives and/or sends electrical signals from/to touch screen 112. Touch screen 112 displays visual output to the user. The visual output optionally includes graphics, text, icons, video, and any combination thereof (collectively termed “graphics”). In some embodiments, some or all of the visual output optionally corresponds to user-interface objects.
Touch screen 112 has a touch-sensitive surface, sensor, or set of sensors that accepts input from the user based on haptic and/or tactile contact. Touch screen 112 and display controller 156 (along with any associated modules and/or sets of instructions in memory 102) detect contact (and any movement or breaking of the contact) on touch screen 112 and convert the detected contact into interaction with user-interface objects (e.g., one or more soft keys, icons, web pages, or images) that are displayed on touch screen 112. In an exemplary embodiment, a point of contact between touch screen 112 and the user corresponds to a finger of the user.
Touch screen 112 optionally uses LCD (liquid crystal display) technology, LPD (light emitting polymer display) technology, or LED (light emitting diode) technology, although other display technologies are used in other embodiments. Touch screen 112 and display controller 156 optionally detect contact and any movement or breaking thereof using any of a plurality of touch sensing technologies now known or later developed, including but not limited to capacitive, resistive, infrared, and surface acoustic wave technologies, as well as other proximity sensor arrays or other elements for determining one or more points of contact with touch screen 112. In an exemplary embodiment, projected mutual capacitance sensing technology is used, such as that found in the iPhone® and iPod Touch® from Apple Inc. of Cupertino, California.
A touch-sensitive display in some embodiments of touch screen 112 is, optionally, analogous to the multi-touch sensitive touchpads described in the following U.S. Pat. No. 6,323,846 (Westerman et al.), 6,570,557 (Westerman et al.), and/or 6,677,932 (Westerman), and/or U.S. Patent Publication 2002/0015024A1, each of which is hereby incorporated by reference in its entirety. However, touch screen 112 displays visual output from device 100, whereas touch-sensitive touchpads do not provide visual output.
A touch-sensitive display in some embodiments of touch screen 112 is described in the following applications: (1) U.S. patent application Ser. No. 11/381,313, “Multipoint Touch Surface Controller,” filed May 2, 2006; (2) U.S. patent application Ser. No. 10/840,862, “Multipoint Touchscreen,” filed May 6, 2004; (3) U.S. patent application Ser. No. 10/903,964, “Gestures For Touch Sensitive Input Devices,” filed Jul. 30, 2004; (4) U.S. patent application Ser. No. 11/048,264, “Gestures For Touch Sensitive Input Devices,” filed Jan. 31, 2005; (5) U.S. patent application Ser. No. 11/038,590, “Mode-Based Graphical User Interfaces For Touch Sensitive Input Devices,” filed Jan. 18, 2005; (6) U.S. patent application Ser. No. 11/228,758, “Virtual Input Device Placement On A Touch Screen User Interface,” filed Sep. 16, 2005; (7) U.S. patent application Ser. No. 11/228,700, “Operation Of A Computer With A Touch Screen Interface,” filed Sep. 16, 2005; (8) U.S. patent application Ser. No. 11/228,737, “Activating Virtual Keys Of A Touch-Screen Virtual Keyboard,” filed Sep. 16, 2005; and (9) U.S. patent application Ser. No. 11/367,749, “Multi-Functional Hand-Held Device,” filed Mar. 3, 2006. All of these applications are incorporated by reference herein in their entirety.
Touch screen 112 optionally has a video resolution in excess of 100 dpi. In some embodiments, the touch screen has a video resolution of approximately 160 dpi. The user optionally makes contact with touch screen 112 using any suitable object or appendage, such as a stylus, a finger, and so forth. In some embodiments, the user interface is designed to work primarily with finger-based contacts and gestures, which can be less precise than stylus-based input due to the larger area of contact of a finger on the touch screen. In some embodiments, the device translates the rough finger-based input into a precise pointer/cursor position or command for performing the actions desired by the user.
In some embodiments, in addition to the touch screen, device 100 optionally includes a touchpad for activating or deactivating particular functions. In some embodiments, the touchpad is a touch-sensitive area of the device that, unlike the touch screen, does not display visual output. The touchpad is, optionally, a touch-sensitive surface that is separate from touch screen 112 or an extension of the touch-sensitive surface formed by the touch screen.
Device 100 also includes power system 162 for powering the various components. Power system 162 optionally includes a power management system, one or more power sources (e.g., battery, alternating current (AC)), a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator (e.g., a light-emitting diode (LED)) and any other components associated with the generation, management and distribution of power in portable devices.
Device 100 optionally also includes one or more optical sensors 164.
Device 100 optionally also includes one or more depth camera sensors 175.
In some embodiments, a depth map (e.g., depth map image) contains information (e.g., values) that relates to the distance of objects in a scene from a viewpoint (e.g., a camera, an optical sensor, a depth camera sensor). In one embodiment of a depth map, each depth pixel defines the position in the viewpoint's Z-axis where its corresponding two-dimensional pixel is located. In some embodiments, a depth map is composed of pixels wherein each pixel is defined by a value (e.g., 0-255). For example, the “0” value represents pixels that are located at the most distant place in a “three dimensional” scene and the “255” value represents pixels that are located closest to a viewpoint (e.g., a camera, an optical sensor, a depth camera sensor) in the “three dimensional” scene. In other embodiments, a depth map represents the distance between an object in a scene and the plane of the viewpoint. In some embodiments, the depth map includes information about the relative depth of various features of an object of interest in view of the depth camera (e.g., the relative depth of eyes, nose, mouth, ears of a user's face). In some embodiments, the depth map includes information that enables the device to determine contours of the object of interest in a z direction.
Device 100 optionally also includes one or more contact intensity sensors 165.
Device 100 optionally also includes one or more proximity sensors 166.
Device 100 optionally also includes one or more tactile output generators 167.
Device 100 optionally also includes one or more accelerometers 168.
In some embodiments, the software components stored in memory 102 include operating system 126, communication module (or set of instructions) 128, contact/motion module (or set of instructions) 130, graphics module (or set of instructions) 132, text input module (or set of instructions) 134, Global Positioning System (GPS) module (or set of instructions) 135, and applications (or sets of instructions) 136. Furthermore, in some embodiments, memory 102 (
Operating system 126 (e.g., Darwin, RTXC, LINUX, UNIX, OS X, iOS, WINDOWS, or an embedded operating system such as VxWorks) includes various software components and/or drivers for controlling and managing general system tasks (e.g., memory management, storage device control, power management, etc.) and facilitates communication between various hardware and software components.
Communication module 128 facilitates communication with other devices over one or more external ports 124 and also includes various software components for handling data received by RF circuitry 108 and/or external port 124. External port 124 (e.g., Universal Serial Bus (USB), FIREWIRE, etc.) is adapted for coupling directly to other devices or indirectly over a network (e.g., the Internet, wireless LAN, etc.). In some embodiments, the external port is a multi-pin (e.g., 30-pin) connector that is the same as, or similar to and/or compatible with, the 30-pin connector used on iPod® (trademark of Apple Inc.) devices.
Contact/motion module 130 optionally detects contact with touch screen 112 (in conjunction with display controller 156) and other touch-sensitive devices (e.g., a touchpad or physical click wheel). Contact/motion module 130 includes various software components for performing various operations related to detection of contact, such as determining if contact has occurred (e.g., detecting a finger-down event), determining an intensity of the contact (e.g., the force or pressure of the contact or a substitute for the force or pressure of the contact), determining if there is movement of the contact and tracking the movement across the touch-sensitive surface (e.g., detecting one or more finger-dragging events), and determining if the contact has ceased (e.g., detecting a finger-up event or a break in contact). Contact/motion module 130 receives contact data from the touch-sensitive surface. Determining movement of the point of contact, which is represented by a series of contact data, optionally includes determining speed (magnitude), velocity (magnitude and direction), and/or an acceleration (a change in magnitude and/or direction) of the point of contact. These operations are, optionally, applied to single contacts (e.g., one finger contacts) or to multiple simultaneous contacts (e.g., “multitouch”/multiple finger contacts). In some embodiments, contact/motion module 130 and display controller 156 detect contact on a touchpad.
In some embodiments, contact/motion module 130 uses a set of one or more intensity thresholds to determine whether an operation has been performed by a user (e.g., to determine whether a user has “clicked” on an icon). In some embodiments, at least a subset of the intensity thresholds are determined in accordance with software parameters (e.g., the intensity thresholds are not determined by the activation thresholds of particular physical actuators and can be adjusted without changing the physical hardware of device 100). For example, a mouse “click” threshold of a trackpad or touch screen display can be set to any of a large range of predefined threshold values without changing the trackpad or touch screen display hardware. Additionally, in some implementations, a user of the device is provided with software settings for adjusting one or more of the set of intensity thresholds (e.g., by adjusting individual intensity thresholds and/or by adjusting a plurality of intensity thresholds at once with a system-level click “intensity” parameter).
Contact/motion module 130 optionally detects a gesture input by a user. Different gestures on the touch-sensitive surface have different contact patterns (e.g., different motions, timings, and/or intensities of detected contacts). Thus, a gesture is, optionally, detected by detecting a particular contact pattern. For example, detecting a finger tap gesture includes detecting a finger-down event followed by detecting a finger-up (liftoff) event at the same position (or substantially the same position) as the finger-down event (e.g., at the position of an icon). As another example, detecting a finger swipe gesture on the touch-sensitive surface includes detecting a finger-down event followed by detecting one or more finger-dragging events, and subsequently followed by detecting a finger-up (liftoff) event.
Graphics module 132 includes various known software components for rendering and displaying graphics on touch screen 112 or other display, including components for changing the visual impact (e.g., brightness, transparency, saturation, contrast, or other visual property) of graphics that are displayed. As used herein, the term “graphics” includes any object that can be displayed to a user, including, without limitation, text, web pages, icons (such as user-interface objects including soft keys), digital images, videos, animations, and the like.
In some embodiments, graphics module 132 stores data representing graphics to be used. Each graphic is, optionally, assigned a corresponding code. Graphics module 132 receives, from applications etc., one or more codes specifying graphics to be displayed along with, if necessary, coordinate data and other graphic property data, and then generates screen image data to output to display controller 156.
Haptic feedback module 133 includes various software components for generating instructions used by tactile output generator(s) 167 to produce tactile outputs at one or more locations on device 100 in response to user interactions with device 100.
Text input module 134, which is, optionally, a component of graphics module 132, provides soft keyboards for entering text in various applications (e.g., contacts module 137, e-mail client module 140, IM module 141, browser module 147, and any other application that needs text input).
GPS module 135 determines the location of the device and provides this information for use in various applications (e.g., to telephone module 138 for use in location-based dialing; to camera module 143 as picture/video metadata; and to applications that provide location-based services such as weather widgets, local yellow page widgets, and map/navigation widgets).
Applications 136 optionally include the following modules (or sets of instructions), or a subset or superset thereof:
Examples of other applications 136 that are, optionally, stored in memory 102 include other word processing applications, other image editing applications, drawing applications, presentation applications, JAVA-enabled applications, encryption, digital rights management, voice recognition, and voice replication.
In conjunction with touch screen 112, display controller 156, contact/motion module 130, graphics module 132, and text input module 134, contacts module 137 are, optionally, used to manage an address book or contact list (e.g., stored in application internal state 192 of contacts module 137 in memory 102 or memory 370), including: adding name(s) to the address book; deleting name(s) from the address book; associating telephone number(s), e-mail address(es), physical address(es) or other information with a name; associating an image with a name; categorizing and sorting names; providing telephone numbers or e-mail addresses to initiate and/or facilitate communications by telephone module 138, video conference module 139, e-mail client module 140, or IM module 141; and so forth.
In conjunction with RF circuitry 108, audio circuitry 110, speaker 111, microphone 113, touch screen 112, display controller 156, contact/motion module 130, graphics module 132, and text input module 134, telephone module 138 are optionally, used to enter a sequence of characters corresponding to a telephone number, access one or more telephone numbers in contacts module 137, modify a telephone number that has been entered, dial a respective telephone number, conduct a conversation, and disconnect or hang up when the conversation is completed. As noted above, the wireless communication optionally uses any of a plurality of communications standards, protocols, and technologies.
In conjunction with RF circuitry 108, audio circuitry 110, speaker 111, microphone 113, touch screen 112, display controller 156, optical sensor 164, optical sensor controller 158, contact/motion module 130, graphics module 132, text input module 134, contacts module 137, and telephone module 138, video conference module 139 includes executable instructions to initiate, conduct, and terminate a video conference between a user and one or more other participants in accordance with user instructions.
In conjunction with RF circuitry 108, touch screen 112, display controller 156, contact/motion module 130, graphics module 132, and text input module 134, e-mail client module 140 includes executable instructions to create, send, receive, and manage e-mail in response to user instructions. In conjunction with image management module 144, e-mail client module 140 makes it very easy to create and send e-mails with still or video images taken with camera module 143.
In conjunction with RF circuitry 108, touch screen 112, display controller 156, contact/motion module 130, graphics module 132, and text input module 134, the instant messaging module 141 includes executable instructions to enter a sequence of characters corresponding to an instant message, to modify previously entered characters, to transmit a respective instant message (for example, using a Short Message Service (SMS) or Multimedia Message Service (MMS) protocol for telephony-based instant messages or using XMPP, SIMPLE, or IMPS for Internet-based instant messages), to receive instant messages, and to view received instant messages. In some embodiments, transmitted and/or received instant messages optionally include graphics, photos, audio files, video files and/or other attachments as are supported in an MMS and/or an Enhanced Messaging Service (EMS). As used herein, “instant messaging” refers to both telephony-based messages (e.g., messages sent using SMS or MMS) and Internet-based messages (e.g., messages sent using XMPP, SIMPLE, or IMPS).
In conjunction with RF circuitry 108, touch screen 112, display controller 156, contact/motion module 130, graphics module 132, text input module 134, GPS module 135, map module 154, and music player module, workout support module 142 includes executable instructions to create workouts (e.g., with time, distance, and/or calorie burning goals); communicate with workout sensors (sports devices); receive workout sensor data; calibrate sensors used to monitor a workout; select and play music for a workout; and display, store, and transmit workout data.
In conjunction with touch screen 112, display controller 156, optical sensor(s) 164, optical sensor controller 158, contact/motion module 130, graphics module 132, and image management module 144, camera module 143 includes executable instructions to capture still images or video (including a video stream) and store them into memory 102, modify characteristics of a still image or video, or delete a still image or video from memory 102.
In conjunction with touch screen 112, display controller 156, contact/motion module 130, graphics module 132, text input module 134, and camera module 143, image management module 144 includes executable instructions to arrange, modify (e.g., edit), or otherwise manipulate, label, delete, present (e.g., in a digital slide show or album), and store still and/or video images.
In conjunction with RF circuitry 108, touch screen 112, display controller 156, contact/motion module 130, graphics module 132, and text input module 134, browser module 147 includes executable instructions to browse the Internet in accordance with user instructions, including searching, linking to, receiving, and displaying web pages or portions thereof, as well as attachments and other files linked to web pages.
In conjunction with RF circuitry 108, touch screen 112, display controller 156, contact/motion module 130, graphics module 132, text input module 134, e-mail client module 140, and browser module 147, calendar module 148 includes executable instructions to create, display, modify, and store calendars and data associated with calendars (e.g., calendar entries, to-do lists, etc.) in accordance with user instructions.
In conjunction with RF circuitry 108, touch screen 112, display controller 156, contact/motion module 130, graphics module 132, text input module 134, and browser module 147, widget modules 149 are mini-applications that are, optionally, downloaded and used by a user (e.g., weather widget 149-1, stocks widget 149-2, calculator widget 149-3, alarm clock widget 149-4, and dictionary widget 149-5) or created by the user (e.g., user-created widget 149-6). In some embodiments, a widget includes an HTML (Hypertext Markup Language) file, a CSS (Cascading Style Sheets) file, and a JavaScript file. In some embodiments, a widget includes an XML (Extensible Markup Language) file and a JavaScript file (e.g., Yahoo! Widgets).
In conjunction with RF circuitry 108, touch screen 112, display controller 156, contact/motion module 130, graphics module 132, text input module 134, and browser module 147, the widget creator module 150 are, optionally, used by a user to create widgets (e.g., turning a user-specified portion of a web page into a widget).
In conjunction with touch screen 112, display controller 156, contact/motion module 130, graphics module 132, and text input module 134, search module 151 includes executable instructions to search for text, music, sound, image, video, and/or other files in memory 102 that match one or more search criteria (e.g., one or more user-specified search terms) in accordance with user instructions.
In conjunction with touch screen 112, display controller 156, contact/motion module 130, graphics module 132, audio circuitry 110, speaker 111, RF circuitry 108, and browser module 147, video and music player module 152 includes executable instructions that allow the user to download and play back recorded music and other sound files stored in one or more file formats, such as MP3 or AAC files, and executable instructions to display, present, or otherwise play back videos (e.g., on touch screen 112 or on an external, connected display via external port 124). In some embodiments, device 100 optionally includes the functionality of an MP3 player, such as an iPod (trademark of Apple Inc.).
In conjunction with touch screen 112, display controller 156, contact/motion module 130, graphics module 132, and text input module 134, notes module 153 includes executable instructions to create and manage notes, to-do lists, and the like in accordance with user instructions.
In conjunction with RF circuitry 108, touch screen 112, display controller 156, contact/motion module 130, graphics module 132, text input module 134, GPS module 135, and browser module 147, map module 154 are, optionally, used to receive, display, modify, and store maps and data associated with maps (e.g., driving directions, data on stores and other points of interest at or near a particular location, and other location-based data) in accordance with user instructions.
In conjunction with touch screen 112, display controller 156, contact/motion module 130, graphics module 132, audio circuitry 110, speaker 111, RF circuitry 108, text input module 134, e-mail client module 140, and browser module 147, online video module 155 includes instructions that allow the user to access, browse, receive (e.g., by streaming and/or download), play back (e.g., on the touch screen or on an external, connected display via external port 124), send an e-mail with a link to a particular online video, and otherwise manage online videos in one or more file formats, such as H.264. In some embodiments, instant messaging module 141, rather than e-mail client module 140, is used to send a link to a particular online video. Additional description of the online video application can be found in U.S. Provisional Patent Application No. 60/936,562, “Portable Multifunction Device, Method, and Graphical User Interface for Playing Online Videos,” filed Jun. 20, 2007, and U.S. patent application Ser. No. 11/968,067, “Portable Multifunction Device, Method, and Graphical User Interface for Playing Online Videos,” filed Dec. 31, 2007, the contents of which are hereby incorporated by reference in their entirety.
Each of the above-identified modules and applications corresponds to a set of executable instructions for performing one or more functions described above and the methods described in this application (e.g., the computer-implemented methods and other information processing methods described herein). These modules (e.g., sets of instructions) need not be implemented as separate software programs (such as computer programs (e.g., including instructions)), procedures, or modules, and thus various subsets of these modules are, optionally, combined or otherwise rearranged in various embodiments. For example, video player module is, optionally, combined with music player module into a single module (e.g., video and music player module 152,
In some embodiments, device 100 is a device where operation of a predefined set of functions on the device is performed exclusively through a touch screen and/or a touchpad. By using a touch screen and/or a touchpad as the primary input control device for operation of device 100, the number of physical input control devices (such as push buttons, dials, and the like) on device 100 is, optionally, reduced.
The predefined set of functions that are performed exclusively through a touch screen and/or a touchpad optionally include navigation between user interfaces. In some embodiments, the touchpad, when touched by the user, navigates device 100 to a main, home, or root menu from any user interface that is displayed on device 100. In such embodiments, a “menu button” is implemented using a touchpad. In some other embodiments, the menu button is a physical push button or other physical input control device instead of a touchpad.
Event sorter 170 receives event information and determines the application 136-1 and application view 191 of application 136-1 to which to deliver the event information. Event sorter 170 includes event monitor 171 and event dispatcher module 174. In some embodiments, application 136-1 includes application internal state 192, which indicates the current application view(s) displayed on touch-sensitive display 112 when the application is active or executing. In some embodiments, device/global internal state 157 is used by event sorter 170 to determine which application(s) is (are) currently active, and application internal state 192 is used by event sorter 170 to determine application views 191 to which to deliver event information.
In some embodiments, application internal state 192 includes additional information, such as one or more of: resume information to be used when application 136-1 resumes execution, user interface state information that indicates information being displayed or that is ready for display by application 136-1, a state queue for enabling the user to go back to a prior state or view of application 136-1, and a redo/undo queue of previous actions taken by the user.
Event monitor 171 receives event information from peripherals interface 118. Event information includes information about a sub-event (e.g., a user touch on touch-sensitive display 112, as part of a multi-touch gesture). Peripherals interface 118 transmits information it receives from I/O subsystem 106 or a sensor, such as proximity sensor 166, accelerometer(s) 168, and/or microphone 113 (through audio circuitry 110). Information that peripherals interface 118 receives from I/O subsystem 106 includes information from touch-sensitive display 112 or a touch-sensitive surface.
In some embodiments, event monitor 171 sends requests to the peripherals interface 118 at predetermined intervals. In response, peripherals interface 118 transmits event information. In other embodiments, peripherals interface 118 transmits event information only when there is a significant event (e.g., receiving an input above a predetermined noise threshold and/or for more than a predetermined duration).
In some embodiments, event sorter 170 also includes a hit view determination module 172 and/or an active event recognizer determination module 173.
Hit view determination module 172 provides software procedures for determining where a sub-event has taken place within one or more views when touch-sensitive display 112 displays more than one view. Views are made up of controls and other elements that a user can see on the display.
Another aspect of the user interface associated with an application is a set of views, sometimes herein called application views or user interface windows, in which information is displayed and touch-based gestures occur. The application views (of a respective application) in which a touch is detected optionally correspond to programmatic levels within a programmatic or view hierarchy of the application. For example, the lowest level view in which a touch is detected is, optionally, called the hit view, and the set of events that are recognized as proper inputs are, optionally, determined based, at least in part, on the hit view of the initial touch that begins a touch-based gesture.
Hit view determination module 172 receives information related to sub-events of a touch-based gesture. When an application has multiple views organized in a hierarchy, hit view determination module 172 identifies a hit view as the lowest view in the hierarchy which should handle the sub-event. In most circumstances, the hit view is the lowest level view in which an initiating sub-event occurs (e.g., the first sub-event in the sequence of sub-events that form an event or potential event). Once the hit view is identified by the hit view determination module 172, the hit view typically receives all sub-events related to the same touch or input source for which it was identified as the hit view.
Active event recognizer determination module 173 determines which view or views within a view hierarchy should receive a particular sequence of sub-events. In some embodiments, active event recognizer determination module 173 determines that only the hit view should receive a particular sequence of sub-events. In other embodiments, active event recognizer determination module 173 determines that all views that include the physical location of a sub-event are actively involved views, and therefore determines that all actively involved views should receive a particular sequence of sub-events. In other embodiments, even if touch sub-events were entirely confined to the area associated with one particular view, views higher in the hierarchy would still remain as actively involved views.
Event dispatcher module 174 dispatches the event information to an event recognizer (e.g., event recognizer 180). In embodiments including active event recognizer determination module 173, event dispatcher module 174 delivers the event information to an event recognizer determined by active event recognizer determination module 173. In some embodiments, event dispatcher module 174 stores in an event queue the event information, which is retrieved by a respective event receiver 182.
In some embodiments, operating system 126 includes event sorter 170. Alternatively, application 136-1 includes event sorter 170. In yet other embodiments, event sorter 170 is a stand-alone module, or a part of another module stored in memory 102, such as contact/motion module 130.
In some embodiments, application 136-1 includes a plurality of event handlers 190 and one or more application views 191, each of which includes instructions for handling touch events that occur within a respective view of the application's user interface. Each application view 191 of the application 136-1 includes one or more event recognizers 180. Typically, a respective application view 191 includes a plurality of event recognizers 180. In other embodiments, one or more of event recognizers 180 are part of a separate module, such as a user interface kit or a higher level object from which application 136-1 inherits methods and other properties. In some embodiments, a respective event handler 190 includes one or more of: data updater 176, object updater 177, GUI updater 178, and/or event data 179 received from event sorter 170. Event handler 190 optionally utilizes or calls data updater 176, object updater 177, or GUI updater 178 to update the application internal state 192. Alternatively, one or more of the application views 191 include one or more respective event handlers 190. Also, in some embodiments, one or more of data updater 176, object updater 177, and GUI updater 178 are included in a respective application view 191.
A respective event recognizer 180 receives event information (e.g., event data 179) from event sorter 170 and identifies an event from the event information. Event recognizer 180 includes event receiver 182 and event comparator 184. In some embodiments, event recognizer 180 also includes at least a subset of: metadata 183, and event delivery instructions 188 (which optionally include sub-event delivery instructions).
Event receiver 182 receives event information from event sorter 170. The event information includes information about a sub-event, for example, a touch or a touch movement. Depending on the sub-event, the event information also includes additional information, such as location of the sub-event. When the sub-event concerns motion of a touch, the event information optionally also includes speed and direction of the sub-event. In some embodiments, events include rotation of the device from one orientation to another (e.g., from a portrait orientation to a landscape orientation, or vice versa), and the event information includes corresponding information about the current orientation (also called device attitude) of the device.
Event comparator 184 compares the event information to predefined event or sub-event definitions and, based on the comparison, determines an event or sub-event, or determines or updates the state of an event or sub-event. In some embodiments, event comparator 184 includes event definitions 186. Event definitions 186 contain definitions of events (e.g., predefined sequences of sub-events), for example, event 1 (187-1), event 2 (187-2), and others. In some embodiments, sub-events in an event (187) include, for example, touch begin, touch end, touch movement, touch cancellation, and multiple touching. In one example, the definition for event 1 (187-1) is a double tap on a displayed object. The double tap, for example, comprises a first touch (touch begin) on the displayed object for a predetermined phase, a first liftoff (touch end) for a predetermined phase, a second touch (touch begin) on the displayed object for a predetermined phase, and a second liftoff (touch end) for a predetermined phase. In another example, the definition for event 2 (187-2) is a dragging on a displayed object. The dragging, for example, comprises a touch (or contact) on the displayed object for a predetermined phase, a movement of the touch across touch-sensitive display 112, and liftoff of the touch (touch end). In some embodiments, the event also includes information for one or more associated event handlers 190.
In some embodiments, event definition 187 includes a definition of an event for a respective user-interface object. In some embodiments, event comparator 184 performs a hit test to determine which user-interface object is associated with a sub-event. For example, in an application view in which three user-interface objects are displayed on touch-sensitive display 112, when a touch is detected on touch-sensitive display 112, event comparator 184 performs a hit test to determine which of the three user-interface objects is associated with the touch (sub-event). If each displayed object is associated with a respective event handler 190, the event comparator uses the result of the hit test to determine which event handler 190 should be activated. For example, event comparator 184 selects an event handler associated with the sub-event and the object triggering the hit test.
In some embodiments, the definition for a respective event (187) also includes delayed actions that delay delivery of the event information until after it has been determined whether the sequence of sub-events does or does not correspond to the event recognizer's event type.
When a respective event recognizer 180 determines that the series of sub-events do not match any of the events in event definitions 186, the respective event recognizer 180 enters an event impossible, event failed, or event ended state, after which it disregards subsequent sub-events of the touch-based gesture. In this situation, other event recognizers, if any, that remain active for the hit view continue to track and process sub-events of an ongoing touch-based gesture.
In some embodiments, a respective event recognizer 180 includes metadata 183 with configurable properties, flags, and/or lists that indicate how the event delivery system should perform sub-event delivery to actively involved event recognizers. In some embodiments, metadata 183 includes configurable properties, flags, and/or lists that indicate how event recognizers interact, or are enabled to interact, with one another. In some embodiments, metadata 183 includes configurable properties, flags, and/or lists that indicate whether sub-events are delivered to varying levels in the view or programmatic hierarchy.
In some embodiments, a respective event recognizer 180 activates event handler 190 associated with an event when one or more particular sub-events of an event are recognized. In some embodiments, a respective event recognizer 180 delivers event information associated with the event to event handler 190. Activating an event handler 190 is distinct from sending (and deferred sending) sub-events to a respective hit view. In some embodiments, event recognizer 180 throws a flag associated with the recognized event, and event handler 190 associated with the flag catches the flag and performs a predefined process.
In some embodiments, event delivery instructions 188 include sub-event delivery instructions that deliver event information about a sub-event without activating an event handler. Instead, the sub-event delivery instructions deliver event information to event handlers associated with the series of sub-events or to actively involved views. Event handlers associated with the series of sub-events or with actively involved views receive the event information and perform a predetermined process.
In some embodiments, data updater 176 creates and updates data used in application 136-1. For example, data updater 176 updates the telephone number used in contacts module 137, or stores a video file used in video player module. In some embodiments, object updater 177 creates and updates objects used in application 136-1. For example, object updater 177 creates a new user-interface object or updates the position of a user-interface object. GUI updater 178 updates the GUI. For example, GUI updater 178 prepares display information and sends it to graphics module 132 for display on a touch-sensitive display.
In some embodiments, event handler(s) 190 includes or has access to data updater 176, object updater 177, and GUI updater 178. In some embodiments, data updater 176, object updater 177, and GUI updater 178 are included in a single module of a respective application 136-1 or application view 191. In other embodiments, they are included in two or more software modules.
It shall be understood that the foregoing discussion regarding event handling of user touches on touch-sensitive displays also applies to other forms of user inputs to operate multifunction devices 100 with input devices, not all of which are initiated on touch screens. For example, mouse movement and mouse button presses, optionally coordinated with single or multiple keyboard presses or holds; contact movements such as taps, drags, scrolls, etc. on touchpads; pen stylus inputs; movement of the device; oral instructions; detected eye movements; biometric inputs; and/or any combination thereof are optionally utilized as inputs corresponding to sub-events which define an event to be recognized.
Device 100 optionally also include one or more physical buttons, such as “home” or menu button 204. As described previously, menu button 204 is, optionally, used to navigate to any application 136 in a set of applications that are, optionally, executed on device 100. Alternatively, in some embodiments, the menu button is implemented as a soft key in a GUI displayed on touch screen 112.
In some embodiments, device 100 includes touch screen 112, menu button 204, push button 206 for powering the device on/off and locking the device, volume adjustment button(s) 208, subscriber identity module (SIM) card slot 210, headset jack 212, and docking/charging external port 124. Push button 206 is, optionally, used to turn the power on/off on the device by depressing the button and holding the button in the depressed state for a predefined time interval; to lock the device by depressing the button and releasing the button before the predefined time interval has elapsed; and/or to unlock the device or initiate an unlock process. In an alternative embodiment, device 100 also accepts verbal input for activation or deactivation of some functions through microphone 113. Device 100 also, optionally, includes one or more contact intensity sensors 165 for detecting intensity of contacts on touch screen 112 and/or one or more tactile output generators 167 for generating tactile outputs for a user of device 100.
Each of the above-identified elements in
Attention is now directed towards embodiments of user interfaces that are, optionally, implemented on, for example, portable multifunction device 100.
It should be noted that the icon labels illustrated in
Although some of the examples that follow will be given with reference to inputs on touch screen display 112 (where the touch-sensitive surface and the display are combined), in some embodiments, the device detects inputs on a touch-sensitive surface that is separate from the display, as shown in
Additionally, while the following examples are given primarily with reference to finger inputs (e.g., finger contacts, finger tap gestures, finger swipe gestures), it should be understood that, in some embodiments, one or more of the finger inputs are replaced with input from another input device (e.g., a mouse-based input or stylus input). For example, a swipe gesture is, optionally, replaced with a mouse click (e.g., instead of a contact) followed by movement of the cursor along the path of the swipe (e.g., instead of movement of the contact). As another example, a tap gesture is, optionally, replaced with a mouse click while the cursor is located over the location of the tap gesture (e.g., instead of detection of the contact followed by ceasing to detect the contact). Similarly, when multiple user inputs are simultaneously detected, it should be understood that multiple computer mice are, optionally, used simultaneously, or a mouse and finger contacts are, optionally, used simultaneously.
Exemplary techniques for detecting and processing touch intensity are found, for example, in related applications: International Patent Application Serial No. PCT/US2013/040061, titled “Device, Method, and Graphical User Interface for Displaying User Interface Objects Corresponding to an Application,” filed May 8, 2013, published as WIPO Publication No. WO/2013/169849, and International Patent Application Serial No. PCT/US2013/069483, titled “Device, Method, and Graphical User Interface for Transitioning Between Touch Input to Display Output Relationships,” filed Nov. 11, 2013, published as WIPO Publication No. WO/2014/105276, each of which is hereby incorporated by reference in their entirety.
In some embodiments, device 500 has one or more input mechanisms 506 and 508. Input mechanisms 506 and 508, if included, can be physical. Examples of physical input mechanisms include push buttons and rotatable mechanisms. In some embodiments, device 500 has one or more attachment mechanisms. Such attachment mechanisms, if included, can permit attachment of device 500 with, for example, hats, eyewear, earrings, necklaces, shirts, jackets, bracelets, watch straps, chains, trousers, belts, shoes, purses, backpacks, and so forth. These attachment mechanisms permit device 500 to be worn by a user.
Input mechanism 508 is, optionally, a microphone, in some examples. Personal electronic device 500 optionally includes various sensors, such as GPS sensor 532, accelerometer 534, directional sensor 540 (e.g., compass), gyroscope 536, motion sensor 538, and/or a combination thereof, all of which can be operatively connected to I/O section 514.
Memory 518 of personal electronic device 500 can include one or more non-transitory computer-readable storage mediums, for storing computer-executable instructions, which, when executed by one or more computer processors 516, for example, can cause the computer processors to perform the techniques described below, including processes 700, 800, 1000, 1200, 1400, 1500, 1700, and 1900 (
As used here, the term “affordance” refers to a user-interactive graphical user interface object that is, optionally, displayed on the display screen of devices 100, 300, and/or 500 (
As used herein, the term “focus selector” refers to an input element that indicates a current part of a user interface with which a user is interacting. In some implementations that include a cursor or other location marker, the cursor acts as a “focus selector” so that when an input (e.g., a press input) is detected on a touch-sensitive surface (e.g., touchpad 355 in
As used in the specification and claims, the term “characteristic intensity” of a contact refers to a characteristic of the contact based on one or more intensities of the contact. In some embodiments, the characteristic intensity is based on multiple intensity samples. The characteristic intensity is, optionally, based on a predefined number of intensity samples, or a set of intensity samples collected during a predetermined time period (e.g., 0.05, 0.1, 0.2, 0.5, 1, 2, 5, 10 seconds) relative to a predefined event (e.g., after detecting the contact, prior to detecting liftoff of the contact, before or after detecting a start of movement of the contact, prior to detecting an end of the contact, before or after detecting an increase in intensity of the contact, and/or before or after detecting a decrease in intensity of the contact). A characteristic intensity of a contact is, optionally, based on one or more of: a maximum value of the intensities of the contact, a mean value of the intensities of the contact, an average value of the intensities of the contact, a top 10 percentile value of the intensities of the contact, a value at the half maximum of the intensities of the contact, a value at the 90 percent maximum of the intensities of the contact, or the like. In some embodiments, the duration of the contact is used in determining the characteristic intensity (e.g., when the characteristic intensity is an average of the intensity of the contact over time). In some embodiments, the characteristic intensity is compared to a set of one or more intensity thresholds to determine whether an operation has been performed by a user. For example, the set of one or more intensity thresholds optionally includes a first intensity threshold and a second intensity threshold. In this example, a contact with a characteristic intensity that does not exceed the first threshold results in a first operation, a contact with a characteristic intensity that exceeds the first intensity threshold and does not exceed the second intensity threshold results in a second operation, and a contact with a characteristic intensity that exceeds the second threshold results in a third operation. In some embodiments, a comparison between the characteristic intensity and one or more thresholds is used to determine whether or not to perform one or more operations (e.g., whether to perform a respective operation or forgo performing the respective operation), rather than being used to determine whether to perform a first operation or a second operation.
In
Device 500A displays, via display 504A, communication UI 520A, which is a user interface for facilitating a communication session (e.g., a video conference session) between device 500B and device 500C. Communication UI 520A includes video feed 525-1A and video feed 525-2A. Video feed 525-1A is a representation of video data captured at device 500B (e.g., using camera 501B) and communicated from device 500B to devices 500A and 500C during the communication session. Video feed 525-2A is a representation of video data captured at device 500C (e.g., using camera 501C) and communicated from device 500C to devices 500A and 500B during the communication session.
Communication UI 520A includes camera preview 550A, which is a representation of video data captured at device 500A via camera 501A. Camera preview 550A represents to User A the prospective video feed of User A that is displayed at respective devices 500B and 500C.
Communication UI 520A includes one or more controls 555A for controlling one or more aspects of the communication session. For example, controls 555A can include controls for muting audio for the communication session, changing a camera view for the communication session (e.g., changing which camera is used for capturing video for the communication session, adjusting a zoom value), terminating the communication session, applying visual effects to the camera view for the communication session, activating one or more modes associated with the communication session. In some embodiments, one or more controls 555A are optionally displayed in communication UI 520A. In some embodiments, one or more controls 555A are displayed separate from camera preview 550A. In some embodiments, one or more controls 555A are displayed overlaying at least a portion of camera preview 550A.
In
Device 500B displays, via touchscreen 504B, communication UI 520B, which is similar to communication UI 520A of device 500A. Communication UI 520B includes video feed 525-1B and video feed 525-2B. Video feed 525-1B is a representation of video data captured at device 500A (e.g., using camera 501A) and communicated from device 500A to devices 500B and 500C during the communication session. Video feed 525-2B is a representation of video data captured at device 500C (e.g., using camera 501C) and communicated from device 500C to devices 500A and 500B during the communication session. Communication UI 520B also includes camera preview 550B, which is a representation of video data captured at device 500B via camera 501B, and one or more controls 555B for controlling one or more aspects of the communication session, similar to controls 555A. Camera preview 550B represents to User B the prospective video feed of User B that is displayed at respective devices 500A and 500C.
In
Device 500C displays, via touchscreen 504C, communication UI 520C, which is similar to communication UI 520A of device 500A and communication UI 520B of device 500B. Communication UI 520C includes video feed 525-1C and video feed 525-2C. Video feed 525-1C is a representation of video data captured at device 500B (e.g., using camera 501B) and communicated from device 500B to devices 500A and 500C during the communication session. Video feed 525-2C is a representation of video data captured at device 500A (e.g., using camera 501A) and communicated from device 500A to devices 500B and 500C during the communication session. Communication UI 520C also includes camera preview 550C, which is a representation of video data captured at device 500C via camera 501C, and one or more controls 555C for controlling one or more aspects of the communication session, similar to controls 555A and 555B. Camera preview 550C represents to User C the prospective video feed of User C that is displayed at respective devices 500A and 500B.
While the diagram depicted in
The embodiment depicted in
Attention is now directed towards embodiments of user interfaces (“UI”) and associated processes that are implemented on an electronic device, such as portable multifunction device 100, device 300, or device 500.
With reference to
With reference to
As shown, user 622 (“John”) is positioned (e.g., seated) in front of desk 621 (and device 600-1) in environment 615. In some examples, user 622 is positioned in front of desk 621 such that user 622 is captured within field-of-view 620 of camera 602. In some embodiments, one or more objects proximate user 622 are positioned such that the objects are captured within field-of-view 620 of camera 602. In some embodiments, both user 622 and objects proximate user 622 are captured within field-of-view 620 simultaneously. For example, as shown, drawing 618 is positioned in front of user 622 (relative to camera 602) on surface 619 such that both user 622 and drawing 618 are captured in field-of-view 620 of camera 602 and displayed in representation 622-1 (displayed by device 600-1) and representation 622-2 (displayed by device 600-2).
Similarly, user 623 (“Jane”) is positioned (e.g., seated) in front of desk 686 (and device 600-2) in environment 685. In some examples, user 623 is positioned in front of desk 686 such that user 623 is captured within field-of-view 688 of camera 682. As shown, user 623 is displayed in representation 623-1 (displayed by device 600-1) and representation 623-2 (displayed by device 600-2).
Generally, during operation, devices 600-1, 600-2 capture image data, which is in turn exchanged between devices 600-1, 600-2 and used by devices 600-1, 600-2 to display various representations of content during the live video communication session. While each of devices 600-1, 600-2 are illustrated, described examples are largely directed to the user interfaces displayed on and/or user inputs detected by device 600-1. It should be understood that, in some examples, electronic device 600-2 operates in an analogous manner as electronic device 600-1 during the live video communication session. In some examples devices 600-1, 600-2 display similar user interfaces and/or cause similar operations to be performed as those described below.
As will be described in further detail below, in some examples such representations include images that have been modified during the live video communication session to provide improved perspective of surfaces and/or objects within a field-of-view (also referred to herein as “field of view”) of cameras of devices 600-1, 600-2. Images may be modified using any known image processing technique including but not limited to image rotation and/or distortion correction (e.g., image skew). Accordingly, although image data may be captured from a camera having a particular location relative to a user, representations may provide a perspective showing a user (and/or surfaces or objects in an environment of the user) from a perspective different than that of the camera capturing the image data. The embodiments of
With reference to
Video conference interface 604-1 further includes representation 623-1 which in turn includes an image of a physical environment within the field-of-view 688 of camera 682. In some examples, the image of representation 623-1 includes the entire field-of-view 688. In other examples, the image of representation 623-1 includes a portion (e.g., a cropped portion or subset) of the entire field-of-view 688. As shown, in some examples, the image of representation 623-1 includes user 623. As shown, representation 623-1 is displayed at a larger magnitude than representation 622-1. In this manner, user 622 may better observe and/or interact with user 623 during the live communication session.
Device 600-2 displays, on display 683, video conference interface 604-2. Video conference interface 604-2 includes representation 622-2 which in turn includes an image of the physical environment within the field-of-view 620 of camera 602. Video conference interface 604-2 further includes representation 623-2 which in turn includes an image of a physical environment within the field-of-view 688 of camera 682. As shown, representation 622-2 is displayed at a larger magnitude than representation 623-2. In this manner, user 623 may better observe and/or interact with user 622 during the live communication session.
At
In some embodiments, settings interface 606 includes one or more affordances for controlling settings of device 600-1 (e.g., volume, brightness of display, and/or Wi-Fi settings). For example, settings interface 606 includes a view affordance 607-1, which when selected causes device 600-1 to display a view menu, as shown in
As shown in
Generally, view menu 616-1 includes one or more affordances which may be used to manage (e.g., control) the manner in which representations are displayed during a live video communication session. By way of example, selection of a particular affordance may cause device 600-1 to display, or cease displaying, representations in an interface (e.g., interface 604-1 or interface 604-2).
View menu 616-1, for instance, includes a surface view affordance 610, which when selected, causes device 600-1 to display a representation including a modified image of a surface. In some embodiments, when surface view affordance 610 is selected, the user interfaces transition directly to the user interfaces of
In some embodiments, the image of representation 624-1 is modified to provide a desired perspective (e.g., a surface view). In some embodiments, the image of representation 624-1 is modified based on a position of surface 619 relative to camera 602. By way of example, device 600-1 can rotate the image of representation 624-1 a predetermined amount (e.g., 45 degrees, 90 degrees, or 180 degrees) such that surface 619 can be more intuitively viewed in representation 624-1. As shown in
In some embodiments, to ensure that user 623 maintains a view of user 622 while representation 624-2 includes a modified image of surface 619, device 600-2 maintains display of representation 622-2. As shown in
Representations 624-1, 624-2 include an image of drawing 618 that is modified with respect to the position (e.g., location and/or orientation) of drawing 618 relative to camera 602. For example, as depicted in
As described, a representation including a modified image of a surface is provided in response to selection of a surface image affordance (e.g., surface view affordance 610). In some examples, a representation including a modified view of a surface is provided in response to detecting other types of inputs.
With reference to
In some embodiments, the set of criteria includes a requirement that a gesture be performed for at least a threshold amount of time. For example, with reference to
In some embodiments, graphical object 626 includes timer 628 indicating an amount of time gesture 612d has been detected (e.g., a numeric timer, a ring that is filled over time, and/or a bar that is filled over time). In some embodiments, timer 628 also (or alternatively) indicates a threshold amount of time gesture 612d is to continue to be provided to satisfy the set of criteria. In response to gesture 612 satisfying the threshold amount of time (e.g., 0.5 second, 2 seconds, and/or 5 seconds), device 600-1 displays representation 624-1 including a modified image of a surface (
In some examples, graphical object 626 indicates the type of gesture currently detected by device 600-1. In some examples, graphical object 626 is an outline of a hand performing the detected type of gesture and/or an image of the detected type of gesture. Graphical object 626 can, for instance, include a hand performing a pointing gesture in response to device 600-1 detecting that user 622 is performing a pointing gesture. Additionally or alternatively, the graphical object 626 can, optionally, indicate a zoom level (e.g., zoom level at which the representation of the second portion of the scene is or will be displayed).
In some examples, a representation having an image that is modified is provided in response to one or more speech inputs. For example, during the live communication session, device 600-1 receives a speech input, such as speech input 614 (“Look at my drawing.”) in
In some examples, speech inputs received by device 600-1 can include references to any surface and/or object recognizable by device 600-1, and in response, device 600-1 provides a representation including a modified image of the referenced object or surface. For example, device 600-1 can receive a speech input that references a wall (e.g., a wall behind user 622). In response, device 600-1 provides a representation including a modified image of the wall.
In some embodiments, speech inputs can be used in combination with other types of inputs, such as gestures (e.g., gesture 612d). Accordingly, in some embodiments, device 600-1 displays a modified image of a surface (or object) in response to detecting both a gesture and a speech input corresponding to a request to provide a modified image of the surface.
In some embodiments, a surface view affordance is provided in other manners. With reference to
While displaying options menu 608, device 600-1 detects an input 612f corresponding to a selection of view affordance 607-2. In response to detecting input 612f, device 600-1 displays view menu 616-2, as shown in
While options menu 608 is illustrated as being persistently displayed in video conference interface 604-1 throughout the figures, options menu 608 can be hidden and/or re-displayed at any point during the live video communications session by device 600-1. For example, options menu 608 can be displayed and/or removed from display in response to detecting one or more inputs and/or a period of inactivity by a user.
While detecting an input directed to a surface has been described as causing device 600-1 to display a representation including a modified image of a surface (for example, in response to detecting input 612c of
In some embodiments, prior to operating in the preview mode, device 600-1 detects an input (e.g., input 612c) directed to a surface view affordance 610. In response, device 600-1 initiates a preview mode. While operating in a preview mode, device 600-1 displays a preview interface 674-1. Preview interface 647-1 includes a left scroll affordance 634-2, a right scroll affordance 634-1, and preview 636.
In some embodiments, selection of the left scroll affordance causes device 600-1 to change (e.g., replace) preview 636. For example, selection of the left scroll affordance 634-2 or the right scroll affordance 634-1 causes device 600-1 to cycle through various images (image of a user, unmodified image of a surface, and/or modified image of surface 619) such that a user can select a particular perspective to be shared upon exiting the preview mode, for instance, in response to detecting an input directed to preview 636. Additionally or alternatively, these techniques can be used to cycle through and/or select a particular surface (e.g., vertical and/or horizontal surface) and/or particular portion (e.g., cropped portion or subset) in the field-of-view.
As shown, in some embodiments, preview user interface 674-1 is displayed at device 600-1 and is not displayed at device 600-2. For example, device 600-2 displays video conference interface 604-2 (including representation 622-2) while device 600-1 displays preview interface 674-1. As such, preview user interface 674-1 allows user 622 to select a view prior to sharing the view with user 623.
In some embodiments, region 636-1 and region 636-2 correspond to respective portions of representation 676. For example, as shown, region 636-1 corresponds to an upper portion of representation 676 (e.g., a portion including an upper body of user 622), and region 636-2 corresponds to a lower portion of representation 676 (e.g., a portion including a lower body of user 622 and/or drawing 618).
In some embodiments, region 636-1 and region 636-2 are displayed as distinct regions (e.g., non-overlapping regions). In some embodiments, region 636-1 and region 636-2 overlap. Additionally or alternatively, one or more graphical objects 638-1 (e.g., lines, boxes, and/or dashes) can distinguish (e.g., visually distinguish) region 636-1 from region 636-2.
In some embodiments, preview interface 674-2 includes one or more graphical objects to indicate whether a region is active or inactive. In the example of
When active, a region is shared with one or more other users of a live video communication session. For example, with reference to
While displaying interface 674-2, device 600-1 detects an input 612i at a location corresponding to region 636-2. Input 612i is a touch input in some embodiments. In response to detecting input 612i, device 600-1 activates region 636-2. As a result, device 600-2 displays a representation including a modified image of surface 619, such as representation 624-2. In some embodiments, region 636-1 remains active in response to input 612i (e.g., user 623 can see user 622, for example, in representation 622-2). Optionally, in some embodiments, device 600-1 deactivates region 636-1 in response to input 612i (e.g., user 623 can no longer see user 622, for example, in representation 622-2).
While the example of
In some embodiments, a plurality of regions are active (and/or can be activated). For example, as shown, device 600-1 displays regions 636a-636i, of which regions 636a-f are active. As a result, device 600-2 displays representation 622-2.
In some embodiments, device 600-1 modifies an image of a surface having any type of orientation, including any angle (e.g., between zero to ninety degrees) with respect to gravity. For example, device 600-1 can modify an image of a surface when the surface is a horizontal surface (e.g., a surface that is in a plane that is within the range of 70 to 110 degrees of the direction of gravity). As another example, device 600-1 can modify an image of a surface when the surface is a vertical surface (e.g., a surface that is in a plane that up to 30 degrees of the direction of gravity).
While displaying interface 674-3, device 600-1 detects input 612j at a location corresponding to region 636h. In response to detecting input 612j, device 600-1 activates region 636-2. As a result, device 600-2 displays a representation including a modified image of surface 619, such as representation 624-2. In some embodiments, regions 636a-f remain active in response to input 612j (e.g., user 623 can see user 622, for example, in representation 622-2). Optionally, in some embodiments, device 600-1 deactivates regions 636a-f in response to input 612j (e.g., user 623 can no longer see user 622, for example, in representation 622-2).
With further reference to
As shown, in some embodiments, device 600-1 displays representation 624-1 including a modified image of a surface. Rotation affordance 648-1, when selected, causes device 600-1 to rotate the image of representation 624-1. For example, while displaying interface 678, device 600-1 detects input 650a corresponding to a selection of rotation affordance 648-1. In response to input 650a, device 600-1 modifies the orientation of the image of representation 624-1 from a first orientation (shown in
Zoom affordance 648-2, when selected, modifies the zoom level of the image of representation 624-1. For example, as depicted in
Additionally or alternatively, in some embodiments, video conference interface 604-1 includes an option to display a magnified view of at least a portion of the image of representation 624-1, as shown in
As depicted in
As depicted in
As shown in
While in some embodiments, as shown in
As depicted in
With reference to
While description is made herein with respect to increasing a zoom level of an image in response to a spread gesture 656e, in some examples, a zoom level of an image is decreased in response to a gesture (e.g., another type of gesture, such as a pinch gesture).
As another example, a gesture in which user 622 curls their fingers can be used to adjust a zoom level. For instance, gesture 668 (e.g., a gesture in which fingers of a user's hand are curled in a direction 668b away from a camera, for example, when the back of the hand 668a is oriented toward the camera) can be used to indicate that a zoom level of an image should be increased (e.g., zoomed in). Gesture 670 (e.g., a gesture in which fingers of a user's hand are curled in a direction 670b toward a camera, for example, when the palm of the hand 668a is oriented toward the camera) can be used to indicate that a zoom level of an image should be decreased (e.g., zoomed out).
As an example, as shown in
In some embodiments, device 600-2 is positioned in front of user 623 on desk 686 in a manner that corresponds to the position of surface 619 relative to user 622. Accordingly, user 623 can view representation 624-2 (including an image of surface 619) in a manner analogous to that of user 622 viewing surface 619 in the physical environment.
As shown in
In
In
In
In
In the embodiment depicted in
Video conferencing application window 6120 includes menu option 6126, which can be selected to display different options for sharing content in the live video communication session. In
In
Additionally, the applications, interfaces (e.g., 604-1, 604-2, 6121, and/or 6131) and field-of-views (e.g., 620, 688, 6145-1, and 6147-2) provided by one or more cameras (e.g., 602, 682, and/or 6102) discussed with respect to
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
For the sake of clarity, shaded regions 6217 and 6206 and field-of-view 620 have been omitted from
At
At
At
At
Returning briefly to
At
At
At
At
At
At
At
At
As described below, method 700 provides an intuitive way for managing a live video communication session. The method reduces the cognitive burden on a user for managing a live video communication session, thereby creating a more efficient human-machine interface. For battery-operated computing devices, enabling a user to manage a live video communication session faster and more efficiently conserves power and increases the time between battery charges.
In method 700, computer system (e.g., 600-1, 600-2, 6100-1, and/or 6100-2) displays (702), via the display generation component, a live video communication interface (e.g., 604-1, 604-2, 6120, 6121, 6130, and/or 6131) for a live video communication session (e.g., an interface for an incoming and/or outgoing live audio/video communication session). In some embodiments, the live communication session is between at least the computer system (e.g., a first computer system) and a second computer system.
The live video communication interface includes a representation (e.g., 622-1, 622-2, 6124, and/or 6132) of at least a portion of a field-of-view (e.g., 620, 688, 6144, 6146, and/or 6148) of the one or more cameras (e.g., a first representation). In some embodiments, the first representation includes images of a physical environment (e.g., a scene and/or area of the physical environment that is within the field-of-view of the one or more cameras). In some embodiments, the representation includes a portion (e.g., a first cropped portion) of the field-of-view of the one or more cameras. In some embodiments, the representation includes a static image. In some embodiments, the representation includes series of images (e.g., a video). In some embodiments, the representation includes a live (e.g., real-time) video feed of the field-of-view (or a portion thereof) of the one or more cameras. In some embodiments, the field-of-view is based on physical characteristics (e.g., orientation, lens, focal length of the lens, and/or sensor size) of the one or more cameras. In some embodiments, the representation is displayed in a window (e.g., a first window). In some embodiments, the representation of at least the portion of the field-of-view includes an image of a first user (e.g., a face of a first user). In some embodiments, the representation of at least the portion of the field-of-view is provided by an application (e.g., 6110) providing the live video communication session. In some embodiments, the representation of at least the portion of the field-of-view is provided by an application (e.g., 6108) that is different from the application providing the live video communication session (e.g., 6110).
While displaying the live video communication interface, the computer system (e.g., 600-1, 600-2, 6100-1, and/or 6100-2) detects (704), via the one or more input devices (e.g., 601, 683, and/or 6103), one or more user inputs including a user input (e.g., 612c, 612d, 614, 612g, 612i, 612j, 6112, 6118, 6128, and/or 6138) (e.g., a tap on a touch-sensitive surface, a keyboard input, a mouse input, a trackpad input, a gesture (e.g., a hand gesture), and/or an audio input (e.g., a voice command)) directed to a surface (e.g., 619) (e.g., a physical surface; a surface of a desk and/or a surface of an object (e.g., book, paper, tablet) resting on the desk; or a surface of a wall and/or a surface of an object (e.g., a whiteboard or blackboard) on a wall; or other surface (e.g., a freestanding whiteboard or blackboard)) in a scene (e.g., physical environment) that is in the field-of-view of the one or more cameras. In some embodiments, the user input corresponds to a request to display a view of the surface. In some embodiments, detecting user input via the one or more input devices includes obtaining image data of the field-of-view of the one or more cameras that includes a gesture (e.g., a hand gesture, eye gesture, or other body gesture). In some embodiments, the computer system determines, from the image data, that the gesture satisfies predetermined criteria.
In response to detecting the one or more user inputs, the computer system (e.g., 600-1, 600-2, 6100-1, and/or 6100-2) displays, via the display generation component (e.g., 601, 683, and/or 6101), a representation (e.g., image and/or video) of the surface (e.g., 624-1, 624-2, 6140, and/or 6142) (e.g., a second representation). In some embodiments, the representation of the surface is obtained by digitally zooming and/or panning the field-of-view captured by the one or more cameras. In some embodiments, the representation of the surface is obtained by moving (e.g., translating and/or rotating) the one or more cameras. In some embodiments, the second representation is displayed in a window (e.g., a second window, the same window in which the first representation is displayed, or a different window than a window in which the first representation is displayed). In some embodiments, the second window is different from the first window. In some embodiments, the second window (e.g., 6140 and/or 6142) is provided by the application (e.g., 6110) providing the live video communication session (e.g., as shown in
The representation (e.g., 624-1, 624-2, 6140, and/or 6142) of the surface includes an image (e.g., photo, video, and/or live video feed) of the surface (e.g., 619) captured by the one or more cameras (e.g., 602, 682, and/or 6102) that is (or has been) modified (e.g., to correct distortion of the image of the surface) (e.g., adjusted, manipulated, corrected) based on a position (e.g., location and/or orientation) of the surface relative to the one or more cameras (sometimes referred to as the representation of the modified image of the surface). In some embodiments, the image of the surface displayed in the second representation is based on image data that is modified using image processing software (e.g., skewing, rotating, flipping, and/or otherwise manipulating image data captured by the one or more cameras). In some embodiments, the image of the surface displayed in the second representation is modified without physically adjusting the camera (e.g., without rotating the camera, without lifting the camera, without lowering the camera, without adjusting an angle of the camera, and/or without adjusting a physical component (e.g., lens and/or sensor) of the camera). In some embodiments, the image of the surface displayed in the second representation is modified such that the camera appears to be pointed at the surface (e.g., facing the surface, aimed at the surface, pointed along an axis that is normal to the surface). In some embodiments, the image of the surface displayed in the second representation is corrected such that the line of sight of the camera appears to be perpendicular to the surface. In some embodiments, an image of the scene displayed in the first representation is not modified based on the location of the surface relative to the one or more cameras. In some embodiments, the representation of the surface is concurrently displayed with the first representation (e.g., the first representation (e.g., of a user of the computer system) is maintained and an image of the surface is displayed in a separate window). In some embodiments, the image of the surface is automatically modified in real time (e.g., during the live video communication session). In some embodiments, the image of the surface is automatically modified (e.g., without user input) based on the position of the surface relative to the one or more first cameras. Displaying a representation of a surface including an image of the surface that is modified based on a position of the surface relative to the one or more cameras enhances the video communication session experience by providing a clearer view of the surface despite its position relative to the camera without requiring further input from the user, which provides improved visual feedback and reduces the number of inputs needed to perform an operation.
In some embodiments, the computer system (e.g., 600-1 and/or 600-2) receives, during the live video communication session, image data captured by a camera (e.g., 602) (e.g., a wide angle camera) of the one or more cameras. The computer system displays, via the display generation component, the representation of the at least a portion of the field-of-view (e.g., 622-1 and/or 622-2) (e.g., the first representation) based on the image data captured by the camera. The computer system displays, via the display generation component, the representation of the surface (e.g., 624-1 and/or 624-2) (e.g., the second representation) based on the image data captured by the camera (e.g., the representation of at least a portion of the field-of view of the one or more cameras and the representation of the surface are based on image data captured by a single (e.g. only one) camera of the one or more cameras). Displaying the representation of the at least a portion of the field-of-view and the representation of the surface captured from the same camera enhances the video communication session experience by displaying content captured by the same camera at different perspectives without requiring input from the user, which reduces the number of inputs (and/or devices) needed to perform an operation.
In some embodiments, the image of the surface is modified (e.g., by the computer system) by rotating the image of the surface relative to the representation of at least a portion of the field-of-view-of the one or more cameras (e.g., the image of the surface in 624-2 is rotated 180 degrees relative to representation 622-2). In some embodiments, the representation of the surface is rotated 180 degrees relative to the representation of at least a portion of the field-of-view of the one or more cameras. Rotating the image of the surface relative to the representation of at least a portion of the field-of-view of the one or more cameras enhances the video communication session experience as content associated with the surface can be viewed from a different perspective that other portions of the field-of-view without requiring input from the user, which provides improved visual feedback and reduces the number of inputs needed to perform an operation.
In some embodiments, the image of the surface is rotated based on a position (e.g., location and/or orientation) of the surface (e.g., 619) relative to a user (e.g., 622) (e.g., a position of a user) in the field-of-view of the one or more cameras. In some embodiments, a representation of the user is displayed at a first angle and the image of the surface is rotated to a second angle that is different from the first angle (e.g., even though the image of the user and the image of the surface are captured at the same camera angle). Rotating the image of the surface based on a position of the surface relative to a user in the field-of-view of the one or more cameras enhances the video communication session experience as content associated with the surface can be viewed from a perspective that is based on the position of the surface without requiring input from the user, which provides improved visual feedback and reduces the number of inputs needed to perform an operation.
In some embodiments, in accordance with a determination that the surface is in a first position (e.g., surface 619 is positioned in front of user 622 on desk 621 in
In some embodiments, the representation of the at least a portion of the field-of-view includes a user and is concurrently displayed with the representation of the surface (e.g., representations 622-1 and 624-1 or representations 622-2 and 624-2 in
In some embodiments, in response to detecting the one or more user inputs and prior to displaying the representation of the surface, the computer system displays a preview of image data for the field-of-view of the one or more cameras (e.g., as depicted in
In some embodiments, displaying the preview of image data for the field-of-view of the one or more cameras includes displaying a plurality of selectable options (e.g., 636-1 and/or 636-2 of
In some embodiments, displaying the preview of image data for the field-of-view of the one or more cameras includes displaying a plurality of regions (e.g., distinct regions, non-overlapping regions, rectangular regions, square regions, and/or quadrants) of the preview (e.g., 636-1, 636-2 of
In some embodiments, the one or more user inputs include a gesture (e.g., 612d) (e.g., a body gesture, a hand gesture, a head gesture, an arm gesture, and/or an eye gesture) in the field-of-view of the one or more cameras (e.g., a gesture performed in the field-of-view of the one or more cameras that is directed to the physical position surface). Utilizing a gesture in the field-of-view of the one or more cameras as an input enhances the video communication session experience by allowing a user to control what is displayed without physically touching a device, which provides additional control options without cluttering the user interface.
In some embodiments, the computer system displays a surface-view option (e.g., 610) (e.g., icon, button, affordance, and/or user-interactive graphical interface object), wherein the one or more user inputs include an input (e.g., 612c and/or 612g) directed to the surface-view option (e.g., a tap input on a touch-sensitive surface, a click with a mouse while a cursor is over the surface-view option, or an air gesture while gaze is directed to the surface-view option). In some embodiments, the surface-view option is displayed in the representation of at least a portion of a field-of-view of the one or more cameras. Displaying a surface-view option enhances the video communication session experience by allowing a user to efficiently manage what is displayed in the live video communication interface, which provides additional control options without cluttering the user interface.
In some embodiments, the computer system detects a user input corresponding to selection of the surface-view option. In response to detecting the user input corresponding to selection of the surface-view option, the computer system displays a preview of image data for the field-of-view of the one or more cameras (e.g., as depicted in
In some embodiments, the computer system detects a user input corresponding to selection of the surface-view option (e.g., 612c, 612d, 614, 612g, 612i, and/or 612j). In response to detecting the user input corresponding to selection of the surface-view option, the computer system displays a preview (e.g., 674-2 and/or 674-3) of image data for the field-of-view of the one or more cameras (e.g., as described in
In some embodiments, the surface is a vertical surface (e.g., as depicted in
In some embodiments, the surface is a horizontal surface (e.g., 619) (e.g., table, floor, and/or desk) in the scene (e.g., the surface is within a predetermined angle (e.g., 5 degrees, 10 degrees, or 20 degrees of a plane that is perpendicular to the direction of gravity)). Displaying a representation of a horizontal surface that includes an image of the horizontal surface that is modified based on a position of the horizontal surface relative to the one or more cameras enhances the video communication session experience by providing a clearer view of the horizontal surface despite its position relative to the camera without requiring further input from the user, which provides improved visual feedback and reduces the number of inputs needed to perform an operation.
In some embodiments, displaying the representation of the surface includes displaying a first view of the surface (e.g., 624-1 in
In some embodiments, displaying the first view of the surface includes displaying an image of the surface that is modified in a first manner (e.g., as depicted in
In some embodiments, the representation of the surface is displayed at a first zoom level (e.g., as depicted in
In some embodiments, while displaying the live video communication interface, the computer system displays (e.g., in a user interface (e.g., a menu, a dock region, a home screen, and/or a control center) that includes a plurality of selectable control options that, when selected, perform a function and/or set a parameter of the computer system, in the representation of at least a portion of the field-of-view of the one or more cameras, and/or in the live video communication interface) a selectable control option (e.g., 610, 6126, and/or 6136-1) (e.g., a button, icon, affordance, and/or user-interactive graphical user interface object) that, when selected, causes the representation of the surface to be displayed. In some embodiments, the one or more inputs include a user input corresponding to selection of the control option (e.g., 612c and/or 612g). In some embodiments, the computer system displays (e.g., in the live video communication interface and/or in a user interface of a different application) a second control option that, when selected, causes a representation of a user to be displayed in the live video communication session and causes the representation of the surface to cease being displayed. Displaying the control option that, when selected, displays the representation of the surface enhances the video communication session experience by allowing a user to modify what content is displayed, which provides additional control options without cluttering the user interface.
In some embodiments, the live video communication session is provided by a first application (e.g., 6110) (e.g., a video conferencing application and/or an application for providing an incoming and/or outgoing live audio/video communication session) operating at the computer system (e.g., 600-1, 600-2, 6100-1, and/or 6100-2). In some embodiments, the selectable control option (e.g., 610, 6126, 6136-1, and/or 6136-3) is associated with a second application (e.g., 6108) (e.g., a camera application and/or a presentation application) that is different from the first application.
In some embodiments, in response to detecting the one or more inputs, wherein the one or more inputs include the user input (e.g., 6128 and/or 6138) corresponding to selection of the control option (e.g., 6126 and/or 6136-3), the computer system (e.g., 600-1, 600-2, 6100-1, and/or 6100-2) displays a user interface (e.g., 6140) of the second application (e.g., 6108) (e.g., a first user interface of the second application). Displaying a user interface of the second application in response to detecting the one or more inputs, wherein the one or more inputs include the user input corresponding to selection of the control option, provides access to the second application without having to navigate various menu options, which reduces the number of inputs needed to perform an operation. In some embodiments, displaying the user interface of the second application includes launching, activating, opening, and/or bringing to the foreground the second application. In some embodiments, displaying the user interface of the second application includes displaying the representation of the surface using the second application.
In some embodiments, prior to displaying the live video communication interface (e.g., 6121 and/or 6131) for the live video communication session (e.g., and before the first application (e.g., 6110) is launched), the computer system (e.g., 600-1, 600-2, 6100-1, and/or 6100-2) displays a user interface (e.g., 6114 and/or 6116) of the second application (e.g., 6108) (e.g., a second user interface of the second application). Displaying a user interface of the second application prior to displaying the live video communication interface for the live video communication session, provides access to the second application without having to access the live video communication interface, which provides additional control options without cluttering the user interface. In some embodiments, the second application is launched before the first application is launched. In some embodiments, the first application is launched before the second application is launched.
In some embodiments, the live video communication session (e.g., 6120, 6121, 6130, and/or 6131) is provided using a third application (e.g., 6110) (e.g., a video conferencing application) operating at the computer system (e.g., 600-1, 600-2, 6100-1, and/or 6100-2). In some embodiments, the representation of the surface (e.g., 6116 and/or 6140) is provided by (e.g., displayed using a user interface of) a fourth application (e.g., 6108) that is different from the third application.
In some embodiments, the representation of the surface (e.g., 6116 and/or 6140) is displayed using a user interface (e.g., 6114) of the fourth application (e.g., 6108) (e.g., an application window of the fourth application) that is displayed in the live video communication session (e.g., 6120 and/or 6121) (e.g., the application window of the fourth application is displayed with the live video communication interface that is being displayed using the third application (e.g., 6110)). Displaying the representation of the surface using a user interface of the fourth application that is displayed in the live video communication session provides access to the fourth application, which provides additional control options without cluttering the user interface. In some embodiments, the user interface of the fourth application (e.g., the application window of the fourth application) is separate and distinct from the live video communication interface.
In some embodiments, the computer system (e.g., 600-1, 600-2, 6100-1, and/or 6100-2) displays, via the display generation component (e.g., 601, 683, and/or 6101) a graphical element (e.g., 6108, 6108-1, 6126, and/or 6136-1) corresponding to the fourth application (e.g., a camera application associated with camera application icon 6108) (e.g., a selectable icon, button, affordance, and/or user-interactive graphical user interface object that, when selected, launches, opens, and/or brings to the foreground the fourth application), including displaying the graphical element in a region (e.g., 6104 and/or 6106) that includes (e.g., is configurable to display) a set of one or more graphical elements (e.g., 6110-1) corresponding to an application other than the fourth application (e.g., a set of application icons each corresponding to different applications). Displaying a graphical element corresponding to the fourth application in a region that includes a set of one or more graphical elements corresponding to an application other than the fourth application, provides controls for accessing the fourth application without having to navigate various menu options, which provides additional control options without cluttering the user interface. In some embodiments, the graphical element corresponding to the fourth application is displayed in, added to, and/or displayed adjacent to an application dock (e.g., 6104 and/or 6106) (e.g., a region of a display that includes a plurality of application icons for launching respective applications). In some embodiments, the set of one or more graphical elements includes a graphical element (e.g., 6110-1) that corresponds to the third application (e.g., video conferencing application associated with video conferencing application icon 6110) that provides the live video communication session. In some embodiments, in response to detecting the one or more user inputs (e.g., 6112 and/or 6118) (e.g., including an input on the graphical element corresponding to the fourth application), the computer system displays an animation of the graphical element corresponding to the fourth application, e.g., bouncing in the application dock.
In some embodiments, displaying the representation of the surface includes displaying, via the display generation component, an animation of a transition (e.g., a transition that gradually progresses through a plurality of intermediate states over time including one or more of a pan transition, a zoom transition, and/or a rotation transition) from the display of the representation of at least a portion of a field-of-view of the one or more cameras to the display of the representation of the surface (e.g., as depicted in
In some embodiments, the computer system is in communication (e.g., via the live communication session) with a second computer system (e.g., 600-1 and/or 600-2) (e.g., desktop computer and/or laptop computer) that is in communication with a second display generation component (e.g., 683). In some embodiments, the second computer system displays the representation of at least a portion of the field-of-view of the one or more cameras on the display generation component (e.g., as depicted in
In some embodiments, in response to detecting a change in an orientation of the second computer system (or receiving an indication of a change in an orientation of the second computer system) (e.g., the second computer system is tilted), the second computer system updates the display of the representation of the surface that is displayed at the second display generation component from displaying a first view of the surface to displaying a second view of the surface that is different from the first view (e.g., as depicted in
In some embodiments, displaying the representation of the surface includes displaying an animation of a transition from the display of the representation of the at least a portion of the field-of-view of the one or more cameras to the display of the representation of the surface, wherein the animation includes panning a view of the field-of-view of the one or more cameras and rotating the view of the field-of-view of the one or more cameras (e.g., as depicted in
In some embodiments, displaying the representation of the surface includes displaying an animation of a transition from the display of the representation of the at least a portion of the field-of-view of the one or more cameras to the display of the representation of the surface, wherein the animation includes zooming (e.g., zooming in or zooming out) a view of the field-of-view of the one or more cameras and rotating the view of the field-of-view of the one or more cameras (e.g., as depicted in
Note that details of the processes described above with respect to method 700 (e.g.,
As described below, method 800 provides an intuitive way for managing a live video communication session. The method reduces the cognitive burden on a user for managing a live video communication session, thereby creating a more efficient human-machine interface. For battery-operated computing devices, enabling a user to manage a live video communication session faster and more efficiently conserves power and increases the time between battery charges.
In method 800, the computer system displays (802), via the display generation component, a live video communication interface (e.g., 604-1) for a live video communication session (e.g., an interface for an incoming and/or outgoing live audio/video communication session). In some embodiments, the live communication session is between at least the computer system (e.g., a first computer system) and a second computer system. The live video communication interface includes a representation (e.g., 622-1) (e.g., a first representation) of a first portion of a scene (e.g., a portion (e.g., area) of a physical environment) that is in a field-of-view captured by the one or more cameras. In some embodiments, the first representation is displayed in a window (e.g., a first window). In some embodiments, the first portion of the scene corresponds to a first portion (e.g., a cropped portion (e.g., a first cropped portion)) of the field-of-view captured by the one or more cameras.
While displaying the live video communication interface, the computer system obtains (804), via the one or more cameras, image data for the field-of-view of the one or more cameras, the image data including a first gesture (e.g., 656b) (e.g., a hand gesture). In some embodiments, the gesture is performed within the field-of-view of the one or more cameras. In some embodiments, the image data is for the field-of-view of the one or more cameras. In some embodiments, the gesture is displayed in the representation of the scene. In some embodiments, the gesture is not displayed in the representation of the first scene (e.g., because the gesture is detected in a portion of the field-of-view of the one or more cameras that is not currently being displayed). In some embodiments, while displaying the live video communication interface, audio input is obtained via the one or more input devices. a determination that the audio input satisfies a set of audio criteria input may take the place of (e.g., is in lieu of) the determination that the gesture satisfies the first set of criteria.
In response to obtaining the image data for the field-of-view of the one or more cameras (and/or in response to obtaining the audio input) and in accordance with a determination that the first gesture satisfies a first set of criteria, the computer system displays, via the display generation component, a representation (e.g., 622-2′) (e.g., a second representation) of a second portion of the scene that is in the field-of-view of the one or more cameras, the representation of the second portion of the scene including different visual content from the representation of the first portion of the scene. In some embodiments, the second representation is displayed in a window (e.g., a second window). In some embodiments, the second window is different than the first widow. In some embodiments, the first set of criteria is a predetermined set of criteria for recognizing the gesture. In some embodiments, the first set of criteria includes a criterion for a gesture (e.g., movement and/or static pose) of one or more hands of a user (e.g., a single-hand gesture and/or two-hand gesture). In some embodiments, the first set of criteria includes a criterion for position (e.g., location and/or orientation) of the one or more hands (e.g., position of one or more fingers and/or one or more palms) of the user. In some embodiments, the criteria includes a criterion for a gesture of a portion of a user's body other than the user's hand(s) (e.g., face, eyes, head, and/or shoulders). In some embodiments, the computer system displays the representation of the second portion of the scene by digitally panning and/or zooming without physically adjusting the one or more cameras. In some embodiments, the representation of the second portion includes visual content that is not included in the representation of the first portion. In some embodiments, the representation of the second portion does not include at least a portion of the visual content that is included in the representation of the first portion. In some embodiments, the representation of the second portion includes at least a portion (but not all) of the visual content included in the first portion (e.g., the second portion and the first portion include some overlapping visual content). In some embodiments, displaying the representation of the second portion includes displaying a portion (e.g., a cropped portion) of the field-of-view of the one or more cameras. In some embodiments, the representation of the first portion and the representation of the second portion are based on the same field-of-view of the one or more cameras (e.g., a single camera). In some embodiments, displaying the representation of the second portion includes transitioning from displaying the representation of the first portion to displaying the representation of the second portion in the same window. In some embodiments, in accordance with a determination that the audio input satisfies a set of audio criteria, the representation of the second portion of the scene is displayed.
In response to obtaining the image data for the field-of-view of the one or more cameras (and/or in response to obtaining the audio input) and in accordance with a determination that the first gesture satisfies a second set of criteria (e.g., does not satisfy the first set of criteria) different from the first set of criteria, the computer system continues to display (810) (e.g., maintain the display of), via the display generation component, the representation (e.g., the first representation) of the first portion of the scene (e.g., representations 622-1, 622-2 in
In some embodiments, the representation of the first portion of the scene is concurrently displayed with the representation of the second portion of the scene (e.g., representations 622-1, 624-1 in
In some embodiments, in response to obtaining the image data for the field-of-view of the one or more cameras and in accordance with a determination that the first gesture satisfies a third set of criteria different from the first set of criteria and the second set of criteria, the computer system displays, via the display generation component, a representation of a third portion of the scene that is in the field-of-view of the one or more cameras, the representation of the third portion of the scene including different visual content from the representation of the first portion of the scene and different visual content from the representation of the second portion of the scene (e.g., as depicted in
In some embodiments, while displaying the representation of the second portion of the scene, the computer system obtains image data including movement of a hand of a user (e.g., a movement of frame gesture 656c in
In some embodiments, the computer system obtains (e.g., while displaying the representation of the first portion of the scene or the representation of the second portion of the scene) image data including a third gesture (e.g., 612d, 654, 656b, 656c, 656e, 664, 666, 668, and/or 670). In response to obtaining the image data including the third gesture and in accordance with a determination that the third gesture satisfies zooming criteria, the computer system changes a zoom level (e.g., zooming in and/or zooming out) of a respective representation of a portion of the scene (e.g., the representation of the first portion of the scene and/or a zoom level of the representation of the second portion of the scene) from a first zoom level to a second zoom level that is different from the first zoom level (e.g., as depicted in
In some embodiments, the third gesture includes a pointing gesture (e.g., 656b), and wherein changing the zoom level includes zooming into an area of the scene corresponding to the pointing gesture (e.g., as depicted in
In some embodiments, the respective representation displayed at the first zoom level is centered on a first position of the scene, and wherein the respective representation displayed at the second zoom level is centered on the first position of the scene (e.g., in response to gestures 664, 666, 668, or 670 in
In some embodiments, changing the zoom level of the respective representation includes changing a zoom level of a first portion the respective representation from the first zoom level to the second zoom level and displaying (e.g., maintaining display of) a second portion of the respective representation, the second portion different from the first portion, at the first zoom level (e.g., as depicted in
In some embodiments, in response to obtaining the image data for the field-of-view of the one or more cameras and in accordance with the determination that the first gesture satisfies the first set of criteria, displaying a first graphical indication (e.g., 626) (e.g., text, a graphic, a color, and/or an animation) that a gesture (e.g., a predefined gesture) has been detected. Displaying a first graphical indication that a gesture has been detected in response to obtaining the image data for the field-of-view of the one or more cameras enhances the user interface by providing an indication of when a gesture is detected, which provides improved visual feedback.
In some embodiments, displaying the first graphical indication includes in accordance with a determination that the first gesture includes (e.g., is) a first type of gesture (e.g., framing gesture 656c of
In some embodiments, in response to obtaining the image data for the field-of-view of the one or more cameras and in accordance with the determination that the first gesture satisfies a fourth set of criteria, displaying (e.g., before displaying the representation of the second portion of the scene) a second graphical object (e.g., 626) (e.g., a countdown timer, a ring that is filled in over time, and/or a bar that is filled in over time) indicating a progress toward satisfying a threshold amount of time (e.g., a progress toward transitioning to displaying the representation of the second portion of the scene and/or a countdown of an amount of time until the representation of the second portion of the scene will be displayed). In some embodiments, the first set of criteria includes a criterion that is met if the first gesture is maintained for the threshold amount of time. Displaying a second graphical object indicating a progress toward satisfying a threshold amount of time when the first gesture satisfies a fourth set of criteria enhances the user interface by providing an indication of how long a gesture should be performed before the device executes a requested function, which provides improved visual feedback.
In some embodiments, the first set of criteria includes a criterion that is met if the first gesture is maintained for the threshold amount of time (e.g., as described with reference to FIGS. 6D-6E) (e.g., the computer system displays the representation of the second portion if the first gesture is maintained for the threshold amount of time). Including a criterion in the first set of criteria that is met if the first gesture is maintained for the threshold amount of time enhances the user interface by reducing the number of unwanted operations based on brief, accidental gestures, which reduces the number of inputs needed to cure an unwanted operation.
In some embodiments, the second graphical object is a timer (e.g., as described with reference to
In some embodiments, the second graphical object includes an outline of a representation of a gesture (e.g., as described with reference to
In some embodiments, the second graphical object indicates a zoom level (e.g., 662) (e.g., a graphical indication of “1×” and/or “2×” and/or a graphical indication of a zoom level at which the representation of the second portion of the scene is or will be displayed). In some embodiments, the second graphical object is selectable (e.g., a switch, a button, and/or a toggle) that, when selected, selects (e.g., changes) a zoom level of the representation of the second portion of the scene. Displaying the second graphical object as indicating a zoom level enhances the user interface by providing an indication of a current and/or future zoom level, which provides improved visual feedback.
In some embodiments, prior to displaying the representation of the second portion of the scene, the computer system detects an audio input (e.g., 614), wherein the first set of criteria includes a criterion that is based on the audio input (e.g., the first gesture is detected concurrently with the audio input and/or that the audio input meets audio input criteria (e.g., includes a voice command that matches the first gesture)). In some embodiments, in response to detecting the audio input and in accordance with a determination that the audio input satisfies an audio input criteria, the computer system displays the representation of the second portion of the scene (e.g., even if the first gesture does not satisfy the first set of criteria, without detecting the first gesture, the audio input is sufficient (by itself) to cause the computer system to display the representation of the second portion of the scene (e.g., in lieu of detecting the first gesture and a determination that the first gesture satisfies the first set of criteria)). In some embodiments, the criterion based on the audio input must be met in order to satisfy the first set of criteria (e.g., both the first gesture and the audio input are required to cause the computer system to display the representation of the second portion of the scene). Detecting an audio input prior to displaying the representation of the second portion of the scene and utilizing a criterion that is based on the audio input enhances the user interface as a user can control visual content that is displayed by speaking a request, which provides additional control options without cluttering the user interface.
In some embodiments, the first gesture includes a pointing gesture (e.g., 656b). In some embodiments, the representation of the first portion of the scene is displayed at a first zoom level. In some embodiments, displaying the representation of the second portion includes, in accordance with a determination that the pointing gesture is directed to an object in the scene (e.g., 660) (e.g., a book, drawing, electronic device, and/or surface), displaying a representation of the object at a second zoom level different from the first zoom level. In some embodiments, the second zoom level is based on a location and/or size of the object (e.g., a distance of the object from the one or more cameras). For example, the second zoom level can be greater (e.g., larger amount of zoom) for smaller objects or objects that are farther away from the one or more cameras than for larger objects or objects that are closer to the one or more cameras. In some embodiments, a distortion correction (e.g., amount and/or manner of distortion correction) applied to the representation of the object is based on a location and/or size of the object. For example, distortion correction applied to the representation of the object can be greater (e.g., more correction) for larger objects or objects that are closer to the one or more cameras than for smaller objects or objects that are farther from the one or more cameras. Displaying a representation of the object at a second zoom level different from the first zoom level when a pointing gesture is directed to an object in the scene enhances the user interface by allowing a user to zoom into an object without touching the device, which provides additional control options without cluttering the user interface.
In some embodiments, the first gesture includes a framing gesture (e.g., 656c) (e.g., two hands making a square). In some embodiments, the representation of the first portion of the scene is displayed at a first zoom level. In some embodiments, displaying the representation of the second portion includes, in accordance with a determination that the framing gesture is directed to (e.g., frames, surrounds, and/or outlines) an object in the scene (e.g., 660) (e.g., a book, drawing, electronic device, and/or surface), displaying a representation of the object at a second zoom level different from the first zoom level (e.g., as depicted in
In some embodiments, the first gesture includes a pointing gesture (e.g., 656d). In some embodiments, displaying the representation of the second portion includes, in accordance with a determination that the pointing gesture is in a first direction, panning image data (e.g., without physically panning the one or more cameras) in the first direction of the pointing gesture (e.g., as depicted in
In some embodiments, displaying the representation of the first portion of the scene includes displaying a representation of a user. In some embodiments, displaying the representation of the second portion includes maintaining display of the representation of the user (e.g., as depicted in
In some embodiments, the first gesture includes (e.g., is) a hand gesture (e.g., 656e). In some embodiments, displaying the representation of the first portion of the scene includes displaying the representation of the first portion of the scene at a first zoom level. In some embodiments, displaying the representation of the second portion of the scene includes displaying the representation of the second portion of the scene at a second zoom level different from the first zoom level (e.g., as depicted in
In some embodiments, the hand gesture to display the representation of the second portion of the scene at the second zoom level includes a hand pose holding up two fingers (e.g., 666) corresponding to an amount of zoom. In some embodiments, in accordance with a determination that the hand gesture includes a hand pose holding up two fingers, the computer system displays the representation of the second portion of the scene at a predetermined zoom level (e.g., 2× zoom). In some embodiments, the computer system displays a representation of the scene at a zoom level that is based on how many fingers are being held up (e.g., one finger for 1× zoom, two fingers for 2× zoom, or three fingers for a 0.5× zoom). In some embodiments, the first set of criteria includes a criterion that is based on a number of fingers being held up in the hand gesture. Utilizing a number of fingers to change a zoom level enhances the user interface by allowing a user to switch between zoom levels quickly and efficiently, which performs an operation when a set of conditions has been met without requiring further user input.
In some embodiments, the hand gesture to display the representation of the second portion of the scene at the second zoom level includes movement (e.g., toward and/or away from the one or more cameras) of a hand corresponding to an amount of zoom (e.g., 668 and/or 670 as depicted in
In some embodiments, the representation of the first portion of the scene includes a representation of a first area of the scene (e.g., 658-1) (e.g., a foreground and/or a user) and a representation of a second area of the scene (e.g., 658-2) (e.g., a background and/or a portion outside of the user). In some embodiments, displaying the representation of the second portion of the scene includes maintaining an appearance of the representation of the first area of the scene and modifying (e.g., darken, tinting, and/or blurring) an appearance of the representation of the second area of the scene (e.g., as depicted in
Note that details of the processes described above with respect to method 800 (e.g.,
At
At
At
Similar to first electronic device 906a, at
At
Further still, at
In some embodiments, electronic devices 906-906d are configured to modify an image of one or more representations. In some embodiments, modifications are made to images in response to detecting user input. During the live video communication session, for example, first electronic device 906a receives data (e.g., image data, video data, and/or audio data) from electronic devices 906b-906d and in response displays representations 918b-918d based on the received data. In some embodiments, first electronic device 906a thereafter adjusts, transforms, and/or manipulates the data received from electronic devices 906b-906d to modify (e.g., adjust, transform, manipulate, and/or change) an image of representations 918b-918d. For example, in some embodiments, first electronic device 906a applies skew and/or distortion correction to an image received from second electronic device 906b, third electronic device 906c, and/or fourth electronic device 906d. In some examples, modifying an image in this manner allows first electronic device 906a to display one or more of physical environments 904b-904d from a different perspective (e.g., an overhead perspective of surfaces 908b-908d). In some embodiments, first electronic device 906a additionally or alternatively modifies one or more images of representations by applying rotation to the image data received from electronic devices 906b-906d. In some embodiments, first electronic device 906 receives adjusted, transformed, and/or manipulated data from at least one of electronic devices 906b-906d, such that first electronic device 906a displays representations 918b-918d without applying skew, distortion correction, and/or rotation to the image data received from at least one of electronic devices 906b-906d. At
With reference to
At
In response to receiving an indication of gesture 949 (e.g., via image data and/or video data received from second electronic device 906b and/or via data indicative of second electronic device 906b detecting gesture 949) and/or the one or more user inputs provided by users 902b-902d, first electronic device 906a modifies image data so that representations 918b-918d include an enlarged and/or close-up view of surfaces 908b-908d from a perspective of user 902b-902d sitting in front of respective surfaces 908b-908d without moving and/or otherwise changing an orientation of cameras 909b-909d with respect to surfaces 908b-908d. In some embodiments, modifying images of representations in this manner includes applying skew, distortion correction and/or rotation to image data corresponding to the representations. In some embodiments the amount of skew and/or distortion correction applied is determined based at least partially on a distance between cameras 909b-909d and respective surfaces 908b-908d. In some such embodiments, first electronic device 906a applies different amounts of skew and/or distortion correction to the data received from each of second electronic device 906b, third electronic device 906c, and fourth electronic device 906d. In some embodiments, first electronic device 906a modifies the data, such that a representation of the physical environment captured via cameras 909b-909d is rotated relative to an actual position of cameras 909b-909d (e.g., representations of surfaces 908b-908d displayed on first communication user interfaces 916a-916d appear rotated 180 degrees and/or from a different perspective relative to an actual position of cameras 909b-909d with respect to surfaces 908b-908d). In some embodiments, first electronic device 906a applies an amount of rotation to the data based on a position of cameras 909b-909d with respect to surfaces 908b-908d, respectively. As such, in some embodiments, first electronic devices 906a applies a different amount of rotation to the data received from second electronic device 906b, third electronic device 906c, and/or fourth electronic device 906d.
Accordingly, at
In some embodiments, first electronic device 906a determines (e.g., detects) that an external device (e.g., an electronic device that is not be used to participate in the live video communication session) is displayed and/or included in one or more of the representations. In response, first electronic device 906a can, optionally, enable a view of content displayed on the screen of the external device to be shared and/or otherwise included in the one or more representations. For instance, in some such embodiments, fifth electronic device 914 communicates with first electronic device 906a (e.g., directly, via fourth electronic device 906d, and/or via another external device, such as a server) and provides (e.g., transmits) data related to the user interface and/or other images that are currently being displayed by fifth electronic device 914. Accordingly, first electronic device 906a can cause fourth representation 918d to include the user interface and/or images displayed by fifth electronic device 914 based on the received data. In some embodiments, first electronic device 906a displays fourth representation 918d without fifth electronic device 914, and instead only displays fourth representation 918d with the user interface and/or images currently displayed on fifth electronic device 914 (e.g., a user interface of fifth electronic device 914 is adapted to substantially fill the entirety of representation 918d).
In some embodiments, further in response to modifying an image of a representation, first electronic device 906a also displays a representation of the user. In this manner, user 902a may still view the user while a modified image is displayed. For example, as shown in
Similarly, at
While fifth representation 928a is shown as being displayed wholly within second representation 918b, in some embodiments, fifth representation 928a is displayed adjacent to and/or partially within second representation 918b. Similarly, in some embodiments, sixth representation 930a and seventh representation 932a are displayed adjacent to and/or partially within third representation 918c and fourth representation 918d. In some embodiments, fifth representation 928a is displayed within a predetermined distance (e.g., a distance between a center of fifth representation 928a and a center of a second representation 918b) of second representation 918b, sixth representation 930a is displayed within a predetermined distance (e.g., a distance between a center of sixth representation 930a and a center of third representation 918c) of third representation 918c, and seventh representation 932a is displayed within a predetermined distance (e.g., a distance between a center of seventh representation 932a and a center of fourth representation 918d) of fourth representation 918d. In some embodiments, first communication user interface 916a does not include one or more of representations 928a, 930a, and/or 932a.
At
At
At
In some embodiments, a table view region (e.g., table view region 940) includes sub-regions for each electronic device providing a modified surface view at a time when selection of table view user interface object 934a is detected. For example, as shown in
As shown at
In some embodiments, surface 954 is not representative of any surface within physical environments 904a-904d in which users 902a-902d are located. In some embodiments, surface 954 is a reproduction of (e.g., an extrapolation of, an image of, a visual replica of) an actual surface located in one of physical environments 904a-904d. For instance, in some embodiments, surface 954 includes a reproduction of surface 908a within first physical environment 904a when first electronic device 906a detects user input 950b. In some embodiments, surface 954 includes a reproduction of an actual surface corresponding to a particular position (e.g., first position 640a) of table view region 940. For instance, in some embodiments, surface 954 includes a reproduction of surface 908b within second physical environment 904b when first sub-region 944 is at first position 940a of table view region 940 and first sub-region 944 corresponds to surface 908b.
In addition, at
In some embodiments, table view region 940 is displayed by each of devices 906a-906d with the same orientation (e.g., sub-regions 944, 946, and 948 are in the same positions on each of second communication user interfaces 938a-938d).
In some embodiments, user 902a may wish to modify an orientation (e.g., a position of sub-regions 944, 946, and 948 with respect to an axis 952a formed by boundaries 952) of table view region 940 to view one or more representations of surfaces 908b-908d from a different perspective. For example, at
In some embodiments, when rotating table view region 940, electronic device 906a displays an animation illustrating the rotation of table view region 940. For example, at
As shown in
At
In some embodiments, electronic devices 906a-906d do not display table view region 940 in the same orientation (e.g., sub-regions 944, 946, and 948 positioned at the same positions 940a-940c) as one another. In some such embodiments, table view region 940 includes a sub-region 944, 946, and/or 948 at first position 940a that corresponds to a respective electronic device 906a-906d displaying table view region 940 (e.g., second electronic device 906b displays sub-region 944 at first position 940a, third electronic device 906c displays sub-region 946 at first position 940a, and fourth electronic device 906d displays sub-region 948 at first position 940a). In some embodiments, in response to detecting user input 950c, first electronic device 906a only causes a modification to the orientation of table view region 940 displayed on first electronic device 906a (and not table view region 940 shown on electronic devices 906b-906d).
At
At
Second communication user interfaces 938a-938d enable users 902a-902d to also share digital markups during a live video communication session. Digital markups shared in this manner are, in some instances, displayed by electronic devices 906a-906d, and optionally, overlaid on one or more representations included on second communication user interfaces 938a-938d. For instance, while displaying communication user interface 938a, first electronic device 906a detects user input 950e (e.g., a tap gesture, a tap and swipe gesture, and/or a scribble gesture) corresponding to a request to add and/or display a markup (e.g., digital handwriting, a drawing, and/or scribbling) on first representation 944a (e.g., overlaid on first representation 944a including book 910), as shown at
At
In some embodiments, one or more devices may be used to project an image and/or rendering of markup 956 within a physical environment. For example, as shown in
At
At
Further, first electronic device 906a displays movement of book 910 on second communication user interface 938a based on physical movement of book 910 by second user 902b. For example, in response to detecting movement of book 910 from first position 962a to second position 962b, first electronic device 906a displays movement of book 910 (e.g., first representation 944a) within table view region 940, as shown at
Electronic devices 906a-906d can also modify markup 956. For instance, in response to detecting one or more user inputs, electronic devices 906a-906d can add to, change a color of, change a style of, and/or delete all or a portion of markup 956 that is displayed on each of second communication user interfaces 938a-938d. In some embodiments, electronic devices 906a-906d can modify markup 956, for instance, based on user 902b turning pages of book 910. At
In response to detecting one or more user inputs, electronic devices 906a-906d can further provide one or more outputs (e.g., audio outputs and/or visual outputs, such as notifications) based on an analysis of content included in one or more representations displayed during the live video communication session. At
At
At
At
After performing the task (e.g., the calculation of the square root of 121), second electronic device 906b provides (e.g., outputs) a response including the answer. In some examples, the response is provided as audio output 968, as shown at
In some embodiments, during a live video communication session, electronic devices 906a-906d are configured to display different user interfaces based on the type of objects and/or content positioned on surfaces.
In response to receiving a request to display representations of multiple drawings, electronic devices 906a-906d are configured to overlay the drawings 970, 972, and 974 onto one another and/or remove physical objects within physical environments 904a-904d from the representations (e.g., remove physical objects via modifying data captured via cameras 909a-909d). At
In some embodiments, surface 982 is a virtual surface that is not representative of any surface within physical environments 904a-904d in which users 902a-902d are located. In some embodiments, surface 982 is a reproduction of (e.g., an extrapolation of, an image of, a visual replica of) an actual surface and/or object (e.g., piece of paper) located in one of physical environments 904a-904d.
In addition, drawing region 978 includes fourth representation 983a of second user 902b, fifth representation 983b of third user 902c, and sixth representation 983c of fourth user 902d. In some embodiments, first electronic device 906a does not display fourth representation 983a, fifth representation 983b, and sixth representation 983c, and instead, only displays first drawing representation 978a, second drawing representation 978b, and third drawing representation 978c.
Electronic devices 906a-906d can also display and/or overlay content that does not include drawings onto drawing region 978. At
At
As described below, method 1000 provides an intuitive way for displaying images of multiple different surfaces during a live video communication session. The method reduces the cognitive burden on a user for managing a live video communication session, thereby creating a more efficient human-machine interface. For battery-operated computing devices, enabling a user to manage a live video communication session faster and more efficiently conserves power and increases the time between battery charges.
In method 1000, the first computer system detects (1002) a set of one or more user inputs (e.g., 949, 950a, and/or 950b) (e.g., one or more taps on a touch-sensitive surface, one or more gestures (e.g., a hand gesture, head gesture, and/or eye gesture), and/or one or more audio inputs (e.g., a voice command)) corresponding to a request to display a user interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d) of a live video communication session that includes a plurality of participants (e.g., 902a-902d) (In some embodiments, the plurality of participants include a first user and a second user.).
In response to detecting the set of one or more user inputs (e.g., 949, 950a, and/or 950b), the first computer system (e.g., 906a, 906b, 906c, and/or 906d) displays (1004), via the display generation component (e.g., 907a, 907b, 907c, and/or 907d), a live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d) for a live video communication session (e.g., an interface for an incoming and/or outgoing live audio/video communication session). In some embodiments, the live communication session is between at least the computer system (e.g., a first computer system) and a second computer system. The live video communication interface (e.g., 916a-916d, 938a-938d. and/or 976a-976d) includes (1006) (e.g., concurrently includes) a first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of a field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d). In some embodiments, the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) includes a first user (e.g., a face of the first user). In some embodiments, the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) is a portion (e.g., a cropped portion) of the field-of-view of the one or more first cameras.
The live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d) includes (1008) (e.g., concurrently includes) a second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d), the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) including a representation of a surface (e.g., 908a, 908b, 908c, and/or 908d) (e.g., a first surface) in a first scene that is in the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d). In some embodiments, the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) is a portion (e.g., a cropped portion) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d). In some embodiments, the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) and the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) are based on the same-field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d). In some embodiments, the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) is a single, wide angle camera.
The live video communication interface includes (1010) (e.g., concurrently includes) a first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of a field-of-view of one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of a second computer system (e.g., 906a, 906b, 906c, and/or 906d). In some embodiments, the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) includes a second user (e.g., 902a, 902b, 902c, and/or 902d) (e.g., a face of the second user). In some embodiments, the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) is a portion (e.g., a cropped portion) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d).
The live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d) includes (1012) (e.g., concurrently includes) a second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d), the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) including a representation of a surface (e.g., 908a, 908b, 908c, and/or 908d) (e.g., a second surface) in a second scene that is in the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d). In some embodiments, the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) is a portion (e.g., a cropped portion) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d). In some embodiments, the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) and the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) are based on the same-field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d). In some embodiments, the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) is a single, wide angle camera. Displaying a first and second representation of the field-of-view of the one or more first cameras of the first computer system (where the second representation of a surface in a first scene) and a first and second representation of the field-of-view of the one or more second cameras of the second computer system (where the second representation of a surface in a second scene) enhances the video communication session experience by improving how participants collaborate and view each other's shared content, which provides improved visual feedback.
In some embodiments, the first computer system (e.g., 906a, 906b, 906c, and/or 906d) receives, during the live video communication session, image data captured by a first camera (e.g., 909a, 909b, 909c, and/or 909d) (e.g., a wide angle camera) of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d). In some embodiments, displaying the live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d) for the live video communication session includes displaying, via the display generation component, the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) based on the image data captured by the first camera (e.g., 909a, 909b, 909c, and/or 909d) and displaying, via the display generation component (e.g., 907a, 907b, 907c, and/or 9078d), the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., including the representation of a surface) based on the image data captured by the first camera (e.g., 909a, 909b, 909c, and/or 909d) (e.g., the first representation of the field-of-view of the one or more first cameras of the first computer system and the second representation of the field-of-view of the one or more first cameras of the first computer system include image data captured by the same camera (e.g., a single camera)). Displaying the first representation of the field-of-view of the one or more first cameras of the first computer system and the second representation of the field-of-view of the one or more first cameras of the first computer system based on the image data captured by the first camera enhances the video communication session experience by displaying multiple representations using the same camera at different perspectives without requiring further input from the user, which reduces the number of inputs (and/or devices) needed to perform an operation.
In some embodiments, displaying the live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d) for the live video communication session includes displaying the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) within a predetermined distance (e.g., a distance between a centroid or edge of the first representation and a centroid or edge of the second representation) from the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) and displaying the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) within the predetermined distance (e.g., a distance between a centroid or edge of the first representation and a centroid or edge of the second representation) from the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d). Displaying the first representation of the field-of-view of the one or more first cameras of the first computer system within a predetermined distance from the second representation of the field-of-view of the one or more first cameras of the first computer system and the first representation of the field-of-view of the one or more second cameras of the second computer system within the predetermined distance from the second representation of the field-of-view of the one or more second cameras of the second computer system enhances the video communication session experience by allowing a user to easily identify which representation of the surface is associated with (or shared by) which a participant without requiring further input from the user, which provides improved visual feedback.
In some embodiments, displaying the live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d) for the live video communication session includes displaying the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) overlapping (e.g., at least partially overlaid on or at least partially overlaid by) the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d). In some embodiments, the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) and the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) are displayed on a common background (e.g., 954 and/or 982) (e.g., a representation of a table, desk, floor, or wall) or within a same visually distinguished area of the live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d). In some embodiments, overlapping the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) with the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) enables collaboration between participants (e.g., 902a, 902b, 902c, and/or 902d) in the live video communication session (e.g., by allowing users to combine their content). Displaying the second representation of the field-of-view of the one or more first cameras of the first computer system overlapping the second representation of the field-of-view of the one or more second cameras of the second computer system enhances the video communication session experience by allowing participants to integrate representations of different surfaces, which provides improved visual feedback.
In some embodiments, displaying the live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d) for the live video communication session includes displaying the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) in a first visually defined area (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d) of the live video communication interface (e.g., 916a-916d) and displaying the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) in a second visually defined area (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d) of the live video communication interface (e.g., 916a-916d) (e.g., adjacent to and/or side-by-side with the second representation of the field-of-view of the one or more first cameras of the first computer system). In some embodiments, the first visually defined area (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d) does not overlap the second visually defined area (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d). In some embodiments, the second representation of the field-of-view of the one or more first cameras of the first computer system and the second representation of the field-of-view of the one or more second cameras of the second computer system are displayed in a grid pattern, in a horizontal row, or in a vertical column. Displaying the second representation of the field-of-view of the one or more first cameras of the first computer system and the second representation of the field-of-view of the one or more second cameras of the second computer system in a first and second visually defined area, respectively, enhances the video communication session experience by allowing participants to readily distinguish between representations of different surfaces, which provides improved visual feedback.
In some embodiments, the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) is based on image data captured by the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) that is corrected with a first distortion correction (e.g., skew correction) to change a perspective from which the image data captured by the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) appears to be captured. In some embodiments, the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) is based image data captured by the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) that is corrected with a second distortion correction (e.g., skew correction) to change a perspective from which the image data captured by the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) appears to be captured. In some embodiments, the distortion correction (e.g., skew correction) is based on a position (e.g., location and/or orientation) of the respective surface (e.g., 908a, 908b, 908c, and/or 908d) relative to the one or more respective cameras (e.g., 909a, 909b, 909c, and/or 909d). In some embodiments, the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) and the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) are based on image data taken from the same perspective (e.g., a single camera having a single perspective), but the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) is corrected (e.g., skewed or skewed by a different amount) so as to give the effect that the user is using multiple cameras that have different perspectives. Basing the second representations on image data that is corrected using distortion correction to change a perspective from which the image data is captured enhances the video communication session experience by providing a better perspective to view shared content without requiring further input from the user, which reduces the number of inputs needed to perform an operation.
In some embodiments, the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the first scene) is based on image data captured by the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) that is corrected with a first distortion correction (e.g., a first skew correction) (In some embodiments, the first distortion correction is based on a position (e.g., location and/or orientation) of the surface in the first scene relative to the one or more first cameras). In some embodiments, the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the second scene) is based on image data captured by the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) that is corrected with a second distortion correction (e.g., second skew correction) different from the first distortion correction (e.g., the second distortion correction is based on a position (e.g., location and/or orientation) of the surface in the second scene relative to the one or more second cameras). Basing the second representation of the field-of-view of the one or more first cameras of the first computer system on image data captured by the one or more first cameras of the first computer system that is corrected by a first distortion correction and basing the second representation of the field-of-view of the one or more second cameras of the second computer system on image data captured by the one or more second cameras of the second computer system that is corrected by a second distortion correction different than the first distortion correction enhances the video communication session experience by providing a non-distorted view of a surface regardless of its location in the respective scene, which provides improved visual feedback.
In some embodiments, the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the first scene) is based on image data captured by the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) that is rotated relative to a position of the surface (e.g., 908a, 908b, 908c, and/or 908d) in the first scene (e.g., the position of the surface in the first scene relative to the position of the one or more first cameras of the first computer system). In some embodiments, the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the second scene) is based on image data captured by the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) that is rotated relative to a position of the surface (e.g., 908a, 908b, 908c, and/or 908d) in the second scene (e.g., the position of the surface in the second scene relative to the position of the one or more second cameras of the second computer system). In some embodiments, the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view and the representation of the surface (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) are based on image data taken from the same perspective (e.g., a single camera having a single perspective), but the representation of the surface (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) is rotated so as to give the effect that the user is using multiple cameras that have different perspectives. Basing the second representation of the field-of-view of the one or more first cameras of the first computer system on image data captured by the one or more first cameras of the first computer system that is rotated relative to a position of the surface in the first scene and/or basing the second representation of the field-of-view of the one or more second cameras of the second computer system on image data captured by the one or more second cameras of the second computer system that is rotated relative to a position of the surface in the second scene enhances the video communication session experience by providing a better view of a surface would have otherwise appeared upside down or turned around, which provides improved visual feedback.
In some embodiments, the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the first scene) is based on image data captured by the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) that is rotated by a first amount relative to a position of the surface (e.g., 908a, 908b, 908c, and/or 908d) in the first scene (e.g., the position of the surface in the first scene relative to the position of the one or more first cameras of the first computer system). In some embodiments, the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the second scene) is based on image data captured by the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) that is rotated by a second amount relative to a position of the surface (e.g., 908a, 908b, 908c, and/or 908d) in the second scene (e.g., the position of the surface in the second scene relative to the position of the one or more second cameras of the second computer system), wherein the first amount is different from the second amount. In some embodiments, the representation of a respective surface (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) in a respective scene is displayed in the live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d) at an orientation that is different from the orientation of the respective surface (e.g., 908a, 908b, 908c, and/or 908d) in the respective scene (e.g., relative to the position of the one or more respective cameras). In some embodiments, the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the first scene) is based on image data captured by the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) that is corrected with a first distortion correction. In some embodiments, the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the second scene) is based on image data captured by the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) that is corrected with a second distortion correction that is different from the first distortion correction. Basing the second representation of the field-of-view of the one or more first cameras of the first computer system on image data captured by the one or more first cameras of the first computer system that is rotated by a first amount and basing the second representation of the field-of-view of the one or more second cameras of the second computer system on image data captured by the one or more second cameras of the second computer system that is rotated by a second amount different than the first distortion correction enhances the video communication session experience by providing a more intuitive, natural view of a surface regardless of its location in the respective scene, which provides improved visual feedback.
In some embodiments, displaying the live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d) includes displaying, in the live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d), a graphical object (e.g., 954 and/or 982) (e.g., in a background, a virtual table, or a representation of a table based on captured image data). Displaying the live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d) includes concurrently displaying, in the live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d) and via the display generation component (e.g., 907a, 907b, 907c, and/or 907d), the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the first scene) on (e.g., overlaid on) the graphical object (e.g., 954 and/or 982) and the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the second scene) on (e.g., overlaid on) the graphical object (e.g., 954 and/or 982) (e.g., the representation of the surface in the first scene and the representation of the surface in the second scene are both displayed on a virtual table in the live video communication interface). Displaying both the second representation of the field-of-view of the one or more first cameras of the first computer system and the second representation of the field-of-view of the one or more second cameras of the second computer system on the graphical object enhances the video communication session experience by providing a common background for shared content regardless of what the appearance of surface is in the respective scene, which provides improved visual feedback, reduces visual distraction, and removes the need for the user to manually place different objects on a background.
In some embodiments, while concurrently displaying the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the first scene) on the graphical object (e.g., 954 and/or 982) and the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the second scene) on the graphical object (e.g., 954 and/or 982), the first computer system (e.g., 906a, 906b, 906c, and/or 906d) detects, via the one or more input devices (e.g., 907a, 907b, 907c, and/or 907d), a first user input (e.g., 950d). In response to detecting the first user input (e.g., 950d) and in accordance with a determination that the first user input (e.g., 950d) corresponds to the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the first scene), the first computer system (e.g., 906a, 906b, 906c, and/or 906d) changes (e.g., increases) a zoom level of (e.g., zooming in) the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the first scene). In some embodiments, the computer system (e.g., 906a, 906b, 906c, and/or 906d) changes the zoom level of the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) without changing a zoom level of other objects in the user interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d) of the live video communication session (e.g., the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d), the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d), and/or the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906c)). In response to detecting the input (e.g., 950d) and in accordance with a determination that the first user input (e.g., 950d) corresponds to the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the second scene), the first computer system (e.g., 906a, 906b, 906c, and/or 906d) changes (e.g., increases) a zoom level of (e.g., zooming in) the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the second scene). In some embodiments, the computer system (e.g., 906a, 906b, 906c, and/or 906d) changes the zoom level of the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) without changing a zoom level of other objects in the user interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d) of the live video communication session (e.g., the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d), the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d), and/or the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944b, 946b, and/or 948b) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d)). Changing a zoom level of the second representation of the field-of-view of the one or more first cameras of the first computer system or the second representation of the field-of-view of the one or more second cameras of the second computer system enhances the live video communication interface by offering an improved input (e.g., gesture) system, which provides an operation when a set of conditions has been met without requiring the user to navigate through complex menus. Additionally, changing a zoom level of the second representation of the field-of-view of the one or more first cameras of the first computer system or the second representation of the field-of-view of the one or more second cameras of the second computer system enhances video communication session experience by allowing a user to view content associated with the surface at different levels of granularity, which provides improved visual feedback.
In some embodiments, the graphical object (e.g., 954 and/or 982) is based on an image of a physical object (e.g., 908a, 908b, 908c, and/or 908d) in the first scene or the second scene (e.g., an image of an object captured by the one or more first cameras or the one or more second cameras). Basing the graphical object on an image of a physical object in the first scene or the second scene enhances the video communication session experience by provide a specific and/or customized appearance of the graphical object without requiring further input from the user, which provides improved visual feedback reduces the number of inputs needed to perform an operation.
In some embodiments, while concurrently displaying the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the first scene) on the graphical object (e.g., 954 and/or 982) and the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the second scene) on the graphical object (e.g., 954 and/or 982), the first computer system (e.g., 906a, 906b, 906c, and/or 906d) detects, via the one or more input devices (e.g., 907a, 907b, 907c, and/or 907d), a second user input (e.g., 950c) (e.g., tap, mouse click, and/or drag). In response to detecting the second user input (e.g., 950c), the first computer system (e.g., 906a, 906b, 906c, and/or 906d) moves (e.g., rotates) the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) from a first position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982) to a second position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982). In response to detecting the second user input (e.g., 950c), the first computer system (e.g., 906a, 906b, 906c, and/or 906d) moves (e.g., rotates) the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906c) from a third position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982) to a fourth position (e.g., 940a, 940b, 940c, and/or 940d) on the graphical object (e.g., 954 and/or 982). In response to detecting the second user input (e.g., 950c), the first computer system (e.g., 906a, 906b, 906c, and/or 906d) moves (e.g., rotates) the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) from a fifth position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982) to a sixth position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982). In response to detecting the second user input (e.g., 950c), the first computer system (e.g., 906a, 906b, 906c, and/or 906d) moves (e.g., rotates) the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) from a seventh position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982) to an eighth position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982). In some embodiments, the representations maintain positions relative to each other. In some embodiments, the representations are moved concurrently. In some embodiments, the representations are rotated around a table (e.g., clockwise or counterclockwise) while optionally maintaining their positions around the table relative to each other, which can give a participant an impression that he or she has a different position (e.g., seat) at the table. In some embodiments, each representation is moved from an initial position to a previous position of another representation (e.g., a previous position of an adjacent representation). In some embodiments, moving the first representations (e.g., which include a representation of a user (e.g., the user who is sharing a view of his or her drawing) allows a participant to know which surface is associated with which user). In some embodiments, in response to detecting the second user input (e.g., 950c), the computer system moves a position of at least two representations of a surface (e.g., the representation of the surface in the first scene and the representation of the surface in the second scene). In some embodiments, in response to detecting the second user input (e.g., 950c), the computer system moves a position of at least two representations of a user (e.g., the first representation of the field-of-view of the one or more first cameras and the first representation of the field-of-view of the one or more second cameras). Moving the respective representations in response to the second user input enhances the video communication session experience by allow a user to shift multiple representations without further input, which performs an operation when a set of conditions has been met without requiring further user input.
In some embodiments, moving the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) from a first position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982) to a second position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982) includes displaying an animation of the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) moving from the first position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982) to the second position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982). In some embodiments, moving the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) from a third position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982) to a fourth position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982) includes displaying an animation of the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) moving from the third position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982) to the fourth position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982). In some embodiments, moving the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) from a fifth position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982) to a sixth position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982) includes displaying an animation of the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) moving from the fifth position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982) to the sixth position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982). In some embodiments, moving the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) from a seventh position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982) to an eighth position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982) includes displaying an animation of the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) moving from the seventh position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982) to the eighth position (e.g., 940a, 940b, and/or 940c) on the graphical object (e.g., 954 and/or 982). In some embodiments, moving the representations includes displaying an animation of the representations rotating (e.g., concurrently or simultaneously) around a table, while optionally maintaining their positions relative to each other. Displaying an animation of the respective movement of the representations enhances the video communication session experience by allow a user to quickly identify how and/or where the multiple representations are moving, which provides improved visual feedback.
In some embodiments, displaying the live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d) includes displaying the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) with a smaller size than (and, optionally, adjacent to, overlaid on, and/or within a predefined distance from) the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) (e.g., the representation of a user in the first scene is smaller than the representation of the surface in the first scene) and displaying the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) with a smaller size than (and, optionally, adjacent to, overlaid on, and/or within a predefined distance from) the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) (e.g., the representation of a user in the second scene is smaller than the representation of the surface in the second scene). Displaying the first representation of the field-of-view of the one or more first cameras with a smaller size than the second representation of the field-of-view of the one or more first cameras and displaying the first representation of the field-of-view of the one or more second cameras with a smaller size than the second representation of the field-of-view of the one or more second cameras enhances the video communication session experience by allowing a user to quickly identify the context of who is sharing the view of the surface, which provides improved visual feedback.
In some embodiments, while concurrently displaying the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the first scene) on the graphical object (e.g., 954 and/or 982) and the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the second scene) on the graphical object (e.g., 954, and/or 982), the first computer system (e.g., 906a, 906b, 906c, and/or 906d) displays the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) at an orientation that is based on a position of the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) on the graphical object (e.g., 954 and/or 982) (and/or, optionally, based on a position of the first representation of the field-of-view of the one or more first cameras in the live video communication interface). Further, the first computer system (e.g., 906a, 906b, 906c, and/or 906d) displays the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) at an orientation that is based on a position of the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) on the graphical object (e.g., 954 and/or 982) (and/or, optionally, based on a position of the first representation of the field-of-view of the one or more second cameras in the live video communication interface). In some embodiments, in accordance with a determination that a first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more respective cameras (e.g., 909a, 909b, 909c, and/or 909d) of the respective computer system (e.g., 906a, 906b, 906c, and/or 906d) is displayed at a first position in the live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d), the first computer system (e.g., 906a, 906b, 906c, and/or 906d) displays the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more respective cameras (e.g., 909a, 909b, 909c, and/or 909d) of the respective computer system (e.g., 906a, 906b, 906c, and/or 906d) at a first orientation; and in accordance with a determination that a first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more respective cameras (e.g., 909a, 909b, 909c, and/or 909d) of the respective computer system (e.g., 906a, 906b, 906c, and/or 906d) is displayed at a second position in the live video communication interface (e.g., 9116a-916d, 938a-938d, and/or 976a-976d) different from the first position, the first computer system (e.g., 906a, 906b, 906c, and/or 906d) displays the first representation (e.g., 928a-928d, 930a-930d, 932a-932d, 944a, 946a, 948a, 983a, 983b, and/or 983c) of the field-of-view of the one or more respective cameras (e.g., 909a, 909b, 909c, and/or 909d) of the respective computer system (e.g., 906a, 906b, 906c, and/or 906d) at a second orientation different from the first orientation. Displaying the first representation of the field-of-view of the one or more first cameras at an orientation that is based on a position of the second representation of the field-of-view of the one or more first cameras on the graphical object and displaying the first representation of the field-of-view of the one or more second cameras at an orientation that is based on a position of the second representation of the field-of-view of the one or more second cameras on the graphical object enhances the video communication session experience by improving how representations are displayed on the graphical object, which performs an operation when a set of conditions has been met without requiring further user input.
In some embodiments, the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d0) (e.g., the representation of the surface in the first scene) includes a representation (e.g., 978a, 978b, and/or 978c) of a drawing (e.g., 970, 972, and/or 974) (e.g., a marking made using a pen, pencil, and/or marker) on the surface (e.g., 908a, 908b, 908c, and/or 908d) in the first scene and/or the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the second scene) includes a representation (e.g., 978a, 978b, and/or 978c) of a drawing (e.g., 970, 972, and/or 974) (e.g., a marking made using a pen, pencil, and/or marker) on the surface (e.g., 908a, 908b, 908c, and/or 908d) in the second scene. Including a representation of a drawing on the surface in the first scene as part of the second representation of the field-of-view of the one or more first cameras of the first computer system as and/or including a representation of a drawing on the surface in the second scene as part of the second representation of the field-of-view of the one or more second cameras of the second computer system enhances the video communication session experience by allowing participants to discuss particular content, which provides improved collaboration between participants and improved visual feedback.
In some embodiments, the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the first scene) includes a representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of a physical object (e.g., 910, 912, 914, 970, 972, and/or 974) on the surface (e.g., 908a, 908b, 908c, and/or 908d) (e.g., dinner plate and/or electronic device) in the first scene and/or the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) (e.g., the representation of the surface in the second scene) includes a representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of a physical object (e.g., 910, 912, 914, 970, 972, and/or 974) (e.g., dinner plate and/or electronic device) on the surface (e.g., 908a, 908b, 908c, and/or 908d) in the second scene. Including a representation of a physical object on the surface in the first scene as part of the second representation of the field-of-view of the one or more first cameras of the first computer system as and/or including a representation of a physical object on the surface in the second scene as part of the second representation of the field-of-view of the one or more second cameras of the second computer system enhances the video communication session experience by allowing participants to view physical objects associated with a particular object, which provides improved collaboration between participants and improved visual feedback.
In some embodiments, while displaying the live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d), the first computer system (e.g., 906a, 906b, 906c, and/or 906d) detects, via the one or more input devices (e.g., 907a, 907b, 907c, and/or 907d), a third user input (e.g., 950e). In response to detecting the third user input (e.g., 950e), the first computer system (e.g., 906a, 906b, 906c and/or 906d) displays visual markup content (e.g., 956) (e.g., handwriting) in (e.g., adding visual markup content to) the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) in accordance with the third user input (e.g., 950e). In some embodiments, the visual markings (e.g., 956) are concurrently displayed at both the first computing system (e.g., 906a, 906b, 906c, and/or 906d) and at the second computing system (e.g., 906a, 906b, 906c, and/or 906d) using the system's respective display generation component (e.g., 907a, 907b, 907c, and/or 907d). Displaying visual markup content in the second representation of the field-of-view of the one or more second cameras of the second computer system in accordance with the third user input enhances the video communication session experience by improving how participants collaborate and share content, which provides improved visual feedback.
In some embodiments, the visual markup content (e.g., 956) is displayed on a representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of an object (e.g., 910, 912, 914, 970, 972, and/or 974) (e.g., a physical object in the second scene or a virtual object) in the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d). In some embodiments, while displaying the visual markup content (e.g., 956) on the representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the object (e.g., 910, 912, 914, 970, 972, and/or 974) in the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d), the first computer system (e.g., 906a, 906b, 906c, and/or 906d) receives an indication of movement (e.g., detecting movement) of the object (e.g., 910, 912, 914, 970, 972, and/or 974) in the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d). In response to receiving the indication of movement of the object (e.g., 910, 912, 914, 970, 972, and/or 974) in the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d), the first computer system (e.g., 906a, 906b, 906c, and/or 906d) moves the representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the object (e.g., 910, 912, 914, 970, 972, and/or 974) in the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) in accordance with the movement of the object (e.g., 910, 912, 914, 970, 972, and/or 974) and moves the visual markup content (e.g., 956) in the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) in accordance with the movement of the object (e.g., 910, 912, 914, 970, 972, and/or 974), including maintaining a position of the visual markup content (e.g., 956) relative to the representation of the object (e.g., 910, 912, 914, 970, 972, and/or 974). Moving the representation of the object in the second representation of the field-of-view of the one or more second cameras of the second computer system in accordance with the movement of the object and moving the visual markup content in the second representation of the field-of-view of the one or more second cameras of the second computer system in accordance with the movement of the object, including maintaining a position of the visual markup content relative to the representation of the object, enhances the video communication session experience by automatically moving representations and visual markup content in response to physical movement of the object in the physical environment without requiring any further input from the user, which reduces the number of inputs needed to perform an operation.
In some embodiments, the visual markup content (e.g., 954) is displayed on a representation of a page (e.g., 910) (e.g., a page of a physical book in the second scene, a sheet of paper in the second scene, a virtual page of a book, or a virtual sheet of paper) in the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d). In some embodiments, the first computer system (e.g., 906a, 906b, 906c, and/or 906d) receives an indication (e.g., detects) that the page has been turned (e.g., the page has been flipped over; the surface of the page upon which the visual markup content is displayed is no longer visible to the one or more second cameras of the second computer system). In response to receiving the indication (e.g., detecting) that the page has been turned, the first computer system (e.g., 906a, 906b, 906c, and/or 906d) ceases display of the visual markup content (e.g., 956). Ceasing display of the visual markup content in response to receiving the indication that the page has been turned enhances the video communication session experience by automatically removing content when it is no longer relevant without requiring any further input from the user, which reduces the number of inputs needed to perform an operation.
In some embodiments, after ceasing display of the visual markup content (e.g., 956), the first computer system (e.g., 906a, 906b, 906c, and/or 906d) receives an indication (e.g., detecting) that the page is re-displayed (e.g., turned back to; the surface of the page upon which the visual markup content was displayed is again visible to the one or more second cameras of the second computer system). In response to receiving an indication that the page is re-displayed, the first computer system (e.g., 906a, 906b, 906c, and/or 906d) displays (e.g., re-displays) the visual markup content (e.g., 956) on the representation of the page (e.g., 910) in the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d). In some embodiments, the visual markup content (e.g., 956) is displayed (e.g., re-displayed) with the same orientation with respect to page as the visual markup content (e.g., 956) had prior to the page being turned. Displaying the virtual markup content on the representation of the page in the second representation of the field-of-view of the one or more second cameras of the second computer system in response to receiving an indication that the page is re-displayed enhances the video communication session experience by automatically re-displaying content when it is relevant without requiring any further input from the user, which reduces the number of inputs needed to perform an operation.
In some embodiments, while displaying the visual markup content (e.g., 956) in the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d), the first computer system (e.g., 906a, 906b, 906c, and/or 906d) receives an indication of a request detected by the second computer system (e.g., 906a, 906b, 906c, and/or 906d) to modify (e.g., remove all or part of and/or add to) the visual markup content (e.g., 956) in the live video communication session. In response to receiving the indication of the request detected by the second computer system (e.g., 906a, 906b, 906c, and/or 906d) to modify the visual markup content (e.g., 956) in the live video communication session, the first computer system (e.g., 906a, 906b, 906c, and/or 906d) modifies the visual markup content (e.g., 956) in the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d) in accordance with the request to modify the visual markup content (e.g., 956). Modifying the virtual markup content in the second representation of the field-of-view of the one or more second cameras of the second computer system in accordance with the request to modify the virtual markup content enhances the video communication session experience by allowing participants to modify other participants content without requiring input from the original visual markup content creator, which reduces the number of inputs needed to perform an operation.
In some embodiments, after displaying (e.g., after initially displaying) the visual markup content (e.g., 956) in the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d), the first computer system (e.g., 906a, 906b, 906c, and/or 906d) fades out (e.g., reducing visibility of, blurring out, dissolving, and/or dimming) the display of the visual markup content (e.g., 956) over time (e.g., five seconds, thirty seconds, one minute, and/or five minutes). In some embodiments, the computer system (e.g., 906a, 906b, 906c, and/or 906d) begins to fade out the display of the visual markup content (e.g., 956) in accordance with a determination that a threshold time has passed since the third user input (e.g., 950e) has been detected (e.g., zero seconds, thirty seconds, one minute, and/or five minutes). In some embodiments, the computer system (e.g., 906a, 906b, 906c, and/or 906d) continues to fade out the visual markup content (e.g., 956) until the visual markup content (e.g., 956) ceases to be displayed. Fading out the display of the virtual markup content over time after displaying the visual markup content in the second representation of the field-of-view of the one or more second cameras of the second computer system enhances the video communication session experience by automatically removing content when it is no longer relevant without requiring any further input from the user, which reduces the number of inputs needed to perform an operation.
In some embodiments, while displaying the live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d) including the representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the surface (e.g., 908a, 908b, 908c, and/or 908d) (e.g., a first surface) in the first scene, the first computer system (e.g., 906a, 906b, 906c, and/or 906d) detects, via the one or more input devices, a speech input (e.g., 950f) that includes a query (e.g., a verbal question). In response to detecting the speech input (e.g., 950f), the first computer system (e.g., 906a, 906b, 906c, and/or 906c) outputs a response (e.g., 968) to the query (e.g., an audio and/or graphic output) based on visual content (e.g., 966) (e.g., text and/or a graphic) in the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) and/or the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more second cameras (e.g., 909a, 909b, 909c, and/or 909d) of the second computer system (e.g., 906a, 906b, 906c, and/or 906d). Outputting a response to the query based on visual content in the second representation of the field-of-view of the one or more first cameras of the first computer system and/or the second representation of the field-of-view of the one or more second cameras of the second computer system enhances the live video communication user interface by automatically outputting a relevant response based on visual content without the need for further speech input from the user, which reduces the number of inputs needed to perform an operation.
In some embodiments, while displaying the live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d), the first computer system (e.g., 906a, 906b, 906c, and/or 906d) detects that the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) (or, optionally, the second representation of the field-of-view of the one or more second cameras of the second computer system) includes a representation (e.g., 918d, 922d, 924d, 926d, and/or 948a) of a third computer system (e.g., 914) in the first scene (or, optionally, in the second scene, respectively) that is in communication with (e.g., includes) a third display generation component. In response to detecting that the second representation (e.g., 918b-918d, 922b-922d, 924b-924d, 926b-926d, 944a, 946a, 948a, 978a, 978b, and/or 978c) of the field-of-view of the one or more first cameras (e.g., 909a, 909b, 909c, and/or 909d) of the first computer system (e.g., 906a, 906b, 906c, and/or 906d) includes the representation (e.g., 918d, 922d, 924d, 926d, and/or 948a) of the third computer system (e.g., 914) in the first scene is in communication with the third display generation component, the first computer system (e.g., 906a, 906b, 906c, and/or 906d) displays, in the live video communication interface (e.g., 916a-916d, 938a-938d, and/or 976a-976d), visual content corresponding to display data received from the third computing system (e.g., 914) that corresponds to visual content displayed on the third display generation component. In some embodiments, the computer system (e.g., 906a, 906b, 906c, and/or 906d) receives, from the third computing system (e.g., 914), display data corresponding to the visual content displayed on the third display generation component. In some embodiments, the first computer system (e.g., 906a, 906b, 906c, and/or 906d) is in communication with the third computing system (e.g., 914) independent of the live communication session (e.g., via screen share). In some embodiments, displaying visual content corresponding to the display data received from the third computing system (e.g., 914) enhances the live video communication session by providing a higher resolution, and more accurate, representation of the content displayed on the third display generation component. Displaying visual content corresponding to display data received from the third computing system that corresponds to visual content displayed on the third display generation component enhances the video communication session experience by providing a higher resolution and more accurate representation of what is on the third display component without requiring any further input from the user, which provides improved visual feedback and reduces the number of inputs needed to perform an operation.
In some embodiments, the first computer system (e.g., 906a, 906b, 906c, and/or 906d) displays (or, optionally, projects, e.g., via a second display generation component in communication with the first computer system), onto a physical object (e.g., 910, 912, 914, 970, 972, and/or 974) (e.g., a physical object such as, e.g., a table, book, and/or piece of paper in the first scene), content (e.g., 958) that is included in the live video communication session (e.g., virtual markup content and/or visual content in the second scene that is, e.g., represented in the second representation of the field-of-view of the one or more second cameras of the second computer system). In some embodiments, the content (e.g., 958) displayed onto the physical object (e.g., 910, 912, 914, 970, 972, and/or 974) includes the visual markup content (e.g., 956) (e.g., the visual markup content in the second representation of the field-of-view of the one or more second cameras of the second computer system that is received in response to detecting the third user input). In some embodiments, a computer system (e.g., 906a, 906b, 906c, and/or 906d) receives an indication of movement (e.g., detecting movement) of the physical object (e.g., 910, 912, 914, 970, 972, and/or 974), and in response, moves the content (e.g., 958) displayed onto the physical object (e.g., 910, 912, 914, 970, 972, and/or 974) in accordance with the movement of the physical object (e.g., 910, 912, 914, 970, 972, and/or 974), including maintaining a position (e.g., 961) of the content (e.g., 958) relative to the physical object (e.g., 910, 912, 914, 970, 972, and/or 974). In some embodiments, the content (e.g., 958) is displayed onto a physical page (e.g., a page of book 910) and, in response to receiving an indication that the page has been turned, a computer system (e.g., 906a, 906b, 906c, and/or 906d) ceases display of the content (e.g., 958) onto the page. In some embodiments, after ceasing display of the content (e.g., 958), a computer system (e.g., 906a, 906b, 906c, and/or 906d) receives an indication that the page has been turned back to, and in response, displays (e.g., re-displays) the content (e.g., 958) onto the page. In some embodiments, a computer system (e.g., 906a, 906b, 906c and/or 906d) modifies the content (e.g., 958) in response to receiving an indication (e.g., from the first and/or second computer system) of a request to modify the content (e.g., 958). In some embodiments, after displaying the content (e.g., 958) onto the physical object (e.g., 910912, 914, 970, 972, and/or 974), a computer system (e.g., 906a, 906b, 906c, and/or 906d) fades out the display of the content (e.g., 958) over time. Displaying, onto a physical object, content that is included in the live video communication session enhances the video communication session experience by allowing users to collaborate in a mixed reality environment, which provides improved visual feedback.
Note that details of the processes described above with respect to method 1000 (e.g.,
At
At
At
At
At
At
In
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
As described below, method 1200 provides an intuitive way for managing digital content. The method reduces the cognitive burden on a user to manage digital content, thereby creating a more efficient human-machine interface. For battery-operated computing devices, enabling a user to manage digital content faster and more efficiently conserves power and increases the time between battery charges.
The computer system displays (1202), via the display generation component (and/or in a virtual environment, in an electronic document, and/or in a user interface of an application, such as a presentation application and/or a live video communication application), a representation of a physical mark (e.g., 1134 and/or 1152) (e.g., a pen, marker, crayon, pencil mark and/or other drawing implement mark) (e.g., drawing and/or writing) in a physical environment (e.g., physical environment of user 1104a) (e.g., an environment that is in the field-of-view of one or more cameras and/or an environment that is not a virtual environment) based on a view of the physical environment (e.g., 1108 and/or 1106) in a field of view (e.g., 620) of one or more cameras (e.g., image data, video data, and/or a live camera feed by one or more cameras of the computer system and/or one or more cameras of a remote computer system, such as a computer system associated with a remote participant in a live video communication session). In some embodiments, the view of the physical environment includes (or represents) the physical mark and a physical background (e.g., 1106a and/or notebook of
While displaying the representation of the physical mark without displaying the one or more elements of the portion of the physical background that is in the field of view of the one or more cameras, the computer system obtains (e.g., 1204) (e.g., receives and/or detects) data (e.g., image data, video data, and/or a live camera feed captured by one or more cameras of the computer system and/or one or more cameras of a remote computer system, such as a computer system associated with a remote participant in a live video communication session) (e.g., in near-real-time and/or in real-time) that includes (or represents) a new physical mark in the physical environment (e.g., 1128 and/or 1150).
In response to obtaining data representing the new physical mark in the physical environment, the computer system displays (1206) a representation of the new physical mark (e.g., 1134 and/or 1152) without displaying the one or more elements of the portion of the physical background that is in the field of view of the one or more cameras (e.g., as depicted in
In some embodiments, the portion of the physical background is adjacent to and/or at least partially (e.g., completely or only partially) surrounds the physical mark (e.g., as depicted in
In some embodiments, the portion of the physical background is at least partially surrounded by the physical mark (e.g., as depicted in
In some embodiments, the computer system displays (e.g., concurrently with the representation of the physical mark and/or the representation of the new physical mark) a representation of a hand of a user (e.g., 1136) that is in the field of view of the one or more cameras without displaying the one or more elements of the portion of the physical background that is in the field of view of the one or more cameras, wherein the hand of the user is in a foreground of the one or more elements of the portion of the physical background that is in the field of view of the one or more cameras (e.g., as depicted in
In some embodiments, the computer system displays (e.g., concurrently with the representation of the physical mark, the representation of the new physical mark, and/or the representation of a hand of a user) a representation of a marking utensil (e.g., 1138) (e.g., a pen, marker, crayon, pencil mark, and/or other drawing tool) without displaying the one or more elements of the portion of the physical background that is in the field of view of the one or more cameras (e.g., as depicted in
In some embodiments, before displaying the representation of the physical mark without displaying one or more elements of a portion of the physical background that is in the field of view of the one or more cameras (e.g.,
In some embodiments, detecting the user input corresponding to the request to modify the representation of the one or more elements of a portion of the physical background includes detecting a user input (e.g., 1115d, 1115e) directed at a control (e.g., 1140b and/or 1140c) (e.g., a selectable control, a slider, and/or option picker) that includes a set (e.g., a continuous set or a discrete set) of emphasis options (e.g., 1140b and/or 1140c as depicted in
In some embodiments, the user input corresponding to the request to modify the representation of the one or more elements of a portion of the physical background includes detecting a user input directed at a selectable user interface object (e.g., 1140a) (e.g., an affordance and/or button). In some embodiments, the affordance is a toggle that, when enabled, sets the degree of emphasis to 100% and, when disabled, sets the degree of emphasis to 0.0%. In some embodiments, the computer system detects a request (e.g., a number of inputs on a button, such as up and/or down button) to gradually change the degree of emphasis. In some embodiments, the affordance does not modify the degree of emphasis for the representation of the physical mark. Displaying the representation of the physical mark with the second degree of emphasis greater than the first degree of emphasis relative to the representation of the one or more elements of the portion of the physical background in response to detecting an input directed at a selectable user interface object improves the user interface because it provides additional control options that allow the user change an emphasis of the background (e.g., fully and/or partially remove the background), provides visual feedback that the camera is on, and provides visual feedback that input was detected.
In some embodiments, the physical mark in the physical environment is a first physical mark (e.g., 1128 and/or 1150), and the first physical mark is in the field of view of one or more cameras of the computer system (e.g., 1102a). In some embodiments, the computer system displays, via the display generation component, a representation (e.g., 1175b and/or 1179) of a second physical mark in a physical environment (e.g., the physical marks on 1172 as depicted in
In some embodiments, the representation of the first physical mark is a first representation (e.g., 1175a-1175c) of the first physical mark and is displayed in a first portion (e.g., 1175a-1175d) of a user interface (e.g., 1174a-1174d). In some embodiments, while displaying the first representation of the first physical mark in the first portion of the user interface, the computer system detects a first set of one or more user inputs (e.g., 1115m1 and/or 1115m2) including an input directed at a first selectable user interface object (e.g., an input directed at 1154a-1154d, 1176a-1176d) (e.g., that is adjacent to, next to, and/or within a predefined distance from the representation of the first physical mark). In some embodiments, the second portion of the user interface is a collaborative area of the user interface and/or a shared area of the user interface. In some embodiments, in response to detecting the first set of one or more user inputs, the computer system displays a second representation (e.g., 1154, 1156, 1179, and/or 1182) of the first physical mark in a second portion (e.g., 1118) of the user interface different from the first portion of the user interface (e.g., while displaying with the representation of the first physical mark in the first portion of the user interface and/or while ceasing to display the representation of the first physical mark in the first portion of the user interface). In some embodiments, the second representation of the first physical mark displayed in the second portion of the user interface is based on image data (e.g., a still image, a video and/or a live camera feed) captured by the one or more cameras of the computer system. In some embodiments, the computer system displays the second representation of the first physical mark in the second portion without displaying the one or more elements of the portion of the physical background that is in the field of view of the one or more cameras of the computer system. In some embodiments, the computer system concurrently displays, in the second portion, the representation of the second physical mark with the second representation of the first physical mark. Displaying the second representation of the first physical mark in the second portion of the user interface different in response to detecting input improves the video communication session experience because a user can move the user's mark and/or another user's physical marks to a shared collaboration space, which improves how users collaborate and/or communicate during a live video communication session and provides improved visual feedback that input was detected.
In some embodiments, the representation of the second physical mark is a first representation (e.g., 1175a-1175d) of the second physical mark and is displayed in a third portion (e.g., 1175a-1175d) of the user interface (e.g., 1174a-1174d) (e.g., different from the first portion and/or second portion). In some embodiments, the computer system detects (e.g., while displaying the second representation of the first physical mark in the third portion) a second set of one or more user inputs (e.g., 1115m1 and/or 1115m2) corresponding to a request to display a second representation (e.g., 1154, 1156, 1179, or 1182) of the second physical mark in a fourth portion (e.g., 1118) of the user interface different from the third portion of the user interface. In some embodiments, the second set of one or more user inputs includes a user input directed at a second affordance. In some embodiments, the third portion of the user interface is a collaborative area of the user interface and/or a shared area of the user interface. In some embodiments, in response to detecting the set of one or more user inputs corresponding to the request to display the second representation of the second physical mark in the fourth portion of the user interface, the computer system displays the second representation of the second physical mark (e.g., associated with a user different from the user associated with a first physical mark) in the fourth portion of the user interface (e.g., while displaying with the first representation of the second physical mark in the third portion of the user interface and/or while ceasing to display the first representation of the second physical mark in third portion of the user interface). In some embodiments, the computer system displays the second representation of the second physical mark in the fourth portion without displaying one or more elements of the portion of the physical background that is in the field of view of the one or more cameras of the external computer system. Displaying the second representation of the second physical mark in the fourth portion of the user interface in response to detecting user input during a live video communication session improves the video communication session experience because a user can move other participants' physical marks to a shared collaboration space, which improves how users collaborate and/or communicate during a live video communication session and provides improved visual feedback that input was detected.
In some embodiments, the computer system detects a request to display a digital mark (e.g., 1151g and/or 1151m1) (e.g., a digital representation of a physical mark and/or machine-generated mark) that corresponds to a third physical mark. In some embodiments, in response to detecting the request to display the digital mark, the computer system displays the digital mark that corresponds to the third physical mark (e.g., 1154, 1156, and/or 1180). In some embodiments, in response to detecting the request to display the digital mark, the computer system displays the digital mark and ceases to display the third physical mark. In some embodiments, displaying the digital mark includes obtaining data that includes an image of the third physical mark and generating a digital mark based on the third physical mark. In some embodiments, the digital mark has a different appearance than the representation of the third physical mark based on the physical mark being machine-generated (e.g., as if the physical mark were inputted directly on the computer, for example, using a mouse or stylist as opposed to being made a physical surface). In some embodiments, the representation of the third physical mark is the same as or different from the representation of the physical mark. In some embodiments, the third physical mark is captured by one or more cameras of a computer system that is different from the computer system detecting the request to display the representation of the digital mark. Displaying a digital mark that corresponds to the third physical mark provides additional control options of how physical marks are displayed within the user interface and/or how users collaborate during a live video communication session.
In some embodiments, while displaying the digital mark, the computer system detects a request to modify (e.g., 1115h and/or 1115i) (e.g., edit and/or change) (e.g., a visual characteristic of and/or visual appearance of) the digital mark corresponding to the third physical mark. In some embodiments, in response to detecting the request to modify the digital mark corresponding to the fourth physical mark, the computer system displays a new digital mark (e.g., 1156 in
In some embodiments, displaying the representation of the physical mark is based on image data captured by a first camera (e.g., a wide angle camera and/or a single camera) having a field of view (e.g., 1120) that includes a face of a user (e.g., shaded region 1108) and the physical mark (e.g., shaded region 1109) (e.g., a surface such as, for example, a desk and/or table, positioned between the user and the first camera in the physical environment that includes the physical mark). In some embodiments, the computer system displays a representation of a face of a user (e.g., a user of the computer system and/or a remote user associated with a remote computer system, such as a different participant in the live video communication session) in the physical environment based on the image data captured by the first camera (e.g., the representation of the physical mark and the representation of the user are based on image data captured by the same camera (e.g., a single camera)). Displaying the representation of the physical mark based on the image data captured by the first camera improves the computer system because a user can view different angles of a physical environment using the same camera, viewing different angles does not require further action from the user (e.g., moving the camera), doing so reduces the number devices needed to perform an operation, the computer system does not need to have two separate cameras to capture different views, and the computer system does not need a camera with moving parts to change angles, which reduces cost, complexity, and wear and tear on the device.
In some embodiments, the computer system displays a representation of the face of the user (e.g., 1104a-110d) (e.g., a user of the computer system and/or a remote user associated with a remote computer system, such as a different participant in the live video communication session) based on the image data captured by the first camera. In some embodiments, the field of view of the first camera includes (or represents) the face of the user and a physical background of the user (e.g., the physical area in the background of a face of user 1104a, 1104b, 1104c, or 1104d in
Note that details of the processes described above with respect to method 1200 (e.g.,
Device 1100a of
At
At
In some embodiments, device 1100a adds digital text to document 1306 in response to an input at device 1100a (e.g., at a button, keyboard, or touchscreen of device 1100a). In some embodiments, elements other than text are optionally added to document 1306. For example, in some embodiments, device 1100a adds images and/or content similar to images and/or slide content of
At
At
At
At
At
At
At
At
At
At
At
At
At
At
As described below, method 1400 provides an intuitive way for managing digital content. The method reduces the cognitive burden on a user for manage digital content, thereby creating a more efficient human-machine interface. For battery-operated computing devices, enabling a user to manage digital content faster and more efficiently conserves power and increases the time between battery charges.
In method 1400, the computer system displays (1402), via the display generation component, an electronic document (e.g., 1306 and/or 118) (e.g., a virtual document, an editable electronic document, a document generated by the computer system, and/or a document stored on the computer system). In some embodiments, the electronic document is displayed in a graphical user interface of an application (e.g., a word processor application and/or a note-taking application).
The computer system detects (1404), via the one or more cameras, handwriting (e.g., 1310) (e.g., physical marks such as pen marks, pencil marks, marker marks, and/or crayon marks, handwritten characters, handwritten numbers, handwritten bullet points, handwritten symbols, and/or handwritten punction) that includes physical marks on a physical surface (e.g., 1106a and/or 1308) (e.g., piece of paper, a notepad, a white board, and/or a chalk board) that is in a field of view (e.g., 1120a, 620, 6204, and/or 688) of the one or more cameras and is separate from the computer system. In some embodiments, the handwriting (and/or the physical surface) is within a field-of-view of the one or more cameras. In some embodiments, the physical surface is not an electronic surface such as a touch-sensitive surface. In some embodiments, the physical surface is in a designated position relative to a user (e.g., in front of the user, between the user and the one or more cameras, and/or in a horizontal plane). In some embodiments, the computer system does not add (e.g., foregoes adding) digital text for handwriting that is not on the physical surface. In some embodiments, the computer system only adds digital text for handwriting that is on the physical surface (e.g., the handwriting has to be in a designated area and/or physical surface).
In response to detecting the handwriting that includes physical marks on the physical surface that is in the field of view of the one or more cameras and is separate from the computer system, the computer system displays (1406) (e.g., automatically and/or manually (e.g., in response to user input)), in the electronic document (or, optionally, adds to the electronic document), digital text (e.g., 1320) (e.g., letters, numbers, bullet points, symbols, and/or punction) corresponding to the handwriting that is in the field of view of the one or more cameras (e.g., the detected handwriting). In some embodiments, the digital text is generated by the computer system (and/or is not a captured image of the handwriting). In some embodiments, the handwriting has a first appearance (e.g., font style, color, and/or font size) and the digital text has a second appearance (e.g., font style, color, and/or font size) different from the first appearance. In some embodiments, the physical surface is positioned between the user and the one or more cameras. Displaying digital text corresponding to the handwriting that is in the field of view of one or more cameras enhances the computer system because it allows a user to add digital text without typing, which reduces the number of inputs needed to perform an operation and provides additional control options without cluttering the user interface and improves how a user can add digital text to an electronic document.
In some embodiments, while (or after) displaying the digital text, the computer system obtains (e.g., receives or detects) data representing new handwriting that includes a first new physical mark (e.g., 1310 as depicted in
In some embodiments, obtaining data representing the new handwriting includes the computer system detecting (e.g., capturing an image and/or video of) the new physical marks while the new physical marks are being applied to the physical surface (e.g., “Jane,” “Mike,” and “Sarah” of 1320 are added to document 1306 while the names are being written on notebook 1308, as described in reference to
In some embodiments, obtaining data representing the new handwriting includes detecting the new physical marks when the physical surface including the new physical marks is brought into the field of view of the one or more cameras (e.g., page turn 1315h brings a new page having new handwriting 1310 into the field of view of camera 1102a, as depicted in
In some embodiments, while (or after) displaying the digital text, the computer system obtains (e.g., receiving or detecting) data representing new handwriting that includes a second new physical mark (e.g., 1334) (e.g., the same or different from the first new physical mark) (e.g., a change to a portion of the handwriting that includes the physical marks; in some embodiments, the change to the portion of the handwriting includes a change to a first portion of the handwriting without a change a second portion of the handwriting) (e.g., the second new physical mark includes adding a letter in an existing word, adding punctuation to an existing sentence, and/or crossing out an existing word) on the physical surface that is in the field of view of the one or more cameras. In some embodiments, in response to obtaining data representing the new handwriting, the computer system displays updated digital text (e.g., 1320 in
In some embodiments, displaying the updated digital text includes modifying the digital text corresponding to the handwriting (e.g., with reference to
In some embodiments, displaying the updated digital text includes ceasing to display a portion (e.g., a letter, punctuation mark, and/or symbol) of the digital text (e.g., “conclusion” is no longer displayed in 1320, as depicted in
In some embodiments, displaying the updated digital text includes: in accordance with a determination that the second new physical mark meets first criteria (e.g., 1310 in FIGS. 13C-13J) (e.g., the physical mark includes one or more new written characters, for example one or more letters, numbers, and/or words), the computer system displays new digital text (e.g., 1320 in
In some embodiments, while displaying a representation (e.g., 1316) (e.g., still image, video, and/or live video feed) of respective handwriting that includes respective physical marks on the physical surface, the computer system detects an input corresponding to a request to display digital text corresponding to the respective physical marks (e.g., 1315c, 1315f, and/or 1315g) (e.g., physical marks that have been detected, identified, and/or recognized as including text) in the electronic document. In some embodiments, the request includes a request to add (e.g., copy and paste) a detected portion of the respective handwriting to the electronic document. In some embodiments, in response to detecting the input corresponding to a request to display digital text corresponding to the respective physical marks, the computer system displays, in the electronic document, digital text (e.g., 1320) corresponding to the respective physical marks (e.g., as depicted in
In some embodiments, the computer system detects a user input (e.g., 1315c or 1315g) directed to a selectable user interface object (e.g., 1318). In some embodiments, in response to detecting the user input directed to a selectable user interface object and in accordance with a determination that the second new physical mark meets first criteria (e.g., as depicted in
In some embodiments, the computer system displays, via the display generation component, a representation (e.g., 1316) (e.g., still image, video, and/or live video feed) of the handwriting that includes the physical marks. In some embodiments, the representation of the handwriting that includes physical marks is concurrently displayed with the digital text (e.g., as depicted in
In some embodiments, the computer system displays, via the display generation component, a graphical element (e.g., 1322) (e.g., a highlight, a shape, and/or a symbol) overlaid on a respective representation of a physical mark that corresponds to respective digital text of the electronic document. In some embodiments, the computer system visually distinguishes (e.g., highlights and/or outlines) portions of handwriting (e.g., detected text) from other portions of the handwriting and/or the physical surface. In some embodiments, the graphical element is not overlaid on a respective representation of a physical mark that does not correspond to respective digital text of the electronic document. In some embodiments, in accordance with a determination that that the computer system is in a first mode (e.g., a live text capture mode is enabled and/or a live text detection mode is enabled), the computer system displays the graphical element. In some embodiments, in accordance with a determination that the computer system is in a second mode (e.g., a live text capture mode is disabled and/or a live text detection mode is disabled), the computer system does not display the graphical element. Displaying a graphical element overlaid on a representation of a physical mark when it has been added as digital text improves the computer system because it provides visual feedback of what portions of the physical handwriting have been added as digital text, which provides improved visual feedback and improves how digital text is added to an electronic document.
In some embodiments, detecting the handwriting is based on image data captured by a first camera (e.g., 602, 682, 6102, and/or 906a-906d) (e.g., a wide angle camera and/or a single camera) having a field of view (e.g., 620, 688, 1120a, 6145-1, and 6147-2) that includes a face of a user (e.g., face of 1104a, face of 622, and/or face of 623) and the physical surface (e.g., 619, 1106a, 1130, and/or 618). In some embodiments, the computer system displays a representation of the handwriting (e.g., 1316) based on the image data captured by the first camera. In some embodiments, the computer system displays a representation of the face of the user (e.g., a user of the computer system) based on the image data captured by the first camera (e.g., the representation of the physical mark and the representation of the user are based on image data captured by the same camera (e.g., a single camera)). In some embodiments, the computer system concurrently displays the representation of the handwriting and representation of the face of the user. Displaying the representation of the handwriting and the representation of the face of the user based on the image data captured by the first camera improves the computer system because a user can view different angles of a physical environment using the same camera, viewing different angles does not require further action from the user (e.g., moving the camera), doing so reduces the number devices needed to perform an operation, the computer system does not need to have two separate cameras to capture different views, and the computer system does not need a camera with moving parts to change angles, which reduces cost, complexity, and wear and tear on the device.
Note that details of the processes described above with respect to method 1400 (e.g.,
As described below, method 1500 provides an intuitive way for managing a live video communication session. The method reduces the cognitive burden on a user for manage a live communication session, thereby creating a more efficient human-machine interface. For battery-operated computing devices, enabling a user to manage a live communication session faster and more efficiently conserves power and increases the time between battery charges.
In method 1500, while (1502) the first computer system is in a live video communication session (e.g., live video communication session of
While (1502) the first computer system is in a live video communication session (e.g., live video communication session of
While (1502) the first computer system is in a live video communication session (e.g., live video communication session of
Changing a view of a physical space in the field of view of a second computer system in response to detecting a change in position of the first computer system enhances the video communication session experience because it provides different views without displaying additional user interface objects and provides visual feedback about a detected change in position of the first computer system, which provides additional control options without cluttering the user interface and provides improved visual feedback about of the detected change of position of the first computer system.
In some embodiments, while the first computer system (e.g., 100, 300, 500, 600-1, and/or 600-2) is in the live video communication session with the second computer system: the first computer system detects, from image data (e.g., image data captured by camera 602 in
In some embodiments, displaying the representation of the second view of the physical environment in the field of view of the one or more cameras of the second computer system includes: in accordance with a determination that the change in the position of the first computer system includes a first amount of change in angle of the first computer system (e.g., the change amount of change in angle caused by 6218ao, 6218aq, 6218ar, 6218av, and/or 6218aw), the second view of the physical environment is different from the first view of the physical environment by a first angular amount (e.g., as schematically depicted by the change of the position of shaded region 6217 in
In some embodiments, displaying the representation of the second view of the physical environment in the field of view of the one or more cameras of the second computer system includes: in accordance with a determination that the change in the position of the first computer system includes (e.g., is in) a first direction (e.g., the direction of change caused by 6218ao, 6218aq, 6218ar, 6218av, and/or 6218aw) (e.g., tilts up and/or rotates a respective edge of the first device toward the user) of change in position of the first computer system (e.g., based on a user tilting the first computer system), the second view of the physical environment is in a first direction in the physical environment from the first view of the physical environment (e.g., as schematically depicted by the direction of change in the position of shaded region 6217 in
In some embodiments, the change in the position of the first computer system includes a change in angle of the first computer system (e.g., 6218ao, 6218aq, 6218ar, 6218av, and/or 6218aw). In some embodiments, displaying the representation of the second view of the physical environment in the field of view of the one or more cameras of the second computer system includes: displaying a gradual transition (e.g., as depicted in
In some embodiments, the representation of the first view includes a representation of a face of a user in the field of view of the one or more cameras of the second computer system (e.g., 6214 in
In some embodiments, while displaying the representation of the physical mark, the first computer system detects, via one or more input devices (e.g., a touch-sensitive surface, a keyboard, a controller, and/or a mouse), a user input (e.g., a set of one or more user inputs) corresponding to a digital mark (e.g., 6222 and/or 6223) (e.g., a drawing, text, a virtual mark, and/or a mark made in a virtual environment). In some embodiments, in response to detecting the user input, the first computer system displays (e.g., via the first display generation component and/or a display generation component of the second computer system) a representation of the digital mark concurrently with the representation of the physical mark (e.g., as depicted in
In some embodiments, the representation of the digital mark is displayed via the first display generation component (e.g., 683 and/or as depicted in
In some embodiments, in response to detecting the digital mark, the first computer system causes (e.g., transmits and/or communicates) a representation of the digital mark to be displayed at the second computer system (e.g., 6216 and/or as depicted in
In some embodiments, the representation of the digital mark is displayed on (e.g., concurrently with) the representation of the physical mark at the second computer system (e.g., 6216 and/or as depicted in
In some embodiments, the representation of the digital mark is displayed on (or, optionally, projected onto) a physical object (e.g., 619 and/or 618) (e.g., a table, book, and/or piece of paper) in the physical environment of the second computer system. In some embodiments, the second computer is in communication with a second display generation component (e.g., a projector) that displays the representation of the digital mark onto a surface (e.g., paper, book, and/or whiteboard) that includes the physical mark. In some embodiments, the representation of the digital mark is displayed adjacent to the physical mark in the physical environment of the second computer system. Displaying the digital mark by projecting the digital mark onto a physical object (e.g., the surface on which the physical marks are made) enhances the video communication session by allowing a user to view the digital mark with respect to the physical mark and provides visual feedback that input was detected at first computer system, which improves visual feedback.
In some embodiments, while the first computer system is in the live video communication session with the second computer system: the first computer system displays, via the first display generation component, a representation of a third view of the physical environment in the field of view of the one or more cameras of the second computer system (e.g., as depicted in 6214 of
In some embodiments, displaying the representation of the first view of the physical environment includes displaying the representation of the first view of the physical environment based on the image data captured by a first camera (e.g., 602 and/or 6202) of the one or more cameras of the second computer system. In some embodiments, displaying the representation of the second view of the physical environment includes displaying the representation of the second view (e.g., shaded regions 6206 and/or 6217) of the physical environment based on the image data captured by the first camera of the one or more cameras of the second computer system (e.g., the representation of the first view of the physical environment and the representation of the first view of the physical environment are based on image data captured by the same camera (e.g., a single camera)). Displaying the first view and the second view based on the image data captured by the first camera enhances the video communication session experience because different perspectives can be displayed based on image data from the same camera without requiring further input from the user, which improves how users collaborate and/or communicate during a live communication session and reduces the number of inputs (and/or devices) needed to perform an operation. Displaying the first view and the second view based on the image data captured by the first camera improves the computer system because a user can view different angles of a physical environment using the same camera, viewing different angles does not require further action from the user (e.g., moving the camera), and doing so reduces the number devices needed to perform an operation, the computer system does not need to have two separate cameras to capture different views, and/or the computer system does not need a camera with moving parts to change angles, which reduces cost, complexity, and wear and tear on the device.
In some embodiments, displaying the representation of the second view of the physical environment in the field of view of the one or more cameras of the second computer system is performed in accordance with a determination that authorization has been provided (e.g., user 622 and/or device 600-1 grants permission for user 623 and/or device 600-4 to change the view) (e.g., granted or authorized at the second computer system and/or by a user of the second computer system) for the first computer system to change the view of the physical environment that is displayed at the first computer system. In some embodiments, in response to detecting the change in the position of the first computer system, and in accordance with a determination that authorization has been provided for the first computer system to change the view, the first computer system displays the representation of the second view of the physical environment in the field of view of the one or more cameras of the second computer system. In some embodiments, in response to detecting the change in the position of the first computer system, and in accordance with a determination that authorization has not been provided for the first computer system to change the view, the first computer system foregoes displaying the representation of the second view of the physical environment in the field of view of the one or more cameras of the second computer system. In some embodiments, authorization can be provided by enabling an authorization affordance (e.g., a user interface object and/or a setting) at the second computer system (e.g., a user of the second computer system grants permission to the user of the first computer system to view different portions of the physical environment based on movement of the first computer system). In some embodiments, the authorization affordance is disabled (e.g., automatically) in response to detecting a termination of the live video communication session. Displaying the representation of the second view based on a determination that authorization has been provided for the first computer system to change the view enhances the video communication session by providing additional security, which improves how users collaborate and/or communicate during a live communication session.
In some embodiments, while displaying a representation of a third view of the physical environment (e.g., 6214 and/or 6216 in
In some embodiments, in response to detecting the respective change in the position of the first computer system: in accordance with the determination that the respective change in the position of the first computer corresponds to the view that is not within the defined portion of the physical environment, the first computer system displays, via the first display generation component, an obscured (e.g., blurred and/or greyed out) representation (e.g., 6226) of the portion of the physical environment that is not within the defined portion of the physical environment (e.g., as described in reference to
In some embodiments, the second view of the physical environment includes a physical object in the physical environment. In some embodiments, while displaying the representation of the second view of the physical environment, the first computer system obtains image data that includes movement of the physical object in the physical environment (e.g., 6230 and/or 6232) (e.g., movement of the physical mark, movement of a piece of paper, and/or movement of a hand of a user). In some embodiments, in response to obtaining image data that includes the movement of the physical object: the first computer system displays a representation of a fourth view of the physical environment that is different from the second view and that includes the physical object (e.g., 6214 and/or 6216 in
In some embodiments, the first computer system is in communication (e.g., via a local area network, via short-range wireless Bluetooth connection, and/or the live communication session) with a second display generation component (e.g., 6201) (e.g., via another computer system such as a tablet computer, a smartphone, a laptop computer, and/or a desktop computer). In some embodiments, the first computer system displays, via the second display generation component, a representation of a user (e.g., 622) in the field of view of the one or more cameras of the second computer system (e.g., 622-4), wherein the representation of the user is concurrently displayed with the representation of the second view of the physical environment that is displayed via the first display generation component (e.g., 6214 in
In some embodiments, while the first computer system is in the live video communication session with the second computer system, and in accordance with a determination that a third computer system (e.g., 600-2) (e.g., a smartphone, a tablet computer, a laptop computer, a desktop computer, and/or a head mounted device (e.g., a head mounted augmented reality and/or extended reality device)) satisfies a first set of criteria (e.g., as described in reference to
In some embodiments, the first set of criteria includes a second set of criteria (e.g., a subset of the first set of criteria) that is different from the location criterion (e.g., the set of criteria includes at least one criterion other than the location criterion) and that is based on a characteristic (e.g., an orientation and/or user account) of the third computer system (e.g., as described in reference to
In some embodiments, the second set of criteria includes an orientation criterion that is satisfied when the third computer system is in a predetermined orientation (e.g., as described in reference to
In some embodiments, the second set of criteria includes a user account criterion that is satisfied when the first computer system and the third computer system are associated with (e.g., logged into or otherwise connected to) a same user account (e.g., as described in reference to
Note that details of the processes described above with respect to method 1500 (e.g.,
John's device 6100-1 of
It should be appreciated that the embodiments illustrated in
John's device 6100-1 also displays dock 6104, which includes various application icons, including a subset of icons that are displayed in dynamic region 6106. The icons displayed in dynamic region 6106 represent applications that are active (e.g., launched, open, and/or in use) on John's device 6100-1. In
At
At
At
At
At
At
In some embodiments, as depicted in
At
At
At
At
In
At
At
Top-down preview 1613 is updated (e.g., compared to
At
At
Additionally or alternatively, in some embodiments, John's device 6100-1 modifies a size of region indicator 1610 (and/or region 1616) based on a change in visual content that is displayed in preview 1606 and/or the change in the physical environment in the field of view (e.g., a difference in the size and/or length of surface 619 and/or a difference in objects detected on surface 619). In some embodiments, John's device 6100-1 does not modify the size of region indicator 1610 (and/or region 1616) based on a change in visual content that is displayed in preview 1606 and/or the change in the physical environment in the field of view (e.g., the size of region indicator 1610 and/or region 1616 is independent of the visual content that is displayed in preview 1606 and/or the change in the physical environment in the field of view).
In some embodiments, during movement 1650e in
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
At
In some embodiments, the communication session between John's device 6100-1 and Jane's tablet 600-2 was a communication session that was most recent in time to the communication session between John's device 6100-1 and Sam's device 1634 (e.g., there were not intervening communication sessions that included a sharing of a surface view and/or change in a surface view). As such, in some embodiments, John's device 6100-1 activates the settings for region control 1628 (and/or region control 1612 of preview user interface 1604) based on most recent settings for region control 1628 (and/or most recent settings for region control 1612 of preview user interface 1604) that was used for the communication session between John's device 6100-1 and Jane's device 600-2. In some embodiments, John's device 6100-1 detects that there has been no significant change in position, such as a translation, rotation, and/or change in orientation, of camera 6102 and/or John's device 6100-1 (e.g., there has been no change and/or the changes are within a threshold amount of change). In such embodiments, John's device 6100-1 activates the settings for region control 1628 (and/or region control 1612 of preview user interface 1604) based on most recent settings for region control 1628 (and/or most recent settings for region control 1612 of preview user interface 1604) that were used for the communication session between John's device 6100-1 and Jane's device 600-2. Additionally or alternatively, in embodiments where there has been no significant change in position of camera 6102 and/or John's device 6100-1, John's device 6100-1 optionally does not display preview user interface 1604 and, instead, displays a surface view based on the most recent settings for region control 1628 (and/or region control 1612 of preview user interface 1604).
At
At
At
As described below, method 1700 provides an intuitive way for managing a live video communication session. The method reduces the cognitive burden on a user to manage a live video communication session, thereby creating a more efficient human-machine interface. For battery-operated computing devices, enabling a user to manage a live video communication session faster and more efficiently conserves power and increases the time between battery charges.
In method 1700, the first computer system detects (1702), via the one or more input devices, one or more first user inputs (e.g., 1650a and/or 1650b) (e.g., a tap on a touch-sensitive surface, a keyboard input, a mouse input, a trackpad input, a gesture (e.g., a hand gesture), and/or an audio input (e.g., a voice command)) corresponding to a request (e.g., a first request) to display a user interface (e.g., 1606) of an application (e.g., the camera application associated with camera application icon 6136-1 and/or the video conferencing application associated with video conferencing application icon 6110) for displaying a visual representation (e.g., 1606) (e.g., a still image, a video, and/or a live camera feed captured by the one or more cameras) of a surface (e.g., 619 and/or 618) that is in a field of view of the one or more cameras (e.g., a physical surface; a horizontal surface, such as a surface of a table, floor, and/or desk; a vertical surface, such as a wall, whiteboard, and/or blackboard; a surface of an object, such as a book, a piece of paper, a display of tablet; and/or other surfaces). In some embodiments, the application (e.g., a camera application and/or a surface view application) provides the image of the surface to be shared in a separate application (e.g., a presentation application, a video communications application, and/or an application for providing an incoming and/or outgoing live audio/video communication session). In some embodiments, the application that displays the image of the surface is capable of sharing the image of the surface (e.g., without a separate video communication application).
In response (1704) to detecting the one or more first user inputs and in accordance with a determination that a first set of one or more criteria is met (e.g., 6100-1 and/or 6102 has moved; 1610 and/or 1616 has not been previously defined; a request to display 6100-1 and/or 6102 is detected; and/or 1610 and/or 1616 are automatically displayed unless one or more conditions are satisfied, including a condition that a setting corresponding to a request not to display 1610 and/or 1616 has been enabled), the first computer system concurrently displays (1706), via the display generation component, a visual representation (1708) (e.g., 1616) of a first portion of the field of view of the one or more cameras and a visual indication (1710) (e.g., 1606 and/or visual emphasis of 1616) (e.g., a highlight, a shape, and/or a symbol) (e.g., a first indication) that indicates a first region (e.g., 1616) of the field of view of the one or more cameras that is a subset of the first portion of the field of view of the one or more cameras, wherein the first region indicates a second portion (e.g., portion of the field of view in region 1616) of the field of view of the one or more cameras that will be presented as a view of the surface (e.g., 1618-1, 1618-2, and/or 1618-3) by a second computer system (e.g., 100, 300, 500, 600-1, 600-2, 600-4, 1100a, 1634, 6100-1, and/or 6100-2) (e.g., a remote computer system, an external computer system, a computer system associated with a user different from a user associated with the first computer system, a smartphone, a tablet computer, a laptop computer, desktop computer, and/or a head mounted device). In some embodiments, the first set of one or more criteria includes a criterion that the user has not previously defined a region of the field of view that will be presented as a view of a surface by an external computer system. In some embodiments, the first set of one or more criteria includes a criterion that the one or more cameras has exceeded a threshold amount of change in position (e.g., a change in location in space, a change in orientation, a translation, and/or a change of a horizontal and/or vertical angle). In some embodiments, the first computer system displays the portion of the image data that will be displayed by the second computer system with a first degree of emphasis (e.g., opacity, transparency, translucency, darkness, and/or brightness) relative to at least a portion the image data that will not by the second computer system. In some embodiments, in response to detecting one or more inputs, the first computer system displays a second indication of a second portion of the image data different from the first portion of the image data will be displayed by the second computer system. In some embodiments, the indication is overlaid on the displayed image data. In some embodiments, the indication is displayed over at least a portion of the displayed image data that includes the surface. In some embodiments, the surface is positioned between the user and the one or more cameras. In some embodiments, the surface is positioned to beside (e.g., to the left or right) the user. In some embodiments, in accordance with a determination that the first set of one or more criteria is not met, the first computer system forgoes displaying the user interface of the application for sharing the image of the surface that is in the field of view of the one or more cameras, including not displaying (e.g., within the user interface) the image data captured by the one or more cameras and the indication of the portion of the image data that will be displayed by the second computer system. Concurrently displaying the visual representation of the first portion of the field of view and the visual indication that indicates the first region of the field of view that is a subset of the first portion of the field of view, where the first region indicates the second portion of the field of view will be presented as a view of the surface by the second computer system, enhances a video communication session experience because it provides visual feedback of what portion of the field of view will be shared and improves security of what content is shared in a video communication session since a user can view what area of a physical environment will be shared as visual content.
In some embodiments, the visual representation of the first portion of the field of view of the one or more cameras and the visual indication of the first region of the field of view is concurrently displayed while the first computer system is not sharing (e.g., not providing for display, not transmitting, and/or not communicating to an external device) the second portion of the field of view of the one or more cameras with the second computer system (e.g., 6100-1 is not sharing 1616, 1618-1, 1618-2, and/or 1618-3). Concurrently displaying the visual representation of the first portion of field of view of the one or more cameras and the visual indication of the first region of the field of view enhances a video communication session experience because it provides a preview of what portion of the field of view that will be shared as a surface view, which provides improved security regarding what area of a physical environment will be shared in a video communication session prior to sharing the surface view and provides improved visual feedback about what will be presented by the second computer system.
In some embodiments, the second portion of the field of view of the one or more cameras includes an image of a surface (e.g., image of 619) (e.g., a substantially horizontal surface and/or a surface of a desk or table) that is positioned between the one or more cameras and a user (e.g., 622 and/or 623) in the field of view of the one or more cameras. In some embodiments, the surface is in front of the user. In some embodiments, the surface is within a predetermined angle (e.g., 70 degrees, 80 degrees, 90 degrees, 100 degrees, or 110 degrees) of the direction of gravity. Because the second portion of the field of view of the one or more cameras includes an image of a surface that is positioned between the one or more cameras and a user in the field of view of the one or more cameras, a user can share a surface view of a table or desk, which improves a video communication session experience since it offers a view of particular surfaces in specific locations and/or improves how users communicate, collaborate, or interact in a video communication session.
In some embodiments, the surface includes (e.g., is) a vertical surface (e.g., as described in reference to
In some embodiments, the view of the surface that will be presented by the second computer system includes an image (e.g., photo, video, and/or live video feed) of the surface that is (or has been) modified (e.g., to correct distortion of the image of the surface) (e.g., adjusted, manipulated, and/or corrected) based on a position (e.g., location and/or orientation) of the surface relative to the one or more cameras (e.g., as described in greater detail with reference to
In some embodiments, the first portion of the field of view of the one or more cameras includes an image (e.g., 1616) of a user (e.g., 622 and/or 623) in the field of view of the one or more cameras. Including an image of a user in the first portion of field of view of the one or more cameras improves a video communication session experience by providing improved feedback of portions of the field of view are captured by the one or more cameras.
In some embodiments, after detecting a change in position of the one or more cameras (e.g., 1650e and/or 1650f), the first computer system concurrently displays, via the display generation component (and, optionally, based on the change in position of the one or more cameras) (e.g., before or after concurrently displaying the visual representation of the first portion of the field of view of the one or more cameras and the visual indication): a visual representation of a third portion (e.g., 1606 of
In some embodiments, while the one or more cameras are substantially stationary (e.g., stationary or having moved less than a threshold amount) and while displaying the visual representation of the first portion of the field of view of the one or more cameras and the visual indication (e.g., and/or before or after concurrently displaying the visual representation of the third portion of the field of view of the one or more cameras and the visual indication), the first computer system detects, via the one or more user input devices, one or more second user inputs (e.g., 1650c and/or 1650d) (e.g., corresponding to a request to change the portion of the field of view of the one or more cameras that is indicated by the visual indication). In some embodiments, in response to detecting the one or more second user inputs and while the one or more cameras remain substantially stationary, the first computer system concurrently displays, via the display generation component the visual representation of the first portion of the field of view and the visual indication, wherein the visual indication indicates a third region (e.g., 1616 of
In some embodiments, while displaying the visual representation of the first portion of the field of view of the one or more cameras and the visual indication, the first computer system detects, via the one or more user input devices, a user input (e.g., 1650c and/or 1650d) directed at a control (e.g., 1612) (e.g., a selectable control, a slider, and/or option picker) that includes a set (e.g., a continuous set or a discrete set) of options (e.g., sizes, dimensions, and/or magnitude) for the visual indication. In some embodiments, in response to detecting the user input directed at the control, the first computer system displays (e.g., changes, updates, and/or modifies) the visual indication to indicate a fourth region (e.g., 1616 of
In some embodiments, in response to detecting the user input directed at the control, the first computer system maintains a position (e.g., relative to the field of view of the one or more cameras) of a first portion (e.g., 1614, and as described in reference to
In some embodiments, the first portion of the visual indication corresponds to an upper most edge (e.g., 1614) of the second portion of the field of view that will be presented as the view of the surface by the second computer system. In some embodiments, the first portion of visual indication corresponds to a lower most edge of the visual indication. When the first portion of the visual indication corresponds to an upper most edge of the second portion of the field of view that will be presented as the view of the surface by the second computer system, it improves a video communication session experience and provides additional control options because it allows at least the upper most edge of the visual indication to remain fixed as a user adjusts what portion of the field of view will be shared in the communication session.
In some embodiments, the first portion of the field of view of the one or more cameras and the second portion of the field of view of the one or more cameras that will be presented as the view of the surface by the second computer system is based on image data captured by a first camera (e.g., 6102 is a wide angle camera) (e.g., a wide angle camera and/or a single camera). In some embodiments, the field of view of the first camera includes the surface and a face of a user. Basing the first portion of the field of view of the one or more cameras and the second portion of the field of view of the one or more cameras that will be presented as the view of the surface by the second computer system on the image data captured by the first camera enhances the video communication session experience because different portions of the field of view can be displayed based on image data from the same camera without requiring further input from the user, which improves how users collaborate and/or communicate during a live communication session and reduces the number of inputs (and/or devices) needed to perform an operation. Basing the first portion of the field of view of the one or more cameras and the second portion of the field of view of the one or more cameras that will be presented as the view of the surface by the second computer system on the image data captured by the first camera improves the computer system because a user can view which portions of the field of view of a single will can be presented at a different angle without requiring further action from the user (e.g., moving the camera), and doing so reduces the number devices needed to perform an operation, the computer system does not need to have two separate cameras to capture different views, and/or the computer system does not need a camera with moving parts to change angles, which reduces cost, complexity, and wear and tear on the device.
In some embodiments, the first computer system detects, via the one or more user input devices, one or more third user inputs (e.g., 16501 and/or 1650b) corresponding to a request (e.g., a second request) to display (e.g., re-display) the user interface of the application for displaying a visual representation (e.g., 1606) of a surface (e.g., 619) that is in the field of view of the one or more cameras. In some embodiments, in response to detecting the one or more third user inputs and in accordance with a determination that the first set of one or more criteria is met, the first computer system concurrently displays, via the display generation component, a visual representation of a seventh portion (e.g., 1606 in
In some embodiments, a visual characteristic (e.g., a scale, a size, a dimension, and/or a magnitude) of the visual indication is user-configurable (e.g., 1616 and/or 1610 is user-configurable) (e.g., adjustable and/or modifiable) (e.g., when a user desires to change what region of the field of view will be (e.g., is) presented as a surface view by a remote computer system), and wherein the first computer system displays the visual indication that indicates the fifth region as having a visual characteristic that is based on a visual characteristic of the visual indication that was used during a recent use (e.g., a most recent use and/or a recent use that corresponds to a use during a most recent communication session to a current communication session) of the one or more cameras to present as a view of the surface by a remote computer system (e.g., 1616 and/or 1610 in
In some embodiments, while displaying the visual representation of the first portion of the field of view of the one or more cameras and the visual indication, the first computer system detects, via the one or more user input devices, one or more fourth user inputs (e.g., 1650c and/or 1650d) corresponding to a request to modify a visual characteristic (e.g., a scale, a size, a dimension, and/or a magnitude) of the visual indication. In some embodiments, in response to detecting the one or more fourth user inputs, the first computer system displays (e.g., changes, updates, and/or modifies) the visual indication to indicate a sixth region (e.g., 1616 of
In some embodiments, in response to detecting the one or more first user inputs and in accordance with a determination that a second set of one or more criteria is met (e.g., as described in 16N, preview user interface 1604 is optionally not displayed if movement of camera 6102 and/or John's laptop 6100-1 is less than a threshold amount) (e.g., in accordance with a determination that the first set of one or more criteria is not met), wherein the second set of one or more criteria is different from the first set of one or more criteria, the first computer system displays the second portion of the field of view as a view of the surface that will be presented by the second computer system (e.g., 1618-1 and/or 1618-3 are displayed instead of displaying preview user interface 1604). In some embodiments, the second portion of the field of view includes an image of the surface that is modified based on a position of the surface relative to the one or more cameras (e.g., 1618-1 and/or 1618-3). In some embodiments, displaying the second portion of the field of view as a view of the surface that will be presented by the second computer system includes providing (e.g., sharing, communicating and/or transmitting) the second portion of the field of view for presentation by second computer system. In some embodiments, the second set of one or more criteria includes a criterion that the user has previously defined a region of the field of view that will be presented as a view of a surface by an external computer system. In some embodiments, the second set of one or more criteria includes a criterion that at least a portion of the first computer system (e.g., the one or more cameras) has not exceeded a threshold amount of change in position (e.g., a change in location in space, a change in orientation, a translation, and/or a change of a horizontal and/or vertical angle). Conditionally displaying the second portion of the field of view as a view of the surface that will be presented by the second computer system, where the second portion of the field of view includes an image of the surface that is modified based on a position of the surface relative to the one or more cameras, reduces the number of inputs to configure the visual indicator to configure a visual indication and/or reduce the number of inputs to request to display an image of the surface that has a corrected view.
In some embodiments, while providing (e.g., communicating and/or transmitting) the second portion of the field of view as a view of the surface for presentation by the second computer system, the first computer system displays, via the display generation component, a control (e.g., 1628) to modify (e.g., expand or shrink) a portion (e.g., the portion displayed in 1618-1, 1618-2, and/or 1618-3) of the field of view of the one or more cameras that is to be presented as a view of the surface by the second computer system. In some embodiments, the first computer system displays, via the display generation component, the second portion of the field of view as a view of the surface (e.g., while the second computer system displays the second portion of the field of view as a view of the surface). In some embodiments, the first computer system detects, via the one or more input devices, one or more inputs directed at the control to modify (e.g., expand or shrink) the portion of the field of view of the one or more cameras that is to be presented as a view of the surface by the second computer system. In some embodiments, in response to detecting the one or more inputs directed at the control to modify a portion of the field of view that is provided a surface view, the first computer system provides a tenth portion of the field of view, different from the second portion, as a view of the surface for presentation by the second computer system. Displaying a control to modify a portion of the field of view of the one or more cameras that is to be presented as a view of the surface by the second computer system improves security of what content is shared in a video communication session since a user can adjust what area of a physical environment is being shared as visual content and improves how users communicate, collaborate, or interact in a video communication session.
In some embodiments, in accordance with a determination that focus (e.g., mouse, pointer, gaze and/or other indication of user attention) is directed to a region (e.g., of a user interface) corresponding to the view of the surface (e.g., cursor in
In some embodiments, the second portion of the field of view includes a first boundary (e.g., boundary along a top of 1618-1 and/or 1618-2, such as the boundary that is cutting off a view of John's laptop in
In some embodiments, while the camera is substantially stationary (e.g., stationary or having moved less than a threshold amount) and while displaying the visual representation (e.g., 1606 in
In some embodiments, displaying the visual indication in response to detecting the one or more sixth user inputs and while the camera remains substantially stationary includes maintaining the position (e.g., including the size and shape) of the visual indication relative to the user interface of the application (e.g., the first computer system changes a zoom level of the visual representation of the field of view of the one or more cameras, while the visual indication remains unchanged). In some embodiments, changing the portion of the field of view in the visual representation, without changing the visual indication, changes the region of the field of view that is indicated by the visual indication, and thus changes the portion of the field of view of the one or more cameras that will be presented as a view of the surface by a second computer system.
In some embodiments, displaying the visual indication includes: in accordance with a determination that a set of one or more alignment criteria are met, wherein the set of one or more alignment criteria include an alignment criterion that is based on an alignment between a current region of the field of view of the one or more cameras indicated by the visual indication and a designated portion (e.g., a target, suggested, and/or recommended portion) of the field of view of the one or more cameras, displaying the visual indication having a first appearance (e.g., the appearance of 1610 in
Displaying the visual indication having an appearance that is based on whether or not an alignment criteria is met, where the alignment criteria is based on an alignment between a current region of the field of view of the one or more cameras indicated by the visual indication and a designated portion of the field of view of the one or more cameras enables the computer system to indicate when a recommended (e.g., optimal) portion of the field of view is indicated by the visual indication and reduces the number of inputs needed to properly adjust the visual indication, which provides improved visual feedback to the user and reduces the number of inputs needed to perform an operation.
In some embodiments, while the visual indication indicates an eighth region of the field of view of the one or more cameras, the first computer system displays, concurrently with a visual representation of a thirteenth portion of the field of view of the one or more cameras and the visual indication (e.g., in response to detecting the one or more first user inputs and in accordance with a determination that a first set of one or more criteria is met), a target area indication (e.g., 1611) (e.g., that is visually distinct and different from the visual indication) that indicates a first designated region (e.g., a target, suggested, and/or recommended region) of the field of view of the one or more cameras (e.g., that is different from the eighth region of the field of view of the one or more cameras indicated by the visual indication), wherein the first designated region indicates a determined portion (e.g., a target, suggested, selected, and/or recommended portion) of the field of view of the one or more cameras that is based on a position of the surface in the field of view of the one or more cameras. Displaying a target area indication concurrently with the visual indication provides additional information to the user about how to adjust the visual indication to align with a recommended (e.g., optimal) portion of the field of view and reduces the number of inputs needed to properly adjust the visual indication, which provides improved visual feedback to the user and reduces the number of inputs needed to perform an operation.
In some embodiments, the target area indication (e.g., 1611) (e.g., the position, size, and/or shape of the target area indication) is stationary (e.g., does not move, is locked, or is fixed) relative to the surface (e.g., 619) (or the visual representation of the surface) (e.g., 1611a in
In some embodiments, the portion of the physical environment in the field of view of the one or more cameras indicated by the target area indication remains constant as the portion of the field of view of the one or more cameras represented by the visual representation changes (e.g., due to a change in the position of the one or more cameras and/or in response to user input corresponding to a request to change the portion of the field of view of the one or more cameras represented by the visual representation, such as a request to zoom in or zoom out). In some embodiments, the target area indication moves within the visual representation of the field of view to remain locked to the determined portion of the field of view of the one or more cameras.
In some embodiments, after detecting a change in position of the one or more cameras, the first computer system displays the target area indication, where the target area indication indicates the first designated region (e.g., the same designated region) of the field of view of the one or more cameras after the change in position of the one or more cameras (e.g., the target area indication indicates the same portion of the surface after the one or more cameras is moved). In some embodiments, when the one or more cameras are moved, the target area indication does not move with the field of view of the one or more cameras (e.g., maintains the same position relative to the surface).
In some embodiments, the target area indication (e.g., the position, size, and/or shape of the target area indication) is selected (e.g., automatically selected, without detecting user input selecting the target area indication) based on an edge of the surface (e.g., 619) (e.g., a position such as a location and/or orientation of an edge of the surface that is, optionally, automatically detected by the device based on one or more sensor inputs such as a camera or other sensor that acquires information about the physical environment that can be used to detect edges of surfaces). Selecting the target area indication based on an edge of the surface enables the computer system to select a relevant target area without requiring a user to provide inputs to select the criteria for selecting the target area indication, which provides improved visual feedback to the user and reduces the number of inputs needed to perform an operation.
In some embodiments, in accordance with a determination that the edge of the surface is in a first position in the field of view of the one or more cameras, the first computer system displays the target area indication in a first position (e.g., relative to the visual representation of the field of view of the one or more cameras); and in accordance with a determination that the edge of the surface is in a second position in the field of view of the one or more cameras that is different from the first position of the edge of the surface in the field of view of the one or more cameras, the first computer system displays the target area indication in a second position (e.g., relative to the visual representation of the field of view of the one or more cameras) that is different from the first position relative to the visual representation of the field of view of the one or more cameras.
In some embodiments, the target area indication (e.g., the position, size, and/or shape of the target area indication) is selected (e.g., automatically selected, without detecting user input selecting the target area indication) based on a position of a person (e.g., 622) (e.g., a user of the first computer system) in the field of view of the one or more cameras (or a position of a representation of a person in the visual representation of the field of view of the one or more cameras that is, optionally, automatically detected by the device based on one or more sensor inputs such as a camera or other sensor that acquires information about the physical environment that can be used to detect a position of a person). Selecting the target area indication based on a position of a user in the field of view of the one or more cameras enables the computer system to select a relevant target area without requiring a user to provide inputs to select the criteria for selecting the target area indication, which provides improved visual feedback to the user and reduces the number of inputs needed to perform an operation.
In some embodiments, in accordance with a determination that the user is in a first position in the field of view of the one or more cameras, the first computer system displays the target area indication in a first position (e.g., relative to the visual representation of the field of view of the one or more cameras); and in accordance with a determination that the person is in a second position in the field of view of the one or more cameras that is different from the first position of the person in the field of view of the one or more cameras, the first computer system displays the target area indication in a second position (e.g., relative to the visual representation of the field of view of the one or more cameras) that is different from the first position relative to the visual representation of the field of view of the one or more cameras.
In some embodiments, after detecting a change in position of the one or more cameras (e.g., movement 1650e), the first computer system displays, via the display generation component, the target area indication (e.g., 1611 or 1611b), wherein the target area indication indicates a second designated region (e.g., the region indicated by 1611b in
In some embodiments, the first computer system displays, concurrently with the visual representation of the field of view of the one or more cameras and the visual indication (e.g., in response to detecting the one or more first user inputs and in accordance with a determination that a first set of one or more criteria is met), a surface view representation (e.g., 1613) (e.g., image and/or video) of the surface in a ninth region of the field of view of the one or more cameras indicated by the visual indication that will be presented as a view of the surface by a second computer system, wherein the surface view representation includes an image (e.g., photo, video, and/or live video feed) of the surface captured by the one or more cameras that is (or has been) modified based on a position of the surface relative to the one or more cameras to correct a perspective of the surface (e.g., as described in greater detail with respect to methods 700 and 1700). Displaying a surface view representation of the region indicated by the visual indication that includes an image of the surface captured by the one or more cameras that is modified based on a position of the surface relative to the one or more cameras provides the user with additional information about the view that will be presented as a view of the surface by the second computer system based on the current state (e.g., position and/or size) of the visual indication and reduces the number of inputs required for the user to adjust the visual indication, which provides improved visual feedback to the user and reduces the number of inputs needed to perform an operation.
In some embodiments, displaying the surface view representation (e.g., 1613) includes displaying the surface view representation in (e.g., within, on, overlaid on, and/or in a portion of) a visual representation (e.g., 1606) of a portion of the field of view of the one or more cameras that includes a person (e.g., 622). In some embodiments, displaying the surface view representation includes displaying the surface view preview representation as a window within the user interface of the application and/or as a picture-in-picture in the user interface of the application. Displaying the surface view representation in a visual representation of a portion of the field of view of the one or more cameras that includes a user provides the user with additional contextual information about the state (e.g., position) of the user relative to the view that will be presented as a view of the surface by the second computer system (e.g., proximity of the user to the view that will be presented by the second computer system) without requiring the user to provide additional inputs to adjust the one or more cameras and/or the visual indication, which provides improved visual feedback to the user and reduces the number of inputs needed to perform an operation.
In some embodiments, after displaying the surface view representation of the surface in the ninth region of the field of view of the one or more cameras indicated by the visual indication, the first computer system detects a change in the field of view of the one or more cameras indicated by the visual indication (e.g., due to a change in the position of the one or more cameras and/or in response to user input corresponding to a request to change the portion of the field of view of the one or more cameras represented by the visual representation, such as a request to zoom in or zoom out); and in response to detecting the change in the field of view of the one or more cameras indicated by the visual indication, the first computer system displays (e.g., updates and/or updates in real-time) the surface view representation, wherein the surface view representation includes the surface in the ninth region of the field of view of the one or more cameras indicated by the visual indication after the change in the field of view of the one or more cameras indicated by the visual indication (e.g., the first computer system updates the surface view representation to display the current portion of the field of view of the one or more cameras indicated by the visual indication) (e.g., 1613 updates from:
Note that details of the processes described above with respect to method 1700 (e.g.,
In
In the embodiment illustrated in
In
Feature description portion 1806b includes text and/or graphics with information describing the feature of the camera application corresponding to camera application window 6114. The information describes that a surface view can be shared, and that the camera application will automatically show a top down view of the surface in front of computer system 1800a using camera 1852a of external device 1850a.
Computer system 1800a displays a virtual demonstration in virtual demonstration portion 1806a in which a virtual writing implement creates a simulated mark on a virtual surface.
In the second state, virtual writing implement 1814 has made a simulated mark 1816a (e.g., written the letter “h”) on virtual surface 1812. Concurrently, a simulated image is displayed on virtual computer system 1808a of an image of virtual surface 1812 captured by a camera of virtual external device 1810. The simulated image includes simulated image 1818 of virtual surface 1812, simulated image 1820 of virtual writing implement 1814, and simulated mark image 1822a of simulated mark 1816a. Simulated image 1818 of virtual surface 1812 is also referred to as simulated surface image 1818; simulated image 1820 of virtual writing implement 1814 is also referred to as simulated writing implement image 1820; and simulated mark image 1822a of simulated mark 1816a is also referred to as simulated mark image 1822a.
In response to detecting selection of learn more option 1806c, computer system 1800a displays information, or a user interface that provides access to information, for using the feature of the camera application demonstrated by the tutorial. In
In response to detecting selection of continue option 1806d, computer system 1800a initiates the feature demonstrated by the tutorial. In
In some embodiments, the virtual demonstration is repeated or looped (e.g., one or more times). In some embodiments, the virtual demonstration displays (e.g., transitions through) the states described in
In some embodiments, the virtual demonstration includes additional states or omits one or more of the states described in
In some embodiments, the content in feature description portion 1806b remains constant throughout the tutorial (e.g., is the same in all of
In
In some embodiments, virtual external device 1810a is displayed in a selected orientation of a plurality of possible orientations. In some embodiments, the selected orientation represents a recommended orientation of the corresponding physical external device (e.g., 1850a) for the feature demonstrated by the tutorial (e.g., a recommended orientation of external device 1850a when using the camera application). In some embodiments, the selected orientation is based on a property of the computer system and/or the external device. In some embodiments, the selected orientation is selected based on the type of device of the computer system, a height of the camera (e.g., a height of an expected mounting position of the camera), and/or a field of view of the camera. In some embodiments, a portrait orientation is selected when the computer system is a laptop computer because the portrait orientation will result in a greater height of the camera than a landscape orientation when the camera is mounted to the computer system (e.g., as shown in
In
In
In response to detection selection of camera application icon 6108 in
In
In
In response to detection selection of camera application icon 6108 in
In
As described below, method 1900 provides an intuitive way for displaying a tutorial for a feature on a computer system. The method reduces the cognitive burden on a user to display a tutorial for a feature on a computer system, thereby creating a more efficient human-machine interface. For battery-operated computing devices, enabling a user to display a tutorial for a feature on a computer system faster and more efficiently conserves power and increases the time between battery charges.
In method 1900, the computer system detects (1902), via the one or more input devices, a request (e.g., an input, a touch input, a voice input, a button press, a mouse click, a press on a touch-sensitive surface, an air gesture, selection of a user-interactive graphical object, and/or other selection input) (e.g., selection of 6108, selection of 6136-1 in
In response to detecting the request to use the feature on the computer system, the computer system displays (1904), via the display generation component, a tutorial (e.g., 1806, 1806a, and/or 1806b) for using the feature that includes a virtual demonstration of the feature (e.g., the virtual demonstration in 1806a described in
In some embodiments, the computer system displays the tutorial for using the feature that includes the virtual demonstration of the feature in response to detecting the request to use the feature on the computer system in accordance with a determination that a set of criteria is met (e.g., a set of one or more criteria and/or predetermined criteria); and the computer system forgoes displaying the tutorial for using the feature that includes the virtual demonstration of the feature in response to detecting the request to use the feature on the computer system in accordance with a determination that the set of criteria is not met. In some embodiments, the set of criteria includes a criterion that is met if the feature has been used (e.g., initiated, activated, opened, and/or launched on the computer system or, optionally, on another computer system associated with a same user as the computer system) a number of times that satisfies (e.g., is equal to; is less than or equal to; or is less than) a threshold amount (e.g., zero times, one time, two times, or three times) (e.g., the set of criteria is based on whether the feature has been used by a user at least a threshold amount (e.g., one or more times)). In some embodiments, the computer system displays the tutorial only if the feature has not been used on the computer system (or, optionally, on another computer system associated with a same user as the computer system). In some embodiments, the computer system forgoes displaying the tutorial if the feature has been used one or more times on the computer system.
In some embodiments, the virtual demonstration has an appearance that is based on which type of device is being used to provide access to the feature (e.g., virtual computer system 1808a is a desktop computer because computer system 1800a is a desktop computer, as shown in
In some embodiments, the virtual demonstration has an appearance that is based on which model of device is being used to provide access to the feature (e.g., virtual computer system 1808b is a model of a laptop computer with sharp corners because computer system 1800b is a laptop computer with sharp corners, as shown in
In some embodiments, the virtual demonstration has an appearance that is based on whether or not the computer system is coupled to (e.g., in communication with) an external device to provide access to the feature (e.g., the virtual demonstration in
In some embodiments, in accordance with a determination that the computer system is coupled to an external device, displaying the tutorial includes displaying a graphical (e.g., virtual) representation of the external device in a selected orientation (e.g., a predetermined orientation, a recommended orientation, a vertical orientation, a horizontal orientation, a landscape orientation, and/or a portrait orientation) of a plurality of possible orientations (e.g., virtual external device 1810a is displayed in a horizontal orientation in the virtual demonstration of
In some embodiments, the virtual demonstration has an appearance that is based on a system language of the computer system (e.g., a language setting of an operating system of the computer system) (e.g., simulated mark 1816a and/or simulated mark image 1822a is in English because a system language of computer system 1800a is English; simulated mark 1816b and/or simulated mark image 1822b is in Spanish because a system language of computer system 1800b is Spanish). In some embodiments, the first value is a first language, and the second value is a second language that is different from the first language. In some embodiments, in accordance with a determination the system language is the first language, the virtual demonstration (or the first appearance of the virtual demonstration) includes a graphical representation (e.g., writing) in the first language; and in accordance with a determination the system language is the second language, the virtual demonstration (or the first appearance of the virtual demonstration) includes the graphical representation in the second language. Basing the appearance of the virtual demonstration on a system language of the computer system enables the computer system to customize the virtual demonstration to the user's computer system, provides a more realistic and useful demonstration of the feature to the user, and reduces the need for a user to provide additional inputs to select a system language for the virtual demonstration, which provides improved visual feedback to the user and reduces the number of inputs needed to perform an operation.
In some embodiments, the virtual demonstration has an appearance that is based on a color associated with the computer system (e.g., an accent color used in the computer system, a color setting such as for an operating system of the computer system, and/or a color scheme for a user interface of the computer system) (e.g., simulated mark 1816a and/or simulated mark image 1822a is a first color because the first color is associated with computer system 1800a; simulated mark 1816c and/or simulated mark image 1822c is a second color, different from the first color, because the second color is associated with computer system 1800b in
In some embodiments, displaying the tutorial includes (e.g., the virtual demonstration includes) displaying a graphical (e.g., virtual) indication (e.g., 1824) of an extent of a field of view of one or more cameras (e.g., 1852) (e.g., one or more cameras of the computer system or of an external device in communication with or coupled to the computer system) in a simulated representation of a physical environment (e.g., the simulated representation of the physical environment shown in virtual demonstration portion 1806a in
In some embodiments, displaying the tutorial includes (e.g., the virtual demonstration includes) displaying a graphical representation (e.g., 1812) of an input area (e.g., a simulated input area) and a graphical representation (e.g., the virtual display of 1808a, 1808b, and/or 1808c) of an output area (e.g., a simulated output area). In some embodiments, the input area includes a surface (e.g., a physical surface; a horizontal surface, such as a surface of a table, floor, and/or desk; a vertical surface, such as a wall, whiteboard, and/or blackboard; a surface of an object, such as a book, a piece of paper, and/or a display of a tablet; and/or other surface). In some embodiments, the output area includes a display and/or a monitor. Displaying a graphical representation of an input area and a graphical representation of an output area provides the user with information about possible areas of user inputs for the feature and expected areas for receiving outputs of the feature, and reduces the need for the user to make additional user input to determine what input areas are possible, which provides improved visual feedback to the user and reduces the number of input needed to perform an operation.
In some embodiments, displaying the tutorial includes (e.g., the virtual demonstration includes) displaying a graphical representation of an input (e.g., virtual writing implement 1814 making a mark on virtual surface 1812) (e.g., a simulated input and/or a user input). In some embodiments, the input includes a marking device (e.g., a pen, marker, pencil, crayon, stylus, or finger) making a mark (e.g., handwriting) on a surface (e.g., a piece of paper or a display of a tablet). In some embodiments, the graphical representation of the input includes movement of a graphical representation of the marking device making the mark on the surface and, optionally, a graphical representation of a user's hand moving and/or holding the marking device. In some embodiments, displaying the graphical representation of the input includes displaying an animation of the input over time (e.g., animating the graphical representation of the input over time; displaying an animation of a graphical representation of a marking device moving over time). In some embodiments, the computer system displays an animation of an output of the input (e.g., a mark made by a marking device), where the output (e.g., marks) appears (e.g., updates) gradually over time as the input progresses. Displaying a graphical representation of an input as part of the tutorial provides the user with information about possible user inputs for the feature and reduces the need for the user to make additional user input to determine what inputs are possible, which provides improved visual feedback to the user and reduces the number of input needed to perform an operation.
In some embodiments, displaying the tutorial includes (e.g., the virtual demonstration includes) displaying (e.g., concurrently displaying) a graphical representation (e.g., 1816a, 1816b, and/or 1816c) of a first output of (or response to) the input (e.g., a simulated physical output, such as a simulated mark on a surface) and a graphical representation (e.g., 1822a, 1822b, and/or 1822c) of a second output of (or response to) the input (e.g., a simulated image of the mark on the surface captured by a camera of the computer system is displayed on a virtual representation of a display of the computer system). Displaying a graphical representation of a first output of the input and a graphical representation of a second output of the input provides the user with additional information about the expected operation and output of the feature, with provides improved visual feedback to the user.
In some embodiments, displaying the graphical representation of the first output includes displaying the graphical representation of the first output on a graphical (e.g., simulated or virtual) representation of a physical (e.g., real-world) surface (e.g., on virtual surface 1812) (e.g., a horizontal surface, such as a surface of a table, floor, and/or desk; a vertical surface, such as a wall, whiteboard, and/or blackboard; a surface of an object, such as a book, a piece of paper, and/or a display of a tablet; and/or other physical surface); and displaying the graphical representation of the second output includes displaying the graphical representation of the second output on a graphical (e.g., simulated or virtual) representation of the computer system (e.g., on 1808a, 1808b, and/or 1808c) (e.g., on a graphical representation of the display generation component). Displaying the graphical representation of the first output on a graphical representation of a physical surface and displaying the graphical representation of the second output on a graphical representation of the computer system provides the user with additional information about where output of the feature occurs and reduces the need for the user to provide additional user inputs to locate an output of the feature, which provides improved visual feedback to the user and reduces the number of inputs needed to perform an operation.
In some embodiments, displaying the graphical representation of the input includes displaying a graphical (e.g., virtual) representation (e.g., 1814) of a writing implement (e.g., a writing utensil, such as a pen, pencil, marker, crayon, and/or stylus) making a mark (e.g., 1816a, 1816b, and/or 1816c); and displaying the tutorial includes (e.g., the virtual demonstration includes) displaying movement of the graphical representation of the writing implement (e.g., away from a surface, from being in contact with a surface to not being in contact with the surface, off to a side of a surface, and/or to a position that does not obscure or overlap a graphical representation of the output) after displaying the graphical representation of the input is complete (e.g., moving 1814 from the position in
In some embodiments, displaying the tutorial includes (e.g., the virtual demonstration includes): displaying a graphical representation of a physical object from a first perspective (e.g., an overhead or top perspective, a side perspective, a front perspective, a back or rear perspective, a bottom perspective, a top-side perspective, and/or a bottom-side perspective) at a first time; and displaying the graphical representation of the physical object from a second perspective at a second time, wherein the second perspective is different from the first perspective, and wherein the second time is different from the first time (e.g., displaying 1808a, 1808b, 1808c, 1810a, 1812, and/or 1814 from the perspective in
In some embodiments, displaying the tutorial includes (e.g., the virtual demonstration includes): displaying a graphical representation of the computer system from a first perspective (e.g., an overhead or top perspective, a side perspective, a front perspective, a back or rear perspective, a bottom perspective, a top-side perspective, and/or a bottom-side perspective) at a first time; and displaying the graphical representation of the computer system from a second perspective at a second time, wherein the second perspective is different from the first perspective, and wherein the second time is different from the first time (e.g., displaying 1808a, 1808b, and/or 1808c, from the perspective in
In some embodiments, displaying the tutorial includes: displaying a first virtual demonstration of the feature (e.g., an animation of the first virtual demonstration and/or a first occurrence of displaying the first virtual demonstration); and after displaying the first virtual demonstration of the feature, displaying a second virtual demonstration of the feature (e.g., displaying the first virtual demonstration again; displaying a second occurrence of displaying the first virtual demonstration; repeating and/or looping display of the first virtual demonstration; or displaying a second virtual demonstration of the feature that is different from the first virtual demonstration of the feature). In some embodiments, the computer system repeats (or loops) display of the first virtual demonstration automatically (e.g., without detecting user input corresponding to a request to repeat display of the first virtual demonstration). In some embodiments, the computer system continues to repeat display of the first virtual demonstration until detecting an input corresponding to a request to cease display of the first virtual demonstration. In some embodiments, the second virtual demonstration is partially the same as the first virtual demonstration (e.g., includes the same device, simulated writing implement, simulated surface, and/or change in perspective over time) and partially different from the first virtual demonstration (e.g., includes different simulated input such as different handwriting). Displaying a first virtual demonstration of the feature and, after displaying the first virtual demonstration of the feature, displaying a second virtual demonstration of the feature provides the user with the ability to view the demonstration multiple times and observe aspects of the demonstration that are difficult to observe in a single instance of the demonstration without having to provide additional input to replay, pause, and/or rewind the demonstration, which provides improved visual feedback to the user and reduces the number of inputs needed to perform an operation.
In some embodiments, the computer system detects a second request to use the feature on the computer system; and in response to detecting the second request to use the feature on the computer system: in accordance with a determination that a set of criteria is met (e.g., a set of one or more criteria and/or predetermined criteria), the computer system displays the tutorial for using the feature that includes the virtual demonstration of the feature (e.g., display 1806 and the tutorial described in
In some embodiments, the set of criteria includes a criterion that is met if the feature has been used (e.g., initiated, activated, opened, and/or launched on the computer system or, optionally, on another computer system associated with a same user as the computer system) a number of times that satisfies (e.g., is equal to; is less than or equal to; or is less than) a threshold amount (e.g., zero times, one time, two times, or three times) (e.g., the set of criteria is based on whether the feature has been used by a user at least a threshold amount (e.g., one or more times)) (e.g., if selection of 6108 in
In some embodiments, after (e.g., in response to) detecting the request to use the feature on the computer system, the computer system: displays a selectable continue option (e.g., 1806d) (e.g., an affordance, a button, a selectable icon, and/or a user-interactive graphical user interface object); detects selection of the continue option (e.g., selection of 1806d) (e.g., an input, a touch input, a voice input, a button press, a mouse click, a press on a touch-sensitive surface, an air gesture, selection of a user-interactive graphical object, and/or other selection input corresponding and/or directed to the continue option); and in response to detecting selection of the continue option, performs (e.g., initiates or continues) a process for using the feature on the computer system (e.g., displaying 1604 as shown in
In some embodiments, after (e.g., in response to) detecting the request to use the feature on the computer system, the computer system: displays a selectable information option (e.g., 1806c) (e.g., an affordance, a button, a selectable icon, and/or a user-interactive graphical user interface object); detects selection of the information option (e.g., selection of 1806c) (e.g., an input, a touch input, a voice input, a button press, a mouse click, a press on a touch-sensitive surface, an air gesture, selection of a user-interactive graphical object, and/or other selection input corresponding and/or directed to the information option); and in response to detecting selection of the information option, displays a user interface (e.g., 1826) that provides (or provides access to) information (e.g., text, graphics, diagrams, charts, images, and/or animations) for using the feature on the computer system (e.g., instructions for using the feature on the computer system, information about aspects of the feature, and/or examples of the feature). In some embodiments, the computer system concurrently displays the information option, the tutorial, and, optionally, the continue option. In some embodiments, the user interface is a website and/or HTML document displayed in a web browser application. In some embodiments, the user interface is an electronic document (e.g., a PDF document, a text document, and/or a presentation document). Providing an information option and displaying a user interface that provides information for using the feature on the computer system in response to detecting selection of the information option provides an efficient technique for the user to obtain information about the feature without requiring additional inputs to search for the information (e.g., entering the name of the feature in a search field of a web browser application), which provides improved visual feedback to the user and reduces the number of inputs needed to perform an operation.
Note that details of the processes described above with respect to method 1900 (e.g.,
The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain the principles of the techniques and their practical applications. Others skilled in the art are thereby enabled to best utilize the techniques and various embodiments with various modifications as are suited to the particular use contemplated.
Although the disclosure and examples have been fully described with reference to the accompanying drawings, it is to be noted that various changes and modifications will become apparent to those skilled in the art. Such changes and modifications are to be understood as being included within the scope of the disclosure and examples as defined by the claims.
As described above, one aspect of the present technology is the gathering and use of data available from various sources to enhance a user's video conferencing experience. The present disclosure contemplates that in some instances, this gathered data may include personal information data that uniquely identifies or can be used to contact or locate a specific person. Such personal information data can include demographic data, location-based data, telephone numbers, email addresses, social network IDs, home addresses, data or records relating to a user's health or level of fitness (e.g., vital signs measurements, medication information, exercise information), date of birth, or any other identifying or personal information.
The present disclosure recognizes that the use of such personal information data, in the present technology, can be used to the benefit of users. For example, the personal information data can be used to customize user profiles for a video conference experience. Accordingly, use of such personal information data enables users to have calculated control of the delivered content. Further, other uses for personal information data that benefit the user are also contemplated by the present disclosure. For instance, health and fitness data may be used to provide insights into a user's general wellness, or may be used as positive feedback to individuals using technology to pursue wellness goals.
The present disclosure contemplates that the entities responsible for the collection, analysis, disclosure, transfer, storage, or other use of such personal information data will comply with well-established privacy policies and/or privacy practices. In particular, such entities should implement and consistently use privacy policies and practices that are generally recognized as meeting or exceeding industry or governmental requirements for maintaining personal information data private and secure. Such policies should be easily accessible by users, and should be updated as the collection and/or use of data changes. Personal information from users should be collected for legitimate and reasonable uses of the entity and not shared or sold outside of those legitimate uses. Further, such collection/sharing should occur after receiving the informed consent of the users. Additionally, such entities should consider taking any needed steps for safeguarding and securing access to such personal information data and ensuring that others with access to the personal information data adhere to their privacy policies and procedures. Further, such entities can subject themselves to evaluation by third parties to certify their adherence to widely accepted privacy policies and practices. In addition, policies and practices should be adapted for the particular types of personal information data being collected and/or accessed and adapted to applicable laws and standards, including jurisdiction-specific considerations. For instance, in the US, collection of or access to certain health data may be governed by federal and/or state laws, such as the Health Insurance Portability and Accountability Act (HIPAA); whereas health data in other countries may be subject to other regulations and policies and should be handled accordingly. Hence different privacy practices should be maintained for different personal data types in each country.
Despite the foregoing, the present disclosure also contemplates embodiments in which users selectively block the use of, or access to, personal information data. That is, the present disclosure contemplates that hardware and/or software elements can be provided to prevent or block access to such personal information data. For example, in the case of video conference interfaces, the present technology can be configured to allow users to select to “opt in” or “opt out” of participation in the collection of personal information data during registration for services or anytime thereafter. In addition to providing “opt in” and “opt out” options, the present disclosure contemplates providing notifications relating to the access or use of personal information. For instance, a user may be notified upon downloading an app that their personal information data will be accessed and then reminded again just before personal information data is accessed by the app.
Moreover, it is the intent of the present disclosure that personal information data should be managed and handled in a way to minimize risks of unintentional or unauthorized access or use. Risk can be minimized by limiting the collection of data and deleting data once it is no longer needed. In addition, and when applicable, including in certain health related applications, data de-identification can be used to protect a user's privacy. De-identification may be facilitated, when appropriate, by removing specific identifiers (e.g., date of birth, etc.), controlling the amount or specificity of data stored (e.g., collecting location data a city level rather than at an address level), controlling how data is stored (e.g., aggregating data across users), and/or other methods.
Therefore, although the present disclosure broadly covers use of personal information data to implement one or more various disclosed embodiments, the present disclosure also contemplates that the various embodiments can also be implemented without the need for accessing such personal information data. That is, the various embodiments of the present technology are not rendered inoperable due to the lack of all or a portion of such personal information data. For example, general user profiles can be created for video conference applications based on non-personal information data or a bare minimum amount of personal information, such as the content being requested by the device associated with a user, other non-personal information available to the video conference provider, or publicly available information.
This application is a continuation of U.S. patent application Ser. No. 17/950,922, entitled “WIDE ANGLE VIDEO CONFERENCE,” filed on Sep. 22, 2022, which claims priority to U.S. Provisional Patent Application No. 63/392,096, entitled “WIDE ANGLE VIDEO CONFERENCE,” filed on Jul. 25, 2022; and claims priority to U.S. Provisional Patent Application No. 63/357,605, entitled “WIDE ANGLE VIDEO CONFERENCE,” filed on Jun. 30, 2022; and claims priority to U.S. Provisional Patent Application No. 63/349,134, entitled “WIDE ANGLE VIDEO CONFERENCE,” filed on Jun. 5, 2022; and claims priority to U.S. Provisional Patent Application No. 63/307,780, entitled “WIDE ANGLE VIDEO CONFERENCE,” filed on Feb. 8, 2022; and claims priority to U.S. Provisional Patent Application No. 63/248,137, entitled “WIDE ANGLE VIDEO CONFERENCE,” filed on Sep. 24, 2021. The contents of each of these applications are hereby incorporated by reference in their entireties.
Number | Date | Country | |
---|---|---|---|
63392096 | Jul 2022 | US | |
63357605 | Jun 2022 | US | |
63349134 | Jun 2022 | US | |
63307780 | Feb 2022 | US | |
63248137 | Sep 2021 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 17950922 | Sep 2022 | US |
Child | 18499848 | US |