Traditional videoconference systems display images on individual monitors or individual windows on a single monitor. Each monitor or each window of the single monitor displays an image provided by a corresponding video camera at a particular location. In addition to the video camera image(s), one or more locations can contribute a shared presentation (e.g., Microsoft PowerPoint® slides or the like) for display on a separate monitor or window. In the past, such videoconference systems displayed the shared presentation on a main screen, with the image(s) of participant(s) displayed either on separate screens (allowing the presentation to fill the main screen), or in window(s) surrounding a less-than-full-screen display of the presentation. Alternatively, the windows may overlap or be hidden by a full-screen presentation of the shared presentation.
Typical video conference systems can easily generate the resulting display, but most participants often find that the resultant display appears unnatural and makes poor use of screen space (already in short supply, particularly if a single monitor must serve multiple purposes). Moreover, in traditional video conference systems, the remote participants, for the most part, face their respective camera, giving the appearance that they always look directly at the viewer who often observes an aesthetically unappealing image.
Various proposals exist to extend teleconferencing to subscribers of shared content delivery networks, such as those networks maintained by cable television companies and telecommunications carriers, to allow subscribers to share content as well as images of each other. Systems, which allow both image and content sharing, often bear the designation “telepresence systems.” Examples of such telepresence systems appear in applicants' co-pending applications PCT/US11/063036, PCT/US12/050130, PCT/US12/035749, and PCT/US13/24614, (all incorporated by reference herein). As described in these co-pending applications, a typical telepresence system includes a plurality of telepresence stations, each associated with a particular subscriber in communication with other subscribers at their respective telepresence stations. Each telepresence station typically has a monitor, referred to as a “telepresence” monitor for displaying the image of one or more “remote” participants, e.g., participants at remote stations whose images undergo captured by the cameras (the “telepresence” camera) at each participant's station. For ease of discussion, the term “local participant” refers to the participant whose image undergoes capture by the telepresence camera at that participant's station for display at one or more distant (e.g., “remote”) stations. Conversely, the term “remote participant” refers to a participant associated with a remote station whose image undergoes display for observation by a local participant.
In the case of a remote telepresence station whose telepresence monitor and camera lie to one side of the monitor showing shared content (e.g., the “shared content” monitor), the transmitted image of the remote participant will appear in profile to the local participant while the remote participant watches his or her content monitor. However, when that remote participant turns to face his or her telepresence monitor directly, that remote participant now appears to directly face the local participant. Thus, at any given time, some participants will directly face their corresponding telepresence cameras while others will not, giving rise to uncertainty as to how to manage the participants' images for display on the telepresence monitor at each telepresence station.
Thus, a need exists for a technique for managing the images of remote participants in a telepresence system.
Briefly, in accordance with a preferred embodiment of the present principles, a method for managing received images of remote participants displayed to a local participant in a telepresence system commences by first establishing for each remote telepresence station the relative orientation of the corresponding shared content screen and telepresence camera, with respect to the corresponding remote telepresence system participant (e.g., whether the camera is to the left, right, or substantially coincident with the shared content screen, from the vantage of the remote participant). The received images of the remote participant(s) undergo processing for display to the local participant in accordance with the established orientations to control at least one of image visibility and image location within the displayed image observed by the local participant.
The STBs 111, 121, and 131 all enjoy a connection to a communication channel 101, such as provided by a network content provider (e.g., a cable television provider or telecommunications carrier.). Alternatively, the communication channel 101 could comprise a link to a broadband network such as the Internet. The communication channel 101 allows the STBs receive content from a content source as well as to exchange information and video streams with each other, with or without intermediation by a server 103.
At each of the stations 110, 120 and 130, a corresponding one of the STBs 111, 121, and 131, respectively, receives a video signal from its corresponding one of telepresence cameras 117, 127, and 137, respectively. Each of the telepresence cameras 117, 127, and 137 serves to capture the image of a corresponding one of the participants 113, 123 and 133, respectively. As discussed in applicants' co-pending applications, each STB sends the video signals embodying telepresence images captured by its corresponding telepresence camera to the other STBs with or without any intermediate processing. Each STB receiving telepresence images from the STBs at the remote stations will supply the images for display on a display device at the local telepresence station. Some local telepresence stations, for example stations 120 and 130, include telepresence monitors 126 and 136, respectively, for displaying telepresence images of remote participants. At the stations 120 and 130, the telepresence monitors 126 and 136, respectively, support the telepresence cameras 127 and 137, respectively, so the telepresence cameras and monitors are co-located. The station 110 has no telepresence monitor and thus the STB 111 will display telepresence images of remote participants on the shared content monitor 112, which serves to support the telepresence camera 117.
As used herein throughout, “orientation” concerns the relative placement at a station (e.g., 120) of the shared content monitor (e.g., 122) and the telepresence camera (e.g., 127), with respect to the participant (e.g., 123) or equivalently, the participant's seat, (e.g., chair 124). At station 120, from the vantage of participant 123, camera 127 is rightward of shared monitor 122, which can be called a “right” orientation (whereas station 130 has a “left” orientation). While in normal use, the “orientation” of the equipment at a station does not change. This should not be confused with the “facing” of a participant, which is more dynamic. At station 120, participant 123 has facing 128 when watching shared content monitor 122, and facing 129 when looking at telepresence monitor 126 and thereby looking toward camera 127. With the “right” orientation of station 120, an image captured by camera 127 while participant 123 is looking at shared content monitor 122 (i.e., has facing 128) will show the participant facing to the right. In the case of station 110, the triangle formed by the participant, camera, and shared content monitor is collapsed, since the camera and shared content monitor is collapsed, in which case the station is said to have a “centered” orientation. In some contexts below, a participant “is facing” when the participant is looking toward the camera, and “is non-facing” when not looking toward the camera. Herein, to say the “orientation of a participant” or “participant having an orientation”, means a participant at a station having the orientation. [[<<-important because we have a lot of claims using this turn of phrase]]
While the participants 113, 123, and 133 watch their shared content monitors 112, 122, and 132, respectively, the participants will have a particular facing relative to their corresponding shared content monitors, indicated by the arrows 118, 128, and 138, respectively. However, when the participants 123 and 133 at the stations 120 and 130, respectively, watch their telepresence monitors 126 and 136, respectively, thereby looking toward the co-located telepresence cameras 127 and 137, respectively, the participants 123 and 133 will have facings 129 and 139, respectively.
At some telepresence stations, the telepresence monitor and telepresence camera can lie to the left of the shared content monitor as at the station 130. At other telepresence stations, the telepresence monitor and telepresence camera can lie to the right, such as at station 120. In case of the station 110, which has no separate telepresence monitor, the telepresence camera 117 lies co-located with the shared content monitor 112 and the telepresence images of the remote participants 123 and 133 will appear on that shared content monitor. As described in applicants' co-pending applications (incorporated by reference herein), the STBs can exchange information about the stations' orientations, or interact by assuming a predetermined orientation (e.g., providing and handling telepresence video signals to appear as if they originated from telepresence cameras on disposed to a particular side of the shared content monitor, e.g., to a participant's right when the participant faces his or her shared content monitor). An embodiment relying on an assumed orientation supports the interaction of this invention without the need to exchange orientation information.
The content supplied to each STB for sharing among the telepresence stations 110, 120 and 130 could originate from a broadcast station, or could comprise stored content distributed from a head end 102 by a server 103. The server 103 can access a database 104 storing containing television programs and a database 105 storing advertisements based on subscription or other access control or access tracking information stored in database 106. Note that the television programs and advertisements could reside in a single database rather than the separate databases as described. The server 103 can provide other services. For example, in some embodiments, the server 103 could provide the services necessary for setting up a telepresence session, or for inviting participants to join a session. In some embodiments, the server 103 could provide processing assistance (e.g., face detection, as discussed below).
Note that while discussion of the present principles refers to the illustrated embodiment of
In the illustrated embodiment, the STB 111 of
Depending on the remote participant's pose, the remote participant's image will appear as transparent or opaque when processed by either the sending or receiving STB, with or without assistance of the remote server 103. Assume that a remote participant (e.g., participant 133) has a non-facing pose (e.g., looking in the direction 138, so as not facing the corresponding telepresence camera 137), as determined by the face detection and pose estimation algorithm. Under such circumstances, the corresponding participant image 223 becomes at least partially transparent to minimize the impact on the shared content in composite image 221. However, when a remote participant (e.g., 123) has a facing pose (e.g., looking in direction 129 toward corresponding camera 127), then the corresponding participant image 222 becomes substantially opaque.
The exemplary presentation 230 shown in
In some embodiments, presentation of windowed images such as images 242 and 243 could occur by presenting such images completely outside of the shared content so that they appear in independent windows (rather than being composited into the single image 241). Presenting these images in this manner suffers from the disadvantage that the shared content will appear smaller than it might otherwise appear, depending upon the aspect ratio of the shared content and that of the shared content monitor 112.
In other embodiments, the presentation technique of
At the station 110, a composite image 211 appears on the shared content monitor 112. (In other exemplary embodiments, the image 211 could look like the composite images 221, 231, 241, or 251.) At the other stations 120 and 130 having independent telepresence monitors 126 and 136, respectively, these telepresence monitors display the telepresence images 326 and 336 of their respective remote participants. Depending on the orientation of the corresponding remote telepresence stations, the individual images of the remote participants in the composite images 326 and 327 may require horizontal flipping to support the illusion that the remote participants face their local shared content monitor 126 and 136, respectively. (In the illustrative embodiment, such image flipping remains unnecessary.). Note that no need exists to flip the frontal image 317 when displayed on either of the remote telepresence monitors, since participant directly faces the telepresence camera. In contrast, the images 327 and 337, typically do not constitute frontal images, which generally do not arise from the participants facing their respective telepresence cameras, although, as shown in image 327, they can occasional constitute a “facing” image.
For the exemplary situation 300 of
During step 420, the STB 111 of
In one exemplary embodiment, the STB 111 can transmit the orientation information (e.g., telepresence station configuration) stored in the settings 513 database to the other participating stations during a configuration step 503 whose execution is optional. Sending the station configuration constitutes one approach to enable a remote station to correctly handle placement and if necessary, the horizontal flipping of a remote participant image. Alternatively, the telepresence video signal sent to each remote station can include embedded orientation information, typically in the form of metadata so the interchange of orientation data occurs concurrently with the interchange of telepresence video signals.
In other embodiments, no need exists to exchange orientation information if all the stations adhere to a convention that assumes a predetermined orientation. This approach has particular application to those embodiments that gather all remote telepresence images to one side or the other as in depicted composite images 241 and 251, but is also more generally applicable. For example, the convention could dictate that all sending STBs provide telepresence images in a particular orientation, for example ‘LEFT’. In other words, the sending STB will pretend that its associated telepresence camera lies to the left of the shared content monitor, whether or not this is actually the case. It actually is the case with the station 130, where telepresence monitor 136 and camera 137 lie to the left of the participant's shared content monitor 132). This corresponds to remote participant images having a generally left-facing profile (i.e., their nose most-often points leftward, from the camera's perspective). Since the station 120 has a “RIGHT” orientation, applying the above-identified convention would dictate that the telepresence image of the participant 123 provided by the station 120 of
During step 502 of
In some embodiments, the participant can select the mode of display of his or telepresence images (e.g., images 211, 221, 231, 241, 251, or others) as a participant preference.
In some instances, exchange of orientation information among stations can prove useful, as indicated by optional nature of step 503 during which exchange of such orientation information would occur. This can be true even when the orientation convention discussed above is in use: For example, telepresence images from participants having a ‘CENTER’ orientation (as shown in 317) can be arranged to be ‘behind’ telepresence images from participants having a non-CENTER orientation (as do images 327, 337), a seen in composite telepresence images 326, 336, which provides a more aesthetic composition than if the image positions were swapped, which would appear to have one participant staring at the other (e.g., in image 326, if the head positions were swapped, participant 133 would appear to be looking at participant 113).
During step 504, telepresence images from another station are received by the STB 111. During step 505, the receiving STB (e.g., STB 111) determines whether the received telepresence image is from a left-oriented configuration. This determination is based on the configuration stored in settings 513. If so, the STB 111 will apply a prescribed policy during step 506, for example to exhibit the received telepresence images of that remote participant on the right side of the composite image displayed on the shared content monitor 112 of
If the received telepresence image is not from a left-oriented station when evaluated during step 505, then STB undertakes an evaluation during step 507 to determine whether the image is from a right-oriented configuration, again based on the configuration stored in settings 513. If so, the STB 111 will apply the prescribed policy during step 508 to display that remote participant image on the left side of the composite image displayed on the shared content monitor 112. As depicted in
If the received remote telepresence image is not from a left or right oriented station (i.e., the remote station has a ‘center’ orientation) when evaluated during steps 505 and 507, respectively, then STB 111 executes step 509 to identify a default placement for the remote participant image on monitor 112 in accordance with a prescribed policy. For example, step 509 undergoes execution upon receipt of a telepresence image from a remote station with a center orientation, such as station 110, which has its telepresence camera co-located with the shared content monitor.
In an alternative embodiment operating with different policies, some remote telepresence images could undergo a horizontal flip during step 508, corresponding to the flipping of the telepresence images 242 and 252 prior to display on the right side of the composite image.
In other embodiments, the policy applied during the execution step 506, 508, and 509 could consider participant preferences. For example, the telepresence system 100 could apply a policy that prescribes consecutive allocation of on-screen position to the telepresence images of remote participants. For example, at each local station, the STB could allocate a first position in the composite image displayed by the shared content monitor 112 to a first-joined station (e.g., the station that joined the telepresence session first), with subsequent positions allocated to the telepresence images from successively joining stations. In some embodiments, user preferences could identify particular placements for telepresence images of particular participants. For example, a participant at a given station could preferentially assign a particular position (e.g., the bottom right-hand screen corner) to that participant's best friend when that best friend participates in the current telepresence session.
After determining placement of each telepresence image during step 506, 508, or 509, the process ends during step 510.
Upon determining that the remote participant faces his or her telepresence camera during step 606, then the STB will make the received telepresence image opaque during step 607, as depicted by telepresence image 222 in
If, during step 605, the STB determines that the received telepresence image is from a station with a “CENTER” orientation, then any subsequent determination of whether the remote participant faces his or her telepresence camera in order to control the telepresence image visibility will not prove useful: A remote telepresence image from a “CENTER” oriented station results in a remote participant directly facing his or her telepresence camera almost constantly (e.g., participant 113 will usually have facing 118). Instead, it is the activity of a remote participant at a station with a “CENTER” orientation that constitutes a more useful indicator for controlling the visibility of that participant's image when displayed in connection with the composite image appearing on the shared content monitor. For this reason, during step 609, the receiving STB will determine whether that remote participant is talking. The STB could use either audio-based techniques (i.e., speech determination) or video-based techniques (i.e., a lip movement determination) for this purpose. If the STB determines the remote participant to be talking, then the STB will display that remote participant's telepresence image as more opaque during step 607. Otherwise, the STB will display that remote participant's telepresence image as more transparent during step 608.
When the system is used by individuals who use sign language, the determination at step 609 could also include detection of gestures likely to represent sign language communication, or simply using hand detection at step 609, much as face detection is used in step 606. Hand detection in video is well-known in the art, as taught by Ciaramello and Hemami of Cornell University in “Real-Time Face and Hand Detection for Videoconferencing on a Mobile Device”, as published in the Fourth International Workshop on Video Processing and Quality Metrics for Consumer Electronics (VPQM), Scottsdale, Ariz., January 2009.
Although not shown in
In an embodiment where the telepresence image undergoes horizontal flipping, when necessary, so as to resemble a particular conventional orientation, then an indication in the settings database 513 for a ‘CENTER’ orientation (as might be recorded during step 502) or the orientation prescribed by the convention, would indicate no flipping required, whereas an indication of the opposite orientation would require horizontally flipping the image. The horizontal flip of the outbound image, when needed, can be performed by outbound video controller 711.
The STB 111 provides its outbound telepresence video signal 741 via communication interface 714 to the communication channel 101 to transmission each of the remote STBs 121, 131 at the remote telepresence stations 120 and 130, respectively, as video signals 743 and 742, respectively. In return, the stations 130 and 120 send their outbound telepresence video signals 750 and 760, respectively, through the communication channel 101 for receipt by the STB 111 at its communication interface 714, which passes the signals to a decoder 715. In embodiments where orientation data undergoes exchange during step 503 of
The decoder 715 processes the inbound telepresence video signals 750 and 760 to provide a sequence of images 751 and 761 to corresponding inbound video buffer 717A and 717B, respectively. A face detection module 721 analyzes the images in the inbound video buffers 717A and 717B to determine whether the corresponding remote participants 133, 123 have turned toward their respective telepresence cameras 137 and 127. In some embodiments, detection module 721 may also detect for the presence of hands (e.g., as a detection of sign language), or may analyze the audio streams (not separately shown) corresponding to the image streams 751 and 761 to detect for talking, as discussed above.
An inbound video controller 718 receives shared content 770, for example as provided from head end 102. For simplicity of explanation,
The inbound video controller 718 composites the shared content 770 with the remote participant's telepresence images stored in inbound video buffers 717A and 717B. The composition performed by the inbound video controller 718 takes account of the orientation information stored in the settings 513 database and the results from detection module 721 to determine position and scale and/or opacity as discussed with respect to processes 500 and 600, and their variants. The inbound video controller 718 writes the resulting composite image to a video output buffer 719, which provides a video signal 720 to shared content display 112, for display, in this example as composite image 211.
The foregoing describes a technique for enabling a telepresence station having a single monitor to provide an improved experience when showing both shared content and telepresence streams of one or more remote participants whose telepresence cameras do not lie close to their shared content monitor.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2013/037955 | 4/24/2013 | WO | 00 |