Live video icons for signal selection in a videoconferencing system

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to videoconferencing systems.

2. Description of the Related Art

Videoconferencing systems allow people at two or more different locations to participate in a conference so that the people at each location can see and hear the people at the other location(s). Videoconferencing systems typically perform digital compression of audio and video signals in real time. The hardware or software that performs compression is called a codec (coder/decoder). The resulting digital stream of bits representing the audio and video data are subdivided into packets, which are then transmitted through a network of some kind (usually ISDN or IP) to the other locations or endpoints participating in the videoconference.

Videoconferences can be performed using dedicated videoconferencing equipment, i.e., devices especially designed for videoconferencing. For example, a dedicated videoconferencing device may include input ports for receiving video signals from local video sources and audio signals from local microphones, network ports for receiving the remote audio/video streams from and sending the local audio/video stream to the remote endpoints, and output ports for displaying the video data on a display device and sending the audio data to an audio output device. The dedicated videoconferencing device may also include specialized software and hardware for compressing and decompressing audiovisual data, generating a composite image of the video streams from the various participants, etc. The dedicated videoconferencing device may also include an interface allowing users to interact with the videoconferencing equipment, e.g., to pan, tilt, and zoom cameras, select a video input source to send to the remote endpoints, control volume levels, control placement of video windows on the display device, etc.

Videoconferences can also be performed using non-dedicated equipment, e.g., a general purpose computer system. For example, a typical desktop PC can be configured to add-on hardware boards and/or software to enable the PC to participate in a videoconference.

Various standards have been established to enable the videoconferencing systems at each endpoint to communicate with each other. In particular, the International Telecommunications Union (ITU) has specified various videoconferencing standards. These standards include:

H.320—This is known as the standard for public switched telephone networks (PSTN) or videoconferencing over integrated services digital networks (ISDN) basic rate interface (BRI) or primary rate interface (PRI). H.320 is also used on dedicated networks such as T1 and satellite-based networks.

H.323—This is known as the standard for video over Internet Protocol (IP). This same standard also applies to voice over IP (VoIP).

H.324—This is the standard for transmission over POTS (Plain Old Telephone Service), or audio telephony networks.

In recent years, IP-based videoconferencing has emerged as a communications interface and standard commonly utilized by videoconferencing equipment manufacturers. Due to the price point and proliferation of the Internet, and broadband in particular, there has been strong growth and use of H.323 IP-based videoconferencing. H.323 has the advantage that it is accessible to anyone with a high speed Internet connection, such as a DSL connection, cable modem connection, or other high speed connection.

SUMMARY

A videoconference may include a plurality of endpoints that share video information among each other. At a given endpoint in the videoconference, there may be multiple local video sources, each of which provides a video input signal to a videoconferencing device at the endpoint. Various embodiments of a method for facilitating selection of a desired local video input signal to send to the remote endpoints in the videoconference are described herein. The method may comprise simultaneously displaying a plurality of icons on a display device, where each icon displays a live version of the video input signal from a respective one of the local video sources at the endpoint. These icons are also referred to herein as live video icons. The live video icons may be selectable to select a video input signal to send to the remote endpoints in the videoconference. In other words, by selecting a particular icon, a user can select the video input signal displayed by the icon as the video input signal to send to the remote endpoints.

Various embodiments of a videoconferencing system which utilizes the video input signal selection method are also described.

BRIEF DESCRIPTION OF THE DRAWINGS

A better understanding of the present invention may be obtained when the following detailed description is considered in conjunction with the following drawings, in which:

FIG. 1 is a block diagram illustrating one embodiment of a videoconference in which there are a plurality of endpoints;

FIG. 2 illustrates one embodiment of a videoconferencing system that is operable to facilitate user selection of a local video input signal to send to remote endpoints by displaying a plurality of live video icons;

FIG. 3 is a flowchart diagram illustrating one embodiment of a method for facilitating user selection of a local video input signal by displaying live video icons;

FIG. 4 is a flowchart diagram illustrating a more particular embodiment of the method of FIG. 3;

FIGS. 5A and 5B illustrate an example in which a local endpoint participates in a videoconference with two remote endpoints;

FIGS. 6A-6G illustrate exemplary embodiments of live video icons displayed on a display screen of a display device;

FIGS. 7A-7B illustrate more detailed examples of screen displays illustrating the use of live video icons according to one embodiment;

FIG. 8 illustrates components in an exemplary videoconferencing device according to one embodiment; and

FIGS. 9A-9D illustrate exemplary hardware components of the videoconferencing device of FIG. 8, according to one embodiment.

While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and are described in detail. It should be understood, however, that the drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the present invention as defined by the appended claims.

DETAILED DESCRIPTION

Incorporation by Reference

U.S. Provisional Patent Application Ser. No. 60/676,918, titled “Audio and Video Conferencing”, which was filed May 2, 2005, whose inventors were Michael L. Kenoyer, Wayne Mock, and Patrick D. Vanderwilt, is hereby incorporated by reference in its entirety as though fully and completely set forth herein.

U.S. patent application Ser. No. 11/252,238, titled “Video Conferencing System Transcoder”, which was filed Oct. 17, 2005, whose inventors were Michael L. Kenoyer and Michael V. Jenkins, is hereby incorporated by reference in its entirety as though fully and completely set forth herein.

U.S. patent application Ser. No. 11/251,084, titled “Speakerphone”, which was filed Oct. 14, 2005, whose inventor was William V. Oxford, is hereby incorporated by reference in its entirety as though fully and completely set forth herein.

U.S. patent application Ser. No. 11/251,086; titled “Speakerphone Supporting Video and Audio Features”, which was filed Oct. 14, 2005, whose inventors were Michael L. Kenoyer, Craig B. Malloy and Wayne E. Mock, is hereby incorporated by reference in its entirety as though fully and completely set forth herein.

U.S. patent application Ser. No. 11/251,083, titled “High Definition Camera Pan Tilt Mechanism”, which was filed Oct. 14, 2005, whose inventors were Michael L. Kenoyer, William V. Oxford, Patrick D. Vanderwilt, Hans-Christoph Haenlein, Branko Lukic and Jonathan I. Kaplan, is hereby incorporated by reference in its entirety as though fully and completely set forth herein.

As described in more detail below, a videoconference may include a plurality of endpoints that share video information among each other. At a given endpoint in the videoconference, there may be multiple video sources, each of which provides a video input signal to a videoconferencing device at the endpoint. One (or more) of these local video input signals may be selected as a video input signal to send to the remote endpoints in the videoconference.

Various embodiments of a method for facilitating selection of a desired video input signal to send from a given endpoint to the remote endpoints in the videoconference are described herein. As described in detail below, the method may comprise simultaneously displaying a plurality of icons on a display device, where each icon displays a live version of the video input signal from a respective one of the local video sources at the endpoint. These icons are also referred to herein as live video icons. The live video icons may be selectable to select the video input signal to send to the remote endpoints in the videoconference. In other words, by selecting a particular icon, a user can select the video input signal displayed by the icon as the video input signal to send to the remote endpoints.

Various embodiments of a videoconferencing system which utilizes the video input signal selection method are also described.

Referring now to FIG. 1, a block diagram illustrating one embodiment of a videoconference is shown. As used herein, the term “videoconference” refers to a conference between participants at two or more locations, wherein video information is sent from at least one of the locations to one or more of the other locations. For example, the video information sent from a given location may represent a live video stream (video signal) received from a camera or other video source, where the video information is received by the other locations and used to reproduce the live video stream on a display device, such as a television or computer monitor. In addition to video information, audio information may also be sent from at least one of the locations to one or more of the other locations.

The various locations of the videoconference participants are also referred to herein as “endpoints” in the videoconference. For example, FIG. 1 illustrates an exemplary videoconference in which participants 80A-80E are located at respective endpoints 101A-101E. The term “remote endpoint” is relative to a given endpoint in the videoconference and refers to the other endpoints in the videoconference. For example, endpoints 101B-101E are remote endpoints with respect to endpoint 101A, while endpoints 101A-101D are remote endpoints with respect to endpoint 101E.

Although there are five endpoints 101 in this example, in other examples there may be any number of endpoints (as long as there are at least two). Also, the participants 80 at a given endpoint 101 may include any number of people. In one embodiment, each endpoint 101 includes at least one person as a participant 80. In other embodiments, one or more of the endpoints 101 may have no persons present as participants 80. For example, video information from a camera stationed at an endpoint 101A with no participants 80 may be sent to other endpoints 101 and viewed by participants 80 at the other endpoints 101, where the other endpoints 101 also share video information among each other.

In one embodiment, each endpoint 101 may send video information to all of the remote endpoints 101. In another embodiment, one or more of the endpoints may send video information to only a subset, but not all, of the remote endpoints. As one example, endpoints 101B-101E may each send video information only to endpoint 101A, and endpoint 101A may send video information to each of the endpoints 101B-101E. As described below, in some embodiments, each endpoint 101 may send video information to a device referred to as a Multipoint Control Unit (MCU). The MCU may then relay the received video information to the various endpoints 101. The MCU may be located at one of the endpoints 101 or may be in a separate location from any of the endpoints 101.

In another embodiment, one or more of the endpoints 101 may not send video information to any remote endpoint. As one example, a given endpoint 101 may receive video information from one or more of the remote endpoints, but may not send video information to any remote endpoint. As another example, a given endpoint 101 may not send video information to any remote endpoint or receive video information from any remote endpoint. In this example, the given endpoint 101 may participate in the videoconference by sharing audio information only, e.g., may receive audio information from one or more of the remote endpoints, as well as possibly sending audio information to one or more of the remote endpoints.

As noted above, in addition to sharing video information, the endpoints 101 may also share audio information. In one embodiment, each endpoint 101 that sends video information to one or more remote endpoints may also send audio information to the one or more remote endpoints 101. In one embodiment, each endpoint 101 may receive both video information and audio information from all of the other endpoints 101. In another embodiment, one or more of the endpoints 101 may send video information to one or more remote endpoints, but without sending audio information to the one or more remote endpoints. In another embodiment, one or more of the endpoints 101 may send audio information to one or more remote endpoints, but without sending video information to the one or more remote endpoints.

It will be appreciated that many other possible permutations of sending video and/or audio information among the various endpoints 101 in the videoconference are possible, other than the particular ones described above.

As noted above, in some embodiments, a device referred to as a Multipoint Control Unit (MCU) may be used to facilitate sharing video and audio information among the endpoints 101. The MCU may act as a bridge that interconnects calls from several endpoints. For example, all endpoints may call the MCU, or the MCU can also call the endpoints which are going to participate in the videoconference. An MCU may be located at one of the endpoints 101 of the videoconference or may be in a separate location from any endpoint 101. In one embodiment, the MCU may be embedded in a videoconferencing device at one of the endpoints 101.

At least one of the endpoints 101 in FIG. 1 may utilize a videoconferencing system 119 which is operable to facilitate user selection of a local video input signal to send to remote endpoints 101 by displaying a plurality of live video icons, as described in detail below. FIG. 2 illustrates one embodiment of such a videoconferencing system 119.

As shown, the videoconferencing system 119 includes a videoconferencing device 120. As used herein, the term “videoconferencing device” refers to a device operable to receive video information from and send video information to remote endpoints in a videoconference. A videoconferencing device may also receive audio information from and send audio information to the remote endpoints.

In the example of FIG. 2, the videoconferencing device 120 receives a plurality of video input signals from a plurality of video sources 130, e.g., via inputs on the videoconferencing device 120. In various embodiments, a video source 130 may comprise any kind of device operable to produce a video signal. In the illustrated example, the video sources 130 include two video cameras and a personal computer (PC), e.g., where the PC provides a video signal through a video card. Other examples of possible video sources 130 include a DVD player, a VCR, or other device operable to produce a video signal. In various embodiments, the videoconferencing device 120 may receive respective video input signals from any number of video sources 130.

The videoconferencing device 120 may be operable to select one (or more) of the video input signals received from the video sources 130 as a video input signal to send to one or more of the remote endpoints in the videoconference. Thus, the video sources 130 are also referred to herein as “local video sources” and the respective video input signals that they produce are also referred to herein as “local video signals”. It is noted, however, that the local video sources may or may not be located physically together with or proximally to the videoconferencing device 120. For example, in one embodiment, one or more of the local video sources 130 may be located far away from the videoconferencing device 120 and may connect to the videoconferencing device 120 to provide a video input signal via a network. Thus, the video sources 130 are “local” in the sense of providing video input signals for possible selection for sending from the local endpoint 101 to the remote endpoints 101, but may or may not be local in the sense of physical location.

The local video input signal that is currently selected to be sent to the remote endpoints is also referred to below as the “selected local video input signal” or simply the “selected video signal”. In some embodiments, the videoconferencing device 120 may be operable to send more than one local video input signal to the remote endpoints, and thus, there may be multiple selected video signals.

As shown, the videoconferencing device 120 may be coupled to the network 105. The videoconferencing device 120 may send the selected local video input signal to the remote endpoints 101 via the network 105. The videoconferencing device 120 may also receive video signals from the remote endpoints 101 via the network 105. The video signals received from the remote endpoints 101 are also referred to herein as “remote video signals”.

As used herein, the term “video signal” or “video input signal” refers to any kind of information useable to display video and does not imply that the information is in any particular form or encoded in any particular way. For example, in various embodiments, the local video signal from a local video source may be sent from an endpoint 101 to the remote endpoints 101 in any form and using any of various communication protocols or standards. In a typical embodiment, the local video signal is sent to the remote endpoints 101 as digital information, e.g., as ordered packets of information. Similarly, the remote video signals may be received over the network 105 in a digital form, e.g., as ordered packets of information.

Thus, if the local video source originally produces an analog signal, then the signal may be converted into digital information, or if the local video source originally produces a digital signal, the signal may be encoded in a different way or packetized in various ways. Thus, the video information that originates from a given video source 130 may be encoded, decoded, or converted into other forms at various stages between leaving the video source and arriving at the remote endpoints, possibly multiple times. The term “video signal” is intended to encompass the video information in all of its various forms.

Referring again to FIG. 2, the videoconferencing system 119 at the endpoint 101 also includes one or more display devices 122 to which the videoconferencing device 120 provides an output signal via an output port. The display device 122 may comprise any kind of device operable to display video information, such as a television, computer monitor, LCD screen, projector, or other device.

The videoconferencing device 120 may be operable to display the remote video signals from the remote endpoints on the display device 122. The videoconferencing device 120 may also display one or more of the local video signals on the display device 122, e.g., may display the selected local video signal. For example, the videoconferencing device 120 may include hardware logic which receives the remote video signals and the selected local video signal and creates a composite image which is then provided to the display device 122, e.g., so that the various video signals are tiled or displayed in different respective windows on the display device 122.

As described in more detail below, the videoconferencing device 120 may also be operable to display live video icons on the display device 122, e.g., where each of the icons displays a live image from one of the local video sources 130. For example, the user may operate the remote control device 128 or provide other input indicating a desire to select which local video input signal to select for sending to the remote endpoints 101. In response, the videoconferencing device 120 may display the icons, and the user may then select the desired icon in order to select the corresponding local video input signal.

In some embodiments the videoconferencing device 120 may be operable to display a graphical user interface (GUI) on the display device 122, where the user (operator of the videoconferencing device 120) can interact with the GUI in order to provide input to the videoconferencing device 120, e.g., similar to the manner in which users commonly provide input to on-screen television displays in order to set various options or perform various functions. For example, the user may operate the remote control device 128 or other input device, such as a keyboard or buttons on the videoconferencing device 120 chassis, in order to request the videoconferencing device 120 to perform a particular operation. In response, the videoconferencing device 120 may display various GUI elements on the display device 122, e.g., where the GUI elements indicate various options or functions related to the requested operation. The user may then scroll to and select a desired GUI element.

In some embodiments the videoconferencing system 119 may include multiple display devices 122. The videoconferencing device 120 may be configured to distribute the various video signals across the multiple display devices 122 in any of various ways.

As shown, the videoconferencing device 120 may also couple to one or more audio devices 124. For example, the audio device(s) 124 may include one or more microphones or other audio input devices for providing local audio input to be sent to the remote endpoints 101, as well as one or more speakers or other audio output devices for audibly projecting audio information received from the remote endpoints 101.

Referring now to FIG. 3, a flowchart diagram is shown to illustrate one embodiment of a method for facilitating user selection of a local video input signal by displaying live video icons. The method of FIG. 3 may be implemented by one or more of the devices shown in the videoconferencing system 119 illustrated in FIG. 2, such as the videoconferencing device 120.

As indicated in 301, a plurality of video input signals may be received from a plurality of local video sources 130. For example, the videoconferencing device 120 may receive a plurality of local video input signals from different respective local video sources 130, as described above.

In 303, a plurality of icons, also referred to as live video icons, may be simultaneously displayed on the display device 122, e.g., may be displayed by the videoconferencing device 120. Each icon displays a live version of the video input signal received from a respective one of the local video sources 130. The icons may be selectable to select a video input signal to send to the remote endpoint(s) in the videoconference. In other words, the user may select a particular one of the displayed icons in order to select the video input signal from the local video source 130 to which the icon corresponds as the video input signal to send to the remote endpoints.

In some embodiments, there may be a corresponding icon for each local video source 130. For example, in the exemplary videoconferencing system 119 of FIG. 2, three icons may be simultaneously displayed on the display device 122, where one of the icons displays a live version of the video signal from video camera 1 (video source 130A), another of the icons displays a live version of the video signal from video camera 2 (video source 130B), and another of the icons displays a live version of the video signal from the PC (video source 130C). In other embodiments, corresponding icons for only a subset of the local video sources 130 may be displayed. As one example, in one embodiment, an icon for the currently selected local video source 130 may not be displayed.

As used herein, displaying a “live version” of a video input signal refers to displaying the video input signal at a rate of at least one frame per second. A given video input signal may include enough information so that the video input signal could potentially be displayed at a relatively fast frame rate. For example, a particular video camera may encode information at about 30 frames per second. In some embodiments, the videoconferencing device 120 may be able to display the live video icons in such a way that the icons display the respective video signals at their native frame rates, i.e., so that each icon displays the video input signal from the corresponding local video source at the same frame rate at which the video input signal is received. In other embodiments, the frame rate for one or more of the video input signals may be reduced for display in the live video icons.

In some embodiments, the frame rates at which the video signals can be displayed in the icons may depend on the number of icons being displayed and the hardware resources of the videoconferencing device 120. For example, the videoconferencing device 120 may have a limited amount of memory or other resources. In some embodiments, these limited resources, together with the overhead required by the other functions performed by the videoconferencing device 120, may place a limit on the total number of frames per second that can be shown in the live video icons. For example, suppose that a total of N frames per second can be shown in the icons, and suppose that there are M icons. Thus, the number of frames per second that can be shown in each icon may be N/M.

In some embodiments, the value of N may high enough and/or the value of M may be low enough so that each local video signal is shown in its corresponding icon at its native frame rate. In other embodiments, one or more of the local video signals may be shown in its corresponding icon at a frame rate that is slightly slower than the native frame rate of the video signal, but at a frame rate that is still fast enough for a human viewer (i.e., a user) to perceive full motion. The user may not even notice that the video signal is displayed in the icon at a slower-than-native frame rate. In other embodiments, one or more of the local video signals may be shown in its corresponding icon at a frame rate that is slow enough that the user perceives some delay between frame changes. However, each icon preferably displays its respective video signal at a rate of at least one frame per second or faster.

As used herein, the term “icon” refers to any visual information displayed on a portion of the display device. In one embodiment each icon may simply comprise a rectangular window in which the respective local video input signal is displayed. The rectangular window may possibly be delineated from the rest of the display screen by a solid-colored border or other graphical information. In the preferred embodiment, each of the icons are substantially the same size as each other. Also, in the preferred embodiment, each icon is substantially smaller than the size of the display screen of the display device 122, i.e., so that the icon takes up only a small proportion of the display screen. For example, as described in the examples below, the icons may be displayed together with remote video signals received from the remote endpoints 101 and/or together with the currently selected local video signal, where the icons are displayed at a small size so as to not obscure (or to only partially obscure) these main video signals.

In one embodiment, each of the icons displays a live version of the video input signal from its corresponding local video input source, but substantially no other information. In other embodiments, one or more of the icons may display information other than the video input signal from its corresponding local video input source. For example, in one embodiment, each icon may include a name or picture of the local video input source to which the icon corresponds.

Thus, the plurality of displayed icons may present the user with visual information so that the user can simultaneously see the live video input signals from all, or at least a subset of, the local video sources 130. As noted above, the icons are selectable by the user in order to select which of the local video input signals to send to the remote endpoints 101.

As indicated in 305 of FIG. 3, user input selecting a first icon from the plurality of simultaneously displayed icons may be received. In various embodiments, the first icon may be selected in any of various ways. For example, in one embodiment a movable selection indicator GUI element may be displayed on the display screen, where the selection indicator highlights or otherwise visually indicates one of the icons. The user may operate a remote control device 128 to move the selection indicator to the desired icon, i.e., the first icon, and then select the first icon. In other embodiments the user may operate buttons on a chassis of the videoconferencing device 120, a keyboard coupled to the videoconferencing device 120, or any of various other kinds of input devices to select the first icon from the plurality of displayed icons.

In response to the user input selecting the first icon, the video input signal displayed by the first icon is selected as the local video input signal to send to the remote endpoints 101 in the videoconference, as indicated in 307. In other words, the videoconferencing device 120 begins sending the selected local video input signal to the remote endpoints 101, possibly instead of a local video input signal that was previously selected.

It is noted that in some embodiments the videoconferencing device 120 may be operable to send more than one local video input signal to the remote endpoints 101. Thus, for example, the method of FIG. 3 may be used to select a first local video input signal to the remote endpoints 101 by displaying icons for the local video sources a first time and receiving user input selecting a first icon and to also select a second local video input signal to the remote endpoints 101 by displaying icons for the local video sources a second time and receiving user input selecting a second icon. Also, in some embodiments, the videoconferencing device 120 may be operable to send different local video input signals to different sets of endpoints 101. Thus, the method may be used to select a local video input signal to send to each set of endpoints 101.

In some embodiments, icons may be displayed for all local video sources, even if one or more of the local video sources are not turned on or connected to the videoconferencing device 120. For example, for a video source that is not connected, the method may display an icon that shows a “blue screen” or another similar image that indicates that the video source is not connected. In another embodiment, for any video sources that are not connected, the method may not display icons for these video sources. Thus, in this embodiment the user would not be able to select a video source that does not provide a meaningful input.

FIG. 4 is a flowchart diagram illustrating a more particular embodiment of the method of FIG. 3.

In 351, the local endpoint 101 connects to the remote endpoint(s) 101 in the videoconference. For example, this may comprise the videoconferencing device 120 in the local endpoint 101 establishing communication with one or more videoconferencing devices in the remote endpoints 101 and/or establishing communication with a Multipoint Control Unit (MCU).

As described above, one of the local video input signals may initially be selected to send from the local endpoint 101 to the remote endpoints 101. For example, the desired local video input signal may be selected by selecting the corresponding icon from a plurality of displayed icons as described above, or the videoconferencing device 120 may already be configured by default to send the desired local video input signal to the remote endpoints 101.

As indicated in 353, the currently selected local video input signal is displayed in a first portion of the display device 122 and each of the remote video signals from the remote endpoints 101 are displayed in respective different portions of the display device 122. In various embodiments, the various video signals may be displayed in different respective portions of the display device 122 in any of various ways, e.g., where the respective portions have any spatial layout with respect to each other and may have any of various sizes with respect to each other.

The currently selected local video input signal and the remote video signals from the remote endpoints 101 may also be referred to herein as “main video signals”. In other words, main video signals comprise video signals that are part of the videoconference, i.e., signals sent to other endpoints. Thus, the main video signals are displayed in 353 in order for the participants at the local endpoint to view the videoconference.

FIG. 5A illustrates one simple example in which the local endpoint 101C participates in a videoconference with two remote endpoints, 101A and 101B. In this example, the currently selected local video input signal 180 from the local endpoint 101C is displayed as a main video signal in a first window on the display device 122, the remote video signal 182A from the remote endpoint 101A is displayed as a main video signal in a second window, and the remote video signal 182B from the remote endpoint 101B is displayed as a main video signal in a third window. For example, suppose that the currently selected local video input signal 180 is a video input signal from a first local video camera showing a person standing at a whiteboard.

Referring again to FIG. 4, in 355, user input indicating a desire to select which local video input signal to send to the remote endpoint(s) (or indicating a desire to change to a different local video input signal as the currently selected video input signal) is received. For example, the user (operator of the videoconferencing device 120) may press a button on the remote control device 128 to indicate a desire to select one of the local video input signals or to view the available local video input signals.

In 357, a plurality of live video icons are simultaneously displayed on the display device in response to the user input received in 355. Each icon displays a live version of the video input signal from a respective one of the local video sources, where the icons are selectable to select which local video input signal to send to the remote endpoint(s), similarly as described above with respect to the flowchart of FIG. 3. For example, FIG. 6A illustrates one exemplary embodiment corresponding to the example of FIG. 5A, in which three live video icons 190 are displayed. For example, the live video icon 190 on the left may display a live version of the video signal from the first local video camera. As noted above, in this example, the video input signal from the first local video camera is also the currently selected video input signal that is being sent to the remote endpoints 101, so this video input signal is also displayed in the first window on the display device 122, as described above. The live video icon 190 in the middle may display a live version of the video signal from a personal computer (PC). The live video icon 190 on the right may display a live version of the video signal from a second local video camera.

In 359, user input selecting a first icon from the plurality of simultaneously displayed icons may be received. For example, suppose that the user (i.e., an operator of the videoconferencing device 120) selects the middle icon displayed in the example of FIG. 6A, which corresponds to the local PC video source.

In 361, the local video input signal displayed by the first icon may be selected as the video input signal to send to the remote endpoint(s) in the videoconference in response to the user input selecting the first icon, similarly as described above with respect to the flowchart of FIG. 3. The local video input signal displayed by the first icon may replace the previously selected local video input signal 180. Thus, in the running example, the videoconferencing device 120 may begin to send the video input signal received from the local PC video source to the remote endpoints instead of the video input signal received from the first local video camera.

In 363, the previously selected local video input signal that was displayed on the first portion of the display device may be replaced with the newly selected local video input signal, i.e., the local video input signal displayed by the selected first icon. For example, in the example of FIG. 5A, the image of the person standing at the whiteboard that was previously displayed on the display device 122 (as shown in FIG. 5A) may be replaced with the local video input signal from the PC, e.g., where the video input signal from the PC shows a document (as shown in FIG. 5B). This may indicate to the participants at the local endpoint 101 that the local video input signal from the PC is now being sent to the remote endpoints 101.

In one embodiment the displayed icons may automatically be removed from the display screen after the user selects one of the icons. In another embodiment the icons may not be removed from the display screen until the user requests them to be removed, e.g., by pressing a button on the remote control 128 to exit from the local video input signal selection function.

In the example of FIG. 6A, the live video icons are displayed horizontally near the bottom of the screen of the display device 122. In this example, there is a live video icon for the first video camera, even though the video input signal from the first video camera is the currently selected local video source. In another embodiment, it may be desirable to only show icons corresponding to local video sources other than the currently selected local video source. For example, FIG. 6B is similar to FIG. 6A, but the currently selected video input signal from the currently selected video camera is not displayed in an icon.

FIG. 6C is another embodiment similar to the embodiment of FIG. 6A, but where the live video icons 190 are arranged vertically along the left side of the display screen instead of horizontally along the bottom.

FIG. 6D illustrates an embodiment in which there is only one remote video signal 182. In this example, the currently selected local video signal 180 and the remote video signal 182 are tiled so that they occupy the entire display screen. The live video icons 190 are displayed within the portion of the display screen in which the currently selected local video signal 180 is displayed.

FIG. 6E illustrates an embodiment in which only the remote video signals 182 are displayed. One of the local video input signals may be selected for sending video information to the remote endpoints 101, but the currently selected local video signal is not displayed on the display screen, or is displayed on a display screen of a different display device. However, the user may still be able to view live video icons for the local video sources in order to select which video input signal to send to the remote endpoints 101.

FIG. 6F illustrates an embodiment in which only the live video icons 190 are displayed on the display device, without displaying the icons together with the remote video signals or the currently selected local video input signal. For example, the live video icons 190 may be displayed on a separate display device from these video signals, or the user may temporarily enter a local video source selection screen during the videoconference, where the active video signals are not shown.

FIG. 6G illustrates an embodiment in which only the currently selected local video signal is displayed on the display device, and the live video icons for the local video sources are also displayed. For example, the remote video signals from the remote endpoints may be displayed on a separate display device, or video information may be sent from the local endpoint to the remote endpoints but may not be received from the remote endpoints.

It is noted that FIGS. 6A-6G are intended to illustrate exemplary embodiments, and in other embodiments the icons may have any of various other kinds of appearances, may be layed out or distributed on the display screen in any of various ways, and may be displayed together with any of various video signals or other information.

FIGS. 7A-7B illustrate more detailed examples of screen displays illustrating the use of live video icons according to one embodiment. FIG. 7A illustrates the display screen of a display device 122 at a local endpoint 101 which as established a videoconference with two remote endpoints 101. Video signals from the two remote endpoints 101 are displayed in the windows positioned at the upper left and upper right of the screen, and the currently selected local video signal is displayed in the window positioned at the bottom center of the screen. (In this example, a basic shape is shown as the video signal displayed in each window, simply to illustrate the example. For example, an ellipse is shown in the upper left window, a triangle is shown in the upper right window, and a star is shown in the bottom window. In a more realistic example, each video signal would of course illustrate other information, such as a person at each endpoint.)

The window for each remote video signal also indicates other information, such as a name of the respective remote endpoint (e.g., “Mock01”), an IP address of the remote endpoint (e.g., “10.10.11.159”). The windows also illustrate various graphic status indicators or glyphs. For example, each window has a “mute” glyph, shown as a microphone with a diagonal line through it, which indicates that the audio information at the respective endpoint is currently muted. (Thus, in this example, audio information from all endpoints is currently muted.)

FIG. 7B illustrates the display screen of FIG. 7A after four live video icons 190 have been displayed, where the live video icons are overlayed over the bottom of the display screen. Each icon displays both a live version of a video input signal from a local video source and a name of the local video source. In particular, a first icon displays a live version of the video input signal from a high definition camera and the name “Hi-Def Camera 1”; a second icon displays a live version of the video input signal from a document camera and the name “Doc Camera”; a third icon displays a live version of the video input signal from a DVD player and the name “DVD”; and a fourth icon displays a live version of the video input signal from a PC and the name “PC”.

In some embodiments the local video sources at a local endpoint may be organized into two or more sets of video sources. For example, the local endpoint may send two video signals to the remote endpoints, where a first video signal is selected from a first set of local video sources and a second video signal is selected from a second set of local video sources. In this embodiment, a first set of live video icons may be displayed in order to select the first video signal, where the first set of icons corresponds to the first set of local video sources. Similarly, a second set of live video icons may be displayed in order to select the second video signal, where the second set of icons corresponds to the second set of local video sources.

As one example, the first set of video sources may include two or more high-definition cameras, where the high definition cameras are aimed at participants at the local endpoint. One of the high definition cameras can be selected as the video source for the first video signal to send to the remote participants. The second set of video sources may include various types of alternate video sources, such as a document camera, VGA screen data, DVD player, or VCR player. One of these alternate video sources may be selected as the second video signal to send to the remote participants.

The different sets of live video icons corresponding to the different sets of local video sources may be accessible via a graphical user interface (GUI) of the videoconferencing device 120. For example, the GUI may enable the user to access each respective set of icons in a hierarchical manner. For example, the GUI may display a first GUI element which the user can select to access the first set of icons and a second GUI element which the user can select to access the second set of icons.

In one embodiment, the user may be able to select a remote video signal using live video icons in addition to or instead of selecting a local video signal. For example, in one embodiment the videoconferencing device 120 at the local endpoint may communicate with a remote videoconferencing device at a remote endpoint using a protocol which embeds scaled down images of the video signals available at the remote endpoint in the video information sent from the remote endpoint to the local endpoint. The local videoconferencing device 120 may display the scaled down images within icons displayed on a display device at the local endpoint. For example, a user at the local endpoint may select one of the icons in order to cause the remote videoconferencing device to begin sending the video signal from the corresponding video source to the local endpoint.

In various embodiments, the method of FIG. 3 may be implemented by any of various kinds of videoconferencing devices. FIG. 8 illustrates an exemplary videoconferencing device 120 according to one embodiment. It is noted that other embodiments of videoconferencing devices 120 may include any of various other kinds of components and may operate in various other ways in order to achieve the functionality described above, and that FIG. 8 represents an exemplary embodiment only.

The videoconferencing device 120 of FIG. 8 includes hardware logic for receiving input video streams (e.g., remote video signals and local video signals) from inputs 412 and creating output video streams (e.g., composite images) which are sent to outputs 414. In this example, the hardware logic comprises FPGA hardware 402, e.g., one or more FPGA chips. Operation of the FPGA hardware 402 is described below.

The videoconferencing device 120 also includes a processor 404 coupled to a memory 406. The memory 406 may be configured to store program instructions and/or data. In particular, the memory 406 may store operating system (OS) software 409, driver software 408, and application software 410. In one embodiment, the memory 406 may include one or more forms of random access memory (RAM) such as dynamic RAM (DRAM) or synchronous DRAM (SDRAM). However, in other embodiments, the memory 406 may include any other type of memory instead or in addition.

It is noted that the processor 404 is representative of any type of processor. For example, in one embodiment, the processor 404 may be compatible with the x86 architecture, while in another embodiment the processor 404 may be compatible with the SPARC™ family of processors. Also, in one embodiment the videoconferencing device 120 may include multiple processors 404.

The processor 404 may be configured to execute the software and to operate on data stored within the memory 406. The application software 410 may interface with the driver software 408 in order to communicate with or control the FPGA hardware 402 in various ways.

In particular, the application software 410 may communicate with the FPGA hardware 402 via the driver software 408 in order to control how the FPGA hardware 402 creates the composite image from the local and remote video signals. For example, suppose that in a videoconference between the local endpoint 101 and a remote endpoint 101, the videoconferencing device 120 displays a composite image of a local video signal and a remote video signal, where the two video signals are displayed in different windows on the display device. The application software 410 may control where to display the windows on the display device in relation to each other, how large to make each window, etc.

The application software 410 may also cause the display of a graphical user interface (GUI), e.g., where various GUI elements are superimposed over the displayed video signals in the composite image. For example, the GUI may comprise GUI elements for receiving user input and/or GUI elements for displaying information to the user.

The application software 410 may also control the display of the live video icons described above. For example, the application software 410 may interact with the FPGA hardware 402 in order to control how large to make the icons, the resolution of the video displayed in the icons, the placement of the icons on the display screen, etc. The application software 410 may also specify which of the local video streams to create icons for. For example, in one embodiment an icon corresponding to every local video input source may be displayed. In other embodiments, icons for only a subset of the local video input sources may be displayed.

The application software 410 may be operable to communicate with the FPGA hardware 402 in order to display a GUI that allows the user to configure the display of the live video icons. For example, in various embodiments, the user may be able to specify any of various options related to the display of the icons, such as which video sources to display icons for, whether to display the live video icons at all times or only in response to a user request, how large to make the icons, where to place the icons on the screen, etc.

Referring now to FIGS. 9A-9C, exemplary embodiments of the FPGA hardware 402 are illustrated. In one embodiment the FPGA hardware 402 includes two FPGA chips, referred to as input FPGA 720 (also referred to as the “V-In” chip) and output FPGA 730 (also referred to as the “V-Out” chip). FIG. 9A provides a high-level overview of components of the FPGA hardware 402.

FIG. 9B illustrates components of the input FPGA 720 in greater detail. Inputs 602, 606, 608, and 610 receive video input signals from various sources. For example, inputs 602A and 602B receive S-video input signals from local S-video sources, such as a document camera and a VCR or DVD player. Input 606 receives a VGA input signal from a device such as a PC. Inputs 610 are primary camera inputs that receive input signals from local cameras HB1 and HB2. For example, these cameras may provide video of the participants at the local endpoint. In one embodiment, these are high definition cameras. The input FPGA 720 may also interface with the video decoders 551. The video decoders 551 may receive remote video signals, e.g., over a network, and decode the remote video signals for input to the FPGA 720. The various video input signals are also referred to herein as “input streams”.

As shown, the input FPGA 720 includes a pool of scalers 503. One or more of the input streams may be sent to the scalers 503 in order to change its resolution, e.g., to scale the resolution up or down. As one example, in one embodiment the S-video input streams may be scaled up to a higher resolution, e.g., so that they can be displayed at a larger size on the display screen. As another example, the HB1 and HB2 primary camera input streams, which may be high definition video, may be scaled down by the scalers 502, e.g., in order to be sent to an S-video output (e.g., for output to a VCR).

After possibly being scaled up or down, the input streams may be serialized by the HS Serial TX module 540 and sent to the output FPGA 730.

FIG. 9C illustrates components of the output FPGA 730 in greater detail. The input streams coming from the input FPGA may be de-serialized by the HS Serial RX module 542 and then written into DDR memory 555b by the Stream-to-DDR DMA module 560.

As shown, the output FPGA 730 includes a memory-based (MB) scaler 593, which is operable to scale down the input streams for display in the live video icons. The DDR-to-Stream DMA module 562 may read the input streams from DDR memory 555b and feed them to the MB scaler 593. The MB scaler 593 may scale down the input streams to a low resolution for display in the icons, e.g., where the icons are displayed at a relatively small size with respect to the size of the display device screen, as described above.

The MB scaler 593 provides the scaled-down input streams to the DDR-to-Stream DMA module 562. Each of the scaled-down input streams may be written by the DDR-to-Stream DMA module 562 to a different location in the DDR memory 555b than the original input stream.

One or more composite images may be created from the input streams received from the input FPGA 720 and/or from the scaled-down input streams created by the MB scaler 593. For example, the output FPGA 730 may be operable to provide composite images on various outputs, such as the outputs 580, 582, 584, and 586. Each output may be coupled to a respective compositor 509, which receive one or more of the input streams from the DDR memory 555b and creates a composite image suitable for the output type. For example, the compositor 509b may provide a composite image at S-video resolution on output 584 to an S-video output device, such as a DVD player or VCR.

In one embodiment, one or more of the composite images may be sent over a network, e.g., to videoconferencing devices at remote endpoints. For example, outputs 586A-C are coupled to video encoders 553. As illustrated in FIG. 9D, video encoders 553 may encode output signals from the output FPGA 730 and send them over a network (e.g., a Wide Area Network (WAN) Access Device (WAD) network 571). Multimedia Digital Signal Processing (DSP) processors (e.g., Nexperia™ processors 572) may be used to process audio (e.g., Phillips Nexperia™ (PNX) signals) and/or video signals (e.g., video signals from the PCI bus).

The compositors 509 may be configured by the application software 410. In other words, the application software 410 may control which input streams are included in each of the composite images, where the respective input streams are placed within the composite image, etc. In particular, the application software 410 may control the display of the live video icons. For example, the application software 410 may control the placement of the scaled-down input streams created by the MB scaler 593 within the composite image and may possibly cause a border to be displayed around each scaled-down input stream or cause the display of other graphical information in each icon.

As described above, the application software 410 may communicate with the FPGA hardware through driver software 408. For example, there may be a driver for the input FPGA 720 and another driver for the output FPGA 730.

In one embodiment, the application software 410 may control memory management for the various input streams. For example, the application software 410 may control where the Stream-to-DDR DMA module 560 writes each stream in the DDR memory 555b, may control which memory locations the compositors 509 read the streams from, etc.

The application software 410 may also control operation of the MB scaler 593. For example, the application software 410 may control which of the input streams are scaled by the MB scaler 593 and control where (what memory locations) the input streams are read from the DDR memory 555b. The application software 410 may also control the resolution to which each of the streams are scaled by the MB scaler 593 and where the scaled-down streams are placed back into the memory 555b.

The input FPGA 720 and the output FPGA 730 may both be coupled to a bus, such as PCI bus 530, which enables them to communicate with the processor 404, e.g., to receive instructions from the application software 410 through the driver software 408 as described above.

It is noted that various embodiments may further include receiving, sending or storing instructions and/or data implemented in accordance with the foregoing description upon a computer-readable memory medium. Generally speaking, a computer-readable memory medium may include storage media or memory media such as magnetic or optical media, e.g., disk or CD-ROM, volatile or non-volatile media such as RAM (e.g. SDRAM, DDR SDRAM, RDRAM, SRAM, etc.), ROM, etc. for storing program instructions. Such a computer-readable memory medium may store program instructions received from or sent on any transmission media or signals such as electrical, electromagnetic, or digital signals, conveyed via a communication medium such as network and/or a wireless link.

Although the embodiments above have been described in considerable detail, numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreciated. It is intended that the following claims be interpreted to embrace all such variations and modifications.

Live video icons for signal selection in a videoconferencing system

Information

Publication Number

Date Filed

Date Published

Inventors

CPC

US Classifications

International Classifications

Abstract

Description

Claims

PRIORITY CLAIM

Provisional Applications (1)