The present invention will be readily understood by the following detailed description in conjunction with the accompanying drawings, and like reference numerals designate like structural elements.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to one skilled in the art that the present invention may be practiced without some of these specific details. In other instances, well known process operations and implementation details have not been described in detail in order to avoid unnecessarily obscuring the invention.
Conference viewers 150 access video streaming server (VSS) 120 to receive audio, video, and other multimedia content such as images, documents, and annotations in real time. In one embodiment, VSS 120 may receive audio, video, and other multimedia content directly from conferencing server 110 via a local area network (LAN) connection 146. Conference viewers 150 generally will be able to receive a combined audio stream made up of audio streams from all conference participants 130. However, in one embodiment, each conference viewer 150 can only view one high bit-rate video stream from one of the conference participants 130, and generally cannot choose which conference participant 130 to view. Furthermore, conference viewers may be given an opportunity to ask questions by sending a signal along reverse path 157 to VSS 120 indicating a desire to ask a question, and then, after permission is granted, a low bit-rate video and/or audio signal can be sent from the conference viewer to VSS 120 encoding the individual's question.
To select which video feed to send to conference viewers 150, and to permit question or feedback from conference viewers 150, a special conference participant, referred to herein as controller 140, is provided with a control panel client as will be described in greater detail below with reference to
For the conference participant, once the user is authenticated, web server 114 triggers browser plug-in 134 which launches conferencing client 136. The browser plug-in is provided with the IP address of MCU server 112 and an authentication token or other authentication information. In one embodiment, web server 114 provides MCU server 112 with complimentary authentication information, such as a key, with which MCU server can authenticate conference participant 130 when contacted by conferencing client 136. For the conference viewer, after authentication by logging in with web server 114, web server triggers browser plug-in 154 which launches streaming client 156. Web server 114 provides browser plug-in 154 with the IP address of MCU server 122 in VSS server 120, for receiving the streaming content. In addition, an authentication token or other authentication information is provided to browser plug-in 154, and complimentary authentication information may be provided to MCU server 122 via LAN connection 146. The IP address and authentication information are passed to streaming client 156 to enable a secure log-in with MCU server 122.
Authentication may be achieved in other ways. For example, authentication may use a public key encryption scheme whereby web server 114 passes an encrypted, digitally signed message to either conference participant 130 or conference viewer 150 after authentication with web server 114, which message is then relayed to the appropriate one of MCU servers 112, 122, which solely holds the private key for decrypting the message, which could contain user information. The MCU server can authenticate the message by authenticating the digital signature. This would allow MCU servers 122, 124 to authenticate users without having to compare certificates with separate information supplied by web server 114. It should also be noted that conference participant 130 and conference viewer 150 may be provided with identical software such that conferencing client 136 and viewing client 156 are actually the same computer program that operate in either a conferencing mode or a viewing mode.
VSS server 120 also has a web server 124 which provides an administration interface. Web server 124 can therefore be used to provide various information and controls relating to MCU server 122 to remote administrators (not shown). Once a connection is made between conference viewer 150 and VSS 120, the conference viewer may begin receiving real-time streaming data from VSS as the data is received from MCU server 110.
Web server 114 will know if a connection is a conference participant or conference viewer by the username. Upon a valid connection, when the web server identifies that the connection is for a conference viewer, the web server redirects the client to VSS 120. This occurs by launching the client, placing the client in viewer mode, and letting it know the IP address of VSS 120. As mentioned, the authentication token is also passed to the client. The same process occurs for a conference participant, except the IP address of MCU server 112 is passed to the client and the security key is sent to MCU server 112 instead of VSS 120.
In one embodiment, meetings with both participants and viewers are created using a create-meeting web page (not shown). The web page may have a link that allows a meeting owner to add streaming users to the meeting. The web server will know that the meeting includes conference viewers if conference viewers are added to the meeting. In clicking the “Add Streaming Users” link, the web server will bring up a web page (not shown) for adding conference viewers. By default, the meeting owner will be the controller. However, the web page may allow the meeting controller/owner to designate a different controller. In one embodiment, the controller must be a conference participant and not a conference viewer. In another embodiment, the controller may be either a participant or a viewer. The controller will have access to the control panel, described below with reference to
Because conference viewers 150 may have varying bandwidth availability to receive multimedia data, and because the bandwidth may fluctuate over time, it may be desired to control the bit rate of video data transmitted from VSS server 120 to the conference viewers 150. In one embodiment, the bit rate is controlled while at the same time limiting the amount of encoding/decoding of the audio and video streams by the VSS. The bitrate may be controlled as described in related U.S. patent application Ser. No. 11/051,674 filed on Feb. 4, 2005 and entitled “Adaptive Bit-Rate Adjustment of Multimedia Communications Channels Using Transport Control Protocol,” incorporated herein by reference.
In one embodiment, the audio is a fixed bit rate, so no changes are made to the way audio is currently handled. However, the video is provided to conference viewers 150 in one of a plurality of bit rates. The incoming stream from MCU SERVER 112 is replicated and sent to each streaming client. If all connections and client PC's are of comparable performance levels then the video could be sent to each client without decoding and encoding. However, if there is a fast client with a fast connection and a slow client with a slow connection, they cannot be sent the video data at the same rate. To solve this problem, there is, in one embodiment, three predetermined video bit rates: slow, medium, and fast. Each video bit rate corresponds to the maximum bit rate of a video frame and the frame rate. MCU server 112 sends the highest bit rate to VSS 120. VSS 120 then decodes the video and re-encodes the video for the two smaller bit rates if needed.
If it is determined that a connection cannot keep up with the current bit rate then the data sent on that stream will be dropped down to the next lower bit rate of the three fixed rates. At this stepped-down bit rate, an intraframe will be generated and sent to all streaming clients of the same bit rate. As is generally known to those skilled in the art, an intraframe, also referred to as “I-frame” or “key frame,” is frame of video encoded in such a way that it does not require information from preceding frames to decode it, i.e., it includes all the data necessary to display that frame. By providing an intraframe to all the streaming clients of the same bit rate, each client's video will be up to date and have the needed data to compose the succeeding frames. To determine when to drop a client down to a lower bit rate, congestion code can be used, the congestion code being a measurement of latency of the connection. There could be times when the bit rate does not need to be reduced, as there is only a blip in the connection's bandwidth, e.g., due to temporary congestion. At this point, the data going out the line could be reset. Resetting effectively pauses the data going out that connection, generating an intraframe (it will be sent to all connections of the same bit rate), and resume sending the data on that line. To select the initial bit rate, the bit rate of the connection will be measured and the next smallest of the three predetermined bit rates will be used.
In one embodiment, VSS 120 controls each stream and the distribution of H.323 data to each client. H.323 is an encoder-decoder (“codec”) and specification from the ITU Telecommunication Standardization Sector (ITU-T). H.323 is an industry standard codec and protocol to provide audio-visual communication sessions on any packet network. In one embodiment, only one video stream is sent from MCU server 112 to VSS 120. This video stream is chosen from the VSS control panel in the manner described below with reference to
In one embodiment, the bit rate from MCU server 112 to VSS 120 will be maintained very high to keep the video at a high quality. In this case, LAN connection 146 is assumed to be able to sustain high data rates, e.g., at or close to 10 or 100 Mbps. Of course, the quality provided by the MCU server may be selected based on the speed of the LAN connection to ensure real-time data delivery. In various embodiments, the quality of the video may be selected depending on the available LAN bandwidth. This reduces degradation caused by decoding and re-encoding to a minimum since the VSS might have to re-encode for the reduced bit rates. In implementations where the connection between MCU server 112 and VSS 120 has restricted bandwidth, this restriction can be dropped at a cost to video quality. VSS 120 may replicate HTTP connections to the conference viewers as described below for delivering supplemental content to the conference viewers. For example, the HTTP connection is used for transferring image and data files to the clients. Upon receiving any HTTP data to the VSS, the VSS will forward that data to each streaming client.
The VSS may negotiate the same video and audio codecs for all streaming clients. In one embodiment, supported codecs include at a minimum H.263 video and G.711 speech codecs published by the ITU-T or some other standard protocol the VSS negotiates. These same protocols may be used for the VSS to MCU streams also.
Conference viewers 150 are not limited to receiving streaming audio and video. In addition, application sharing data, images and documents, and annotations, along with other multimedia content may be delivered to the conference viewers. In one embodiment, streaming clients may download images or documents from VSS 120 by making an HTTP connection to VSS 120. VSS 120 may then forward this request to web server 114. VSS 120 in this case acts as a proxy for the streaming client. However, the HTTP requests are not blindly forwarded to web server 114. Rather they are transformed to show the origination is from VSS 120. As VSS 120 collects data from the HTTP requests to the web server, VSS 120 can send the data onto the streaming client as the result of its HTTP request. In addition, conference viewers may be permitted to send text messages to each other, to the group at large, and/or to the controller.
In operation 206, it is determined whether the authentication is acceptable. For example, the authentication may be compared with a list of attendees for each online conference, or each conference may simply have an identifier and password, such that any person possessing the identifier and password, would be authenticated. In the latter case, the user may then be required to enter a name or select a name from a predefined list so that they can be identified by the system and other participants and users. It is also possible to provide a single password to identify the particular conference and a separate username for each attendee. If the authentication information entered by a user is matches previously stored authentication information, then the authentication is acceptable. Otherwise, the authentication would be rejected. If the authentication is rejected, then the procedure flows back to operation 204 to give the user an opportunity to re-enter the information. In one embodiment, the user is only permitted to enter authentication information a limited number of times before being locked out as a security precaution. If the authentication information is acceptable, e.g., matches previously stored authentication information, then the procedure flows to operation 208.
In operation 208, web server 114 sends authentication data to VSS 120 so that VSS 120 can validate the incoming VSS client connection. Note that this procedure is specifically for conference viewers. If the username had matched a conference participant, then the web server would connect the user to the MCU as described above with reference to
If the authentication information sent from streaming client 156 is not acceptable, then the procedure returns to operation 204 to allow the user to enter different authentication information. On the other hand, if the authentication is acceptable, then the procedure flows to operation 220 wherein it is determined whether this is the first client to connect to VSS 120. If so, then the VSS connects to the MCU to begin receiving streaming data therefrom for conference viewer 150. The procedure then ends as indicated by finish block 224. If, in operation 220, it is determined that the client is not the first client to connect, then the procedure flows directly to finish block 224. Once the user is connected to the streaming client, he or she can view the conference as a conference viewer, as shown in
In one embodiment, streaming client 156 operates in one of three modes: a video mode, a data mode, or a mixed mode.
Thumbnails 234 of the other presenters are “live” and show low bit-rate snapshots of that presenter. For example, in one embodiment, the thumbnails are updated less than once per second and are of small size (e.g. 64 pixels by 64 pixels). In one embodiment, thumbnails 234 are displayed on a strip at the bottom of the display area, although other configurations are possible. Graphical interface 230 also includes a question button 236 to indicate to the controller a desire to ask a question. As will be described in more detail below with reference to
In the data mode, shown by way of example in
In the mixed mode, the document and video might need to be transmitted at the same time. This may result in reduced video quality and frame rate due to bandwidth limitations while a document is being transmitted to all the clients. In one embodiment, if bandwidth is fully available, high quality video, at full size (e.g. 320×240) is displayed at a high frame rate (e.g., 10 fps). This may be the goal under ideal network conditions. Due to network conditions, this goal might not be reached and may be reduced as described in the related U.S. Patent Application entitled “Adaptive Bitrate Adjustment of Multimedia Communications Channel over TCP (temporary title)”. The other presenters 196 are shown in the usual thumbnail mode, e.g., 64 pixels by 64 pixels updated once per second or less, at the bottom of the display area. A small view of the document 254 may be shown to the right of the video being displayed.
The main purpose of the Video Streaming Server (VSS) control panel will be to manage the questions from streaming clients. The control panel will have to establish a connection to the VSS. The data needed from the VSS will be the name of each connected client, its question state, the audio levels, and the network quality levels. A connection to the Web Server Document Channel will need to be established to authenticate the connection. The control panel has a list 262 of each client connected to VSS 120. Along with the name of the client connected will be an indicator 264 showing whether there is a question from that client.
The control panel can be used to select which client can ask a question. A thumbnail view of the current questioner will be transmitted along with the audio. Note, it is the controller that controls how many streaming clients can ask a question at the same time. The moderator will also be able to control which questioner shows up in the thumbnail view. In one embodiment, the control panel has audio and network indicators 266 for each client. This may help in diagnosing audio and network problems that will come up. In one embodiment, the control panel is written in the JAVA programming language.
In one embodiment, each conference viewer 150 may request permission to ask a question of the conference participants 130. The request is sent to VSS 120 and forwarded to the control panel 260. The controller can then approve the request at his or her convenience. The approval needs is then sent to VSS 120 and forwarded to the issuing client 150. The client may also cancel the request. In one embodiment, there may be only one streaming client approved to ask questions at a time. The protocol may be a simple request/granting protocol. Whichever client has their request granted will be able to ask questions. The question state of each client is maintained by VSS 120. The state can be none, requesting, canceling, or approved. At startup, control panel 260 can query the questioning state of each client. When the VSS receives an approve request from the control panel, it will then allow audio from the client that issued the request. If VSS 120 receives a cancel request from a streaming client that has the audio enabled, the VSS will disable that client's audio and set question state its state to none.
In one embodiment, the control panel displays the audio levels 268. This may aid in helping the moderator diagnose audio problems to the streaming clients. The indicators will show the network quality and signal strength of the audio being sent to the streaming clients. Each streaming client that is listed in the control panel will have it own audio indicator. To help determine whether a particular communications problem relates to a network problem, whether the speakers turned on, whether the microphone on, whether the Microsoft Windows® control panel settings are correct, etc., the streaming system may have a test mode.
In the test mode, the controller can determine whether all conference viewers are connected and working properly. The controller can check audio to the streaming clients, video to the streaming clients, and audio from the streaming clients. In one embodiment, the test mode is started from the control panel by clicking a test mode button 270. Upon entering test mode, the conference viewer's user interface will change into test mode. The question button 236 (
To help in diagnosing audio and network problems, audio and network levels 266 may be shown on the line item for that client. Also the bit rate of each client will be listed along with the bit rate level (slow, medium, fast) in the line item for that client.
During the start of a meeting or during the audio/video check, it might be determined that a conference viewer should actually be a conference participant or visa-versa. The control panel 260 will have the ability to change the client from one mode to the other. This might involve shutting down the client and its connection and restarting the client in the new mode.
Often at the end of a presentation there is a question answer period. In one embodiment, one conference viewer may ask a question at a time. To ask a question the user of a streaming client 156 will click a button 236 (
With the above embodiments in mind, it should be understood that the invention can employ various computer-implemented operations involving data stored in computer systems. These operations are those requiring physical manipulation of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared and otherwise manipulated. Further, the manipulations performed are often referred to in terms such as producing, identifying, determining, or comparing.
Any of the operations described herein that form part of the invention are useful machine operations. The invention also relates to a device or an apparatus for performing these operations. The apparatus can be specially constructed for the required purpose, or the apparatus can be a general-purpose computer selectively activated or configured by a computer program stored in the computer. In particular, various general-purpose machines can be used with computer programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required operations.
The invention can also be embodied as computer readable code on a computer readable medium. The computer readable medium is any data storage device that can store data, which can be thereafter be read by a computer system. Examples of the computer readable medium include hard drives, network attached storage (NAS), read-only memory, random-access memory, CD-ROMs, CD-Rs, CD-RWs, magnetic tapes and other optical and non-optical data storage devices. The computer readable medium can also be distributed over a network-coupled computer system so that the computer readable code is stored and executed in a distributed fashion.
Embodiments of the present invention can be processed on a single computer, or using multiple computers or computer components which are interconnected. A computer, as used herein, shall include a standalone computer system having its own processor(s), its own memory, and its own storage, or a distributed computing system, which provides computer resources to a networked terminal. In some distributed computing systems, users of a computer system may actually be accessing component parts that are shared among a number of users. The users can therefore access a virtual computer over a network, which will appear to the user as a single computer customized and dedicated for a single user.
Although the foregoing invention has been described in some detail for purposes of clarity of understanding, it will be apparent that certain changes and modifications may be practiced within the scope of the appended claims. Accordingly, the present embodiments are to be considered as illustrative and not restrictive, and the invention is not to be limited to the details given herein, but may be modified within the scope and equivalents of the appended claims.
This application is related to U.S. patent application Ser. No. 10/192,130 filed on Jul. 10, 2002 and entitled “Method and Apparatus for Controllable Conference Content via Back-Channel Video Interface;” U.S. patent application Ser. No. 10/192,080 filed on Jul. 10, 2002 and entitled “Multi-Participant Conference System with Controllable Content Delivery Using a Client Monitor Back-Channel;” U.S. patent application Ser. No. 11/051,674 filed on Feb. 4, 2005 and entitled “Adaptive Bit-Rate Adjustment of Multimedia Communications Channels Using Transport Control Protocol;” U.S. patent application Ser. No. 11/199,600 filed on Aug. 9, 2005 and entitled “Client-Server Interface to Push Messages to the Client Browser;” and U.S. patent application Ser. No. 11/340,062 filed on Jan. 25, 2006 and entitled “IMX Session Control and Authentication” all of which are incorporated herein by reference.