The present invention relates to network conferencing and more specifically to data collaboration videoconferencing on processor-based packet networks.
Present data collaboration networks, such as IP-based networks, require the mixing of video data with other types of content (e.g., audio data, application data, etc.) from a computer terminal so that a group of geographically diverse terminals may share in the viewing and processing of distributed content. Current generations of data collaboration products require the use of proprietary software applications running on a personal computer (PC) in order to share data, with a hardware or software videoconferencing client dedicated to providing video content.
One example of such videoconferencing systems, using a software-based client, is Microsoft's Netmeeting™, which uses a analog video capture card or a high-speed digital interface to import video data from an external camera to a PC. The imported video data can then be overlaid with local applications, such as Microsoft Office™ to be displayed on a desktop monitor. However, such videoconferencing systems suffer from reduced video quality, since the software-based clients do not typically have the processing power to encode high-quality video in real time.
When using hardware-based systems, the conferencing devices typically used either do not have means to facilitate data collaboration (such as the Starback Torrent VCG™), or use an analog audio/video (A/V) capture card on a PC to import analog audio and video from the conferencing device to the PC collaboration client. For example, the capture card of such systems typically performs analog to digital (A/D) conversion, and imports video over a dedicated network that complies with National Television Standards Committee (NTSC) or Phase Alternate Line (PAL) standards. While these types systems are effective for delivering A/V between terminals, repeated A/D conversion tends to introduce data errors, which in turn degrade the quality of A/V transmission. Furthermore, by requiring a separate network connection, conventional hardware-based systems introduce additional complexity in the synchronizing of data between the conferencing device and the PC.
One problem with the configuration of
One problem with the configuration of
Technologies such as FireWire™ and i-Link™ provide efficient transfer of A/V data. However platforms with these interfaces are not designed to support data collaboration and teleconferencing features. Other devices perform streaming multicasts of videoconferencing sessions over an enterprise LAN to PC's, but those devices do not include the videoconferencing endpoint functionality. Furthermore, these devices do not avail themselves of high-speed digital interfaces for transmission to PC clients using a unified display.
A videoconferencing and data collaboration system is disclosed, wherein user systems exchange A/V data, along with other computer data, via conferencing devices connected digitally to a packet network. The conferencing devices are configured to process and transmit A/V data to other devices participating in a conference. Each transmitting conferencing device incorporates a DSP or equivalent hardware to encode A/V data for transmission the packet network. Furthermore, once A/V data is received from the network, each receiving conferencing device decodes the A/V data and forwards it to a respective terminal for viewing. The conferencing devices also share computer data and files over the digital network, where user modifications are tracked by transmitting short messages that indicate key depression or mouse movement.
Since the conferencing device is responsible for decoding the received A/V data from the network, the attached processing terminal is relieved from performing CODEC processing. Also, the digital links used in the system obviate the need for performing extraneous conversion between the analog and digital domains, thus resulting in better quality of A/V data. Furthermore, since digital links come as standard interfaces in modem PCs, availability and support problems are minimized.
Additional features and advantages of the present invention are described in, and will be apparent from, the following Detailed Description of the Invention and the figures.
The first user system 315 includes a first processing terminal 303, which is coupled to a storage unit 306. Storage unit 306 may be a hard drive, a removable drive, recordable disk, or any other suitable medium capable of storing computer and A/V data. Terminal 303 is further connected to a conferencing device 304, via digital interface 304A. Conferencing device 304 incorporates a digital signal processor (DSP) 305. As shown in
The second user system 316 includes devices 308-313, which are equivalent to devices 301-306 described above in the first user system (315). The second user system includes a conferencing device 308 with digital interface 308A and network interface module 304B, DSP 309, processing terminal 310, audio source 311, video source 312 and storage unit 313 as shown in
Under a preferred embodiment, processing terminals 303, 310 provide real-time bidirectional multimedia and data communication through their respective conferencing device 304, 308 to packet network 307. Terminals 303, 310 can either be a PC, or a stand-alone device capable of supporting multimedia applications (i.e., audio, video, data). Packet network 307 may be an IP-based network, Internet packet exchange (IPX)—based local area network (LAN), enterprise network (EN), metropolitan-area network (MAN), wide-area network (WANs) or any other suitable network. A MCU 314 may also be coupled to packet network 307 for providing support for conferences of three or more user systems. Under this condition, all user systems participating in a conference would establish a connection with the MCU 314. The MCU would then be responsible for managing conference resources, negotiation between user systems for determining the audio or video coder/decoder (CODEC) to use, and may also handle the media stream being transmitted over packet network 307.
To illustrate an example of A/V data communicating over system 300, terminal 303 receives A/V data from audio source 301 and video source 302. Alternately, terminal may also receive A/V data, as well as computer data, transmitted from storage device 306. Once the data is received at terminal 303, the data is forwarded via digital link to conferencing device 304. Conferencing device 304 then captures the A/V data and encodes it using DSP 305. Once encoded, the A/V data is transmitted through packet network 307 to either the MCU 314 (if three or more user systems are being used), or directly to conferencing device 308. If the A/V data is received directly at conferencing device 308, the encoded A/V data is then decoded and transmitted to terminal 310 for viewing in a compatible format. If the A/V data is transmitted to MCU 314, the MCU 314 uses conventional methods known in the art to manage and transmit the A/V data to the destination conferencing devices, where the data is decoded in the conferencing device and further transmitted to each respective terminal for viewing. A/V data may include uncompressed digital video (e.g., CCIR601, CCIR656, etc.) or any compressed digital video formats that support streaming (e.g., H.261, H.263, H.264, MPEG1, MPEG2, MPEG4, RealMedia™, Quicktime™). The audio data may be transmitted in half-duplex or full-duplex mode.
One advantage of the system 300 shown in
System 300 also provides for the receiving and transmitting of documents separately from, or concurrently with transmitted A/V data. As an example, a document stored in storage medium 306 of a first user system 315 is opened in terminal 303 and is transmitted, to conferencing device 304, where the document is processed under a file transfer protocol (FTP) for transmission to packet network 307. The processing is done preferably under the multipoint file transfer protocol block of the T.120 portion of conferencing device 304, which will be explained in further detail below. After transmission from conferencing device 304, the second user system 316 receives the document in the conferencing device 308 via packet network 307. Conferencing device 308 would then forward the document to terminal 310, where the document would be viewed. Under an alternate embodiment, MCU 314 would forward the document to each respective conferencing device participating in the conference, if three or more users are participating.
To provide users with the ability to manipulate documents (or A/V data) without taking up unnecessary bandwidth, short data messages (also known as “collaboration cues”) are preferably transmitted when a user has depressed a key or has moved a mouse or other device. Any change a local user makes is then replicated on all remote copies of the same document in accordance with the collaboration cue that is received. Under this configuration, the system does not have to re-transmit multiple graphic copies of a document each time it is altered. If chair control is desired, a token mechanism may be used in the system to allow users to take and pass chair control. The specific processes regarding chair control and token mechanisms are described in greater detail in the International Telecommunications Union (ITU) T.120 standard, particularly in T.122 and T.125. Furthermore, a software plug-in may be used in the conferencing devices to recognize RTP streams, which will be discussed in further detail below.
Conferencing device 304 receives A/V data, as well as computer data from terminal 303, where audio data is received at the audio application portion 320, video data is received at the video application portion 321, and other data, including computer data is received at the terminal manager portion 322 of conferencing device 304. A/V data transmitted from terminal 303 in user system 315 is received at DSP portion 305, which comprises an audio application portion 320 and video application portion 321 as shown in
Once the A/V data is processed, DSP 305 forwards the encoded data to real-time transport protocol portion (RTP) 323. RTP portion 323 manages end-to-end delivery services of real-time audio and video. RTP 323 is typically used to transport data via the user datagram protocol (UDP). Under this configuration, transport-protocol functionality is established among various conferencing devices during conferencing, and is further managed by the transport protocols & network interface 329 as shown in
Still referring to
The registration, admission, and status (RAS) portion 325 establishes protocol for the session between endpoints (e.g., terminals in a user system, gateways). More specifically, RAS 325 may be used to perform registration, admission control, bandwidth changes, status, and disengagement procedures between endpoints. A RAS channel is preferably used to exchange RAS messages, and this signaling channel may also be opened between an endpoint and any gatekeeper prior to the establishment of any other channels.
Call signaling portion 326 of
The T.120 data portion 328 is based on the ITU-T.120 standard, which is generally made up of a suite of communication and application protocols developed and approved by the international computer and telecommunications industries. The T.120 data portion 328 in
Multi-point file transfer segment 341 defines how files are transferred simultaneously among conference participants. Multi-point file transfer segment would preferably be based on the T.127 standard and would enable one or more files to be selected and transmitted in compressed or uncompressed form to all selected participants during a conference. The image exchanger segment 342 would specify how an application from 340 sends and receives whiteboard information, in either compressed or uncompressed form, for viewing and updating among multiple conference participants. The image exchanger segment 342 is preferably based on the T.126 standard. The ITU-T standard application protocol segment 343 provides lower-level networking protocols for connecting and transmitting data, and specifies interaction with higher level application protocols generated from applications segment 340. The data is then transmitted to packet network 305 as shown in
While the invention has been described in detail in connection with preferred embodiments known at the time, it should be readily understood that the invention is not limited to the disclosed embodiments. Rather, the invention can be modified to incorporate any number of variations, alterations, substitutions or equivalent arrangements not heretofore described, but which are commensurate with the spirit and scope of the invention.
For example, although the invention has been described in connection over a generic digital link, the invention may be practiced with many types of digital links such as a USB 2.0, IEEE 1394 and even wired or wireless LAN without departing from the spirit and scope of the invention. In addition, although the invention is described in connection with videoconferencing and data collaboration, it should be readily apparent that the invention may be practiced with any type of collaborative network. It is also understood that the device portions and segments described in the embodiments above can substituted with equivalent devices to perform the disclosed methods and processes. Accordingly, the invention is not limited by the foregoing description or drawings, but is only limited by the scope of the appended claims.