This application claims priority of Taiwan Patent Application No. 100140245, filed on Nov. 4, 2011, the entirety of which is incorporated by reference herein.
1. Field of the Invention
The present invention relates to video conferencing, and in particular relates to a video conference system and method with a pause mode.
2. Description of the Related Art
In recent years, video conferencing has become an important way to communicate between two remote users due to the development of network technologies and video compression technologies. In addition, the coverage area of wired and wireless networks have become very wide, and thus video communications using the internet protocol (IP) network is widely used. Although video conference services are provided by 3G cellular networks (e.g. the video phone protocol 3G-324M using the communications network), the popularity thereof is mute as the coverage area is limited and communications fees for services are very expensive. Thus, video conferencing using the 3G cellular network is not popular. Generally, it is necessary for a user to own a dedicated video conference system for convenience to conduct video conferencing with other users. However, sounds and images of users will always be displayed on the other device after the video conference system is enabled, which may cause inconvenience for users in some conditions.
A detailed description is given in the following embodiments with reference to the accompanying drawings.
An exemplary embodiment provides a video conference system. The video conference system includes an audio processing unit, a video processing unit and a network processing unit. The audio processing unit is configured to encode an audio signal to an audio stream, wherein the audio signal is captured by a sound receiver. The video processing unit is configured to encode a pause image to a first video stream when the video conference system is in a pause mode, and encode a video signal which is captured by a multimedia capturing unit to a second video stream when the video conference system is in a conference mode. The network processing unit is configured to encode the first video stream to a first network package or encode the second video stream and the audio stream to a second network package, and send the first and second network packages to a network, wherein the network processing unit encodes the first video stream to the first network package when the video conference system is in the pause mode, and encodes the second video stream and the audio stream to the second network package when the video conference system is in the conference mode.
Another exemplary embodiment provides a video conference method which is applied in a video conference system, wherein the video conference system includes a pause mode and a conference mode. First, the video conference method includes determining whether the pause mode has been triggered. When the pause mode has been triggered, a pause image which is pre-saved is retrieved. Next, the pause image is encoded to a first video stream, and the first video stream is encoded to a first network package. Finally, the first network package is sent to a network.
The present invention can be more fully understood by reading the subsequent detailed description and examples with references made to the accompanying drawings, wherein:
The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
The video conference system 100 may comprise a multimedia capturing unit 110, a digital enhanced cordless telecommunications telephone (DECT telephone hereafter) 120, and a video conference terminal apparatus 130. The video conference terminal apparatus 130 is configured to connect with another video conference terminal apparatus to exchange video signals and audio signals though an IP network (e.g. local network (LAN)), and a radio telecommunications network, and the details will be described in the following sections. The multimedia capturing unit 110 can be a light-sensitive component (e.g. a CCD or CMOS sensor), configured to receive the images of a user and output a video signal V1 according to the images. The DECT telephone 120 is configured to receive the audio signal from a remote user through the video conference terminal apparatus 130, and play the audio signal. The multimedia capturing unit 110 may further comprise a microphone (not shown in
The video conference terminal apparatus 130, coupled to the multimedia capturing unit 110 and the DECT telephone 120, may comprise an audio processing unit 140, a video processing unit 150, and a network processing unit 160. The audio processing unit 140 is configured to receive the audio signal A1 outputted from the DECT telephone 120 through the network processing unit 160, and encode the audio signal A1 to an audio stream AS1. The video processing unit 150 is configured to receive the video signal V1 (and/or the audio signal A3) from the multimedia capturing unit 110 through the network processing unit 160 or retrieve a pre-saved pause image V3 though a bus (not shown), and encode the video signal V1 and the pause image V3 to a video stream VS1 and a video stream VS3, respectively. The pause image V3 can be pre-saved in a storage device (not shown) of the video conference terminal apparatus 130 or the multimedia capturing unit 110, but it is not limited thereto.
It should be noted that the video processing unit 150 encodes the pause image V3 to the video stream VS3 when the video conference terminal apparatus 130 is in the pause mode, wherein the video stream VS3 has a first bit rate and a first frame rate. The video processing unit 150 encodes the video signal V1 to the video stream VS1 when the video conference terminal apparatus 130 is in the conference mode, wherein the video stream VS1 has a second bit rate and a second frame rate. For example, the second bit rate can be 2 mega bits per second (2 Mbps), and the second frame rate can be 30 frames per second (30 fps). Additionally, the pause image V3 is a static picture or dynamic pictures. Therefore, the video processing unit 150 can encode the pause image V3 to the video stream VS3 with the lower bit rate and the lower frame rate for using the bandwidth efficiently. For example, the first bit rate can be 500 kilo bits per second (500 Kbps), and the first frame rate can be 5 frames per second (5 fps). The above frame rates and bit rates are one of the embodiments of the present invention, but it is not limited thereto.
The network processing unit 160 further encodes the video stream VS1 and the audio stream AS1 to a network packet NA, and communicates with another video conference terminal apparatus by network packets through an IP network for video conference. For example, the network processing unit 160 encodes the video stream VS3 which is encoded by the pause image V3 to a network packet P1B when the video conference terminal apparatus 130 is in the pause mode. The network processing unit 160 encodes the video stream VS1 which is encoded by the video signal V1 and the audio stream AS1 to a network packet P1A when the video conference terminal apparatus 130 is in the conference mode. It should be noted that the network package P1B does not include the audio stream AS1 when the video conference terminal apparatus 130 is in the pause mode in the present embodiment. In another embodiment, the network package P1B includes the audio stream AS1 when the video conference terminal apparatus 130 is in the pause mode, but it is not limited thereto.
The network processing unit 160 may comprise a digital enhanced cordless telephone interface (DECT interface hereafter) 161, a network processing unit 162, and a multimedia transmission interface 163. The DECT telephone 120 may communicate with and transmit data to the video conference terminal apparatus 130 through the DECT interface 161 with the DECT protocol. The network processing unit 162 is configured to receive the video stream VS1 or VS3 and the audio stream AS1 from the video processing unit 150 and the audio processing unit 140, respectively, and encode the video stream VS1 or VS3 and the audio stream AS1 to a network packet NA or P1B, which are further transmitted to the video conference terminal apparatuses of other users in the IP network. The network processing unit 162 is compatible with various wired/wireless communications protocols, such as the local network (LAN), the intranet, the internet, the radio telecommunications network, the public switched telephone network, Wifi, the infrared ray, and Bluetooth, etc., but the invention is not limited thereto. The network processing unit 162 may further control the real-time media sessions and coordinate the network transfer flows between each user in the video conference. The multimedia transmission interface 163 is compatible with various transmission interfaces, such as a USB and HDMI interface, for transmitting and receiving the video/audio signals.
As illustrated in
Referring to
The video processing unit 150 may be a video codec (i.e. video encoder/decoder), configured to receive the video signal V1 from the multimedia capturing unit 110, and encode the video signal V1 to generate a video stream VS1. The video processing unit 150 may further transmit the video stream VS1 and the audio stream AS1 to the video conference terminal apparatus of another user in the video conference through the network processing unit 162. When the network processing unit 162 receives the network packet P2 from the other user in the video conference through the IP network, the network processing unit 162 executes a process of error concealment on the network packet P2. The audio processing unit 140 and the video processing unit 150 decode the audio stream AS2 and video stream VS2 of the network packet P2, respectively, after processing the process of error concealment, and obtain the audio signal A2 and video signal V2. After obtaining the audio signal A2 and video signal V2, the display device and/or DECT telephone synchronize and display the audio signal A2 and video signal V2. It should be noted that the video processing unit 150 and the audio processing unit 140 can be implemented by hardware or software, and it is not limited thereto.
In another embodiment, the user may control the video conference terminal apparatus 130 by using the telephone keypad 121 of the DECT telephone 120, such as dialing the telephone numbers of other users in the video conference, controlling the angle of the camera, or alternating the settings of the screen. Specifically, the DECT telephone 120 may transmit the control signal to the video conference terminal apparatus 130 through the DECT interface 161 with the DECT protocol. The connection between the video conference terminal apparatus 130 and the multimedia capturing unit 110 can pass through the multimedia transmission interface 163, such as a wired interface (e.g. USB or HDMI) or a wireless interface (e.g. Wifi). The video conference terminal apparatus 130 can be connected to a display apparatus (e.g. a LCD TV) through the multimedia transmission interface 163, such as the HDMI interface or Widi (Wireless Display) interface, thereby the video screens of other users in the video conference and/or the control interface of the video conference terminal apparatus 130 can be displayed on the display apparatus, but the invention is not limited thereto.
In an embodiment, if the user A wants to conduct a video conference with the user B, the user A may use the DECT telephone 120 of the video conference terminal apparatus 130 to dial the telephone number of the video conference terminal apparatus 130 of the user B. Meanwhile, the video conference terminal apparatus 130 of the user A may receive the control message from the DECT telephone 120 through the DECT interface 161, and transmit the control message to the user B. When the video conference terminal apparatus 130 of the user B receives the phone call from the user A, the user B may respond to the phone call. Meanwhile, a video call can be built between the users A and B through the respective video conference terminal apparatus 130. The user A may use the DECT telephone 120 to receive the sounds thereof, and use the multimedia capturing unit 110 to capture the images thereof. Then, the audio processing unit 140 may receive the received sounds of the user A through the DECT interface 161, and encode the received sounds (i.e. the audio signal A1) to an audio stream AS1. The video processing unit 150 may encode the captured images of the user A (i.e. the video signal V1) to the video stream VS1. The audio stream AS1 and the video stream VS1 is transmitted to the video conference terminal apparatus 130 of the user B through the video conference terminal apparatus of the user B. On the other hand, the video conference terminal apparatus of the user B may decode the received audio stream AS1 and the video stream VS1. Then, the user B may transmit the audio signal A1 after the decoding process to the DECT telephone 120 through the DECT interface 161, thereby playing the audio signal A1. The user B may also display the video signal V1 after the decoding process on a display apparatus through the multimedia transmission interface 163 of the video conference terminal apparatus 130. It should be noted that the user B may also use the same procedure performed by the user A for exchanging video/audio signals to conduct the video conference.
In yet another embodiment, the multimedia capturing unit 110 may further comprise a microphone (not shown in
In the step S100, the video conference system 100 determines whether a pause mode has been triggered by users. When the pause mode has been triggered by users, the process goes to step S110, otherwise, the process goes to step S120.
In the step S110, the video processing unit 150 retrieves a pre-saved pause image V3. Next, the process goes to step S130.
In the step S120, the video processing unit 150 receives the video signal V1 from the multimedia capturing unit 110. Next, the process goes to step S130.
In the step S130, the video processing unit 150 encodes the captured image. For example, the video processing unit 150 can encode the video signal V1 to a video stream VS1, or encode the pause image V3 to a video stream VS3.
Next, in the step S140, the network processing unit 160 sends the image which is encoded by the video processing unit 150 to a network. For example, during the pause mode, the network processing unit 160 encodes the video stream VS3, which is encoded by the pause image V3, to a network package P1B, and sends the network package P1B to a network. During the conference mode, the network processing unit 160 encodes the video stream VS1, which is encoded by the video signal V1, and audio stream AS1 to a network package P1A, and sends the network package P1A to a network. It should be noted that, in the pause mode, the network package P1B does not include the audio stream AS1. In another embodiment, the network package P1B includes the audio stream AS1 in the pause mode, but it is not limited thereto.
Next, in the step S210, the video conference system 100′ receives the network package P1A or P1B through a network.
Next, in the step S220, the network processing unit 162 of the video conference system 100′ executes a process of error concealment on the network packet P1A or P1B.
Next, in the step S230, the audio processing unit 140 and the video processing unit 150 of the video conference system 100′ decode the audio stream AS2 and the video stream VS2 of the network packet P1A or the video stream VS3 of the network packet P1B, respectively, after processing the process of error concealment.
Next, in the step S240, the video conference system 100′ synchronizes the audio signal A1 and video signal V1.
Next, in the step S250, the video conference system 100′ displays the audio signal A1 and video signal V1. For example, when the pause mode of the video conference system 100 has been triggered by users, the video conference system 100′ displays the pause image V3. When the pause mode of the video conference system 100 has not been triggered by users, i.e., in the conference mode, the video conference system 100′ displays the video signal V1. The process ends at the step S250.
For those skilled in the art, it should be appreciated that the aforementioned embodiments in the invention describe different ways of implementation, and the each way of implementation of the video conference system and the video conference terminal apparatus can be collocated for usage. The video conference system 100 in the invention may use the video conference terminal apparatus and a common DECT telephone with an image capturing unit to conduct a video conference with other users, thereby having convenience and cost advantages.
While the invention has been described by way of example and in terms of the preferred embodiments, it is to be understood that the invention is not limited to the disclosed embodiments. To the contrary, it is intended to cover various modifications and similar arrangements (as would be apparent to those skilled in the art). Therefore, the scope of the appended claims should be accorded the broadest interpretation so as to encompass all such modifications and similar arrangements.
Number | Date | Country | Kind |
---|---|---|---|
100140245 | Nov 2011 | TW | national |