System and method for optimizing video communications based on device capabilities

Description

BACKGROUND

The manner in which communication sessions with remote parties occur is currently limited in functionality and flexibility. Accordingly, what is needed are a system and method that addresses these issues.

SUMMARY

In some example embodiments, a method for optimizing video for transmission on a device based on the device's capabilities includes capturing, by a camera associated with the device, an original video frame, scaling the original video frame down to a lower resolution video frame, encoding the lower resolution video frame using a first encoder to produce a first layer output, decoding the first layer output, upscaling the decoded first layer output to match a resolution of the original video frame, obtaining a difference between the upscaled decoded first layer output and the original video frame, and encoding the difference using a second encoder to create a second layer output, wherein the encoding to produce the second layer output occurs independently from the encoding to produce the first layer output.

In one or more of the above examples, the first and second encoders perform the encoding of the first and second layer outputs, respectively, using different video coding standards.

In one or more of the above examples, the first and second encoders perform the encoding of the first and second layer outputs, respectively, using identical video coding standards.

In one or more of the above examples, the method further includes communicating, by the device, with another device in order to determine which video coding standard is to be used to perform the encoding by each of the first and second encoders.

In one or more of the above examples, the method further includes sending the first and second layer outputs to another device during a video call.

In one or more of the above examples, the method further includes sending the first and second layer outputs to a storage device.

In some example embodiments, a method for decoding video for display by a device, the method includes receiving an encoded first video frame and an encoded second video frame, independently decoding the encoded first and second video frames using a first decoder and a second decoder, respectively, upscaling the decoded first video frame to a resolution matching a resolution of the decoded second video frame, and adding the upscaled decoded first video frame and the decoded second video frame to create an additive video frame.

In one or more of the above examples, the first and second decoders perform the decoding of the encoded first and second video frames, respectively, using different video coding standards.

In one or more of the above examples, the first and second decoders perform the decoding of the encoded first and second video frames, respectively, using identical video coding standards.

In one or more of the above examples, the method further includes sending the additive video frame for display by the device.

In one or more of the above examples, receiving the encoded first video frame and the encoded second video frame includes retrieving the encoded first video frame and the encoded second video frame from a storage device.

In some example embodiments, a device or system for sending and receiving optimized video frames includes a processor, and a memory coupled to the processor, the memory having a plurality of instructions stored therein for execution by the processor, the plurality of instructions including instructions for scaling an original video frame down to a lower resolution video frame, encoding the lower resolution video frame using a first encoder to produce a first layer output, decoding the first layer output, upscaling the decoded first layer output to match a resolution of the original video frame, obtaining a difference between the upscaled decoded first layer output and the original video frame, and encoding the difference using a second encoder to create a second layer output, wherein the encoding to produce the second layer output occurs independently from the encoding to produce the first layer output.

In one or more of the above examples, the first and second encoders perform the encoding of the first and second layer outputs, respectively, using different video coding standards.

In one or more of the above examples, the first and second encoders perform the encoding of the first and second layer outputs, respectively, using identical video coding standards.

In one or more of the above examples, the instructions further include communicating with another device in order to determine which video coding standard is to be used to perform the encoding by each of the first and second encoders.

In one or more of the above examples, the instructions further include sending the first and second layer outputs to another device during a video call.

In one or more of the above examples, the instructions further include sending the first and second layer outputs to a storage device.

BRIEF DESCRIPTION OF THE DRAWINGS

For a more complete understanding, reference is now made to the following description taken in conjunction with the accompanying Drawings in which:

FIGS. 1A-1C illustrate various embodiments of environments within which video communications may be optimized;

FIG. 2 illustrates one embodiment of an encoding process that may be used by a transmitting device to optimize a video frame prior to transmission or storage;

FIG. 3 illustrates one embodiment of a decoding process that may be used by a receiving device to recover a video frame optimized by the encoding process of FIG. 2;

FIG. 4 illustrates a flow chart showing one embodiment of an encoding process that may be used by a transmitting device to optimize a video frame prior to transmission or storage;

FIG. 5 illustrates a flow chart showing one embodiment of a decoding process that may be used by a receiving device to recover a video frame optimized by the encoding process of FIG. 4;

FIG. 6 illustrates a flow chart showing one embodiment of a process that may occur to establish and use video encoding parameters;

FIG. 7 illustrates one embodiment of a server conference call environment within which different encoded frames may be used for video communications;

FIGS. 8A-8D illustrate various embodiments of environments showing different optimization configurations; and

FIG. 9 is a simplified diagram of one embodiment of a computer system that may be used in embodiments of the present disclosure as a communication device or a server.

DETAILED DESCRIPTION

It is understood that the following disclosure provides many different embodiments or examples. Specific examples of components and arrangements are described below to simplify the present disclosure. These are, of course, merely examples and are not intended to be limiting. In addition, the present disclosure may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed.

Referring to FIGS. 1A-1C, embodiments of an environment 100 are illustrated within which various aspects of the present disclosure may be practiced. The environment 100 of FIG. 1A includes a first communication device 102 and a second communication device 104. The two devices 102 and 104 may be involved in a one-way or two-way communication session involving video. The two devices may be similar or different, and may include identical or different hardware and/or software capabilities, such as graphics processing units (GPUs), video encoders, and video decoders.

The environment 100 of FIG. 1B illustrates video information being sent from a communication device 102 to a storage 106. The environment of FIG. 1C illustrates a conference call environment where a server 108 uses a selective transmission unit 110 to manage a conference call with multiple communication devices 102, 104, and 112. Although only three communication devices are illustrated, it is understood that any number of devices may be in communication with the server 108, subject to technical limitations such as bandwidth, processing power, and/or similar factors.

The communication devices 102, 104, and 112 may be mobile devices (e.g., tablets, smartphones, personal digital assistants (PDAs), or netbooks), laptops, desktops, workstations, smart televisions, and/or any other computing device capable of receiving and/or sending electronic communications via a wired or wireless network connection. Such communications may be direct (e.g., via a peer-to-peer network, an ad hoc network, or using a direct connection), indirect, such as through a server or other proxy (e.g., in a client-server model), or may use a combination of direct and indirect communications.

One video optimization method involves the use of video scaling, which enables more efficient resource usage in video communications. Generally, the scaling of video may be accomplished using two different methods. The first scaling method is resolution scaling, in which a video frame has similar information at different resolutions, but uses different amounts of bandwidth due to the different resolutions. The second scaling method is temporal scaling, in which reference frames are arranged such that every other frame (or some percentage or number of frames) can be dropped without any real impact on the decoding process. The present disclosure refers generally to resolution scaling, although it is understood that temporal scaling may be incorporated with aspects of the described embodiments.

The present disclosure provides a scaling approach that enables video optimizations for various devices even when those devices do not include support for standards such as Scalable Video Coding (SVC) as embodied in the Annex G extension of the H.264/MPEG-4 AVC video compression standard. This allows the present disclosure's approach to be used with a broad range of devices, including devices such as older mobile phones and devices with different encoding and decoding hardware and/or software. By dynamically adjusting to each device's capabilities, the scaling process may be configured to achieve an optimized outcome that may take into account the device itself, available network bandwidth, and/or other factors. Furthermore, for devices that support standards such as SVC, the present disclosure's approach may provide more flexibility due to its enabling of independent encoding steps and the provision for using different encoders during different steps of the encoding process. For purposes of convenience, the terms “codec,” “video coding format,” and “video coding standard” may be used interchangeably in the present disclosure.

Referring to FIG. 2, one embodiment of an encoding process 200 that may be used by a sending device (e.g., one of the communication devices of FIGS. 1A-1C or the server 106/STU 110) is illustrated. An original video frame 201a is captured by a camera in step 202. The resolution and other parameters of the video frame 201a may depend on the settings used to capture the image, the quality of the camera, and/or similar factors. For purposes of example, the video frame is captured at 1280×720.

The original frame is then scaled down in step 204 to create a scaled down frame 201b. The scaling may be performed, for example, using the device's GPU. For purposes of example, the original video frame 201a is scaled down to 320×180 for the frame 201b. The frame 201b is then encoded in step 206 to produce a Layer 0 output. The Layer 0 output is sent to a server, another device, and/or to storage in step 216, depending on the environment within which the device is operating.

Depending on factors such as the level of scaling and the compression type used, Layer 0 may be significantly smaller than the original frame while containing much of the same information as the original frame. For example, Layer 0 may be around 1/16^ththe size of the original image and the amount of bandwidth may be reduced to around ⅛^thof the original bandwidth that would otherwise be needed.

The Layer 0 output is decoded in step 208 and scaled up to the original resolution in step 210 to create a frame 201c. In the present example, the decoded frame 201b is scaled up from 320×180 to 1280×720 by the GPU. Due to the process of scaling and/or encoding/decoding, the 201b frame will likely not be exactly the same as the original frame 201a even after it is scaled up. For example, if a lossy algorithm is used to scale down the frame to 320×180, then some information will generally be lost during the downscaling process. When the frame is upscaled to the original resolution as frame 201c, the lost information may result in differences between the scaled up frame 201c and the original frame 201a.

In step 212, the difference between the original frame 201a and the scaled up frame 201c is calculated. This operation may be performed, for example, by the GPU. This difference results in a “ghost” image 201d that contains the differences between the original frame 201a and the scaled up frame 201c. The actual content of the ghost image 201d may vary depending on the process used to scale the frame and the encoding process used to create the Layer 0 output. In step 214, the ghost image 201d is encoded to produce a Layer 1 output. The Layer 1 output is sent to a server, another device, and/or storage in step 216, depending on the environment within which the device is operating. Is it understood that the terms “Layer 0” and “Layer 1” are used for purposes of illustration and any identifiers may be used for the encoder outputs.

It is noted that the encoding step 214 is independent of the encoding step 206. Accordingly, different encoding processes may be used by the two steps or the same encoding process may be used. This allows flexibility in the encoding processes. For example, a preferred encoder for the low resolution encoding that produces the Layer 0 output may not be ideal for the high resolution encoding of step 214. Accordingly, because of the described independent encoding process, the encoding steps 206 and 214 may be performed using different video coding standards.

The encoders may provide header information, such as encoder type, layer number, timestamps (e.g., to ensure the correct Layer 0 and Layer 1 frames are used properly on the receiving device), resolution information, and/or other information. The encoding process 200 of FIG. 2, including the creation and inclusion of header information, may be managed by an application on the device, and may include coordination with an STU (e.g., the STU 110) of FIG. 1C and/or other communication devices. According, determining which video coding standards may be used may include a negotiation process with other devices. The encoders may be hardware, while the decoders (which are generally less complex and use fewer resources) may be hardware or software. If hardware encoders are not available, software encoders may be used with adjustments made to account for the slower encoding and higher resource usage.

It is noted that, in the present embodiment, information may not be transferred between the two independently operating encoders. Instead, each encoder may simply encode the frame it receives without taking information from the other encoder into account. In other embodiments, information may be transferred between the encoders. While two separate encoders are used for purposes of example, both encoding steps may be performed by a single encoder in some embodiments.

Referring to FIG. 3, one embodiment of a decoding process 300 that may be used by a receiving device (e.g., one of the communication devices of FIGS. 1A-1C or the server 106/STU 110) is illustrated. For purposes of example, the receiving device is receiving the Layer 0 and Layer 1 outputs sent by the process 200 of FIG. 2. The Layer 0 and Layer 1 outputs of FIG. 2 are received in step 302. The low resolution Layer 0 stream is decoded in step 304 to recover the scaled down frame 201b. The frame 201b is scaled up (e.g., by the GPU) from its current resolution of 320×180 to the resolution of 1280×720 for frame 201c that will match the ghost image 201d.

The high resolution Layer 1 stream is independently decoded in step 308 to recover the ghost image 201d. Depending on the video coding standards used to encode the Layer 0 and Layer 1 outputs, the decoders for steps 304 and 308 may be different or may be the same. The ghost image 201d and the scaled up frame 201c are added in step 310 (e.g., by the GPU) to recreate the image 201a or an approximation thereof. It is noted that the recreated frame 201a of FIG. 3 may not exactly match the original frame of FIG. 2. The recreated frame 201a is then displayed in step 312.

It is understood that the encoder/decoder may depend on the device and its capabilities. Examples of hardware and software vendors and their supported encoder/decoder standards that may be used with the present disclosure are provided below in Table 1.

TABLE 1

Encoder/Decoder

Chipset Vendor/Software Vendor
Standards Supported

Qualcomm
Vp8, H.264

Samsung Exynos
Vp8, H.264

MediaTek
H.264

Google (software)
Vp9, Vp8, H.264

Apple (iPhone)
H.264

As can be seen, some devices may not support certain video coding standards, which in turn affects the selection of the encoders used in the encoding process 200 of FIG. 2. The receiving device is also taken into account, as it must be able to decode the received Layer 0 and Layer 1 streams. Examples of possible pairings of sending and receiving devices are provided in the following Tables 2-5. It is noted that if no native compatibility exists between two devices, a software encoder/decoder solution may be provided (identified as Damaka H.264 in the following tables). Listed standards may be in order of preference, but the order may change in some situations.

TABLE 2

Android Transmitter (Encoder)

Low Resolution
Difference Image
Android Receiver (Decoder)

Vp9, Vp8, H.264,
Vp8, H.264
Vp9, Vp8, H.264, Damaka H.264

Damaka H.264

TABLE 3

Android Transmitter (Encoder)

Low Resolution
Difference Image
iPhone Receiver (Decoder)

Vp9, Vp8, H.264,
Vp8, H.264
Hardware: H.264

Damaka H.264

Software: Vp9, Vp8

TABLE 4

Iphone Transmitter (Encoder)

Low Resolution
Difference Image
Android Receiver (Decoder)

H.264
H.264
H.264

TABLE 5

Iphone Transmitter (Encoder)

Low Resolution
Difference Image
iPhone Receiver (Decoder)

H.264
H.264
H.264

It is understood that many different combinations are possible and such combinations may change as new models of devices are introduced, as well as new or modified encoders and decoders. Accordingly, due to the flexibility provided by the encoding process described herein, the process may be applied relatively easily to currently unreleased combinations of hardware and software.

Generally, the process described herein encodes both lower resolution video frames and difference video frames independently. The type of encoder used for lower resolutions can be different from the type of encoder used for higher resolution. For example, Vp9 can be used for low resolution encoding, while Vp8 (which may have built-in support in current devices) can be used for high resolution encoding. The process on the receiving end uses independent decoding and the synchronized addition of images.

Referring to FIG. 4, a flowchart illustrates one embodiment of a method 400 that may be used by a device to encode and send video information. In step 402, an original video frame is acquired. In step 404, the original video frame is scaled down. In step 406, the scaled down video frame is encoded to produce a Layer 0 output. In step 408, the Layer 0 output is transmitted or stored. In step 410, the Layer 0 output is decoded. In step 412, a difference between the Layer 0 output and the original video frame is obtained. In step 414, the difference is encoded to produce a Layer 1 output. This encoding is independent of the encoding in step 406 and may use a different video coding standard. In step 416, the Layer 1 output is transmitted or stored.

Referring to FIG. 5, a flowchart illustrates one embodiment of a method 500 that may be used by a device to decode received video information. In step 502, a Layer 0 frame and a Layer 1 frame are obtained. In step 504, the Layer 0 and Layer 1 frames are decoded. In step 506, the decoded Layer 0 frame is scaled up to match the resolution of the decoded Layer 1 frame. In step 508, the scaled up Layer 0 frame and the Layer 1 frame are added to create an additive frame. In step 510, the additive frame is displayed.

Referring to FIG. 6, a flowchart illustrates one embodiment of a method 600 that may be used by a device to establish video parameters. In step 502, video parameters are established during communications with a server and/or another device. In step 504, encoding is performed based on the established parameters. In step 506, Layer 0 output is sent, and Layer 1 output is sent if needed.

Referring to FIG. 7, one embodiment of an environment 700 illustrates (from the perspective of the device 102) communication devices 102, 104, 112, and 702 interacting on a conference call via a server 108/STU 110. In the present example, each device 102, 104, 112, and 702 may have the ability to transmit at multiple resolutions and to receive multiple streams of video of different participants. Accordingly, the STU 110 includes logic to determine such factors as what resolution(s) each device should use to send its video to the server 108, how many video streams each device should receive from the server 108, and how many “small” videos and “large” videos should be sent to a device. In the present example, a “small” video uses only Layer 0 frames and a “large” video uses the recreated frames formed by adding the Layer 0 and Layer 1 frames. Accordingly, a device may be showing users in a grid (generally “small” videos) and/or may have one user in a spotlight (a “large” video). The STU 110 then selects and transmits the video streams as needed.

Compared to a simulcast conference call model, the described process may provide all required video streams while using less bandwidth (e.g., approximately fifteen to thirty percent less). The process may, in some situations, cause an additional delay (e.g., thirty-three to eighty milliseconds). It is understood that these examples may vary based on a large number of factors and are for purposes of illustration only. Adjustments may be made, for example, by reducing the bit rate, changing the maximum resolution, sending only Layer 0 frames, and/or dropping the frame rate.

Referring to FIGS. 8A-8D, embodiments of an environment 800 are illustrated within which various aspects of the present disclosure may be practiced. In previous embodiments, as shown with respect to FIG. 8A, the server 108/STU 110 was generally managing multiple devices with each device performing the encoding and decoding operations needed for that device. This distribution of encoding/decoding may enable the STU 110 to handle more devices for a particular conference session (e.g., may provide more scalability) as the encoding and decoding processes are offloaded to each device, rather than being performed by the server 108/STU 110. FIG. 8A may also illustrate the storage of encoded data from the device 102 and then the forwarding of the encoded data to the device 104 for decoding. However, in FIGS. 8B-8D, the server 108/STU 110 may perform encoding and/or decoding steps when communicating with a device.

Referring to FIG. 8B, the device 102 may be streaming (or may have previously streamed) video data to the server 108. It is understood that the video stream may be processed by the server 108 without use of the STU 110 or may be managed by the STU 110. The video stream may be sent in encoded format (e.g., using the video scaling optimization process disclosed herein) as shown and the server 108/STU 110 decodes the stream. The server 108/STU 110 then encodes the data prior to sending the data to the device 104, which decodes the data. In the illustration of FIG. 8B, it is understood that encoding/decoding negotiations may occur between each device 102, 104 and the server 108/STU 110, or the server 108/STU 110 may use information from negotiations between the devices 102 and 104 for its encoding and decoding.

Referring to FIG. 8C, the device 102 may be streaming (or may have previously streamed) video data to the server 108. It is understood that the video stream may be processed by the server 108 without use of the STU 110 or may be managed by the STU 110. However, the video stream is not in encoded format (e.g., does not use the video scaling optimization process disclosed herein) as shown and the server 108/STU 110 does not need to decode the stream. The server 108/STU 110 then encodes the data prior to sending the data to the device 104, which decodes the data. In the illustration of FIG. 8C, it is understood that encoding/decoding negotiations may occur between the device 104 and the server 108/STU 110.

Referring to FIG. 8D, the device 102 may be streaming (or may have previously streamed) video data to the server 108. It is understood that the video stream may be processed by the server 108 without use of the STU 110 or may be managed by the STU 110. The video stream may be sent in encoded format (e.g., using the video scaling optimization process disclosed herein) as shown and the server 108/STU 110 decodes the stream. The server 108/STU 110 then sends the data to the device 104 without encoding, and the device 104 does not need to decode the data. In the illustration of FIG. 8D, it is understood that encoding/decoding negotiations may occur between the device 102 and the server 108/STU 110.

As an example scenario using server-side encoding and decoding, the device 102 may stream video data to the server 108 for storage. The device 102 then goes offline. During a later communication session, the server 108/STU 110 retrieves the stored data and provides it to the device 104. As the device 104 was not able to negotiate the encoding/decoding parameters with the device 102, the server 108/STU 110 may perform encoding/decoding in order to establish the parameters with the device 104. It is understood that this process may be used with live streaming video call data, as well as with stored data. It is further understood that this server-side encoding and decoding may occur with only some devices (e.g., the device 102 of FIG. 1C) on a conference call, with other devices (e.g., the devices 104 and 112 of FIG. 1C) being managed as shown in FIG. 8A. This enables the server 108/STU 110 to manage exceptions on a per device basis, while still offloading as much of the encoding/decoding to the remaining devices as possible.

Referring to FIG. 9, one embodiment of a computer system 900 is illustrated. The computer system 900 is one possible example of a system component or computing device such as a communication device or a server. The computer system 900 may include a controller (e.g., a central processing unit (“CPU”)) 902, a memory unit 904, an input/output (“I/O”) device 906, and a network interface 908. The components 902, 904, 906, and 908 are interconnected by a transport system (e.g., a bus) 910. A power supply (PS) 912 may provide power to components of the computer system 900, such as the CPU 902 and memory unit 904. It is understood that the computer system 900 may be differently configured and that each of the listed components may actually represent several different components. For example, the CPU 902 may actually represent a multi-processor or a distributed processing system; the memory unit 904 may include different levels of cache memory, main memory, hard disks, and remote storage locations; the I/O device 906 may include monitors, keyboards, and the like; and the network interface 908 may include one or more network cards providing one or more wired and/or wireless connections to a network 916. Therefore, a wide range of flexibility is anticipated in the configuration of the computer system 900.

The computer system 900 may use any operating system (or multiple operating systems), including various versions of operating systems provided by Microsoft (such as WINDOWS), Apple (such as iOS or Mac OS X), Google (Android), UNIX, and LINUX, and may include operating systems specifically developed for handheld devices, personal computers, and servers depending on the use of the computer system 900. The operating system, as well as other instructions (e.g., for the processes and message sequences described herein), may be stored in the memory unit 904 and executed by the processor 902. For example, if the computer system 900 is the server 108 or a communication device 102, 104, 112, or 702, the memory unit 904 may include instructions for performing some or all of the message sequences and methods described with respect to such devices in the present disclosure.

The network 916 may be a single network or may represent multiple networks, including networks of different types. For example, the server 108 or a communication device 102, 104, 112, or 702 may be coupled to a network that includes a cellular link coupled to a data packet network, or data packet link such as a wide local area network (WLAN) coupled to a data packet network. Accordingly, many different network types and configurations may be used to establish communications between the server 108, communication devices 102, 104, 112, 702, servers, and/or other components described herein.

Exemplary network, system, and connection types include the internet, WiMax, local area networks (LANs) (e.g., IEEE 802.11a and 802.11g wi-fi networks), digital audio broadcasting systems (e.g., HD Radio, T-DMB and ISDB-TSB), terrestrial digital television systems (e.g., DVB-T, DVB-H, T-DMB and ISDB-T), WiMax wireless metropolitan area networks (MANs) (e.g., IEEE 802.16 networks), Mobile Broadband Wireless Access (MBWA) networks (e.g., IEEE 802.20 networks), Ultra Mobile Broadband (UMB) systems, Flash-OFDM cellular systems, and Ultra wideband (UWB) systems. Furthermore, the present disclosure may be used with communications systems such as Global System for Mobile communications (GSM) and/or code division multiple access (CDMA) communications systems. Connections to such networks may be wireless or may use a line (e.g., digital subscriber lines (DSL), cable lines, and fiber optic lines).

Communication among the server 108, communication devices 102, 104, 112, 702, servers, and/or other components described herein may be accomplished using predefined and publicly available (i.e., non-proprietary) communication standards or protocols (e.g., those defined by the Internet Engineering Task Force (IETF) or the International Telecommunications Union-Telecommunications Standard Sector (ITU-T)), and/or proprietary protocols. For example, signaling communications (e.g., session setup, management, and teardown) may use a protocol such as the Session Initiation Protocol (SIP), while data traffic may be communicated using a protocol such as the Real-time Transport Protocol (RTP), File Transfer Protocol (FTP), and/or Hyper-Text Transfer Protocol (HTTP). A sharing session and other communications as described herein may be connection-based (e.g., using a protocol such as the transmission control protocol/internet protocol (TCP/IP)) or connection-less (e.g., using a protocol such as the user datagram protocol (UDP)). It is understood that various types of communications may occur simultaneously, including, but not limited to, voice calls, instant messages, audio and video, emails, document sharing, and any other type of resource transfer, where a resource represents any digital data.

While the preceding description shows and describes one or more embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the present disclosure. For example, various steps illustrated within a particular sequence diagram or flow chart may be combined or further divided. In addition, steps described in one diagram or flow chart may be incorporated into another diagram or flow chart. Furthermore, the described functionality may be provided by hardware and/or software, and may be distributed or combined into a single platform. Additionally, functionality described in a particular example may be achieved in a manner different than that illustrated, but is still encompassed within the present disclosure. Therefore, the claims should be interpreted in a broad manner, consistent with the present disclosure.

Claims

1. A method for optimizing video for transmission on a device based on the device's capabilities, the method comprising: capturing, by a camera associated with the device, an original video frame;scaling the original video frame down to a lower resolution video frame;encoding the lower resolution video frame using a first encoder to produce a first layer output;decoding the first layer output;upscaling the decoded first layer output to match a resolution of the original video frame;obtaining a difference between the upscaled decoded first layer output and the original video frame; andencoding the difference using a second encoder to create a second layer output, wherein the encoding to produce the second layer output occurs independently from the encoding to produce the first layer output.
2. The method of claim 1 wherein the first and second encoders perform the encoding of the first and second output layers, respectively, using different video coding standards.
3. The method of claim 1 wherein the first and second encoders perform the encoding of the first and second output layers, respectively, using identical video coding standards.
4. The method of claim 1 further comprising communicating, by the device, with another device in order to determine which video coding standard is to be used to perform the encoding by each of the first and second encoders.
5. The method of claim 1 further comprising sending the first and second output layers to another device during a video call.
6. The method of claim 1 further comprising sending the first and second output layers to a storage device.
7. A method for decoding video for display by a device, the method comprising: receiving an encoded first video frame and an encoded second video frame;independently decoding the encoded first and second video frames using a first decoder and a second decoder, respectively;upscaling the decoded first video frame to a resolution matching a resolution of the decoded second video frame; andadding the upscaled decoded first video frame and the decoded second video frame to create an additive video frame.
8. The method of claim 7 wherein the first and second decoders perform the decoding of the encoded first and second video frames, respectively, using different video coding standards.
9. The method of claim 7 wherein the first and second decoders perform the decoding of the encoded first and second video frames, respectively, using identical video coding standards.
10. The method of claim 7 further comprising sending the additive video frame for display by the device.
11. The method of claim 7 wherein receiving the encoded first video frame and the encoded second video frame includes retrieving the encoded first video frame and the encoded second video frame from a storage device.
12. A device for sending and receiving optimized video frames, the device comprising: a processor; anda memory coupled to the processor, the memory having a plurality of instructions stored therein for execution by the processor, the plurality of instructions including instructions for scaling an original video frame down to a lower resolution video frame;encoding the lower resolution video frame using a first encoder to produce a first layer output;decoding the first layer output;upscaling the decoded first layer output to match a resolution of the original video frame;obtaining a difference between the upscaled decoded first layer output and the original video frame; andencoding the difference using a second encoder to create a second layer output, wherein the encoding to produce the second layer output occurs independently from the encoding to produce the first layer output.
13. The device of claim 12 wherein the first and second encoders perform the encoding of the first and second output layers, respectively, using different video coding standards.
14. The device of claim 12 wherein the first and second encoders perform the encoding of the first and second output layers, respectively, using identical video coding standards.
15. The device of claim 12 wherein the instructions further include communicating with another device in order to determine which video coding standard is to be used to perform the encoding by each of the first and second encoders.
16. The device of claim 12 wherein the instructions further include sending the first and second output layers to another device during a video call.
17. The device of claim 12 wherein the instructions further include sending the first and second output layers to a storage device.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional Application Ser. No. 63/192,051, filed on May 23, 2021, and entitled “SYSTEM AND METHOD FOR OPTIMIZING VIDEO COMMUNICATIONS BASED ON DEVICE CAPABILITIES,” which is hereby incorporated by reference in its entirety.

US Referenced Citations (150)

Number	Name	Date	Kind
5442637	Nguyen	Aug 1995	A
5612744	Lee	Mar 1997	A
5761309	Ohashi et al.	Jun 1998	A
5790637	Johnson et al.	Aug 1998	A
5818447	Wolf et al.	Oct 1998	A
5889762	Pajuvirta et al.	Mar 1999	A
6031818	Lo et al.	Feb 2000	A
6041078	Rao	Mar 2000	A
6128283	Sabaa et al.	Oct 2000	A
6141687	Blair	Oct 2000	A
6161082	Goldberg et al.	Dec 2000	A
6195694	Chen et al.	Feb 2001	B1
6202084	Kumar et al.	Mar 2001	B1
6219638	Padmanabhan et al.	Apr 2001	B1
6298129	Culver et al.	Oct 2001	B1
6311150	Ramaswamy et al.	Oct 2001	B1
6343067	Drottar et al.	Jan 2002	B1
6360196	Poznanski et al.	Mar 2002	B1
6389016	Sabaa et al.	May 2002	B1
6438376	Elliott et al.	Aug 2002	B1
6473425	Bellaton et al.	Oct 2002	B1
6574668	Gubbi et al.	Jun 2003	B1
6606112	Falco	Aug 2003	B1
6654420	Snook	Nov 2003	B1
6674904	McQueen	Jan 2004	B1
6741691	Ritter et al.	May 2004	B1
6754181	Elliott et al.	Jun 2004	B1
6766373	Beadle et al.	Jul 2004	B1
6826613	Wang et al.	Nov 2004	B1
6836765	Sussman	Dec 2004	B1
6842460	Olkkonen et al.	Jan 2005	B1
6850769	Grob et al.	Feb 2005	B2
6898413	Yip et al.	May 2005	B2
6912278	Hamilton	Jun 2005	B1
6940826	Simard et al.	Sep 2005	B1
6963555	Brenner et al.	Nov 2005	B1
6975718	Pearce et al.	Dec 2005	B1
6987756	Ravindranath et al.	Jan 2006	B1
6999575	Sheinbein	Feb 2006	B1
6999932	Zhou	Feb 2006	B1
7006508	Bondy et al.	Feb 2006	B2
7010109	Gritzer et al.	Mar 2006	B2
7013155	Ruf et al.	Mar 2006	B1
7079529	Khuc	Jul 2006	B1
7080158	Squire	Jul 2006	B1
7092385	Gallant et al.	Aug 2006	B2
7117526	Short	Oct 2006	B1
7123710	Ravishankar	Oct 2006	B2
7184415	Chaney et al.	Feb 2007	B2
7185114	Hariharasubrahmanian	Feb 2007	B1
7272377	Cox et al.	Sep 2007	B2
7302496	Metzger	Nov 2007	B1
7304985	Sojka et al.	Dec 2007	B2
7345999	Su et al.	Mar 2008	B2
7346044	Chou et al.	Mar 2008	B1
7353252	Yang et al.	Apr 2008	B1
7353255	Acharya et al.	Apr 2008	B2
7412374	Seiler et al.	Aug 2008	B1
7457279	Scott et al.	Nov 2008	B1
7477282	Firestone et al.	Jan 2009	B2
7487248	Moran et al.	Feb 2009	B2
7512652	Appelman et al.	Mar 2009	B1
7542472	Gerendai et al.	Jun 2009	B1
7546334	Redlich	Jun 2009	B2
7564843	Manjunatha et al.	Jul 2009	B2
7570743	Barclay et al.	Aug 2009	B2
7574523	Traversat et al.	Aug 2009	B2
7590758	Takeda et al.	Sep 2009	B2
7613171	Zehavi et al.	Nov 2009	B2
7623476	Ravikumar et al.	Nov 2009	B2
7623516	Chaturvedi et al.	Nov 2009	B2
7656870	Ravikumar et al.	Feb 2010	B2
7664495	Bonner et al.	Feb 2010	B1
7769881	Matsubara et al.	Aug 2010	B2
7774495	Pabla et al.	Aug 2010	B2
7778187	Chaturvedi et al.	Aug 2010	B2
7782866	Walsh et al.	Aug 2010	B1
7917584	Arthursson	Mar 2011	B2
8009586	Chaturvedi et al.	Aug 2011	B2
8065418	Abuan et al.	Nov 2011	B1
8135232	Kimura	Mar 2012	B2
8200796	Margulis	Jun 2012	B1
8402551	Lee	Mar 2013	B2
8407314	Chaturvedi et al.	Mar 2013	B2
8407576	Yin et al.	Mar 2013	B1
8447117	Liao	May 2013	B2
8560642	Pantos et al.	Oct 2013	B2
8611540	Chaturvedi et al.	Dec 2013	B2
8990877	Hart	Mar 2015	B2
9143489	Chaturvedi et al.	Sep 2015	B2
9356997	Chaturvedi et al.	May 2016	B2
9742846	Chaturvedi et al.	Aug 2017	B2
10091258	Carter et al.	Oct 2018	B2
10097638	Chaturvedi et al.	Oct 2018	B2
10147202	Nystad	Dec 2018	B2
10834256	Nair et al.	Nov 2020	B1
10887549	Wehrung et al.	Jan 2021	B1
10924709	Faulkner et al.	Feb 2021	B1
11315158	Lidster et al.	Apr 2022	B1
20020112181	Smith	Aug 2002	A1
20030036886	Stone	Feb 2003	A1
20030164853	Zhu et al.	Sep 2003	A1
20040091151	Jin	May 2004	A1
20040141005	Banatwala et al.	Jul 2004	A1
20050071678	Lee et al.	Mar 2005	A1
20050138110	Redlich	Jun 2005	A1
20050147212	Benco et al.	Jul 2005	A1
20050193311	Das	Sep 2005	A1
20060195519	Slater et al.	Aug 2006	A1
20060233163	Celi et al.	Oct 2006	A1
20070003044	Liang et al.	Jan 2007	A1
20080005666	Sefton	Jan 2008	A1
20080037753	Hofmann	Feb 2008	A1
20080163378	Lee	Jul 2008	A1
20090178019	Bahrs	Jul 2009	A1
20090178144	Redlich	Jul 2009	A1
20090254572	Redlich	Oct 2009	A1
20090282251	Cook et al.	Nov 2009	A1
20100005179	Dickson	Jan 2010	A1
20100064344	Wang	Mar 2010	A1
20100158402	Nagase	Jun 2010	A1
20100202511	Shin	Aug 2010	A1
20100250497	Redlich	Sep 2010	A1
20100299529	Fielder	Nov 2010	A1
20110044211	Long et al.	Feb 2011	A1
20110110603	Ikai	May 2011	A1
20110129156	Liao	Jun 2011	A1
20110145687	Grigsby et al.	Jun 2011	A1
20110164824	Kimura	Jul 2011	A1
20120030733	Andrews	Feb 2012	A1
20120064976	Gault et al.	Mar 2012	A1
20120173971	Sefton	Jul 2012	A1
20120252407	Poltorak	Oct 2012	A1
20120321083	Phadke	Dec 2012	A1
20130051476	Morris	Feb 2013	A1
20130063241	Simon	Mar 2013	A1
20130091290	Hirokawa	Apr 2013	A1
20140096036	Mohler	Apr 2014	A1
20140185801	Wang	Jul 2014	A1
20150295777	Cholkar et al.	Oct 2015	A1
20160057391	Block et al.	Feb 2016	A1
20160234264	Coffman et al.	Aug 2016	A1
20170249394	Loeb et al.	Aug 2017	A1
20180176508	Pell	Jun 2018	A1
20190273767	Nelson et al.	Sep 2019	A1
20200274965	Ravichandran	Aug 2020	A1
20200301647	Yoshida	Sep 2020	A1
20200382618	Faulkner et al.	Dec 2020	A1
20210099574	Nair et al.	Apr 2021	A1
20220086197	Lohita et al.	Mar 2022	A1

Foreign Referenced Citations (16)

Number	Date	Country
1603339	Dec 2005	EP
1638275	Mar 2006	EP
1848163	Oct 2007	EP
1988698	Nov 2008	EP
1404082	Oct 2012	EP
1988697	Feb 2018	EP
2005094600	Apr 2005	JP
2005227592	Aug 2005	JP
2007043598	Feb 2007	JP
20050030548	Mar 2005	KR
03079635	Sep 2003	WO
2005009019	Jan 2005	WO
2004063843	Mar 2005	WO
2006064047	Jun 2006	WO
2006075677	Jul 2006	WO
2008099420	Dec 2008	WO

Non-Patent Literature Citations (28)

Entry
Balamurugan Karpagavinayagam et al. (Monitoring Architecture for Lawful Interception in VoIP Networks, ICIMP 2007, Aug. 24, 2008).
Blanchet et al; “IPv6 Tunnel Broker with the Tunnel Setup Protocol (TSP)”; May 6, 2008; IETF; IETF draft of RFC 5572, draftblanchet-v6ops-tunnelbroker-tsp-04; pp. 1-33.
Chathapuram, “Security in Peer-To-Peer Networks”, Aug. 8, 2001, XP002251813.
Cooper et al; “NAT Traversal for dSIP”; Feb. 25, 2007; IETF; IETF draft draft-matthews-p2psip-dsip-nat-traversal-00; pp. 1-23.
Cooper et al; “The Effect of NATs on P2PSIP Overlay Architecture”; IETF; IETF draft draft-matthews-p2psip-nats-and-overlays-01.txt; pp. 1-20.
Dunigan, Tom, “Almost TCP over UDP (atou),” last modified Jan. 12, 2004; retrieved on Jan. 18, 2011 from 18 pgs.
Hao Wang, Skype VoIP service-architecture and comparison, In: INFOTECH Seminar Advanced Communication Services (ASC), 2005, pp. 4, 7, 8.
Isaacs, Ellen et al., “Hubbub: A sound-enhanced mobile instant messenger that supports awareness and opportunistic interactions,” Proceedings of the SIGCHI Conference On Human Factors in Computing Systems; vol. 4, Issue No. 1; Minneapolis, Minnesota; Apr. 20-25, 2002; pp. 179-186.
J. Rosenberg et al., SIP: Session Initiation Protocol (Jun. 2008) retrieved at http://tools.ietf.org/html/rfc3261. Relevant pages provided.
J. Rosenberg et al. “Session Traversal Utilities for NAT (STUN)”, draft-ietf-behave-rfc3489bis-06, Mar. 5, 2007.
Jeff Tyson, “How Instant Messaging Works”, www.verizon.com/learningcenter, Mar. 9, 2005.
Mahy et al., The Session Initiation Protocol (SIP) “Replaces” Header, Sep. 2004, RFC 3891, pp. 1-16.
NiceLog User's Manual 385A0114-08 Rev. A2, Mar. 2004.
Pejman Khadivi, Terence D. Todd and Dongmei Zhao, “Handoff trigger nodes for hybrid IEEE 802.11 WLAN/cellular networks,” Proc. Of IEEE International Conference on Quality of Service in Heterogeneous Wired/Wireless Networks, pp. 164-170, Oct. 18, 2004.
Philippe Bazot et al., Developing SIP and IP Multimedia Subsystem (IMS) Applications (Feb. 5, 2007) retrieved at redbooks IBM form No. SG24-7255-00. Relevant pages provided.
Qian Zhang; Chuanxiong Guo; Zihua Guo; Wenwu Zhu, “Efficient mobility management for vertical handoff between WWAN and WLAN,” Communications Magazine, IEEE, vol. 41. issue 11, Nov. 2003, pp. 102-108.
RFC 5694 (“Peer-to-Peer (P2P) Architecture: Definition, Taxonomies, Examples, and Applicability”, Nov. 2009).
Rory Bland, et al,“P2P Routing” Mar. 2002.
Rosenberg, “STUN—Simple Traversal of UDP Through NAT”, Sep. 2002, XP015005058.
Rosenberg, J; “Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal for Offer/Answer Protocols”; Oct. 29, 2007; I ETF; I ETF draft of RFC 5245, draft-ietf-mmusic-ice-19; pp. 1-120.
Salman A. Baset, et al, “An Analysis Of The Skype Peer-To-Peer Internet Telephony Protocol”, Department of Computer Science, Columbia University, New York, NY, USA, Sep. 15, 2004.
Seta, N.; Miyajima, H.; Zhang, L;; Fujii, T., “All-SIP Mobility: Session Continuity on Handover in Heterogeneous Access Environment,” Vehicular Technology Conference, 2007. VTC 2007—Spring. IEEE 65th, Apr. 22-25, 2007, pp. 1121-1126.
Singh et al., “Peer-to Peer Internet Telephony Using SIP”, Department of Computer Science, Columbia University, Oct. 31, 2004, XP-002336408.
Sinha, S. and Oglieski, A., A TCP Tutorial, Nov. 1998 (Date posted on Internet: Apr. 19, 2001) [Retrieved from the Internet ].
Srisuresh et al.; “State of Peer-to-Peer(P2P) Communication Across Network Address Translators(NATs)”; Nov. 19, 2007; I ETF; I ETF draft for RFC 5128, draft-ietf-behave-p2p-state-06.txt; pp. 1-33.
T. Dierks & E. Rescorla, The Transport Layer Security (TLS) Protocol (Ver. 1.2, Aug. 2008) retrieved at http://tools.ietf.org/htmllrfc5246. Relevant pages provided.
Wireless Application Protocol—Wireless Transport Layer Security Specification, Version 18—Feb. 2000, Wireless Application Forum, Ltd. 2000; 99 pages.
WISPA: Wireless Internet Service Providers Association; WISPA-CS-IPNA-2.0; May 1, 2009.

Provisional Applications (1)

	Number	Date	Country
	63192051	May 2021	US

System and method for optimizing video communications based on device capabilities

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

CPC

Field of Search

CPC

International Classifications