1. Field of the Invention
This invention relates to communications, and more particularly, to a communication apparatus and method for transmitting media between nodes using either a network efficient transmission protocol or a loss tolerant transmission protocol, depending on (i) the condition on the network and (ii) if either the transmitted media is being consumed in real-time or in a time-shifted mode.
2. Description of Related Art
The Transmission Control Protocol or “TCP” is an example of a network efficient protocol. TCP guarantees the delivery of transmitted data between a sender and a recipient at the expense of speed. For this reason, TCP is currently the most common delivery protocol used on the Internet. A feature called “flow control” is the main reason why TCP is able to guarantee the delivery of media. Flow control determines when data needs to be re-sent and stops the flow of data until previous packets are successfully transferred. For example, when a recipient receives a defective packet or a packet is not received (i.e., a missing packet), a request for retransmission of the defective and/or missing packet is made and flow of subsequent packets is stopped until the retransmission request is satisfied. The guaranteed delivery feature of TCP is beneficial for certain applications, such as the transfer of the content of web pages, files or database information. The possibility that the flow of data may be stopped, however, makes TCP less than ideal for delivery of time critical media, such as streaming voice or video.
The User Datagram Protocol or “UDP” is an example of a loss tolerant protocol, commonly used on the Internet for streaming audio, video and other time-based media (i.e., media that changes over time). UDP is mainly concerned with the delivery of the most recently available media, as opposed to quality. To achieve the necessary delivery rate for streaming audio or video, there is no form of flow control or error correction with UDP. Without any mechanism for guaranteeing delivery, packets may be received out of order, defective, or lost altogether, possibly resulting in reduced quality of the media delivered to the recipient. As a result when the condition on the network are poor, media may be delivered at a rate sufficient for real-time consumption using UDP when TCP would otherwise be inadequate.
Conventional communication systems are typically either real-time or time-shifted. Consequently, a protocol like UDP, which is optimized for “real-time” delivery, is typically used for real-time systems, while a protocol like TCP, which is optimized for reliable delivery, is used for time-shifted systems.
The invention relates to a method and apparatus for transmitting voice media over a network where the voice media may be consumed either in a real-time mode or a time-shifted mode. The method comprising transmitting the voice media over the network using a network efficient protocol when either (i) the media is not being consumed in the real-time mode or (ii) the condition on the network is good enough to support the real-time transmission and consumption of the voice media in the real-time mode. Alternatively, the voice media is transmitted using a loss tolerant transmission protocol when the media is being consumed in the real-time mode and the condition on the network is sufficiently poor to prevent the real-time consumption of the voice media in real-time using the network efficient protocol. The apparatus, which may be a communication device or a server, implements the above-described method.
The invention may best be understood by reference to the following description taken in conjunction with the accompanying drawings, which illustrate specific embodiments of the invention.
It should be noted that like reference numbers refer to like elements in the figures.
The invention will now be described in detail with reference to various embodiments thereof as illustrated in the accompanying drawings. In the following description, specific details are set forth in order to provide a thorough understanding of the invention. It will be apparent, however, to one skilled in the art, that the invention may be practiced without using some of the implementation details set forth herein. It should also be understood that well known operations have not been described in detail in order to not unnecessarily obscure the invention.
The term “media” as used herein is intended to broadly mean virtually any type of media, such as but not limited to, voice, video, text, still pictures, sensor data, GPS data, or just about any other type of media, data or information.
As used herein, the term “conversation” is also broadly construed. In one embodiment, a conversation is intended to mean a thread of messages, strung together by some common attribute, such as a subject matter or topic, by name, by participants, by a user group, or some other defined criteria. In another embodiment, the messages of a conversation do not necessarily have to be tied together by some common attribute. Rather one or more messages may be arbitrarily assembled into a conversation. Thus a conversation is intended to mean two or more messages, regardless if they are tied together by a common attribute or not.
Referring to
The communication services network 12 is IP based and layered over one or more communication networks (not illustrated), such as the Public Switched Telephone Network (PSTN), a cellular network based on CDMA or GSM for example, the Internet, an intranet or private communication network, a tactical radio network, a satellite radio network, a first responder network, or any other communication network, or a combination thereof. In various embodiments, the communication services network 12 is either heterogeneous or homogeneous.
One or more legacy communication devices 26 are connected to the circuit switched network 14 through either wired or wireless connection(s) 28, as is well known in the art. In various embodiments, the legacy device(s) may include conventional land-line telephones, mobile or cellular phones, PTT radios, satellite phones or radios, desktop or mobile computers, or any combination thereof.
One or more client application 30 enabled communication devices 32 are coupled to the communication services network 12 through an IP-based network connection 34. Depending on the type of communication device 32, the connection 34 may be wired (e.g., Ethernet) or wireless (e.g., Wi-Fi, PTT, radio, satellite, CDMA, GSM, etc.). In various embodiments, the client application 30 enabled communication devices 32 may be any type of telephone, including both land-line, cellular or mobile phones, a PTT radio, satellite based communication device, any type of computer, including but not limited to desktop, laptop, note pad computers, or any other type of wired or wireless communication device.
The client application 30 is a messaging application that operates in a time-shifted mode, a real-time mode, and provides the ability to seamlessly transition between the two modes. With the client application 30, both inbound and outgoing media is simultaneously and progressively stored as it is either (i) received over a network connection 34 at the communication device 32 or (ii) created on the communication device 32 and transmitted over the network connection 34. The storage of the media allows the participants to converse in a time-shifted mode, providing a user experience similar to conventional messaging systems (e.g., email or voice Short Messaging Service (SMS)). The simultaneous and progressive nature of the application
The simultaneous and progressive storage of both transmitted media as it is being created and received media as it is being received further enables a host of rendering options. Such rendering options include, but are not limited to: the real-time rendering of media as the media is received over the network connection 34, pause, replay, play faster, play slower, jump backward, jump forward, catch up to the most recently received media, Catch up to Live (CTL), or jump to the most recently received media. As described in more detail below, the storage of media and certain rendering options allow the participants of a conversation to seamlessly transition the conversation from the time-shifted mode to the real-time mode and vice versa. In addition, the client application 30 is capable of supporting multiple types of media, including but not limited to, voice, video, text, still pictures, sensor data, GPS data, or just about any other type of media, data or information.
Referring to
The MCMS module 40 includes a number of modules and services for creating, managing and conducting multiple conversations. The MCMS module 40 includes a user interface module 40A for enabling the user to interface and control the audio and video rendering and creating functions on the device 32, rendering/encoding module 40B for performing rendering and encoding tasks, a contacts service 40C for managing and maintaining information needed for creating and maintaining contact lists (e.g., telephone numbers and/or email addresses), a presence status service 40D for both sharing the online status of the user of the device 32 as well as the online status of the other users on the communication services network 12. The MCMS data base 40E stores and manages the meta data for messages and conversations conducted using the application 30 running on a device 32 as well as contact and presence status information. In alternative embodiments, the MCMS database 40E may include, but is not limited to, relational databases, file-based databases, object databases, document-oriented databases, or any other type of database and/or database management system that is capable of storing and retrieving data.
The Store and Stream module 42 includes a Permanent Infinite Memory Buffer or PIMB 46 for storing, in an indexed format, the media of received and sent messages. The store and stream module 42 also includes an encode-receive module 42A, net receive module 42B, transmit module 42C and a render module 42D. The encode-receive module 42A performs the function of receiving, encoding, indexing and storing in a time-indexed format in the PIMB 46 media created using the client application 30 on device 32. The net receive module 42B performs the function of indexing and storing in the time-indexed format in the PIMB 46 the media contained in messages received from other devices 32 or 26 through a gateway client 16. The transmit module 42C is responsible for both storing in the PIMB and transmitting to recipients the media of messages created using the device 32. The render module 42D enables the client application 30 to render on device 32 the media of messages, either in the near real-time mode as media is received over the network 12 or in the time-shifted mode by retrieving and rendering the media stored in the PIMB 46.
The MCMS module 40 and the Store and Stream module 42 also each communicate with various hardware components 48 provided on the device 32, including, but not limited to, encoder/decoder hardware 48A, network interface 48B for connecting the device 32 to network connection 34, and media drivers 48C. The encoder/decoder hardware 48A is provided for encoding the media, such as voice, text, video or sensor data, generated by a microphone, camera, keyboard, touch-sensitive display, GPS, sensor, etc. provided on or associated with the device 32 and decoding similar media before it is rendered on the device 32. The media drivers 48C are provided for driving the media generating components, such as speaker and/or a display (not illustrated) after the media has been decoded. The network interface 48B is provided for the connecting device 32 to a network connection 34, either through a wireless or wired connection. Although not illustrated, the client application 30 runs or is executed by an underlying processor embedded in device 32, such as a microprocessor or microcontroller.
The transmitted and received media stored in the PIMB 46 is persistently stored. The term persistent storage as used herein is intended to have broad meaning. In various embodiments, persistent storage is intended to mean the storage of media and meta data from just beyond transient storage needed to either transmit or render media in real-time to storage for an indefinite period of time. The term persistent storage therefore may have different meanings, depending specific implementations or embodiments.
In addition, “real-time” is intended to mean the consumption or rendering of time-based media (i.e., media that changes over time) as the media is being transmitted, regardless if the media is “live” or not. The real-time consumption of “live” media is intended to mean the rendering of time-based media as the media is being created and transmitted. The real-time consumption of non-live media is intended to mean the consumption of previously recorded time-based media that is being transmitted out of storage.
Referring to
(i) The Receipt of Media: Media received from the communication services network 12 is simultaneously and progressively stored in the PIMB 46 by the net receive module 42B as the media is over the network connection 34, as designated by arrow 50, regardless if the media is to be rendered in real-time or in the time-shifted mode. When in the real-time mode, the media is also simultaneously and progressively provided by the net receive module 42B, as designated by arrow 52, to the render module 42D. In response, the render module 42D simultaneously and progressively renders the media as it is received over connection 34 on a media-rendering device (e.g., a speaker and/or display). In the time-shifted mode, the media is retrieved by the render module 42D, as designated by arrow 54, from the PIMB 46 at an arbitrary time after it was persistently stored. The retrieved media is then rendered on the media rendering devices, such as a speaker and/or display. In this manner, the recipient of the media may review persistently stored media at any time after storage in the time-shifted mode.
(ii) Transmitting Media: Media created on device 32 by a media creating device (e.g. a microphone, keyboard, video and/or still camera, a sensor such as a thermometer or GPS, or any combination thereof) is progressively stored in the PIMB 46 in a time-indexed format as it is created, as designated by arrow 58. In most situations, the media is also provided, as designated by arrow 56, to the transmit module 42C, which simultaneously and progressively transmits the media as it is created. In other situations, media may be transmitted by transmit module 42C out of the PIMB 46 at some arbitrary time after it was created, as designated by arrow 60. Transmissions out of the PIMB 46 typically occur when media is created when the device 32 is disconnected from the network 12. When the device 32 reconnects, the media is read from the PIMB 46 and transmitted by the transmit module 42C.
As a clarification, the media creating devices (e.g, microphone, camera, keyboard, etc.) and media rendering devices (e.g., speaker and display) as illustrated in
Referring to
The routers 20 communicate with other routers 20, to header stores 62 for read and/or write operations, to body stores 64 for read and/or write operations. Routers 20 are further responsible for updating routing tables and maintaining the presence status information of users on the network 12. Routers 20 also perform a number of security functions, including authentication, encryption, and authorization.
In a non-exclusive embodiment, the one or more servers 18 on the network 12 are highly configurable and scalable. For example, if a large number of users subscribe to the services provided by the network 12, then a large number of routers 20 may be needed. If a server 18 routes a high volume of traffic, but the messages tend to be relatively short in duration (e.g., contain minimal media), then the number of header stores 62 may be increased relative to the number of body stores 64. Alternatively, if the traffic handled by a server tends to have large amounts of media (i.e., the messages are long in duration), then more body stores 64 may be needed. Further, the number of servers 18 included on the network may be increased or reduced as needed.
On the network 12, each of the server(s) 18 subscribe to all of the header and body data for a given user and/or group of users. As a result, if a server 18 that holds the header and/or body information for a user becomes unavailable, a router 20 may be able to locate another server 18 to obtain the data. In other embodiments, one or more users, servers or any other entity may subscribe based on the domain of user(s), defined sets of users, media type, codec type used to encode the media, conversation name, conversation subject or topic, time range, or any other type of defined criteria.
Client application 30 enabled devices 32 communicate with one another using individual message units, referred to herein as “Vox messages”. By sending Vox messages back and forth over the communication services network 12, the users of the devices 32 may communicate with one another, either in the real-time mode or in a time-shifted messaging mode, and with the ability to seamlessly transition between the two modes.
There are two types of Vox messages, including (i) messages that do not contain media and (ii) messages that do contain media. Vox messages that do not contain media are generally used for message meta data, such as media headers and descriptors, contacts information (telephone numbers or email addresses), presence status information, etc. Message meta data includes such attributes as a message identifier or ID, the identification of the message originator, a recipient list, and a message subject. The identifier information may be used for a variety of reasons, including, but not limited to, building contact lists, associating media with messages, and/or associating messages with conversations. The Vox messages that contain media are used for the transport of media. In one embodiment, messages containing text media may include both meta data and the text media, whereas messages containing time-based media, such as voice or video, do not contain meta data.
In one embodiment, Vox messages are layered on top of the application layer of whatever transport protocol or protocols are used on the underlying network infrastructure below the network 12. As a result, a new transport protocol for Vox messages is not needed. Rather, Vox messages are transmitted and routed across the network 12 using current transport protocols running over the existing telecommunications infrastructure.
The presence status information contained in Vox messages may be used to identify the users that are currently authenticated by the system 10 and/or if a given user is reviewing a message in real-time or not. The presence data is therefore useful in determining, in certain embodiments, how messages are delivered across the network 12. In situations where the presence status indicates an authenticated user is reviewing a message in real-time for example, then a transport protocol optimized for timely (i.e., real-time) delivery, such as UDP, may be used, whereas a transport protocol optimized for efficient delivery of messages, such as TCP, may be used when the presence status indicates the authenticated user is not reviewing the message in real-time.
Referring to
In the initial decision 82, a transmission loop is defined at the sending node and the sending node determines if there is any media available for transmission. If not, step 82 is continually repeated until media becomes available for transmission. When media is available, it is next determined (decision 84) if the media is or will be consumed in real-time based on the presence information of the recipient(s). If the presence information of all the recipient(s) indicates none are reviewing in real-time, then the media is transmitted using a network efficient protocol (step 86). If one or more of the recipients is reviewing the media in real-time, then the transmitting node determines if the condition on the network (decision 88) is sufficient for transmitting the media at a rate sufficient to support real-time consumption using the network efficient protocol.
If the condition on the network is sufficient for supporting real-time communication using the network efficient protocol, then the transmitting node continues transmitting using the network efficient protocol (step 86). If the condition is not sufficient, however, then the transmitting node transmits the media using a loss tolerant protocol (step 90).
With media that is transmitted using the loss tolerant protocol, it is determined in decision 92 if the condition on the network is sufficiently good enough (i.e., the rate of media loss is minimal) to support real-time communication. If the condition on the network is not sufficient, meaning real-time communication is not possible or practical using even the loss tolerant protocol, then the transmitting node stops using the loss tolerant protocol. Instead, the media is transmitted using the network efficient protocol (step 86) as the condition of the network permits. On the other hand if the condition on the network is sufficiently good enough to support real-time communication, then the media is transmitted using the loss tolerant protocol.
In decision 94, it is determined if the message has ended or if the recipient is no longer reviewing in the real-time mode. If neither condition is met, then another transmission loop is defined (decision 82) and the above process is repeated. When either condition is met, the media originally transmitted using the loss tolerant protocol is retransmitted when network conditions permit using the network efficient protocol, which guarantees the eventually delivery of a complete copy of the media as originally encoded. In most situations, the retransmission occurs when bandwidth on the network is available beyond what is needed to support real-time communication. In one embodiment, a complete copy of the media previous sent using the loss tolerant protocol is retransmitted. In an alternative embodiment, just the missing media is transmitted.
The aforementioned process is continually repeated while media is available for transmission. With each cycle, a transmission loop is defined and the above-described process is repeated.
The transmission protocol as described above with respect to
In a specific embodiment using TCP and UDP, the transmission protocol as described above with respect to
The transmission protocol as described above with respect to
Referring to
The timing diagram is intended to illustrate the asynchronous nature of the transmission flow diagram illustrated in
The transmission of media between communication devices 32, as described above, occurs through one or more servers 18 on the network 12. In an alternative pier-to-pier embodiment however, it would be possible for two communication devices 32 to communicate directly with one another. With this embodiment, the flow and timing diagrams of
It also should be noted that for the sake of simplicity, the transmission of messages as described above has been “one-way”. It should be understood that the transmissions of media, using either the network efficient and/or the loss tolerant protocol, is often bi-directional or “two-way”. With two-way communication in the real-time mode, the user experience is similar to a conventional full-duplex conversation. In addition, bi-directional communication can also take place between multiple parties (i.e, more than two), similar to a multi-party conference call.
By using either the network efficient or loss tolerant protocol, depending on if media is being consumed in real-time and/or on the condition of the network, transmissions are optimized for real-time when needed or for efficient delivery when real-time delivery is not critical or network conditions are sufficiently good to support real-time communication using the network efficient protocol. On the other hand when the media is being reviewed in real-time and network conditions are poor, the loss tolerant protocol is typically used to extend or enhance the ability to conduct real-time communication. Since any missing or media transmitted using the loss tolerant protocol is eventually retransmitted using a network efficient protocol that guarantees the delivery, the recipient eventually receives a complete copy (i.e. a full bit rate representation as the media was originally encoded). The full bit rate representation of the media typically replaces any missing, defective or out of order representations of media previously received and stored in the PIMB 46 of the receiving device 32. In this manner, the recipient may review the complete, full quality, version of the media in the time-shifted mode at a later arbitrary time.
In one embodiment, TCP is used as the network efficient protocol, while UDP is used as the loss tolerant protocol. In other embodiments, TCP is used as the network efficient protocol, while the Cooperative Transmission Protocol (CTP) described in U.S. application Ser. No. 12/192,890, incorporated by reference herein, is used as the loss tolerant protocol. In other embodiments, any network efficient or loss tolerant protocol may be used.
Although many of the components and processes are described above in the singular for convenience, it will be appreciated by one of skill in the art that multiple components and repeated processes can also be used to practice the techniques of the system and method described herein. Further, while the invention has been particularly shown and described with reference to specific embodiments thereof, it will be understood by those skilled in the art that changes in the form and details of the disclosed embodiments may be made without departing from the spirit or scope of the invention. For example, embodiments of the invention may be employed with a variety of components and should not be restricted to the ones mentioned above. It is therefore intended that the invention be interpreted to include all variations and equivalents that fall within the true spirit and scope of the invention.
This application claims the benefit of U.S. Provisional Patent Application No. 61/323,609, filed Apr. 13, 2010, entitled “Communication Services Network and Client Enabled Communication Devices,” which is incorporated herein by reference for all purposes.
Number | Date | Country | |
---|---|---|---|
61323609 | Apr 2010 | US |