The invention relates to the field of communications, particularly the transmission of media streams via a network that uses an Internet protocol.
A common problem when transmitting voice and/or video media over networks that use an Internet protocol is minimizing latency. Too large of a delay between the sending of media by one user and its reception by another user affects the quality of long-distance communication and slows exchanges. In the case of a live conversation, latency prevents the parties from conversing naturally as if they were face to face. Low quality in communication during a live conversation, even if temporary, immediately creates perceptible and detrimental interruptions.
In addition, establishing and maintaining real-time and continuous synchronization between two terminals involves continuously exchanging data. Such an exchange is expensive and consumes energy and computing power, which is particularly detrimental in a context of mobile terminals. Such problems exist in point-to-point communications (involving only two parties) and become critical in multipoint communications (or “conference” mode, involving more than two parties).
In the context of a network using an internet protocol, media exchanges, in particular voice and/or video, generally occur on simultaneous bidirectional channels, also called full-duplex. Equivalently, two simultaneous unidirectional channels (“simplex”) in opposite directions can be used. In order to communicate, the terminals must be equipped with compatible hardware and software, which is complex and expensive.
The Applicant makes use of solutions enabling several parties to communicate with each other via networks using an Internet protocol, particularly more than two parties concurrently. Each terminal is connected to a central unit by means of a bidirectional but alternating channel (not simultaneous), meaning a unidirectional channel in which the direction alternates. This is also called half-duplex communication, as opposed to full-duplex.
In a session, meaning a set of terminals wanting to communicate with one another, only one user can speak at a time. A central unit arbitrates who can speak by assigning speaking authorization to each of the users in turn who request it. The terminal of the user authorized to speak then operates in transmitter mode while the other terminals participating in the session operate in receiver mode.
From the point of view of the users, the operation is reminiscent of:
Half-duplex network communications using an internet protocol work similarly for sessions with two and with more than two parties. The Applicant then allowed a user, via a single terminal, to connect and participate in multiple different sessions at the same time. The terminal connects to multiple sessions by establishing as many half-duplex channels as there are sessions. The amount of data sent and received by the terminal is multiplied by the number of sessions, which can clutter the network, and for example saturate the local area network of the terminal. The volume of data exchanged increases even further in the case of full-duplex channels.
In addition, the usual hardware and software of each terminal are not intended to operate simultaneously in multiple sessions. For example, the security software bundled with the most common browsers requests authorization to access the microphone of the terminal each time a session is joined. Multiple sessions opened through a terminal browser require significant computing power for the terminal processor. At best, this results in slowing down the other software running on the terminal, and often in latencies and dropped connections. The user experience becomes unpleasant. The Applicant has furthermore noted that the other participants in the sessions receive a poor quality media stream, for example with choppy and/or distorted sound. On the network, packet losses and jitter effects are observed. In testing by the Applicant, such defects appear when a terminal participates in more than two simultaneous sessions, which is not satisfactory.
The invention improves the situation.
The Applicant proposes a method for transmitting media between a first terminal and at least one second terminal among a plurality of second terminals, the first terminal being identified with a plurality of session servers while each of the second terminals is identified with at least one session server among said plurality of session servers. Each of the session servers establishes communication channels from one terminal to at least one other terminal among the terminals identified with said session server. The method comprises:
Such a method makes it possible to implement a single media stream, alternating between transmission from the first terminal and reception at the first terminal. Thus, regardless of the number of sessions in which the user of the first terminal is participating, the media stream can be processed by the usual computer means of the first terminal, without needing to adapt the hardware or software. In transmission, the media stream is transmitted via a single channel to the proxy server, regardless of the final destination or final destinations. The proxy server then transmits the media stream to each of the session servers.
The proxy server may be remote from the first terminal, for example located in a part of the network arranged to support several media streams concurrently, while the first terminal may be located in a part of the network with more limited capabilities. For the first terminal, participation in a plurality of sessions is equivalent to participation in a single session. The security software only requests access to the microphone once, regardless of the number of sessions in which the user has speaking authorization. For example, the Applicant has not observed any quality deficiencies when the first terminal is connected to fifteen sessions concurrently.
In another aspect, the Applicant proposes a system for transmitting media between a first terminal and at least one second terminal among a plurality of second terminals, the first terminal being identified with a plurality of session servers while each of the second terminals is identified with at least one of said plurality of session servers. Each of the session servers establishes communication channels from a terminal to at least one other terminal among the terminals identified with said session server. The system comprises a proxy server arranged to:
In another aspect, the Applicant proposes a computer program comprising instructions for implementing the above method, when it is implemented by at least one processor of a proxy server of a system as defined herein.
The following features may optionally be implemented. They may be implemented independently of each other or in combination with each other:
This allows transmitting a single stream from the first terminal, while enabling the user of the first terminal to speak in one or more sessions at a time according to his choices. The workload of the first terminal remains substantially equivalent to what is required to speak in a single session, and the amount of data that must travel over the local area network of the first terminal is also substantially equivalent to what is used to speak in a single session.
This allows the user of the first terminal to remain connected to multiple sessions at a time while receiving a single stream at the first terminal that is selected according to the user's choices. The workload of the first terminal remains substantially equivalent to what is required to listen to a single session, and the amount of data that must travel the local area network of the first terminal is also substantially equivalent to what is used to listen to a single session. Using the source identification data, the first terminal can be set up to inform the user of the identity of the media stream source in parallel with the retransmission of the media.
This makes it possible to use the proxy server as an intermediary for managing speaking authorizations. The resources of the first terminal can be used to transmit or receive a media stream regardless of the number of sessions.
This reduces the risk of several separate and concomitant media streams being sent to the same session, which affects the quality of the communication or involves downstream filtering of the received media streams.
This enables the first terminal, during transmission of the stream, to indicate to the user of the first terminal a list of sessions and/or second terminals actually listening to the first stream.
This makes it possible to limit the resources required for the one-way transmission of streams at each moment, and to avoid monopolizing unneeded resources.
Other features, details, and advantages of the invention will be apparent from reading the following detailed description, and from an analysis of the appended drawings, in which:
The drawings and the following description contain, for the most part, elements of a specific nature. They therefore not only serve to provide a better understanding of the invention, but where appropriate also contribute to its definition.
From a technical point of view, the constraints and characteristics of radiocommunication and telephony systems are generally not transposable to communications by networks that function using an Internet Protocol (IP). Therefore, references to “walkie-talkie” mode are used merely to illustrate a communication mode of alternating turns from the user's point of view.
In the following, we distinguish the nature of the data exchanged by using the term “media” to designate data relating to media content to be transmitted, and terms such as “identification data” or “addressing data” to designate data other than those relating to the media content itself. The volumes/sizes of data relating to media content to be transmitted are generally greater than those of the other data.
We now refer to
In the following, reference 2 and reference 3 are respectively used to designate the second terminals 2A, 2B, 2C, 2D, 2E for the former and the session servers 3A, 3B, 3C for the latter.
In the example described here, the first terminal 1 is connected to each of the session servers 3 via a single proxy server 4. Each of the second terminals 2 is directly connected to one or more session servers 3. “Directly” is understood here to mean without a server similar to the proxy server 4 being arranged between the second terminal 2 and the session server 3. In the example, there is only one first terminal 1. Alternatively, the system may comprise a plurality of first terminals 1 connected by a common or respective proxy server 4. The terms “first terminal” and “second terminal” are used to distinguish terminals connected to the session servers 3 via a proxy server 4 and those that are not. Otherwise the first terminals 1 and second terminals 2 can be similar.
Each of the first and second terminals 1, 2 may comprise any electronic device comprising at least one processor and communication means, and able to connect to a network that functions using an Internet protocol, for example a computer, tablet, or smartphone. Each communication terminal 1, 2 comprises an operating system and programs, components, modules, applications in the form of software executed by the processors, which can be stored in non-volatile memory.
Here, each terminal 1, 2 is available to a user. Each terminal can be considered as equivalent to a user, each of the users being assumed to be at a distance from the others. Each session server 3 is considered equivalent to a session, or a group of users wishing to exchange media streams between them.
Media stream here refers in particular to audio data, and even more specifically to voice data, such that the system 100 can be used by users to hold voice conversations. Alternatively, the media stream may comprise video or text data, or may be a multimedia stream and comprise a combination of at least two types of data among video, audio, and text. In the following, the term “speaking” is used in general to designate the action of monopolizing the role of media stream transmission among a set of terminals 1, 2 of a session, regardless of the nature of the media.
In
Each session may be of short duration and considered temporary, or of long duration and considered permanent. Allocation of resources of the devices forming a session server 3 can therefore be dynamic.
Here, each second terminal 2A, 2C, 2D, 2E is connected to a single server 3, with the exception of terminal 2B which is connected to both session server 3A and session server 3B.
In
In a first step, the user of the first terminal 1 connects the first terminal 1 to the proxy server 4. For example, after having authorized access to the microphone, the browser of the first terminal 1 establishes a single outgoing call with the proxy server 4. Such a call can be implemented by means of known techniques, for example Web Real Time Communication technologies, better known by the acronym WebRTC. One connection by TLS protocol is established from the first terminal 1. Next, the first terminal 1 sends or receives a single media stream at a time. A “SIP over TLS” connection is implemented here. The media uses the WebRTC stack built into the browser of the first terminal 1 and which includes RTP transmission and receiving.
The media stream travels a first half-duplex channel portion between the first terminal 1 and the proxy server 4.
The proxy server 4 transmits the single media stream received from the first terminal 1 to one of the session servers 3. Similarly, the proxy server 4 transmits to the first terminal 1 a single media stream received from a session server 3, among the one media stream received from a session server 3 or among the plurality of media streams received from the respective session servers 3.
The media stream may travel by a second half-duplex channel portion between the proxy server 4 and each of the session servers 3 concerned. The media stream may travel by a third half-duplex channel portion between each session server 3 and the corresponding second terminals 2.
The first terminal 1 transmits a media stream 11 to the proxy server 4, here by RTP and on a half-duplex channel portion. Upon reception of the data stream 11 from the first terminal 1 by the proxy server 4, the proxy server 4 transmits the data stream 11 to at least one of the session servers 3.
In the example described here, the first terminal 1 further transmits addressing data 11′ to the proxy server 4, for example in the form of a set of identifiers (“SessionID”) of the recipient sessions. Such addressing data 11′ may be transmitted, for example, by SIP. The addressing data 11′ are associated with the media stream 11: the addressing data indicate the recipient(s) of the media stream 11.
Upon reception of the media stream 11 associated with the addressing data 11′, the proxy server 4 identifies at least one of the session servers 3, based on the addressing data 11′. Next, the proxy server 4 transmits the media stream 11 to each of the identified session servers 3, here only session server 3A.
In variants, in the absence of addressing data 11′, the proxy server 4 may be arranged to identify default session servers 3 among those with which terminal 1 is identified. For example, when a single session is active for the first terminal 1, in other words when the proxy server 4 is connected to a single session server 3, then the proxy server 4 can identify by default the only possible session server among those with which terminal 1 is identified. When the first terminal 1 is identified with multiple sessions, the proxy server 4 can identify a single session server 3 to which to transmit the media stream 11 based on pre-established priority rules, for example a scheduling of session servers 3 relative to one another. When the first terminal 1 is identified with multiple sessions, the proxy server 4 can also identify, by default, all session servers 3 accessible as recipients of the media stream 11. The proxy server 4 can be configured to reject the request for transmission of a stream from the first terminal 1 when said stream is intended for a session server with which the first terminal 1 is not identified, and to send an error message to the first terminal 1.
Upon receiving the request 31 and if no one has speaking authorization, meaning transmission is not yet monopolized by another terminal 2A, 2B, the session server 3A grants speaking authorization to the first terminal 1. The session server 3A sends, here in SIP:
Upon receiving a speaking authorization request 31 from the first terminal 1, the proxy server 4 sends the request 31 to at least one of the session servers 3. Upon receiving a speaking authorization 33 from at least one session server 3, the proxy server 4 sends the received authorization 33 to the first terminal 1.
Upon receiving the media stream 11, the session server 3A transmits the media stream 11 to each of the other terminals of the session, here the second terminals 2A and 2B. Transmission of the media stream 11 is carried out here by RTP and on half-duplex channel portions.
Second terminal 2A transmits a media stream 22 to session server 3A, here by RTP and on a half-duplex channel portion. When session server 3A receives the media stream 22 from second terminal 2A, session server 3A in turn transmits the media stream 22 to second terminal 2B and to the proxy server 4, here by RTP and on a respective half-duplex channel portion.
Upon receiving the media stream 22 originating from second terminal 2A from session server 3A, the proxy server 4 transmits the received media stream 22 to the first terminal 1.
In the example described here, the second terminal 2A further transmits identification data 22′ to session server 3A, for example in the form of an identifier of second terminal 2A. Alternatively, the identifier of second terminal 2A is established by session server 3A without it being necessary to transmit it from second terminal 2A. Session server 3A transmits the identification data 22′ comprising the identifier of second terminal 2A and adding to it an identifier of the session. Such identification data 22′ may be transmitted, for example, by SIP. The identification data 22′ is associated with the media stream 22: the identification data 22′ indicate the source of the media stream 22.
When received, the proxy server 4 transmits to the first terminal 1 the received media stream 22 and the associated identification data 22′. The identification data 22′ are able to identify the source of the media stream 22 among the session servers 3 and among the second terminals 2. Alternatively, the identification data 22′ comprise only data able to identify the source of the media stream 22 among the session servers 3, or only data able to identify the source of the media stream 22 among the second terminals 2. Alternatively, the identification data 22′ are absent.
In the example described here, the transmission of identification data 22′, or signaling, is performed by superimposed SIP and TLS protocols so that TCP security is provided for the communication. In a prior step, the first terminal 1 is identified with the proxy server 4 during an identification request with a session and a session server 3, by means of an identifier and a temporary password. The temporary password is retrieved by the user of the first terminal 1 from a database, here by XML-RPC, to improve the interoperability of terminals and servers. The session servers 3 identify the first terminal 1 by accessing the temporary password in said database. The first terminal 1 remains identified with the session terminal 3 for the duration of the session, or as long as the connection is not interrupted.
In the example described here, each of the session servers 3 establishes a half-duplex communication channel with each of the terminals identified with said session server 3. The proxy server 4 also establishes a half-duplex communication channel with the first terminal 1. The communication channels are negotiated by SDP when the communication is established. The communication channels are static. In other words, the half-duplex communication channels used are constant during a session between the first terminal 1 and a session server 3, and during a communication between the first terminal 1 and a proxy server 4.
Alternatively, each of the session servers 3 dynamically establishes a half-duplex communication channel from a terminal to at least one other terminal among the terminals identified with said session server 3. In other words, the half-duplex communication channels used change over time, in particular on the basis of the speaking authorizations.
In the two previous scenarios, the media stream 11, 22 is transmitted independently of the sending of the speaking authorization request 31. Alternatively, the source (the first terminal 1 in the case of
We now refer to
Each of the media streams 22 originating from a second terminal 2A, respectively 2C, received by the proxy server 4 from at least one session server 3A, respectively 3B, is associated with identification data 22 ‘. The identification data 22’ are able to identify the source of the media stream 22 among session servers 3A, 3B and/or among second terminals 2A, 2C. In the example described here, the identification data 22′ are able to identify both the transmitting session server 3 and the transmitting second terminal 2.
In the example described here, in the case of receiving at least two media streams 22 originating from at least one second terminal 2A, 2C, from at least two session servers 3A, 3B, the proxy server 4 is arranged to select a single media stream 22 among the received media streams 22. Transmission from the proxy server 4 to the first terminal 1 is applied only to the selected media stream 22. In other words, the proxy server 4 filters the received media streams 22, for example in order to transmit only one. Thus, depending on the parameters settings of the proxy server 4, the first terminal 1 receives only one media stream 22 regardless of the number of media streams 22 received by the proxy server 4.
Selection by the proxy server 4 of the media stream 22 among several streams may be done by applying a set of priority rules. For example, the order of priority of session servers 3 may default to the chronological order of the connection of the first user to each of the sessions. An administrator can set the priority rules, said rules being for example stored in a database accessible to the proxy server 4. The user may also establish an order of priority if so desired, for example via the first terminal 1. The user may also make an on-the-fly selection of which media stream 22 to receive among those available. The proxy server 4 may be arranged to transmit a list of active sessions to the first terminal 1, meaning those for which transmission of a media stream is in progress. The session list may further include other information such as an identifier of the second terminal 2 transmitting in each session. The priority rules may also be defined according to a temporary marker of the speaking authorization in each media stream 22, for example to give priority to selecting the media stream 22 corresponding to the oldest speaking authorization.
In the example of
In a variant of the scenario of
The control data from the first terminal 1 are transmitted to each of the second terminals 2 participating in at least one session in common with the first terminal 1. Such transmission is provided via the proxy server 4 and each of the corresponding session servers 3.
Each of the first terminal 1 and second terminals 2 are further arranged to transmit a “MediaBurstRelease” type of SIP message to release speaking authorization at the end of the transmission of the media stream 22. Upon receiving such a message, each of the session servers 3 transmits a “MediaBurstIdle” type of message to each of the other participants of the session, indicating that someone else can speak. In situations where no speaking authorization has been assigned, no media stream 22 is transmitted between the session participants via the corresponding session server 3. In this case, only control data (RTCP) are exchanged, as represented in
In some embodiments, when a user of the first terminal 1 wishes to join a session, an SIP message is sent from the first terminal 1 to the corresponding session server 3, and via the proxy server 4. Such a message may, for example, be sent via a browser of the first terminal 1, for example after requesting access to the microphone of the first terminal 1. Once the connection is established, a “JoinSession” type SIP message is sent to the proxy server 4. The SIP message may be accompanied by a session identifier and user identification and/or authentication data. The proxy server 4 authenticates the user and then establishes an outgoing call to the session server 3 corresponding to the session to be joined.
When the connection to the session is established, the proxy server 4 acts as a relay between the first terminal 1 and each of the session servers 3. The proxy server 4 transmits all the messages between the user and the session. During the transmission, the proxy server 4 can modify the message, for example by adding data such as the addressing data 11′ and/or the identification data 22′. Such data may be in the form of headers such as “SessionId” or “UserID”. In the example described here, the user identifier comprises a public identifier for the user, for example his or her telephone number, or MSISDN (“Mobile Station-Integrated Services Digital Network”). The data may be transmitted by RTP. The messages from the first terminal 1 are transmitted to the session corresponding to the identifier contained in such a field. The messages from a session server 3 are transmitted to the first terminal 1, with the session identifier added.
Alternatively, other messages may be exchanged. For example:
In some embodiments, the first terminal 1 and the proxy server 4 are far apart from one another. The first terminal 1 and the proxy server 4 communicate via the Internet. For example, the proxy server 4 may belong to a service provider, while the first terminal 1 belongs to a customer of the service provider. In such a configuration, the proxy server 4 can be likened to a service access point for the user of the first terminal 1, a “front end”. The proxy server 4 and the session servers 3 may for example communicate via a private network managed by the service provider.
Thus, the service provider can adapt the private network to requirements, in particular based on the volume of data and streams to be exchanged. The public network (here the Internet) and the local private network of the user of the first terminal 1 must transport a volume of data and of streams that varies little with the number of simultaneous connections of the first terminal 1. The risk of saturating the local area network is reduced. In practice, connections to fifteen simultaneous sessions do not result in problems with communication quality.
Usually, the term “proxy”, or proxy server, refers to a device that serves as a relay between two other devices. Here, the proxy server 4 acts as a relay between the first terminal 1 and the session servers 3. The proxy server 4 could therefore be called a “proxy”. However, in the context of voice over internet protocol (VoIP) transmission, it is customary to designate as “proxy” the device that merely relays data (usually SIP) for establishing, controlling, and ending a session, while the media stream (usually RTP) does not pass through the proxy. It should be noted that in the present context, the proxy server 4 relays the media stream and possibly other data.
Unless incompatible, the various aspects and features described above may be implemented together, separately, or as substitutions for one another. According to one aspect, the invention can be viewed as a method implemented by computer means. According to another aspect, the invention can be viewed as a system capable of implementing the method.
In the preceding examples, each of the communication channels used is half-duplex. This makes it possible to limit the volume of data exchanged at each moment and therefore to save bandwidth in each part of the network. Combined with the proxy server, this reduces the risk of poor communication quality. Alternatively, at least some of the communication channels may be full-duplex.
The invention may take the form of a computer program comprising instructions for implementing a method or part of the methods described above when it is implemented by at least one processor of a terminal 1, 2 or of a server 3, 4.
The invention is not limited to the examples of software, methods, and systems described above only as examples, but encompasses all variants conceivable to those skilled in the art within the scope of the following claims.
Number | Date | Country | Kind |
---|---|---|---|
17 51092 | Feb 2017 | FR | national |