The present invention relates to a system and method for controlling a server over a network or mobile network and in particular reducing latency for more effective control of the server.
Wireless communications between a mobile device and a remote server typically include the use of mobile networks. Whilst increasing data transfer rates across such networks may improve the delivery of multimedia and other content, for such interactions to include the use of real-time control between remote devices and a server in an effective manner, then latency becomes a limiting factor.
The Tactile Internet-Applications & Challenges; G. P. Fettweis; IEEE Vehicular Technology Magazine March, 2014, discusses latency requirements of different systems and some benefits of reducing latency in a mobile network.
For a user of a mobile device to be able to operate or control a program operating remotely or in a cloud-computing environment, then a dramatic reduction in latency in mobile networks is required. Therefore it is currently not possible to effectively run applications, for example word processors or spreadsheets, from a mobile device where the program executes in a cloud environment. Ideally, the latency between issuing a command on a mobile device (e.g. using a touchscreen) should be less than the order of 1 ms. The current best achievable latency is of the order of 20 ms. This translates to a displacement error or delay in tracking movements across a touch screen of about 2 cm. It is clearly not possible to operate or control remote programs with such latency or delay in command execution.
Reducing latency may be possible but at the expense of other users of the network or with much tighter control of the mobile network. Further reductions may be possible by moving servers closer to mobile base stations but this lacks practicality. Therefore, there is required a system and method for overcoming these limitations and to allow effective control by mobile users of remotely operated servers and programs running on them.
The general concept is for a server or cloud-based service to predict a control command issued by a user of a mobile device based on their interaction with the device before the command is issued or fully completed or fully received by the server. Data describing this initial, preliminary or early interaction is sent to the server or cloud-based service. In the meantime, the user may complete the command but the server has already reacted or been provided with a head start in order to prepare for the command and trigger a corresponding event or action. Therefore, the apparent latency within the network can be mitigated or its effect reduced because the event or action is triggered by the server or cloud-based service earlier than if it had to wait for confirmation of the completed command (and so for the user to wait for their desired function to execute). Alternatively, the prediction may be carried out by the mobile device and this result or related signalling is sent to the server (preferably with an indication of confidence in this prediction).
Against this background and in accordance with a first aspect there is provided a method for controlling a program executing within a server or preferably a cloud-based server from a mobile device, the method comprising the steps of:
Optionally, the event may be an event triggered by a server to occur within the mobile device. However, the event may also be triggered in another device or on the server, for example.
Advantageously, the step of determining the next user interaction may be further based on the user's previous interactions with the user interface. A history of the user's interactions may be built up in order to learn and improve the accuracy of predicted next interactions or selections by the user.
Preferably, the user interface may be a graphical user interface, GUI.
Preferably, the data describing the GUI may include positions of one or more objects within the GUI.
Advantageously, the GUI may be a touchscreen GUI and further wherein the user interaction is a touch or multi-touch gesture or part of a touch or multi-touch gesture.
Preferably, determining the next user interaction may further comprise determining a trajectory of a gesture on the touchscreen GUI. This may be achieved in various ways such as by using trigonometry, extrapolation, artificial intelligence or machine learning algorithms, for example.
Optionally, determining the trajectory may further comprise the step of determining an intersection with an object within the touchscreen GUI.
Optionally, the method may further comprise determining an intersection of the trajectory with a plurality of objects within the touchscreen GUI and triggering an event for each of the plurality of objects.
Optionally, the method may further comprise the step of synchronising the user's further captured interactions with the user interface and the determined next user interaction. For example, the server trigger is not executed until the finger has reached the intended target. In other words, should the determined next user interaction not correspond with the actual next user interaction then a correction may be made.
Preferably, the information from a mobile device describing the user interaction with a user interface of the mobile device may be received over a network. The network may be a mobile network other wireless network, or fixed network for example.
According to a second aspect, there is provided a server or a cloud-based server comprising logic configured to:
Advantageously, triggering the event may further comprise issuing a response to the mobile device. This response may be issued across the network or mobile network.
Optionally, the logic may be further configured to transmit additional data with the response.
Preferably, the additional data may include a value corresponding to a prediction accuracy of the determined next user interaction. In other words, the server may determine an accuracy or confidence level (e.g. percentage) of the determined next user interaction and therefore that the event being triggered is the event that the user requires. In an example, the event may only be triggered if the accuracy or confidence level is above a predetermined threshold (e.g. 50%).
According to a third aspect, there is provided a mobile application comprising logic configured to:
Optionally, the logic may be further configured to predict the event associated with the object and further wherein the data describing the user interaction identifies the predicted event. In other words, the mobile application may carry out the prediction and transmit this prediction to the server or the mobile application may only capture the user interactions, send these to the server and allow the server to make the determination or prediction.
The mobile application may be a word processor, spreadsheet, presentation viewer and editor or other similar productivity tool, for example.
The methods described above may be implemented as a computer program comprising program instructions to operate a computer. The computer program may be stored on a computer-readable medium.
The computer system may include a processor such as a central processing unit (CPU). The processor may execute logic in the form of a software program. The computer system may include a memory including volatile and non-volatile storage medium. A computer-readable medium may be included to store the logic or program instructions. The different parts of the system may be connected using a network (e.g. wireless networks and wired networks). The computer system may include one or more interfaces. The computer system may contain a suitable operation system such as UNIX, Windows (RTM) or Linux, for example.
It should be noted that any feature described above may be used with any particular aspect or embodiment of the invention.
The present invention may be put into practice in a number of ways and embodiments will now be described by way of example only and with reference to the accompanying drawings, in which:
It should be noted that the figures are illustrated for simplicity and are not necessarily drawn to scale. Like features are provided with the same reference numerals.
Applications or programs requiring low (or lower) latency may be deployed in a centralized or distributed (edge) cloud. In one example, MS Office (RTM) applications, where the user may be running a program in the cloud from a mobile device, may be controlled from the mobile device. Latency requirements for this type of application are related to the physiological tactile reaction but are typically of the order of 1 ms. In current mobile networks (e.g. using LTE) the best achievable latency, calculated assuming the content at the edge, is of the order of 20 ms, which results in a displacement of 2 cm between the position of the finger and the visual feedback of the application or apparent triggered event or function.
This latency may be reduced by predicting the user actions, commands and/or requests and triggering the corresponding actions in the cloud software in advance or at least earlier.
A mobile device has a graphical user interface (GUI) which is interfaced with a server preferably in the cloud. Preferably, the cloud server holds information about the GUI, such as a history, e.g. regarding user input(s) into the GUI. This may also or instead be held within the mobile device. Based on the input sequences or user interactions, the cloud server can trigger responses anticipated or predicted to be required by the user of the mobile device. The mobile device may choose to perform the particular triggered response or one of a selection offered by the cloud server, for example. The server may execute a program, which is controlled by the mobile device. The program may cause events to occur. These events may occur on the mobile device, in the server or elsewhere. The events may be computing actions, transmission of data from one entity to another or cause another program to carry out an operation or event.
Prediction of finger movement or other user interactions with the mobile device may occur at either or both the mobile device and at the server. In this way, services may be provisioned at the mobile device but where some or all of the necessary processing is carried out by the server.
In order to determine the relevance or accuracy of the prediction, intelligence or logic may be utilised at the mobile based upon the confidence of the prediction performed at the cloud. For example, where the cloud server reports a confidence of 50% or below when providing a service, the mobile device may reply with a not acknowledged (NACK) signal and seek (or wait for) a second data set or triggered event from the cloud. In an alternative embodiment the server reports a “hold” message or a “discard” message.
As a further enhancement, synchronization may be achieved between visual feedback directed to the user on the mobile device (i.e. at the client side) and the motion of the user's finger (especially when there are errors in the predicted event). The overall effect for the user is ideally to observe a visual or other feedback within 1 ms of their finger reaching object 40.
Functional features to achieve this may be present at the mobile device (user equipment, UE) 10, within the network infrastructure, and/or at the server (e.g. cloud server).
These features or components:
A further enhancement allows motion prediction to take place with multiple targets or objects 40. This is illustrated with regards to
Motion Prediction with Multiple Targets
At the cloud server:
At the mobile device 10:
Any or all of the previous embodiments may be further extended or enhanced by allowing motion prediction with different types of motion estimation at the mobile device 10. This further enhancement is shown schematically in
In one example implementation the system provides a prediction based on a “swipe” user interaction on the touchscreen 20 of the mobile device. Such an action is demonstrated in
In one example, these objects 40, 160 may be selected to play different audio tracks (i.e. instruct the server to provide these tracks). The events may be the initialisation of downloads on the mobile device 10 of two different tracks. The mobile device 10 may determine, based on its own data or data accompanying the events that the track associated with object 40 is more likely to be requested by the user and so discard the initial download of the track associated with object 160. Nevertheless, latency may be reduced as both tracks start to download (or at least processing of these alternative requests may start) before the user interaction completes.
Other example user interactions may include taps or multiple taps, dropdown list scrolling, pinch-to-zoom in/out, select text, copy, paste, cut, insert text/images, etc.
User interactions are not limited to the use of the touchscreen 20. For example, other components may be used to capture user interactions including a camera, proximity sensors 300 (as shown in
Any or all of the embodiments may include the network cooperating with the cloud server so that the information may be exchanged between the server and one or more mobile devices with a higher priority and therefore further reducing latency. This higher priority communication may be limited to this specific communication type and then revert to normal priority or latency so as to reduce or minimise adverse effects on other mobile users in the network.
In the case of multiple or alternative feedbacks or event triggers (see
In an example architecture, functions to be included in the different nodes include:
At the server:
A mechanism for predicting the finger's motion based on:
A mechanism for sending one or multiple updated data sets to the UE; and/or
A set of application programming interfaces (APIs) to expose the prediction capabilities to over-the-top (OTT) content providers.
At the base station:
A mechanism to prioritize the messages or event triggers based on confidence levels or other criteria.
At the mobile device 10:
A mechanism to synchronise the information sent by the server with the user's actual (or eventual) motion or interaction.
A mechanism for checking the coherence of the information received by the cloud server with respect to the user motion or interaction.
A mechanism to combine and select between different information sets sent by the cloud server.
A set of APIs to be exposed.
In any or all of these embodiments, the mechanism for predicting user or target object 40 may be realised at the mobile device 10 rather than at the cloud or server.
Benefits of this system and method include the user perceiving a lower latency between their actions and the reactions triggered by the cloud or server. For example, these functions allow anticipating an icon that will be clicked by the user and putting into effect (earlier) the visual feedback or other event triggered by the click in advance. The synchronisation function introduced at the mobile device 10, may allow the visual feedback or triggered event too arrive with reduced delay.
Further possibilities of all embodiments include exposing some APIs to the over-the-top (OTT) content or service providers. For example, in order to provide low latency services for MS Office (RTM), this idea may involve opening some APIs to Microsoft.
Line 410 indicates the user's finger motion across the touchscreen 20 of the mobile device 10, which is used to control the program on the server. Data describing this motion is transmitted to the server (arrow 420), which encounters some delay or latency as the data propagates through the mobile network.
Line 430 represents the time take for the server or cloud to calculate an event, which may be a set of data required by the mobile device 10. The event may result in requested data being transmitted (over the mobile network) to the mobile device 10. Transmitting these data is represented by line 440. Line 450 represents the event (e.g. presenting data) occurring on the mobile device. Therefore, the overall latency may be represented by the distance (time) 460.
This scenario may be compared with that of
In the example scenario of
In this example, the triggered event is determined before the user interaction completes and the event trigger (e.g. sending data) also commences before the user interaction completes. The event being transmitted to the mobile device 10 over the network is illustrated by line 440, starts before the complete user interaction has been received by the server in this example. Again, line 450 represents the event (e.g. presenting data) occurring on the mobile device 10 but in this case the latency 560 is reduced. During the time that the event occurs on the mobile device 10 (e.g. displaying new data) then a coherence test between the predicted event and the actual event associated with the object 40 takes place to provide synchronisation.
The server sends an early predicted event (which may be the transmission of data) to the mobile device at line 640. This may be accompanied by optional metadata describing a confidence level in the prediction. These metadata may further include a destination location or object on the touchscreen 20 associates with the prediction. The metadata may further include a hold message to the mobile device 10 if it is above a predetermined likelihood or probability that further event transmissions may occur (e.g. low probability of location prediction accuracy).
The mobile device 10 may send back to the server a non-acknowledgement (NACK) (e.g. via an application layer or from elsewhere) when the mobile device (or an application running on the mobile device 10) determines the prediction to be invalid. The server transmits a further or updated event (data set) to the mobile device at line 640′ when further data about the user interaction is received. This may be accompanied with or without a hold message depending on if the prediction confidence is greater than the pre-determined threshold (e.g. 50%).
Again, the event occurs over line 650 and latency is shown as 660. When the initial prediction is correct (but with a low confidence level) an advantage may be gained because the correct event including associated data may have started transmission earlier but simply not actioned (due to the low confidence or probability level).
The processor 930 also detects user interactions with the user interface 940 and sends data describing the user interaction or interactions to the server 710. These data may be unprocessed user interaction or some or all of the prediction may take place within the processor 930 executing the app 910. In this case the data describing the user interactions may be in the form of a predicted object and/or triggerable event. This occurs before the user interaction encounters the object (i.e. typically before the user interaction completes).
The app 910 may be configured to receive the event trigger from the server and then trigger the event (e.g. play content received form the server).
The app may execute within a suitable mobile operating system. Example operating systems include iOS and Android, although others may be used. The server may be of any suitable type and running a suitable operating system such as Windows, Linux, UNIX, or Solaris, or other software, for example.
As will be appreciated by the skilled person, details of the above embodiment may be varied without departing from the scope of the present invention, as defined by the appended claims.
For example, the mobile device may be a smartphone, tablet computer, laptop computer or any device that can interaction with a network. The network has been described as a mobile network (e.g. GSM, CDMA, LTE, 2G, 3G or 4G). This may include cellular networks or any network, especially where latency needs to be reduced to improve user experience or control.
Many combinations, modifications, or alterations to the features of the above embodiments will be readily apparent to the skilled person and are intended to form part of the invention. Any of the features described specifically relating to one embodiment or example may be used in any other embodiment by making the appropriate changes.
| Number | Date | Country | Kind |
|---|---|---|---|
| 1408747.2 | May 2014 | GB | national |