The present disclosure relates generally to massively and geographically distributed systems and, more particularly, to systems and methods for establishing a self-stabilizing feedback controller for such massively and geographically distributed systems.
Distributed computing systems are computational systems that include a plurality of computing components that are located in different geographical locations, be they relatively local or trans-national, which coordinate for the purposes of a common task or action, by passing messages to one another. Such distributed systems are commonly used in a variety of industries for a variety of tasks, such as service-oriented architecture (SOA) based systems, online gaming software and systems, and digital advertising platform components, among other things.
Many systems utilize feedback controllers to allow the systems to react to changing environments associated with a task, while maintaining internal states and/or system outputs, within desired ranges. Such feedback controllers may be implemented as software elements, hardware elements, or a combination of the two and are based on direct or indirect measurements of the internal states of the given system. In some example, large scale, distributed systems, the state can change across different subcomponents much faster than it can bet synchronized, due to delays inherent in synchronization. This may be of particular concern in non-homogenous, massively distributed systems where internal states, in different parts of the system, are frequently operating in very different regimes and the distribution of state is very non-regular.
However, the use of feedback controllers in distributed systems is rare, as synchronization issues may arise. For many distributed systems, the delay in synchronizing state variables over many different subcomponents can prevent standard feedback controllers from operating at optimal efficiency. In such scenarios, system architects may be forced to chose between systems that are stable, but slow to respond (e.g., a system that updates no more frequently than the minimum synchronization interval) versus a fast-reacting solution, which may potentially become unstable.
Prior attempts to solve the stability versus speed conundrum have resulted in solutions that limit the state update frequency to the minimum synchronization speed. Such controllers can always update based on consistent information, thus remaining stable; however, doing so limits the speed at which the system can react to its environment. Alternatively, some controllers allow the state to be updated faster than the synchronization speed, by using fixed control parameter, yet, this controller is prone to instability, due to a delay in the feedback of control changes.
Accordingly, a distributed system feed back controller that allows a control variable to be updated as frequently as the underlying state variable updates events, while still maintaining stability, is desired.
In accordance with an embodiment, a system of synchronized computing devices, connected via a common network and configured to operate one or more common computing tasks across the system, is disclosed. The system includes a plurality of subcomponent computing devices, each including, at least, a non-transitory, machine readable storage medium, each of the storage media of the plurality of subcomponent computing devices storing instructions associated with the one or more common computing tasks. The system further includes one or more processors, each of the one or more processors associated with one or more of the plurality of storage media, each of the one or more processors configured to execute instructions which, when executed, at least, output a plurality of events occurring within the context of the one or more computing tasks throughout the system. The system further includes at least one synchronization controller, operatively associated with one or more of the plurality of subcomponent computing devices, configured to receive the plurality of events from the one or more processors, to determine a continually updating state variable and a continually updating sum of error terms based on a symmetric function of the one or more events, provide the continually updating value of interest, subject to a time delay, wherein the time delay is a time period having a length substantially longer than an average time interval between two consecutive members of the plurality of events.
In accordance with another embodiment, a method for synchronizing a plurality of computing device is disclosed. Each of the plurality of computing devices is connected via a common network and configured to operate one or more common computing tasks, amongst the plurality of computing devices. The one or more computing tasks includes, at least, a plurality of events. The method includes outputting, using a processor of at least one of the plurality of computing devices, the plurality of events to at least one synchronization controller. The method further includes determining, using a processor associated with the at least one synchronization controller, a continually updating value of interest, based on a symmetric function of the plurality of events. The method further includes providing, using the processor associated with the at least one synchronization controller, the continually updating value of interest, subject to a time delay, wherein the time delay is a time period having a length substantially longer than an average time interval between two consecutive members of the plurality of events.
In accordance with yet another embodiment, a system for serving online advertisements to a subject to online advertisement is disclosed. The system includes a plurality of synchronized computing devices connected via a common network and configured to operate one or more common advertising data operations across the system. Each of the subcomponent computing devices includes, at least, a non-transitory, machine readable storage medium and each of the storage media of the plurality of subcomponent computing devices stores instructions associated with the one or more common advertising data operations. The system further includes one or more processors, each of the one or more processors associated with one or more of the plurality of storage media and each of the one or more processors being configured to execute instructions, which, when executed, at least, output a plurality of events, occurring within the context of the one or more common advertising data operations throughout the system. The system further includes at least one synchronization controller, operatively associated with one or more of the plurality of subcomponent computing devices, configured to receive the plurality of events from the one or more processors, to determine a continually updating value of interest, based on a symmetric function of the one or more events and provide the continually updating value of interest, subject to a time delay L. L is a time period having a length substantially longer than an average time interval between two consecutive members of the plurality of events.
While the present disclosure is susceptible to various modifications and alternative constructions, certain illustrative examples thereof will be shown and described below in detail. The disclosure is not limited to the specific examples disclosed, but instead includes all modifications, alternative constructions, and equivalents thereof.
Turning now to the drawings and with specific reference to
As depicted, there may be any number (“n” number) of computing device(s) 10A-N, so long as each of the computing devices 10 are connected to one another and function as part of the massively, geographically distributed system 8, to operate one or more common computing tasks across the system 8. As described in the introduction, such distributed systems may be utilized to perform any number of computing tasks across such a plurality of computing device(s) 10, such as, but certainly not limited to, service-oriented architecture (SOA) based systems, online gaming software and systems, and digital advertising platform components, among other things.
Referring now to
As shown in better detail in
One or more of the processors 14 are configured to execute instructions for a synchronization system 30 of the distributed system 10. The synchronization system 30 includes input/output elements 32 at each of the computing devices 10 and a synchronization controller 34, which may be located proximate to and/or may be executed by at least one of the one or more of the processors 14. Accordingly, in some examples the synchronization controller 34 is executed as instructions on each of the one or more processors 14, wherein each of the one or more processors 14 are in continuous operative communication, amongst themselves via the network 9, as depicted in
Further, as best depicted in
The input/output elements 32 are configured to, at least, output a plurality of events occurring within the context of the one or more common computing tasks, throughout the system 8. Within the context of the distributed system 8, each of the computing devices 10 generates such events, which contribute to the change of an internal state of the system 8. Accordingly, such an internal state may be a variable that is expressed as a symmetric function, which is a function that is invariant under the reordering of its variables. In other words, the order of events that are processed by the system 8 does not change the final results and/or final consistent state of the broader system 8; rather, only the processed events are considered. Common examples of symmetric functions include, but are certainly not limited to, counts, means, variances, medians, percentiles, maximums, and minimums.
The synchronization controller 34, distributed amongst one or more of the computing devices 10, is a synchronization mechanism that calculates a running, continually updating, value of a symmetric function of the events associated with the common computing task. As such, the synchronization controller 34 is configured to determine a continually updating value of interest, based on a symmetric function of the events associated with the computing task. Further, the synchronization controller 34 is configured to provide the continually updating value of interest back to each of the computing devices 10 of the system 8, wherein the value of interest is subject to a time delay L, wherein L is a time period having a length substantially longer than an average time interval between two consecutive members of the plurality of events.
In some examples, the synchronization controller 34 may be a proportional-integral-derivative controller (PID controller), which is a control loop feedback mechanism for computing tasks that require continuously modulated control. A PID controller continuously calculates an error value (Δt) for each computing device 10 which is calculated as a scaled difference between a desired target value (R) and a measured state variable (Ut), scaled by a correction based on proportional, integral, and derivative terms. Accordingly, as a PID controller, the synchronization controller 34 is configured to maintain a state variable U for each of the subcomponent computing devices 10, an events frequency factor T, and an integral term ΣΔ.
To control the state value U, the synchronization controller 34 maintains, in addition to U, an events frequency factor T and the integral term ΣΔ. Both U and ΣΔ are symmetric functions of the data from the events of the common computing task. As best illustrated in
K
p(R−Ut)+ΣΔt−Kd(Ut−Ut−1)
where Kp is the proportional gain, Kd is the derivate gain, Ut is the latest state variable value received from the synchronization, Ut−1 is the previous value received from the synchronization controller 34, and R is the reference value or the target value that may change over time at a much slower pace than the synchronization of event results. The events frequency factor T may be an estimation of how many events, over a given period of time, are expected within the context of the computing task(s). Kp may be how much a factor is weighted in the equation, whereas Kd is an estimated change in error. Both Kp and Kd may be tuned to the system 8, either by manual tuning or by a simulated or estimated value.
As best depicted in the more detailed description of functions of an input/output element 32 in
which is the difference between target R and the last measurement Ut, weighted by system integral gain Ki and the frequency factor Tt. The error term is only sent to the synchronization controller 34 and does not update the local value of the integral term. The computing device 10, then, continues to operate at the existing input level (Ut−1) despite the new event being generated, until the synchronization controller 34 sends back a consistent set of values for Ut, Tt, and ΣΔt.
To achieve actual, self-stabilizing control, the frequency factor Tt is updated to approximate the mean number of event updates expected between the successive measurement update intervals of length L, which is the time period having a length substantially longer than an average time interval between two consecutive members of the plurality of events. Accordingly, Tt can be calculated by a variety of methods, including, but not limited to: using an external input representing the expected event count over L; using an approximate count of past events over a certain window size (can be performed on a trailing basis to avoid a recursive synchronization problem); and using an estimate based on one of the current state variables (e.g., a function of Ut or a multiple of the current count as a function of the time of day).
Turning now to
For example, the data operation may be bid pricing and/or associated client ad spend with such bid pricing for serving an online ad to the subject. In such examples, the plurality of events may be a plurality of changes in bid price for serving the advertisement to the subject to online advertising. Further, in such examples, the value of interest may be a bid price to be submitted, via at least one of the one or more processors, to an advertising exchange 60. The bid price may then be submitted to the ad exchange 60, upon request, over the network 9 via one or more transceivers. If the bid price is determined by the advertising exchange 60 to be a “win” or selected bid (decision 62), then a win notification is transferred to the ad exchange 40, and an ad is served with the bid to the ad exchange, for publication at one or more publishers 70A-N.
A combination of hardware and software may be used to implement instructions in association with any of the computing devices 10.
The processor 81 includes a local memory 82 and is in communication with a main memory including a read only memory 83 and a random-access memory 84 via a bus 88. The random-access memory 84 may be implemented by Synchronous Dynamic Random Access Memory (SDRAM), Dynamic Random Access Memory (DRAM), RAIVIBUS Dynamic Random Access Memory (RDRAM) and/or any other type of random access memory device. The read only memory 83 may be implemented by a hard drive, flash memory and/or any other desired type of memory device.
The computer 80 may also include an interface circuit 85. The interface circuit 85 may be implemented by any type of interface standard, such as, for example, an Ethernet interface, a universal serial bus (USB), and/or a PCI express interface. One or more input devices 86 are connected to the interface circuit 85. The input device(s) 86 permit a user to enter data and commands into the processor 81. The input device(s) 86 can be implemented by, for example, a keyboard, a mouse, a touchscreen, a track-pad, a trackball, and/or a voice recognition system. For example, the input device(s) 86 may include any wired or wireless device for connecting the computer 80 to the positioning system 88 to receive positioning signals.
One or more output devices 87 are also connected to the interface circuit 85. The output devices 87 can be implemented by, for example, display devices for associated data (e.g., a liquid crystal display, a cathode ray tube display (CRT), etc.). While depicted, it is certainly possible that an exemplary computer 80 may include no output device(s) 87.
Further, the computer 80 may include one or more network transceivers 89 for connecting to the network 12, such as the Internet, a WLAN, a LAN, a personal network, or any other network for connecting the computer 80 to one or more other computers or network capable devices.
As mentioned above the computer 80 may be used to execute machine readable instructions. For example, the computer 80 may execute machine readable instructions to perform the methods shown in the block diagrams of