Computer programs are groups of instructions that describe operations, or in other words actions, to be performed by a computer or other processor-based device. When a computer program is loaded and executed on computer hardware, the computer will behave in a predetermined manner by following the instructions of the computer program. Accordingly, the computer becomes a specialized machine that performs tasks prescribed by the instructions.
A programmer using one or more programming languages creates the instructions comprising a computer program. Typically, source code is specified or edited by a programmer manually and/or with help of an integrated development environment (IDE) comprising numerous development services (e.g., editor, debugger, auto fill, intelligent assistance . . . ). By way of example, a programmer may choose to implement source code utilizing an object-oriented programming language (e.g., C#®, Visual Basic®, Java . . . ) where programmatic logic is specified as interactions between instances of classes or objects, among other things. Subsequently, the source code can be compiled or otherwise transformed to another form to facilitate execution by a computer or like device.
A compiler conventionally produces code for a specific target from source code. For example, some compilers transform source code into native code for execution by a specific machine. Other compilers generate intermediate code from source code, where this intermediate code is subsequently interpreted dynamically at runtime or compiled just-in-time (JIT) to facilitate execution across computer platforms, for instance. Typically, most optimization of a program is performed at compile time when a source code is compiled to native or intermediate code. However, limited program optimization can also be performed at runtime during code interpretation or JIT compilation.
The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed subject matter. This summary is not an extensive overview. It is not intended to identify key/critical elements or to delineate the scope of the claimed subject matter. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.
Briefly described, the subject disclosure generally pertains to instruction optimization. More particularly, rather than eagerly executing program instructions at runtime, execution can be delayed and the instructions can be recorded. Subsequently or concurrently, the recorded instructions can be optimized utilizing local and/or global optimization techniques. For example, instructions can be removed, reordered, and/or combined based on other recorded instructions. When instructions need to be executed, for instance to supply a result, an optimized group of instructions, which is not worse than an original group of instructions in terms of some metric (e.g., time to run, amount of memory . . . ), is executed.
To the accomplishment of the foregoing and related ends, certain illustrative aspects of the claimed subject matter are described herein in connection with the following description and the annexed drawings. These aspects are indicative of various ways in which the subject matter may be practiced, all of which are intended to be within the scope of the claimed subject matter. Other advantages and novel features may become apparent from the following detailed description when considered in conjunction with the drawings.
Details below are generally directed toward instruction optimization. Instructions can be recorded and transformed at runtime, prior to execution, to enhance execution of operations prescribed thereby. Such a transformation can involve removing, reordering, and/or combining instructions. In other words, execution can be delayed by recording operations that need to be performed and optimizing the operations prior to execution rather than immediately performing operations. This can be termed just-in-time instruction optimization. Further, such functionality can correspond to instruction virtualization since a layer of indirection is included with respect to specified instructions and instructions that are actually executed. In accordance with one embodiment, optimization can be performed locally, over a small set of instructions (e.g., peephole or window). Additionally or alternatively, a larger, or more global, optimization approach can be employed.
Various aspects of the subject disclosure are now described in more detail with reference to the annexed drawings, wherein like numerals refer to like or corresponding elements throughout. It should be understood, however, that the drawings and detailed description relating thereto are not intended to limit the claimed subject matter to the particular form disclosed. Rather, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the claimed subject matter.
Referring initially to
The recordation component 110 can receive, retrieve, or otherwise obtain or acquire an instruction stream, or in other words, a series of instructions that specifies one or more actions be performed, and record those instructions as they are acquired, for example. In some sense, a buffer of instructions is created where instructions are recorded and not executed. The instructions can be recorded on any computer-readable medium.
The optimization component 120 can transform recorded instructions into an optimized form, for instance as a function of algebraic properties, among other things (e.g., domain specific information, cost . . . ). As previously mentioned, optimization can be triggered by an internal or external trigger or event. By way of example and not limitation, the optimization can be triggered upon recording of a determined number of instructions and/or upon a request for a result that the instructions produce. Upon occurrence of one or more trigger events, the optimization component 120 can transform the recorded instructions into a better form to facilitate optimized execution of actions specified thereby.
Turning attention to
The reorder component 220 can reorder instructions to optimize computation. In other words, there may be computational costs associated with instruction set permutations that the reorder component 220 can seek to minimize. For example, execution can be improved by filtering a data set prior to performing some action since the data set will likely be reduced by the filtering. More particularly, if instructions indicate that order operation (e.g., OrderBy) is to be performed prior to a filter operation (e.g. Where), the instructions can be reversed so that the filter operation is performed before the order operation so that the order operation is executed with respect to a potentially reduced data set.
The combination component 230 can combine or, in other words coalesce, two or more instructions into a single instruction. More specifically, a new instruction can be generated that captures multiple instructions and the other instructions can be removed. For instance, rather than performing multiple filter operations requiring a data set to be traversed multiple times, the filter operations can be combined such that the data set need only be traversed once.
Returning to
Optimization, by way of the optimization component 120, can also be performed at various levels of granularity. In accordance with one embodiment, optimization can be performed with respect to a small set of instructions (e.g., peephole, window). For example, optimization can be triggered after each instruction is acquired with respect to a previous “N” adjacent instructions where “N” is a positive integer. Additionally or alternatively, a more global approach can be taken where optimization is performed on a large set of instructions. For instance, the optimization can be initiated just prior to execution, such as when a result produced as a function of the recorded instruction is requested. In one embodiment, simpler optimizations can be designated for performance with a small set of instructions whereas more complex optimizations can be designated for performance with respect to a larger set of instructions to leverage the aggregate knowledge regarding the instructions. Of course, this is not required. In fact, optimization can be very configurable such that one can designate which optimizations to perform and when they should be performed.
The functionality provided by the instruction optimization system 100 can be implemented in a variety of different ways. In one instance, dynamic dispatch can be utilized where the result of an operation exposes an object with specialized behavior for consecutive operations (e.g., virtual methods). Similarly, a state machine can be employed wherein acquisition of additional knowledge by way of instructions moves from one node to another as a function of the knowledge, or in other words, state. Of course, these are but two implementation mechanisms that are contemplated. Other implementations are also possible and will be apparent to one of skill in the art.
The instruction optimization system 100 can be employed alone or in combination with other instruction optimization systems. More specifically, an instruction optimization system 100 can include a number or instruction optimization sub-systems. As illustrated in
By way of example, and not limitation, instructions can relate to graphics or more specifically rendering a polygon. Instructions fed into the instruction optimization system 100 can be divided and distributed to instruction optimization sub-systems 310 and 320, which can render triangles, for instance. Accordingly, the execution of render polygon has been virtualized since it is divided into simpler things or multiple triangles can be rendered to form a polygon. Furthermore, optimization can occur if it can be determined that two polygons overlap, which can result in optimal rending of only one polygon.
In accordance with one exemplary embodiment, the optimization can be performed with respect to query instructions, or operators, comprising a query expression, for instance, such as but not limited to language-integrated query (LINQ) expressions. Query expressions specified in higher-level languages, such as C#® and Visual Basic®, can benefit from optimization strategies that work independently from a back-end query language (e.g., Transact SQL) that is targeted through query providers.
At a local level, optimization can be carried out on query operators, which are represented as methods that implement the functionality of a named operator (e.g., Select, Where . . . ). Furthermore, semantic properties of query operators can be exploited to aid optimization. Consider the following query expression:
Note that the ability to compose queries can easily lead to suboptimal queries. In particular, one can write nested query expressions in an indirect manner, much like creation of views in database products:
Appendix A provides a few exemplary properties that hold at least for query operators. For the most part these properties enable local optimization based thereon and often enable two operators to be collapsed into a one operator. These and other optimizations can be realized by using virtual dispatch mechanisms where the result of a sequence operator exposes an object with specialized behavior for consecutive sequence operator applications. For example, the below sample code illustrates how the result of an “OrderBy” call reacts to a “Where” and “OrderBy” operations immediately following the call:
Each of those operators overrides a virtual method on the base class, “Sequence<T>:”
Sequence objects keep their left-side sequence object (e.g., what the operator represented by the type is being applied to). Subclasses can override the virtual query operator methods provided in order to do a local optimization. “Sequence<T>” implements “IEnumerable<T>,” whose implementation is provided by means of an abstract property called “Source.” In here, the sequence operations can be rewritten in terms of LINQ queries. Different strategies exist with respect to creating those “Sequence<T>” objects. For example, “Sequence<T>” objects can be utilized internally to existing “IEnumerable<T>” extension methods or a user can be allowed to explicitly move into a world of “optimized sequences,” for instance utilizing an extension method on “IEnumerable<T>.”
More specifically, at query execution time, the “source” 400 is encapsulated by an optimized version thereof namely “Sequence<T>” 410. Operations with respect to data can now be executed with respect to “Sequence<T>” 410 utilizing methods that override virtual methods of the base class “Sequence<T> 410. When the first operator of the query expression “OrderBy” is seen an object “OrderedSequence<T>” 420 is produced that captures the “OrderBy” query operator with respect to a first key selector “k1.” When the second “OrderBy” operator is seen with respect to a second key selector “k2,” another “OrderedSequence<T>” object 430 cancels the first ordering and replaces it with the second ordering. In other words, the query execution plan for two “OrderBy” operations includes solely the latest “OrderBy” operation, since the first ordering can be cancelled as redundant. Subsequently, upon identifying the “where” operator the “OrderedSequence<T>” 440 can swap the ordering of the filter and ordering to potentially limit the data set prior to performing the ordering. Stated differently, the query plan for two “OrderBy” operations followed by a “Where” is the “Where” operator followed by the second “OrderBy” operation.
Note that the optimizations can be carried out at query construction/formulation time (e.g., after compilation but before execution) as a result of calling query operator methods. This technique can be used for various querying application-programming interfaces (APIs) underneath not just “IEnumerable<T>.” In particular, the “Sequence<T>” layer is an abstraction over the rewrite of operations and their relative ordering. Simply substituting the “Source” property type for another type that supports similar operators will suffice to rewrite operations applied to it. For example, this technique can be used to optimize query operators over IEnumerable<T>” or “IObservable<T>,” or their respective homo-iconic “IQueryable<T>” and “IQbservable<T>” forms. For the later forms, an underlying query provider will be provided with a pre-optimized query in terms of high-level querying operations.
To illustrate the generic nature of the rewriting mechanism, a similar set of types isomorphic to “System. String” operations can be built, for example to eliminate conflicting operations (e.g., those that can be canceled out):
In fact, the general pattern is to have a lazy operation that triggers the optimized computation. In the above “String” example, it is “ToUpper.” In the case of “Sequence<T>” enumeration of the “Source” property triggers the optimized computation.
Furthermore, the subject optimization mechanisms are beneficial for any immutable type. The problem with immutable types is whenever something needs to be done that corresponds to a mutation, or change, a new “thing” (e.g., object, element . . . ) needs to be created. For example, there can be many instructions pertaining to creating a new thing, deleting the thing, and creating a new thing. Instead of performing several allocations and de-allocations with respect to an immutable thing, all mutations can be recorded and later utilized to create only a single new immutable thing capturing all the mutations up to the point of allocation.
The aforementioned systems, architectures, environments, and the like have been described with respect to interaction between several components. It should be appreciated that such systems and components can include those components or sub-components specified therein, some of the specified components or sub-components, and/or additional components. Sub-components could also be implemented as components communicatively coupled to other components rather than included within parent components. Further yet, one or more components and/or sub-components may be combined into a single component to provide aggregate functionality. Communication between systems, components and/or sub-components can be accomplished in accordance with either a push and/or pull model. The components may also interact with one or more other components not specifically described herein for the sake of brevity, but known by those of skill in the art.
Furthermore, various portions of the disclosed systems above and methods below can include or consist of artificial intelligence, machine learning, or knowledge or rule-based components, sub-components, processes, means, methodologies, or mechanisms (e.g., support vector machines, neural networks, expert systems, Bayesian belief networks, fuzzy logic, data fusion engines, classifiers . . . ). Such components, inter alia, can automate certain mechanisms or processes performed thereby to make portions of the systems and methods more adaptive as well as efficient and intelligent. By way of example and not limitation, the instruction optimization system 100 can employ such mechanism to determine or infer optimizations, for example as a function of history or context information.
In view of the exemplary systems described supra, methodologies that may be implemented in accordance with the disclosed subject matter will be better appreciated with reference to the flow charts of
Referring to
To facilitate clarity and understanding regarding aspects of the claimed subject matter consider the following real-world analogy. Suppose a homeowner instructs a contractor to paint his house and install new windows while the homeowner is on vacation for two weeks. The contractor can eagerly paint the house and install the new windows the next day. Alternatively, the contractor can simply note that the house is to be painted and new windows are to be installed, and just before the homeowner returns from vacation, perform all the work. In other words, the contractor can virtualize the instructions. Although, the homeowner may think that the work will commence shortly after the instruction, there is no difference to the homeowner as to when the work is performed (e.g., as well as how and by whom the work is performed). However, the efficiency with which the work is performed can be optimized in the later case. For example, suppose the homeowner calls midway through his vacation and changes the paint color of the house. If the contractor already painted the house, he would have to re-paint the house in the new color—twice the work. However, if he had not yet begun, he could just change the color previously noted and utilize the new color when he paints the house the first and only time. Further, suppose the homeowner adds an additional task such as fix the roof If the contractor already performed the work, he has likely removed all his tools from the job site and thus would have to bring them back to fix the roof However, if the contractor performed work lazily, he could simply note add the task to his list. The contractor can then wait until just before the homeowner returns from vacation to complete all the work. Of course, if the homeowner decides to come home early, this could also trigger the contractor to complete all the tasks on the list. Still further yet, it is to be appreciated that the contractor need not complete all tasks himself, rather the contractor can delegate work one or more sub-contractors, who can perform similar lazy optimization.
As used herein, the terms “component,” “system,” and “engine” as well as forms thereof are intended to refer to a computer-related entity, either hardware, a combination of hardware and software, software, or software in execution. For example, a component may be, but is not limited to being, a process running on a processor, a processor, an object, an instance, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a computer and the computer can be a component. One or more components may reside within a process and/or thread of execution and a component may be localized on one computer and/or distributed between two or more computers.
The word “exemplary” or various forms thereof are used herein to mean serving as an example, instance, or illustration. Any aspect or design described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects or designs. Furthermore, examples are provided solely for purposes of clarity and understanding and are not meant to limit or restrict the claimed subject matter or relevant portions of this disclosure in any manner. It is to be appreciated a myriad of additional or alternate examples of varying scope could have been presented, but have been omitted for purposes of brevity.
As used herein, the term “inference” or “infer” refers generally to the process of reasoning about or inferring states of the system, environment, and/or user from a set of observations as captured via events and/or data. Inference can be employed to identify a specific context or action, or can generate a probability distribution over states, for example. The inference can be probabilistic—that is, the computation of a probability distribution over states of interest based on a consideration of data and events. Inference can also refer to techniques employed for composing higher-level events from a set of events and/or data. Such inference results in the construction of new events or actions from a set of observed events and/or stored event data, whether or not the events are correlated in close temporal proximity, and whether the events and data come from one or several event and data sources. Various classification schemes and/or systems (e.g., support vector machines, neural networks, expert systems, Bayesian belief networks, fuzzy logic, data fusion engines . . . ) can be employed in connection with performing automatic and/or inferred action in connection with the claimed subject matter.
Furthermore, to the extent that the terms “includes,” “contains,” “has,” “having” or variations in form thereof are used in either the detailed description or the claims, such terms are intended to be inclusive in a manner similar to the term “comprising” as “comprising” is interpreted when employed as a transitional word in a claim.
In order to provide a context for the claimed subject matter,
While the above disclosed system and methods can be described in the general context of computer-executable instructions of a program that runs on one or more computers, those skilled in the art will recognize that aspects can also be implemented in combination with other program modules or the like. Generally, program modules include routines, programs, components, data structures, among other things that perform particular tasks and/or implement particular abstract data types. Moreover, those skilled in the art will appreciate that the above systems and methods can be practiced with various computer system configurations, including single-processor, multi-processor or multi-core processor computer systems, mini-computing devices, mainframe computers, as well as personal computers, hand-held computing devices (e.g., personal digital assistant (PDA), phone, watch . . . ), microprocessor-based or programmable consumer or industrial electronics, and the like. Aspects can also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. However, some, if not all aspects of the claimed subject matter can be practiced on stand-alone computers. In a distributed computing environment, program modules may be located in one or both of local and remote memory storage devices.
With reference to
The processor(s) 720 can be implemented with a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any processor, controller, microcontroller, or state machine. The processor(s) 720 may also be implemented as a combination of computing devices, for example a combination of a DSP and a microprocessor, a plurality of microprocessors, multi-core processors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
The computer 710 can include or otherwise interact with a variety of computer-readable media to facilitate control of the computer 710 to implement one or more aspects of the claimed subject matter. The computer-readable media can be any available media that can be accessed by the computer 710 and includes volatile and nonvolatile media, and removable and non-removable media. By way of example, and not limitation, computer-readable media may comprise computer storage media and communication media.
Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules, or other data. Computer storage media includes, but is not limited to memory devices (e.g., random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM) . . . ), magnetic storage devices (e.g., hard disk, floppy disk, cassettes, tape . . . ), optical disks (e.g., compact disk (CD), digital versatile disk (DVD) . . . ), and solid state devices (e.g., solid state drive (SSD), flash memory drive (e.g., card, stick, key drive . . . ) . . . ), or any other medium which can be used to store the desired information and which can be accessed by the computer 710.
Communication media typically embodies computer-readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer-readable media.
Memory 730 and mass storage 750 are examples of computer-readable storage media. Depending on the exact configuration and type of computing device, memory 730 may be volatile (e.g., RAM), non-volatile (e.g., ROM, flash memory . . . ) or some combination of the two. By way of example, the basic input/output system (BIOS), including basic routines to transfer information between elements within the computer 710, such as during start-up, can be stored in nonvolatile memory, while volatile memory can act as external cache memory to facilitate processing by the processor(s) 720, among other things.
Mass storage 750 includes removable/non-removable, volatile/non-volatile computer storage media for storage of large amounts of data relative to the memory 730. For example, mass storage 750 includes, but is not limited to, one or more devices such as a magnetic or optical disk drive, floppy disk drive, flash memory, solid-state drive, or memory stick.
Memory 730 and mass storage 750 can include, or have stored therein, operating system 760, one or more applications 762, one or more program modules 764, and data 766. The operating system 760 acts to control and allocate resources of the computer 710. Applications 762 include one or both of system and application software and can exploit management of resources by the operating system 760 through program modules 764 and data 766 stored in memory 730 and/or mass storage 750 to perform one or more actions. Accordingly, applications 762 can turn a general-purpose computer 710 into a specialized machine in accordance with the logic provided thereby.
All or portions of the claimed subject matter can be implemented using standard programming and/or engineering techniques to produce software, firmware, hardware, or any combination thereof to control a computer to realize the disclosed functionality. By way of example and not limitation, the instruction optimization system 100, or portions thereof, can be, or form part, of an application 762, and include one or more modules 764 and data 766 stored in memory and/or mass storage 750 whose functionality can be realized when executed by one or more processor(s) 720.
In accordance with one particular embodiment, the processor(s) 720 can correspond to a system on a chip (SOC) or like architecture including, or in other words integrating, both hardware and software on a single integrated circuit substrate. Here, the processor(s) 720 can include one or more processors as well as memory at least similar to processor(s) 720 and memory 730, among other things. Conventional processors include a minimal amount of hardware and software and rely extensively on external hardware and software. By contrast, an SOC implementation of processor is more powerful, as it embeds hardware and software therein that enable particular functionality with minimal or no reliance on external hardware and software. For example, the instruction optimization system 100 and/or associated functionality can be embedded within hardware in a SOC architecture.
The computer 710 also includes one or more interface components 770 that are communicatively coupled to the system bus 740 and facilitate interaction with the computer 710. By way of example, the interface component 770 can be a port (e.g., serial, parallel, PCMCIA, USB, FireWire . . . ) or an interface card (e.g., sound, video . . . ) or the like. In one example implementation, the interface component 770 can be embodied as a user input/output interface to enable a user to enter commands and information into the computer 710 through one or more input devices (e.g., pointing device such as a mouse, trackball, stylus, touch pad, keyboard, microphone, joystick, game pad, satellite dish, scanner, camera, other computer . . . ). In another example implementation, the interface component 770 can be embodied as an output peripheral interface to supply output to displays (e.g., CRT, LCD, plasma . . . ), speakers, printers, and/or other computers, among other things. Still further yet, the interface component 770 can be embodied as a network interface to enable communication with other computing devices (not shown), such as over a wired or wireless communications link.
What has been described above includes examples of aspects of the claimed subject matter. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the claimed subject matter, but one of ordinary skill in the art may recognize that many further combinations and permutations of the disclosed subject matter are possible. Accordingly, the disclosed subject matter is intended to embrace all such alterations, modifications, and variations that fall within the spirit and scope of the appended claims.