Hardware description languages (“HDLs”) are modeling languages used by hardware engineers to describe the structure and behavior of electronic circuits, most commonly digital logic circuits. Examples of HDLs include Very High Speed Integrated Circuit (“VHSIC”) HDL and VERILOG.
HDLs commonly require many lines of code to model digital logic circuits. Even for hardware engineers that are very familiar with HDLs, creation of such code can be extremely time consuming. Moreover, the more lines of code present in a design, the more likely it is for the design to include errors or perform poorly.
Additionally, because HDLs typically utilize a different programming paradigm than imperative programming languages, software engineers that are not intimately familiar with HDLs commonly have a very difficult time utilizing these languages. As a result, electronic circuits generated from HDL created by software engineers can also include errors or perform poorly.
C to HDL tools exist that can convert C-language or C-like program code into HDLs, like VHSIC or VERILOG. There are, however, certain types of programming language constructs that these tools inefficiently implement in hardware. For example, these tools typically create multiple instances of the same hardware when implementing a function that is called from multiple locations in program source code. This results in the inefficient use of limited hardware resources and can result in poor performance.
It is with respect to these and other technical challenges that the disclosure made herein is presented.
Technologies are disclosed for generating a synchronous digital circuit (“SDC”) from a source code construct defining a function call. Through implementations of the disclosed technologies, a SDC can be generated that includes a single instance of hardware for implementing a function called from multiple locations in program source code. This results in more efficient utilization of available hardware, such as when the SDC is implemented in a field-programmable gate array (“FPGA”), as compared to C to HDL tools. Other technical benefits not specifically mentioned herein can also be realized through implementations of the disclosed subject matter.
In order to realize the technical benefits mentioned briefly above, program source code is generated in a multi-threaded imperative programming language and stored. The programming language is imperative in that program statements are executed one after another, and multi-threaded in that multiple threads of execution can be executing in parallel. A thread refers to a collection of local variables that are executed as the local variables are processed by a hardware circuit.
The multi-threaded imperative programming language includes language constructs (or “constructs”) that map to circuit implementations. A language construct is a syntactically allowable part of a program that may be formed from one or more lexical tokens. The circuit implementations can be implemented as a SDC in a FPGA, a Gate Array, an Application-Specific Integrated Circuit (“ASIC”), or another type of suitable device. Another hardware component, such as a network interface card (“NIC”), can be configured with the FPGA, gate array, or ASIC, in order to implement desired functionality.
In one configuration, the multi-threaded imperative programming language includes a language construct that defines a function call (which might be referred to herein as a “function call construct”). This construct maps to a circuit implementation for implementing the function call in hardware. The construct can identify the function call and one or more input parameters for the function (referred to herein as “function parameters”). The same construct can be utilized to enable a called function to call other functions.
The circuit implementation corresponding to the function call construct includes a first hardware pipeline. The first hardware pipeline can implement statements located before the function call in the program source code. The first hardware pipeline outputs variables to a first queue and outputs parameters for the function (which might be referred to herein as “function parameters”) to a second queue.
The circuit implementation corresponding to the function call construct also includes a second hardware pipeline that obtains the function parameters from the second queue. The second hardware pipeline also includes hardware for implementing the function itself. For example, the second hardware pipeline might implement the function by performing operations on the function parameters and/or other values. The second hardware pipeline stores results generated by performance of the function in a third queue.
The circuit implementation for the function call construct also includes a third hardware pipeline. The third hardware pipeline implements statements located after the function call in the program source code. The third hardware pipeline can retrieve the results generated by the second pipeline from the second queue. The third hardware pipeline can also retrieve the variables stored by the first hardware pipeline from the first queue. The third hardware pipeline can perform hardware operations specified by the source code using the variables and the results of the function.
In some configurations, the circuit implementation can include hardware for implementing function invocations from multiple locations within program source code. In these configurations, the circuit implementation for the function call can include a fourth hardware pipeline. The fourth hardware pipeline can implement statements located before a second function call in the program source code.
The fourth hardware pipeline outputs second variables to a fourth queue and outputs second function parameters to a fifth queue. In these configurations, the second hardware pipeline (i.e. the pipeline implementing the function) can receive the second function parameters from the fifth queue and perform the specified function using the second function parameters. The second hardware pipeline can then store the results of the function in a sixth queue.
A fifth hardware pipeline can implement statements located after the second function call in the program source code. The fifth hardware pipeline can retrieve the results generated by the second pipeline from the sixth queue. The fifth hardware pipeline can also retrieve the second variables stored by the fourth hardware pipeline in the fourth queue.
The fifth hardware pipeline can then perform operations specified by the source code using the second variables and the results of the performance of the function using the second function parameters. In these configurations, the second hardware pipeline can utilize a hidden parameter to determine whether results are to be stored in the third queue (i.e. for consumption by the third pipeline) or the sixth queue (i.e. for consumption by the fifth pipeline).
Once program source code has been defined that includes a construct that maps to a circuit implementation for a function, the source code, including the construct, can be compiled to generate a circuit description. The circuit description can be expressed using HDL, for instance. The circuit description can, in turn, be used to generate an SDC that includes the circuit implementation. For example, HDL might be utilized to generate an FPGA image or bitstream that includes the circuit implementation defined by the construct. The FPGA image or bitstream can, in turn, be utilized to program an FPGA that includes the circuit implementation.
As discussed briefly above, implementations of the technologies disclosed herein enable more efficient utilization of available hardware when implementing functions as compared to previous solutions such as, for instance, C to HDL tools. Other technical benefits not specifically identified herein can also be realized through implementations of the disclosed technologies.
It should be appreciated that the above-described subject matter can be implemented as a computer-controlled apparatus, a computer-implemented method, a computing device, or as an article of manufacture such as a computer readable medium. These and various other features will be apparent from a reading of the following Detailed Description and a review of the associated drawings.
This Summary is provided to introduce a brief description of some aspects of the disclosed technologies in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended that this Summary be used to limit the scope of the claimed subject matter. Furthermore, the claimed subject matter is not limited to implementations that solve any or all disadvantages noted in any part of this disclosure.
The following detailed description is directed to technologies for generating a SDC based on a source code construct that defines a function. As discussed briefly above, implementations of the technologies disclosed herein enable a SDC to be generated that includes a single instance of hardware for implementing a software-defined function that is called from multiple locations in program source code. This results in more efficient utilization of available hardware, such as when the SDC is implemented in a FPGA for instance, as compared to C to HDL tools. Other technical benefits not specifically mentioned herein can also be realized through implementations of the disclosed subject matter.
While the subject matter described herein is presented in the general context of a computing system executing a compiler configured for compiling source code language constructs that map to circuit implementations, those skilled in the art will recognize that other implementations can be performed in combination with other types of computing systems and modules. Those skilled in the art will also appreciate that the subject matter described herein can be practiced with other computer system configurations, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, computing or processing systems embedded in devices (such as wearable computing devices, automobiles, home automation etc.), minicomputers, mainframe computers, and the like.
In the following detailed description, references are made to the accompanying drawings that form a part hereof, and which are shown by way of illustration specific configurations or examples. Referring now to the drawings, in which like numerals represent like elements throughout the several FIGS., aspects of various technologies for generating a SDC from a source code construct that defines a function will be described.
As illustrated in
The program source code 102 is expressed using a multi-threaded imperative programming language designed to target SDCs 112. The disclosed language provides many of the features of languages such as ‘C’ and ‘JAVA, such as function calls, for-loops, arithmetic operators, and conditional statements. However, the disclosed language includes constructs that map directly to an underlying SDC 112 hardware implementation. This enables both hardware and software engineers to reason about performance, and to be effective in optimizing their designs. This can also make the language familiar to software engineers, and free hardware engineers from dealing with whole classes of bugs that arise when coding in an HDL.
The disclosed multi-threaded imperative programming language is imperative in that program statements are executed one after another, and multi-threaded in that multiple threads of execution can be executing in parallel. As discussed above, a thread is a collection of variables that are executed as the variables are processed by a hardware circuit.
The threads described herein are analogous to, yet different, from software threads. While a software thread maintains a call stack containing variables and executes code in memory, the threads described herein are collections of variables that move through hardware circuits. While a software thread has a location in executable code determined by an instruction pointer, the disclosed thread has a physical location on the SDC at a given point in time. SDCs may execute hundreds, thousands, or even millions of threads, and SDC execution may be pipelined—i.e. different threads may execute within different stages of a circuit at the same time.
As will be described in greater detail below, language constructs can be defined in the program source code 102 that map to a circuit implementation. A language construct is a syntactically allowable part of a program that may be formed from one or more lexical tokens. The language constructs described herein map to circuit implementations that guarantee thread ordering (i.e. that threads will exit a circuit implementation in the same order that they entered).
As will also be described in greater detail below, the circuit implementations generated by the constructs disclosed herein can be implemented as an SDC in an FPGA, a Gate Array, an ASIC, or another type of suitable device. Another hardware component, such as a NIC, can be configured with the FPGA, Gate Array, or ASIC, in order to implement desired functionality.
As shown in
The pipelines 200A-200C can be connected by first-in-first-out (“FIFO”) queues (which might be referred to herein as “FIFOs” or “queues”). The pipelines 200A-200C implement the functionality defined by the program source code 102. The FIFOs 202 store data values, providing input to pipelines 200 as well as storing output generated by pipelines 200. For example, the SDC 112 includes a pipeline 200A that feeds its output to the FIFO 202A. Pipeline 200B, in turn, obtains its input from the FIFO 202A and provides its output to the FIFO 202B. The pipeline 200C obtains its input from the FIFO 202B.
In some configurations, the pipelines 200 implement circuitry that determines when to retrieve the next value(s) from a FIFO 202. For example, a policy may require that an input FIFO (e.g. the FIFO 202A in the case of the pipeline 200B) is not empty and an output FIFO (e.g. the FIFO 202B) is not full before retrieving a value from the input FIFO (e.g. the FIFO 202A) for processing.
As shown in
Each pipeline stage 206 can include one or more computational units 208, such as adder 208A and lookup table (“LUT”) 208B. In the illustrated example, adder 208A can perform basic arithmetic, e.g. addition, subtraction, or multiplication. Computational units can also implement Boolean operators (e.g. “OR”, “NOR”, “XOR”, etc.) or other custom logic provided by the SDC manufacturer.
Computational units can also be implemented by user-programmable lookup tables 208B. The illustrated LUT 208B depicts a two-input truth table that maps two input bits to a single output bit. LUTs 208B can be configured to support different numbers of input bits. To generate more complex output values, e.g. characters or 8-bit integers, multiple LUTs 208B, each connected to a different bit of an input variable, may be used.
Computational units can temporarily store results in registers 204 (or “flip-flops”). The contents of such a register can be provided to other computation units in the same or different pipeline 200. Registers 204 can capture a value at an input when a connected digital clock transitions from 0 to 1, and provide that value at an output until the end of the next clock cycle (i.e. until the clock transitions from 0 to 1 again). Registers can also include an enable line. If an enable line is set to false, then the register will not perform the operations described above, maintaining the current output value over multiple clock cycles.
It is to be appreciated that the pipeline architecture shown in
The construct 302 identifies the function call 306 and one or more input parameters for the function (referred to herein as “function parameters”). For example, in the source code sample shown in Table A, a function call 306 has been defined as “Z=G(X)”, where X is the input parameter for the function “G.”
As shown in
The construct 302 maps to a circuit implementation that includes a first pipeline 200D, a second pipeline 200E, and a third pipeline 200F. The first hardware pipeline 200D can implement the statements 304A located before the function call 306 in the program source code 102. The first hardware pipeline 200D outputs variables to a variable queue 202C (which might be referred to herein as the “first queue” and outputs parameters for the function to a function parameters queue (which might be referred to herein as the “second queue”).
Variables stored in a variable queue 202 are variables that have a value prior to a function call 306 and that are used after the function call 306. In the sample source code shown in Table A, the variable “Y” will be stored in a variable queue since it has a value prior to the function call 306 and is used after the function call in the statement “Return Y*Z.” The variable “X” is not stored in a variable queue since it is not utilized after the function call 306.
The second hardware pipeline 200E in the circuit implementation corresponding to the function call construct 302 obtains function parameters from the function parameters queue 202D. The second hardware pipeline 200E also includes hardware for implementing the function itself. For example, the second hardware pipeline 200E might implement the function by performing operations on the function parameters and/or other values. In the sample source code shown in Table A, the function is “G( )”, which takes one parameter, “X.” The second hardware pipeline 200E stores results generated by performance of the function in a return queue 202E (which might be referred to herein as the “third queue”).
The third hardware pipeline 200F in the circuit implementation for the function call construct 302 implements the statements 304B located after the function call 302 in the program source code 102. The third hardware pipeline 200F can retrieve the results generated by the second pipeline 200E from the return queue 202E. The third hardware pipeline 200F can also retrieve the variables stored in the variable queue 202C by the first hardware pipeline 200D. The third hardware pipeline 200F can perform hardware operations specified by the statements 304B using the variables and the results of the function. In the sample source code shown in Table A, the third pipeline 200F implements “Return Y*Z.” As discussed above, Y is pushed on the variable queue 202C by the pipeline 200D and Z is the result of the function “G(X)” pushed on the return queue 202E by the second pipeline 200E.
In the example shown in
As shown in
In order to implement the function call 306B, the SDC 112 includes a fourth hardware pipeline 200G. The fourth hardware pipeline 200G can implement the statements 304C located before the second function call 306B in the program source code 102.
The fourth hardware pipeline 200G outputs second variables to a second variable queue 202F (which might be referred to herein as the “fourth queue”) and outputs second function parameters to a second function parameters queue 202G (which might be referred to herein as the “fifth queue.”) In these configurations, the second hardware pipeline 200E (i.e. the pipeline implementing the function) can obtain the second function parameters from the second function parameters queue 202G and perform the specified function using the second function parameters. The second hardware pipeline 200E can then store the results of the function in a second results queue 202H (which might be referred to herein as the “sixth queue”).
As shown in
In the configuration shown in
In the example shown in
The particular implementation of the technologies disclosed herein is a matter of choice dependent on the performance and other requirements of the computing device. Accordingly, the logical operations described herein are referred to variously as states, operations, structural devices, acts, or modules. These states, operations, structural devices, acts and modules can be implemented in hardware, software, firmware, in special-purpose digital logic, and any combination thereof. It should be appreciated that more or fewer operations can be performed than shown in the FIGS. and described herein. These operations can also be performed in a different order than those described herein.
The routine 400 begins at operation 402, where program source code 102 is defined and stored that includes a language construct 302 that defines a function call. From operation 402, the routine 400 proceeds to operation 404, where the compiler 104 compiles the program source code 102 to a circuit description for a SDL 112 for implementing the function call. As discussed above, the circuit description might be expressed as HDL code 106.
From operation 404, the routine 400 proceeds to operation 406, where the circuit description (e.g. HDL code) is utilized to generate an SDC 112 that includes the circuit implementation defined by the circuit description. The routine 400 then proceeds from operation 406 to operation 408, where it ends.
The computer 500 illustrated in
The mass storage device 512 is connected to the CPU 502 through a mass storage controller (not shown) connected to the bus 510. The mass storage device 512 and its associated computer readable media provide non-volatile storage for the computer 500. Although the description of computer readable media contained herein refers to a mass storage device, such as a hard disk, CD-ROM drive, DVD-ROM drive, or USB storage key, it should be appreciated by those skilled in the art that computer readable media can be any available computer storage media or communication media that can be accessed by the computer 500.
Communication media includes computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics changed or set in a manner so as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio frequency, infrared and other wireless media. Combinations of the any of the above should also be included within the scope of computer readable media.
By way of example, and not limitation, computer storage media can include volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. For example, computer storage media includes, but is not limited to, RAM, ROM, EPROM, EEPROM, flash memory or other solid-state memory technology, CD-ROM, digital versatile disks (“DVD”), HD-DVD, BLU-RAY, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store the desired information and which can be accessed by the computer 500. For purposes of the claims, the phrase “computer storage medium,” and variations thereof, does not include waves or signals per se or communication media.
According to various configurations, the computer 500 can operate in a networked environment using logical connections to remote computers through a network such as the network 520. The computer 500 can connect to the network 520 through a network interface unit 516 connected to the bus 510. It should be appreciated that the network interface unit 516 can also be utilized to connect to other types of networks and remote computer systems. The computer 500 can also include an input/output controller 518 for receiving and processing input from a number of other devices, including a keyboard, mouse, touch input, an electronic stylus (not shown in
It should be appreciated that the software components described herein, when loaded into the CPU 502 and executed, can transform the CPU 502 and the overall computer 500 from a general-purpose computing device into a special-purpose computing device customized to facilitate the functionality presented herein. The CPU 502 can be constructed from any number of transistors or other discrete circuit elements, which can individually or collectively assume any number of states. More specifically, the CPU 502 can operate as a finite-state machine, in response to executable instructions contained within the software modules disclosed herein. These computer-executable instructions can transform the CPU 502 by specifying how the CPU 502 transitions between states, thereby transforming the transistors or other discrete hardware elements constituting the CPU 502.
Encoding the software modules presented herein can also transform the physical structure of the computer readable media presented herein. The specific transformation of physical structure depends on various factors, in different implementations of this description. Examples of such factors include, but are not limited to, the technology used to implement the computer readable media, whether the computer readable media is characterized as primary or secondary storage, and the like. For example, if the computer readable media is implemented as semiconductor-based memory, the software disclosed herein can be encoded on the computer readable media by transforming the physical state of the semiconductor memory. For instance, the software can transform the state of transistors, capacitors, or other discrete circuit elements constituting the semiconductor memory. The software can also transform the physical state of such components in order to store data thereupon.
As another example, the computer readable media disclosed herein can be implemented using magnetic or optical technology. In such implementations, the software presented herein can transform the physical state of magnetic or optical media, when the software is encoded therein. These transformations can include altering the magnetic characteristics of particular locations within given magnetic media. These transformations can also include altering the physical features or characteristics of particular locations within given optical media, to change the optical characteristics of those locations. Other transformations of physical media are possible without departing from the scope and spirit of the present description, with the foregoing examples provided only to facilitate this discussion.
In light of the above, it should be appreciated that many types of physical transformations take place in the computer 500 in order to store and execute the software components presented herein. It also should be appreciated that the architecture shown in
In a network environment in which the communications network 520 is the Internet, for example, the server computer 600A can be a dedicated server computer operable to process and communicate data to and from the client computing devices 600B-600G via any of a number of known protocols, such as, hypertext transfer protocol (“HTTP”), file transfer protocol (“FTP”), or simple object access protocol (“SOAP”). Additionally, the networked computing environment 600 can utilize various data security protocols such as secured socket layer (“SSL”) or pretty good privacy (“PGP”). Each of the client computing devices 600B-600G can be equipped with an operating system operable to support one or more computing applications or terminal sessions such as a web browser (not shown in
The server computer 600A can be communicatively coupled to other computing environments (not shown in
The data and/or computing applications may be stored on the server 600A, or servers 600A, and communicated to cooperating users through the client computing devices 600B-600G over an exemplary communications network 520. A participating user (not shown in
The server computer 600A can host computing applications, processes and applets for the generation, authentication, encryption, and communication of data and applications, and may cooperate with other server computing environments (not shown in
It should be appreciated that the illustrative computing architecture shown in
The disclosure presented herein also encompasses the subject matter set forth in the following clauses:
Clause 1. A computer-implemented method, comprising: storing program source code in a multi-threaded imperative programming language, the program source code comprising a construct defining a function call; compiling the construct to a circuit description describing a circuit implementation, the circuit implementation comprising a first hardware pipeline configured to output one or more variables to a first queue and to output one or more function parameters to a second queue, a second hardware pipeline configured to receive the function parameters from the second queue, to perform one or more operations using the function parameters, and to store results generated by the one or more operations in a third queue, and a third hardware pipeline configured to obtain the variables from the first queue and to retrieve the results from the third queue; and generating, based on the circuit description, a synchronous digital circuit comprising the circuit implementation.
Clause 2. The computer-implemented method of clause 1, wherein the first hardware pipeline implements statements in the program source code located before the function call.
Clause 3. The computer-implemented method of any of clauses 1-2, wherein the third hardware pipeline implements statements in the program source code located after the function call.
Clause 4. The computer-implemented method of any of clauses 1-3, wherein the circuit implementation further comprises: a fourth hardware pipeline configured to output one or more second variables to a fourth queue and to output one or more second function parameters to a fifth queue, wherein the second hardware pipeline is further configured to receive the second function parameters from the fifth queue, to perform the one or more operations using the second function parameters, and to store results generated by the one or more operations in a sixth queue; and a fifth hardware pipeline configured to receive the variables from the fifth queue and to receive the results from the sixth queue.
Clause 5. The computer-implemented method of any of clauses 1-4, wherein second pipeline is further configured to receive a hidden parameter and to store the results in the third queue or the sixth queue based on the hidden parameter.
Clause 6. The computer-implemented method of any of clauses 1-5, wherein the construct identifies the function call and the one or more function parameters.
Clause 7. The computer-implemented method of any of clauses 1-6, wherein the synchronous digital circuit is implemented in a field-programmable gate array (FPGA), a gate array, or application-specific integrated circuit (ASIC).
Clause 8. The computer-implemented method of any of clauses 1-7, wherein a network interface card (NIC) is configured with the FPGA, gate array, or ASIC.
Clause 9. A synchronous digital circuit generated from program source code in a multi-threaded imperative programming language, the program source code comprising a construct defining a function call, the synchronous digital circuit comprising: a first hardware pipeline configured to output one or more variables to a first queue and to output one or more function parameters to a second queue; a second hardware pipeline configured to receive the function parameters from the second queue, to perform one or more operations using the function parameters, and to store results generated by the one or more operations in a third queue; and a third hardware pipeline configured to obtain the variables from the first queue and to retrieve the results from the third queue.
Clause 10. The synchronous digital circuit of clause 9, wherein the first hardware pipeline implements statements in the program source code located before the function call.
Clause 11. The synchronous digital circuit of any of clauses 9-10, wherein the third hardware pipeline implements statements in the program source code located after the function call.
Clause 12. The synchronous digital circuit of any of clauses 9-11, further comprising: a fourth hardware pipeline configured to output one or more second variables to a fourth queue and to output one or more second function parameters to a fifth queue, wherein the second hardware pipeline is further configured to receive the second function parameters from the fifth queue, to perform the one or more operations using the second function parameters, and to store results generated by the one or more operations in a sixth queue; and a fifth hardware pipeline configured to receive the variables from the fifth queue and to receive the results from the sixth queue.
Clause 13. The synchronous digital circuit of any of clauses 9-12, wherein second pipeline is further configured to receive a hidden parameter and to store the results in the third queue or the sixth queue based on the hidden parameter.
Clause 14. The synchronous digital circuit of any of clauses 9-13, wherein the synchronous digital circuit is implemented in a field-programmable gate array (FPGA), a gate array, or application-specific integrated circuit (ASIC).
Clause 15. The synchronous digital circuit of any of clauses 9-14, wherein a network interface card (NIC) is configured with the FPGA, gate array, or ASIC.
Clause 16. A computer, comprising: a central processing unit (CPU); and at least one computer storage medium storing program source code in a multi-threaded imperative programming language, the program source code comprising a construct defining a function call, and instructions, which when executed by the CPU, will cause the CPU to compile the program source code to a circuit description describing a circuit implementation, the circuit implementation comprising a first hardware pipeline configured to output one or more variables to a first queue and to output one or more function parameters to a second queue, a second hardware pipeline configured to receive the function parameters from the second queue, to perform one or more operations using the function parameters, and to store results generated by the one or more operations in a third queue, and a third hardware pipeline configured to obtain the variables from the first queue and to retrieve the results from the third queue.
Clause 17. The computer of clause 16, wherein the at least one computer storage medium stores further statements for generating the synchronous digital circuit from the circuit description.
Clause 18. The computer of any of clauses 16-17, wherein the circuit implementation further comprises: a fourth hardware pipeline configured to output one or more second variables to a fourth queue and to output one or more second function parameters to a fifth queue, wherein the second hardware pipeline is further configured to receive the second function parameters from the fifth queue, to perform the one or more operations using the second function parameters, and to store results generated by the one or more operations in a sixth queue; and a fifth hardware pipeline configured to receive the variables from the fifth queue and to receive the results from the sixth queue.
Clause 19. The computer of any of clauses 16-18, wherein second pipeline is further configured to receive a hidden parameter and to store the results in the third queue or the sixth queue based on the hidden parameter.
Clause 20. The computer of any of clauses 16-19, wherein the first hardware pipeline implements statements in the program source code located before the function call and wherein the third hardware pipeline implements statements in the program source code located after the function call.
Based on the foregoing, it should be appreciated that technologies for generating a SDC from a source code construct that defines a function call have been disclosed herein. Although the subject matter presented herein has been described in language specific to computer structural features, methodological and transformative acts, specific computing machinery, and computer readable media, it is to be understood that the subject matter set forth in the appended claims is not necessarily limited to the specific features, acts, or media described herein. Rather, the specific features, acts and mediums are disclosed as example forms of implementing the claimed subject matter.
The subject matter described above is provided by way of illustration only and should not be construed as limiting. Various modifications and changes can be made to the subject matter described herein without following the example configurations and applications illustrated and described, and without departing from the scope of the present disclosure, which is set forth in the following claims.
Number | Name | Date | Kind |
---|---|---|---|
5343554 | Koza et al. | Aug 1994 | A |
5416719 | Pribetich | May 1995 | A |
5642304 | Simpson | Jun 1997 | A |
5761483 | Trimberger | Jun 1998 | A |
5909572 | Thayer et al. | Jun 1999 | A |
6061521 | Thayer et al. | May 2000 | A |
6112019 | Chamdani et al. | Aug 2000 | A |
6212601 | Shiell | Apr 2001 | B1 |
6275508 | Aggarwal et al. | Aug 2001 | B1 |
6597664 | Mithal et al. | Jul 2003 | B1 |
7028281 | Agrawal et al. | Apr 2006 | B1 |
7111273 | Ganesan et al. | Sep 2006 | B1 |
7203718 | Fu et al. | Apr 2007 | B1 |
7305582 | Moser et al. | Dec 2007 | B1 |
7315991 | Bennett | Jan 2008 | B1 |
7375550 | Redgrave et al. | May 2008 | B1 |
7386820 | Koelbl et al. | Jun 2008 | B1 |
7415681 | Tomar et al. | Aug 2008 | B2 |
7471104 | Chirania | Dec 2008 | B1 |
7516446 | Choi et al. | Apr 2009 | B2 |
7647567 | Esposito et al. | Jan 2010 | B1 |
7735047 | Anderson et al. | Jun 2010 | B1 |
7735050 | Yu et al. | Jun 2010 | B2 |
7823117 | Bennett | Oct 2010 | B1 |
7844924 | Sasao et al. | Nov 2010 | B2 |
8095508 | Chamberlain et al. | Jan 2012 | B2 |
8209580 | Varnica et al. | Jun 2012 | B1 |
8468510 | Sundararajan et al. | Jun 2013 | B1 |
8599049 | Wang et al. | Dec 2013 | B2 |
8620881 | Chamberlain et al. | Dec 2013 | B2 |
8656347 | Ito | Feb 2014 | B2 |
8671371 | Dimond | Mar 2014 | B1 |
8775986 | Mohan et al. | Jul 2014 | B1 |
8881079 | Pan | Nov 2014 | B1 |
8930926 | Bastoul et al. | Jan 2015 | B2 |
9471307 | Giroux et al. | Oct 2016 | B2 |
9690278 | Chen et al. | Jun 2017 | B1 |
9824756 | Brand et al. | Nov 2017 | B2 |
9846623 | Jennings et al. | Dec 2017 | B2 |
9858373 | Cho et al. | Jan 2018 | B2 |
10162918 | Iyer et al. | Dec 2018 | B1 |
10331836 | Hosangadi et al. | Jun 2019 | B1 |
10419338 | Gray | Sep 2019 | B2 |
10474533 | Jennings et al. | Nov 2019 | B2 |
20020080174 | Kodosky et al. | Jun 2002 | A1 |
20030154466 | Snider | Aug 2003 | A1 |
20050050531 | Lee | Mar 2005 | A1 |
20060075180 | Tian et al. | Apr 2006 | A1 |
20060120189 | Beerel et al. | Jun 2006 | A1 |
20060268939 | Dries et al. | Nov 2006 | A1 |
20070094474 | Wilson et al. | Apr 2007 | A1 |
20070143717 | Koelbl et al. | Jun 2007 | A1 |
20070171101 | Siemers et al. | Jul 2007 | A1 |
20070174804 | Sasao et al. | Jul 2007 | A1 |
20070180334 | Jones et al. | Aug 2007 | A1 |
20070300192 | Curtin et al. | Dec 2007 | A1 |
20080005357 | Malkhi et al. | Jan 2008 | A1 |
20080075278 | Gaubatz et al. | Mar 2008 | A1 |
20080111721 | Reznik | May 2008 | A1 |
20080111722 | Reznik | May 2008 | A1 |
20080229141 | Chang et al. | Sep 2008 | A1 |
20080313579 | Larouche et al. | Dec 2008 | A1 |
20090210412 | Oliver et al. | Aug 2009 | A1 |
20090243732 | Tarng et al. | Oct 2009 | A1 |
20100162049 | Stall et al. | Jun 2010 | A1 |
20110078640 | Bruneel | Mar 2011 | A1 |
20120065956 | Irturk et al. | Mar 2012 | A1 |
20130013301 | Subbaraman et al. | Jan 2013 | A1 |
20130054939 | Felch | Feb 2013 | A1 |
20130081060 | Otenko | Mar 2013 | A1 |
20130100750 | Ishiguro et al. | Apr 2013 | A1 |
20130111425 | Kumar et al. | May 2013 | A1 |
20130111453 | Kalogeropulos et al. | May 2013 | A1 |
20130125097 | Ebcioglu | May 2013 | A1 |
20130139122 | Pell et al. | May 2013 | A1 |
20130212365 | Chen et al. | Aug 2013 | A1 |
20130226594 | Fuchs et al. | Aug 2013 | A1 |
20130298130 | Pienaar et al. | Nov 2013 | A1 |
20130335853 | Li et al. | Dec 2013 | A1 |
20140059524 | Kee et al. | Feb 2014 | A1 |
20140237437 | Mang et al. | Aug 2014 | A1 |
20150052298 | Brand et al. | Feb 2015 | A1 |
20150178418 | Gu | Jun 2015 | A1 |
20150178435 | Kumar | Jun 2015 | A1 |
20150295552 | Abou-Chahine et al. | Oct 2015 | A1 |
20150304068 | Xiong et al. | Oct 2015 | A1 |
20160087651 | Lu | Mar 2016 | A1 |
20160180001 | Adler | Jun 2016 | A1 |
20160246571 | Walters | Aug 2016 | A1 |
20160259023 | Overall et al. | Sep 2016 | A1 |
20160299998 | Isshiki | Oct 2016 | A1 |
20160344629 | Gray | Nov 2016 | A1 |
20170140513 | Su et al. | May 2017 | A1 |
20170185508 | Looney et al. | Jun 2017 | A1 |
20170192921 | Wang et al. | Jul 2017 | A1 |
20170251211 | Froehlich et al. | Aug 2017 | A1 |
20170277656 | John et al. | Sep 2017 | A1 |
20170315815 | Smith | Nov 2017 | A1 |
20170316154 | Fitch et al. | Nov 2017 | A1 |
20180129475 | Almagambetov et al. | May 2018 | A1 |
20180143872 | Sun et al. | May 2018 | A1 |
20180232475 | Derisavi et al. | Aug 2018 | A1 |
20180253368 | Villarreal et al. | Sep 2018 | A1 |
20180255206 | Kim et al. | Sep 2018 | A1 |
20180330022 | Choi et al. | Nov 2018 | A1 |
20180342040 | Nguyen et al. | Nov 2018 | A1 |
20180347498 | Maloney | Dec 2018 | A1 |
20190114548 | Wu et al. | Apr 2019 | A1 |
20190138365 | Purnell et al. | May 2019 | A1 |
20190286973 | Kovvuri et al. | Sep 2019 | A1 |
20190303153 | Halpern et al. | Oct 2019 | A1 |
20200167139 | Drepper | May 2020 | A1 |
20200225920 | Pelton et al. | Jul 2020 | A1 |
20200225921 | Pelton et al. | Jul 2020 | A1 |
20200226051 | Pelton et al. | Jul 2020 | A1 |
20200226227 | Pelton et al. | Jul 2020 | A1 |
20200226228 | Pelton et al. | Jul 2020 | A1 |
20210049163 | Levy et al. | Feb 2021 | A1 |
Number | Date | Country |
---|---|---|
2016094012 | Jun 2016 | WO |
2017084104 | May 2017 | WO |
Entry |
---|
“Final Office Action Issued in U.S. Appl. No. 16/247,250”, dated Apr. 13, 2020, 22 Pages. |
“International Search Report and Written Opinion Issued in PCT Application No. PCT/US2019/069030”, dated Mar. 27, 2020, 11 Pages. |
“Operand Forwarding”, Retrieved From: https://en.wikipedia.org/w/index.php?title=Operand_forwarding&oldid=868126536, Retrieved on: Nov. 10, 2018, 2 Pages. |
“Register File”, Retrieved From: https://en.wikipedia.org/w/index.php?title=Register_file&oldid=923242007, Retrieved on: Oct. 27, 2019, 8 Pages. |
“Re-order Buffer”, Retrieved From: https://en.wikipedia.org/w/index.php?title=Re-order_buffer&oldid=928835149 Retrieved on: Dec. 1, 2019, 1 Page. |
“Non-Final Office Action Issued in U.S. Appl. No. 16/247,250”, dated Dec. 27, 2019, 14 Pages. |
“Ex Parte Quayle Action Issued in U.S. Appl. No. 16/247,261”, dated Feb. 4, 2020, 7 Pages. |
“Non Final Office Action Issued in U.S. Appl. No. 16/247,269”, dated Feb. 20, 2020, 14 Pages. |
Cong, et al., “Combinational Logic Synthesis for LUT Based Filed Programmable Gate Arrays”, In Proceedings of the ACM Transactions on Design Automation of Electronic Systems, Apr. 1996, pp. 145-204. |
Tan, et al., “Mapping-Aware Constrained Scheduling for LUT-Based FPGAs”, In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Feb. 22, 2015, 10 Pages. |
Ditmar, et al., “Function Call Optimisation in SystemC Hardware Compilation”, In Proceedings of 4th Southern Conference on Programmable Logic, Mar. 26, 2008, pp. 93-98. |
“International Search Report and Written Opinion Issued in PCT Application No. PCT/US2019/069028”, dated Apr. 24, 2020, 13 Pages. |
“International Search Report and Written Opinion Issued in PCT Application No. PCT/US2019/069028”, dated May 4, 2020, 13 Pages. |
“International Search Report and Written Opinion Issued in PCT Application No. PCT/US2019/069032”, dated Apr. 24, 2020, 13 Pages. |
“International Search Report and Written Opinion Issued in PCT Application No. PCT/US2020/012278”, dated Apr. 6, 2020, 15 Pages. |
“Final Office Action Issued in U.S. Appl. No. 16/247,269”, dated Oct. 16, 2020, 15 Pages. |
“Non Final Office Action Issued in U.S. Appl. No. 16/247,226”, dated Sep. 4, 2020, 33 Pages. |
“Non-Final Office Action Issued in U.S. Appl. No. 16/247,203”, dated Nov. 9, 2020, 8 Pages. |
“Non Final Office Action Issued in U.S. Appl. No. 16/247,250”, dated Nov. 12, 2020, 18 Pages. |
Arar, Steve, “Concurrent Conditional and Selected Signal Assignment in VHDL”, https://www.allaboutcircuits.com/technical-articles/concurrent-conditional-and-selected-signal-assignment-in-vhdl/, Jan. 3, 2018, 8 Pages. |
Galloway, et al., “The Transmogrifier C hardware description language and compiler for FPGAs”, In Proceedings IEEE Symposium on FPGAs for Custom Computing Machines, Apr. 19, 1995, pp. 136-144. |
“International Search Report and Written Opinion Issued in PCT Application No. PCT/US2019/069034”, dated Jun. 23, 2020, 19 Pages. |
“Notice of Allowance Issued in U.S. Appl. No. 16/247,250”, dated Apr. 14, 2021, 25 Pages. |
Hatami, et al., “High Performance Architecture for Flow-Table Lookup in SDN on FPGA”, In Repository of arXiv:1801.00840, Jan. 2, 2018, 15 Pages. |
“Final Office Action Issued in U.S. Appl. No. 16/247,203”, dated Apr. 1, 2021, 6 Pages. |
“Final Office Action Issued in U.S. Appl. No. 16/247,226”, dated Mar. 18, 2021, 41 Pages. |
“Notice of Allowance Issued in U.S. Appl. No. 16/247,226”, dated Jun. 25, 2021, 14 Pages. |
Steinberg, et al., “Automatic High-level Programs Mapping onto Programmable Architectures”, In Proceedings of International Conference on Parallel Computing Technologies, Jul. 25, 2015, pp. 474-485. |
“Notice of Allowance Issued in U.S. Appl. No. 16/247,203”, dated Jun. 3, 2021, 8 Pages. |
Number | Date | Country | |
---|---|---|---|
20200225919 A1 | Jul 2020 | US |