Accelerated mathematical engine

Information

  • Patent Grant
  • 11403069
  • Patent Number
    11,403,069
  • Date Filed
    Friday, May 29, 2020
    4 years ago
  • Date Issued
    Tuesday, August 2, 2022
    2 years ago
Abstract
Various embodiments of the disclosure relate to an accelerated mathematical engine. In certain embodiments, the accelerated mathematical engine is applied to image processing such that convolution of an image is accelerated by using a two-dimensional matrix processor comprising sub-circuits that include an ALU, output register and shadow register. This architecture supports a clocked, two-dimensional architecture in which image data and weights are multiplied in a synchronized manner to allow a large number of mathematical operations to be performed in parallel.
Description
BACKGROUND
A. Technical Field

The present disclosure relates to an accelerated mathematical engine for operating on large amounts of data, and more particularly, to an accelerated mathematical engine for performing complex convolution operations based on matrix multiply operations.


B. Description of the Related Art

One skilled in the art will recognize the ever-increasing demands of speed and performance on general processors and systems that are used to implement time-sensitive and complex mathematical operations. As these general systems are used to process large amounts of data and perform complex mathematical operations, the computational resources and the rate of calculations are limited by the capabilities of existing general hardware designs that perform those calculations. For example, general-purpose computing devices and processors that execute matrix operations may be unable to perform these operations in a timely manner under certain circumstances. Many conventional multipliers that perform digital signal processing operations rely on a series of software and hardware matrix manipulation steps (address generation, transpositions, bit-by-bit addition and shifting, etc.) and may represent a bottleneck within a time-sensitive system. Oftentimes, these manipulation steps require the use of a processor's arithmetic functions to generate intermediate results at the expense of wasting computing time due to the added steps of storing and fetching intermediate results from various locations to complete an operation.



FIG. 1 shows an example of a conventional multiplier system. Multiplier system 100 is a scalar machine that comprises computation unit 102, registers 104, cache 106, and memory 108. In operation, computation unit 102 uses registers 104 and cache 106 to retrieve data stored in memory 108. Typically, computation unit 102 is a microprocessor, such as a CPU or GPU, capable of performing various computational procedures including matrix multiplication on input matrices to obtain a resultant matrix, e.g., by converting multiplications into additions and outputting the result into some internal register.


For example, a dot product that represents an output pixel of an image is typically generated by dot-multiplying individual matrix elements from two matrices to obtain partial results, which are then added to obtain the final dot product. A multiplication of individual matrix elements, i.e., a scalar multiplication, is typically performed on individual data elements by breaking up the dot multiplication into a series of individual sub-operations. As a result, partial products have to be stored and fetched from one or more of registers 104, cache 106, and memory 108 to complete a single arithmetic operation.


Computationally demanding applications, such as a convolution, oftentimes require a software function be embedded in computation unit 102 and used to convert convolution operations into alternate matrix-multiply operations. This is accomplished by rearranging and reformatting data into two matrices that then can be raw matrix-multiplied. However, there exists no mechanism to efficiently share or reuse data in scalar machine 100, such that data necessary to execute each scalar operation has to be re-stored and re-fetched from registers many times. The complexity and managerial overhead of these operations becomes significantly greater as the amount of image data subject to convolution operations increases.


The inability to reuse much of the data in scalar machine 100 coupled with the added and inefficient steps of storing and fetching intermediate results from registers 104, cache 106, and memory 108 to complete an arithmetic operation are only some of the shortcoming of existing systems, such as multiplier system 100.


Accordingly, what is needed are high-computational-throughput systems and methods that can perform matrix mathematical operations quickly and efficiently.





BRIEF DESCRIPTION OF THE DRAWINGS

References will be made to embodiments of the invention, examples of which may be illustrated in the accompanying figures. These figures are intended to be illustrative, not limiting. Although the invention is generally described in the context of these embodiments, it should be understood that it is not intended to limit the scope of the invention to these particular embodiments. Items in the figures may be not to scale.



FIG. 1 shows an example of a conventional multiplier system.



FIG. 2 illustrates and exemplary matrix processor architecture for performing arithmetic operations according to various embodiments of the present disclosure.



FIG. 3 illustrates details of an exemplary configuration of the matrix processor architecture shown in FIG. 2.



FIG. 4 illustrates an exemplary multiply-and-add circuit implementation of the logic circuit shown in FIG. 3.



FIG. 5 illustrates an exemplary convolution operation according to various embodiments of the present disclosure.



FIG. 6 through FIG. 8 illustrate details of an exemplary convolution operation according to various embodiments of the present disclosure.



FIG. 9 illustrates an exemplary deconvolution operation according to various embodiments of the present disclosure.



FIG. 10 illustrates a process for performing arithmetic operations to make convolutional neural networks faster, according to various embodiments of the present disclosure.





DETAILED DESCRIPTION OF EMBODIMENTS

In the following description, for purposes of explanation, specific details are set forth in order to provide an understanding of the invention. It will be apparent, however, to one skilled in the art that the invention can be practiced without these details. Furthermore, one skilled in the art will recognize that embodiments of the present invention, described below, may be implemented in a variety of ways, such as a process, an apparatus, a system, a device, or a method on a tangible computer-readable medium.


Components, or modules, shown in diagrams are illustrative of exemplary embodiments of the invention and are meant to avoid obscuring the invention. It shall also be understood that throughout this discussion that components may be described as separate functional units, which may comprise sub-units, but those skilled in the art will recognize that various components, or portions thereof, may be divided into separate components or may be integrated together, including integrated within a single system or component. It should be noted that functions or operations discussed herein may be implemented as components. Components may be implemented in software, hardware, or a combination thereof. Many components are be formed through interconnection of many subcomponents. Subcomponents may be selected that are logically different in operation from what is shown herein, where these logically different subcomponents can be combined in the aggregate with other subcomponents provide similar or identical functionality at the aggregated component level to that described herein (e.g., active high signals can be active low, AND gates replaced with inverted-input NOR gates, etc).


Furthermore, connections between components or systems within the figures are not intended to be limited to direct connections. Rather, data between these components may be modified, re-formatted, or otherwise changed by intermediary components. Also, additional or fewer connections may be used. It shall also be noted that the terms “coupled,” “connected,” or “communicatively coupled” shall be understood to include direct connections, indirect connections through one or more intermediary devices, and wireless connections.


Reference in the specification to “one embodiment,” “preferred embodiment,” “an embodiment,” or “embodiments” means that a particular feature, structure, characteristic, or function described in connection with the embodiment is included in at least one embodiment of the invention and may be in more than one embodiment. Also, the appearances of the above-noted phrases in various places in the specification are not necessarily all referring to the same embodiment or embodiments.


The use of certain terms in various places in the specification is for illustration and should not be construed as limiting. A service, function, or resource is not limited to a single service, function, or resource; usage of these terms may refer to a grouping of related services, functions, or resources, which may be distributed or aggregated.


The terms “include,” “including,” “comprise,” and “comprising” shall be understood to be open terms and any lists that follow are examples and not meant to be limited to the listed items and may include subsets or supersets of the items along with additional items. Any headings used herein are for organizational purposes only and shall not be used to limit the scope of the description or any claims. Each document mentioned in this patent document is incorporate by reference herein in its entirety.


Furthermore, one skilled in the art shall recognize that: (1) certain steps may optionally be performed; (2) steps may not be limited to the specific order set forth herein; (3) certain steps may be performed in different orders; and (4) certain steps may be done concurrently.


Although embodiments herein are discussed mainly in the context of convolutions, one of skill in the art will appreciate that a deconvolution and other matrix operations can also be structured as a matrix-matrix type multiply operation and, thus, the principles of the present invention are equally applicable to deconvolutions. Furthermore, other types of mathematical operations may be implemented in accordance with various embodiments of this disclosure.



FIG. 2 illustrates an exemplary matrix processor architecture for performing arithmetic operations according to various embodiments of the present disclosure. System 200 comprises logic circuit 232234, cache/buffer 224, data formatter 210, weight formatter 212, data input matrix 206, weight input matrix 208, matrix processor 240, output array 226, post processing units 228, and control logic 250. Matrix processor 240 comprises a plurality of sub-circuits 242 which contain Arithmetic Logic Units (ALUs), registers and, in some embodiments, encoders (such as booth encoders). Logic circuit 232 may be a circuit that represents N input operators and data registers. Logic circuit 234 may be circuitry that inputs M weight operands into matrix processor 240. Logic circuit 232 may be circuitry that input image data operands into matrix processor 240. Weight input matrix 208 and data input matrix 206 may be stored in various types of memory including SRAM devices. One skilled in the art will recognize that various types of operands may be input into the matrix processor 240.


In operation according to certain embodiments, system 200 accelerates convolution operations by reducing redundant operations within the systems and implementing hardware specific logic to perform certain mathematical operations across a large set of data and weights. This acceleration is a direct result of methods (and corresponding hardware components) that retrieve and input image data and weights to the matrix processor 240 as well as timing mathematical operations within the matrix processor 240 on a large scale.


In embodiments, formatters 210212, which in example in FIG. 2 are implemented as in-line formatters. In certain embodiments, formatters 210212 are discrete components and in other embodiments the formatters 210212 are integrated together and/or with one or more other components. Each is implemented in hardware and converts a matrix to a vector on operands to be operated upon within the matrix processor 240. In other embodiments, formatters 210212 are implemented in software, although this typically produces a loss in speed. Data formatter 210 converts two-dimensional or three-dimensional (e.g., a 3×3×3 cube) data comprising data input matrix 206 into a single vector or string that may be represented by a row or column, thereby, linearizing or vectorizing data input matrix 206. In detail, formatter 210 receives data input matrix 206 and prepares input data to be processed by matrix processor 240. In embodiments, this is accomplished by mapping parameters of the data input matrix 206 into a suitable format according to the hardware requirements of matrix processor 240 such that matrix processor 240 can efficiently perform a matrix multiply as part of a convolution calculation when generating output pixels.


As an example, assuming matrix processor 240 comprises 96 rows and 96 columns, data mapped into a 96×96 format would cause matrix processor 240 to be utilized to its full computational capacity and, thus, provide a preferred efficiency. In that case, formatter 210 should produce an output that is 96-columns wide. Similarly, formatter 212 should produce an output that is 96-rows wide based on the weight input matrix 208.


In embodiments, formatter 210 uses a number of multiplexers or switches to fetch some or all of data input matrix 206 and choose different elements therefrom in order to produce data that is then lined up according to the columns of matrix processor 240. In embodiments, the selection ensures that the appropriate data from data input matrix 206 is passed to each of the columns at defined clock cycles. In embodiments, if weights are static, they may be pre-formatted offline, stored in memory, fetched only once, and fed directly into matrix processor 240 in a modified, vectorized format without the use of formatter 212. In other embodiments, weights may be dynamically adjusted and fed into matrix processor 240 in accordance with various formatting and fetching operations. In embodiments, matrix processor 240 allows for column and row inputs of varying sizes. That is, matrix processor 240 is designed to compute NxM computations of arbitrary size.


In other embodiments, if the number of columns of the matrix processor 240 is limited (for example to N columns) such that the number of columns in the data input matrix 206 (for example X) is greater than the number of columns of the matrix processor 240 (i.e., X>N), then the control logic 250 may split the data input matrix 206 into multiple submatricies with each submatrix computed by a matrix processor 240. In such instances, each matrix processor 240 may be running in a different thread. For example, if data input matrix 206 consists of 192×96 data points, and the matrix processor has 96 columns and 96 rows (i.e., 96×96 computations may occur in one clock cycle), the control logic 250 may split the data input matrix 206 into two submatricies (such as the left half of the data input matrix 206 and the right half of the data input matrix 206). Each submatrix will consist of 96×96 data points. Each separately threaded matrix processor 240 can compute the output channels for the submatrix sent to it with results placed into the final output array 260, which must be large enough to hold the values from all channels (that is 192 values). More generally, data input matrix 206 may be split into any number of submatricies and sent to different matrix processors 240, each running in a separate thread. As with the output array 226, the data input matrix 206, data formatter 210, cache/buffer 224, logic circuit 232, and post processing unit 228 must similarly be able to accommodate the larger data.


In alternative embodiments, a CNN may be computed between multiple matrix processors 240 by having control logic 250 spliting the computations along the inner product. The segments of the inner product are computed, each in a different matrix processor 240, and then the input products added together to compute the output vector, which is then stored in output array 260.


Unlike common software implementations of formatting functions that are performed by a CPU or GPU to convert a convolution operation into a matrix-multiply by rearranging data to an alternate format that is suitable for a fast matrix multiplication, various hardware implementations of the present disclosure re-format data on the fly and make it available for execution, e.g., 96 pieces of data every cycle, in effect, allowing a very large number of elements of a matrix to be processed in parallel, thus efficiently mapping data to a matrix operation. In embodiments, for 2N fetched input data 2N2 compute data may be obtained in a single clock cycle. This architecture results in a meaningful improvement in processing speeds by effectively reducing the number of read or fetch operations employed in a typical processor architecture as well as providing a paralleled, efficient and synchronized process in performing a large number of mathematical operations across a plurality of data inputs.


In embodiments, to increase efficiency of matrix processor 240 that may have any arbitrary number of columns and rows, formatter 212214 may reformat different shapes of input matrices data into the columns and rows suitable for matrix processor 240. In embodiments, formatting is performed dynamically to accommodate processing of matrices having different input sizes. In embodiments, the reformatted matrixes comprising input channels are fed into cache/buffer 224.


Cache/Buffer 224 may fetch data from data input matrix 206 only 1/k times as various pieces of data may be reused, where k is the convolution kernel width. For example, for any given cycle, once a row is fetched, certain columns will have access to all the data in that row. In embodiments, cache/buffer 224 may be a local buffer that stores a local copy of data that may be reused by a convolution without having to re-access and read data from SRAM.


Once matrix processor 240 has completed a computation, a set of result may be shifted, e.g., from the accumulators in the bottom row of matrix processor 240, e.g., to output flip-flops (not shown) that effectively form a shift register that receive a dot product. In embodiments, pulling or shifting results into output array 226, e.g., one per clock cycle, from a row that corresponds to an output channel may be accomplished by a state machine (not shown). The state machine may perform additional operations on the output channel, for example, prior to sending data to SRAM and/or post processing unit 228. The internal operation of matrix processor 240 will be described in more detail below.


In embodiments, matrix processor 240 comprises shadow resisters that enable parallel processing by storing a copy of the results that are passed through matrix processor 240 to output array 226. In embodiments, moving an operation result from output register to shadow register involves loading the next set of values into the ALUs.


Once an accumulation has completed, a convolution may commence and accumulation may start over before all of the data of a prior convolution is output to output array 226. As a result, in every clock cycle, the data in matrix processor 240 may move down by one row, such that for each cycle the last row may be output to output array 226. In effect, this mode of operation ensures that a new calculation may be made in each consecutive cycle without any interruptions and independent of additional processing operations, such as storing data in SRAM, etc.


Post processing unit 228 may comprise or interact with a number of devices (not shown), such as a hardware-accelerated pooling unit, a DRAM that may be part of a direct memory access (“DMA”) that retrieves data from memory and stores data (e.g., weights and results) in SRAM, and the like. The devices may be partially or entirely controlled by control logic 250, which may also manage formatters 210212 and other components within system 200.


Not shown in FIG. 2 are auxiliary devices that perform management functions, such as a sequencer that generates addresses for reading the data, writes the results, and keeps track of where system 200 is in the convolution in order to calculate from where to get and how to execute the data that will be used in a subsequent step of the convolution.


In certain embodiments, weight input matrix 208 is physically split and drives weights from two different sides of matrix processor 240, such that the two-dimensional array is split into two regions (e.g., a left-hand side and a right-hand side) that each receive a portion of the data in weight input matrix 208. Such an implementation reduces data latency by taking advantage of the fact that weights are known. In embodiments, in order to reduce peak power consumption, the timing of operations may be chosen such that multiplications of weight and data are spread out over a certain number of cycles. This efficient timing of operations results in a reduction of energy consuming steps including a decrease in the number of read operations performed by the matrix processor and improving the efficiency of data movement within the matrix (e.g., between sub-circuits).


In embodiments, a state machine (not shown) that is configured to identify redundant data may be employed. Identified redundant data may be reused across columns, such that the data does not need to be re-fetched. The state machine may be configured to determine how and where to shift data that is to be executed, e.g., based on inputs related to image size, filter size, stride, number of channels, and similar parameters.


In embodiments, a booth encoder is shared across a number of elements in the multiplication architecture of matrix processor 240. The booth encoder may be any booth encoder known in the art and may be used to multiply two numbers and encode one of the two numbers, e.g., from an 8-bit value to a 12-bit or any other value that makes multiplication operations easier on the multiplier logic and, thus, faster. In embodiments, the booth encoder may be applied in parallel across an entire row so as to share the same encoded, alternate weight value across all columns. By loading an operand across all columns, a multiplication may be performed in a single clock cycle across an entire row. The cost for leveraging re-encoding to share the same data (e.g., weights) across for N computational elements is thus paid only once for each column (or row). In comparison, in existing computing architectures, every single scalar would require a booth encoder for every single multiplication operation.



FIG. 3 illustrates details of an exemplary configuration of the matrix processor architecture shown in FIG. 2. In embodiments, matrix processor 300 may accommodate a predetermined vector length on each axis. As depicted in FIG. 3, matrix processor 300 may comprise an array of 6×6 tiles 302 that are arranged in a matrix format. Each tile 302 may comprise a matrix 320 that, in turn, comprises sub-circuits circuits 350. As discussed in detail below with reference to FIG. 4, each sub-circuit circuit 350 may be a cell capable of performing arithmetic operations. In embodiments, sub-circuit circuit 350 performs simultaneously multiplication, accumulation, and shift operations.


In embodiments, arithmetic operations are parallelized by utilizing multiple rows and columns of matrix processor 300 to generate an N×N tile output. For example, a given row size of 96 and a corresponding column size of 96 facilitate an output of 2*9216 mathematical calculations. In other embodiments, the number of rows and columns may be different. That is, there may be N rows and M columns and an NxM tile output may be generated. For example, for a row size of 96 and a corresponding column size of 192, an output of 2*18,432 calculations is generated in a single clock cycle.



FIG. 4 illustrates an exemplary multiply-and-add circuit implementation of the sub-circuit shown in FIG. 3. As depicted in FIG. 4, multiply-and-add circuit 400 comprises multiplier 430, adder 432, logic 434436438, accumulator 424, shadow register 428, and output register 440. In embodiments, accumulator 424 may be implemented as an accumulation register.


In embodiments, accumulator 424 may comprise a set of ALUs that comprise registers and shadow register 428 that may be configured to receive the outputs of the ALUs.


In operation, multiplier 430 receives and multiplies weights 402 and data 404 to generate products therefrom. Each product may be provided to adder 432 that, in response to receiving the product from multiplier 430, adds the product to the current value of the accumulator 424.


In embodiments, accumulator 424 generates an accumulated value that is stored, e.g., in output register 440. The accumulated value is the result of a convolution and, as mentioned with reference to FIG. 2, may correspond to the dot product of two formatted matrices.


In embodiments, a copy of the result in output register 440 may be provided to shadow register 428, which may output result 450, such that accumulator 424 can be accessed again to commence new calculations. In embodiments, multiply-and-add circuit 400 in FIG. 4 may perform a multiplication, an addition operation, and a shift operation at the same time, i.e., within a single cycle, thereby doubling the total number of operations that occur each cycle.


In embodiments, ClearAcc signal 408 clears the contents of accumulator 424, e.g., when multiplier 430 performs a multiply operation, such that accumulation operations can start over. In embodiments, ResultEnable signal 412 is activated in response to a determination that data 404 is valid. It is understood that accumulator 424 may accumulate and save data, accumulate and clear data, or just clear data.


In embodiments, results are moved from output register 440 to shadow register 428 in a single clock cycle, i.e., without the need of intermediate execute and save operations.



FIG. 5 illustrates an exemplary convolution operation according to various embodiments of the present disclosure. Convolution 500 comprises input channels IC of input image 502, weights 532, dot product 514, output channels OC, and accumulator 540.


In embodiments, convolution operation 500 applies individual filters (i.e., weights) 532 to input image 502, e.g., to detect small features within input image 502. By analyzing a sequence of different features in a different order, macro features may then be identified in input image 502. In other embodiments, input 502 is non-image data. For example, input 502 may be non-image sensor data, such as ultrasonic, radar, LIDAR, or other sensor data. Input 502 may also be general mathematical computations or any other types of data known to one of skill in the art.


Convolution 500 may use a different set of weights 532 for each input channel IC, as each input channel IC may contain a different set of information, and each weight matrix 532 may be designed to help identify a different feature. In embodiments, convolution 500 multiplies a rectangular input matrix 504 with a rectangular weight matrix 532 to obtain partial dot products. The partial dot products may then summed by adder 546 in order to generate an accumulated dot product 514 (i.e., an integer) that represents an output pixel 514 in the output image.


In embodiments, each pixel in output channel OC is generated by multiplier 542 and adder 544. In embodiments, the value of the partial dot products correspond to the application of weight matrix 532 in its entirety to area 504 of the input image 502. In other words, each weight 532 is dot multiplied by multiplier 542 with area 504 to produce a partial dot product, then the partial dot products are accumulated in accumulator 540 to generate an accumulated output that represents the convolution.


One or more input channels IC, e.g., one for each color (e.g., RGB) may be used. For example, each convolution may use weights 532 that represent three different matrices, one for each color. Each output channel OC 512 may be generated using a different filter or weight 532 that represents a different a feature in input data 502. The number of output channels may depend on the number of features. The number of convolutions is equal to the number of output channels OC times the number of input channels IC, and each convolution may have N convolutions for each input channel IC. One skilled in the art will recognize that the number and type of input channels may vary and may include color and/or clear inputs.


As depicted in FIG. 5, input matrix 504 is a Kx×Ky (i.e., 3×3) matrix that may be combined with a 3×3 weight matrix 532 across 3 input channels, i.e., 3×3×IC, such that the depths match and produce a single element, dot product 514, in the output plane. Each dot product 514 in output channel 512 is the result of a dot multiplication.



FIG. 6 through FIG. 8 illustrate details of an exemplary convolution operation according to various embodiments of the present disclosure. Convolution 600 comprises input data matrix 602, weight data matrix 604, array 606, and dot product 630. In embodiments, array 606 is a matrix processor architecture as shown in FIG. 2 and FIG. 3.


Input data matrix 602 in FIG. 6 comprises column 610 that, in embodiments, may be obtained by linearizing an input matrix, such as rectangular input matrix 504 shown in FIG. 5, to obtain a vectorized form of the input matrix. Similarly, weight data matrix 604 comprises row 620 that may be a vectorized form of a weight matrix, such as rectangular weight matrix 532 in FIG. 5. As an example, a 3×3 input matrix and 3 input channels may be re-formatted into a vector that comprises 3×3×3=27 elements from which a 27-element column 610 may be produced for use in input data matrix 602. Conversely, a 3×3 weight matrix for the same 3 input channels may be used to generate a 27-element row 620 for use in weight data matrix 604. One skilled in the art will recognize that the sizes of input matrices and number of input channels may vary across different applications.


In embodiments, the input channels and input weights drawn as rectangles in FIG. 5 are reformatted, e.g., by the formatter discussed with reference to FIG. 2, into a vector formats (e.g., vectors having 96 elements) that are provided to a matrix multiplier/processor (denoted as element 240FIG. 2), such that a 96×96 element dot product operation can be performed in parallel. In detail, input data 504 and input weights 532 shown in FIG. 5 as rectangles for each input channel are reformatted into vector formats.


In embodiments, the resulting vector formats, illustrated in FIG. 6 as input data 602 and input weights 604 (e.g., each having comprising 96 elements) are provided to matrix processor or matrix multiplier 240 that performs a 96×96 element dot product operation in parallel. In embodiments, in the calculation of output channels, the same output pixels are produced using the same set of input data but different set of weights (i.e., filters), such that by reading the input data once many output channels can be generated at once. As stated above, it is understood that the number of input and output channels may be arbitrarily chosen.


It is further understood that input data matrix 602, weight data matrix 604, and array 606 may have different numbers of columns and rows as those depicted in FIG. 6. In particular, the shapes of input data matrix 602 and weight data matrix 604 may be formatted such as to accommodate the columns and rows of any arbitrate configuration of array 606. In addition, in circumstances in which weight data matrix 604 is known then row 620 may be generated and stored in a vectorized format without the use of a formatter.


In embodiments, dot product 630 in FIG. 6 is generated by dot-multiplying a vector corresponding to column 610 with a vector corresponding to row 620. In embodiments, as shown in FIG. 7, the next dot product 632 may be obtained by dot-multiplying a vector corresponding to column 612 with the vector corresponding to row 620. As those of skill in the art will recognize, once all dot products in the first row of array 606 are filled, the dot product of the second row of array 606 may be calculated by dot-multiplying the elements in first column 610 of input data matrix 602 with the second row of weight data matrix 604, etc.


It is important to note that FIG. 6 through FIG. 8 merely serve illustrative purposes and that the abovementioned dot-multiplications may be simultaneously performed to generate a one-shot matrix-matrix multiply operation.



FIG. 9 illustrates an exemplary deconvolution operation according to various embodiments of the present disclosure. Deconvolution system 900 comprises input channels IC of input image 902, weights 922, dot product 904906, and output channels OC. A person of skill in the art will recognize that, the deconvolution operation 900 is, in effect, is a mathematical transposition (approximately the inverse) of the convolution operation, for example, the convolution shown in FIG. 5. One of skill in the art will further recognize that a neural network may be used to learn deconvolution operation 900 by applying procedures similar to those used for ordinary convolutional neural networks. For purposes of brevity, a description or functions of components similar to those in FIG. 5 is not repeated here.


In embodiments, deconvolution operation 900 in FIG. 9 reassembles matrices 912 by deconstructing dot product 904906 using weights 922. As with a convolution operation, deconvolution 900 may use a different set of weights 922 for each input channel IC. In embodiments, deconvolution 900 may be advantageously applied to an image to perform image deconvolution, for example to improve robustness against artifacts. Other applications may include analysis and restoration of image data, and the like.



FIG. 10 illustrates a process for performing arithmetic operations to accelerate convolutional neural networks according to various embodiments of the present disclosure.


Process 1000 for performing arithmetic operations begins at step 1002 when a first set of operands that may be representative of a row in a data matrix is received from a first logic circuit. This first set of operands may be vectorized such that the operands are aligned with inputs into a matrix processor. In certain embodiments, the size of the vectorized operands is directly related to the number of inputs into a matrix processor along on axis.


At step 1004, a second set of operands that may be representative of a column in a weight matrix is received from a second logic circuit. This second set of operands may be vectorized such that the operands are aligned within corresponding inputs into the matrix processor. In certain embodiments, the size of the vectorized operands is directly related to the number of inputs into the matrix process along a different axis.


At step 1006, the first set of operands is dot-multiplied with the second set of operands to obtain one or more dot-products. In certain embodiments, this set operation across the sets of operands is performed in a single clock cycle.


At step 1008, the dot-products may be used to convolve an image with a filter to produce a convolution result.


At step 1010, the convolution result is further processed to enhance the image output. This further processing may occur using a non-linear function, a normalization operation or a pooling operation.


One skilled in the art will recognize no computing system or programming language is critical to the practice of the present invention. One skilled in the art will also recognize that a number of the elements described above may be physically and/or functionally separated into sub-modules or combined together.


It shall be noted that elements of the claims below may be arranged differently including having multiple dependencies, configurations, and combinations. For example, in embodiments, the subject matter of various claims may be combined with other claims.


It will be appreciated to those skilled in the art that the preceding examples and embodiment are exemplary and not limiting to the scope of the present invention. It is intended that all permutations, enhancements, equivalents, combinations, and improvements thereto that are apparent to those skilled in the art upon a reading of the specification and a study of the drawings are included within the true spirit and scope of the present invention.

Claims
  • 1. A matrix processor comprising: a first input circuit configured to receive sensor data;a second input circuit configured to receive one or more filters of a plurality of filters; anda plurality of sub-circuits arranged as a matrix and configured to receive the sensor data and filters, wherein each sub-circuit comprises an arithmetic logic unit, and wherein the sub-circuits are configured to convolve the sensor data and filters,wherein to convolve the sensor data and filters the sub-circuits are configured to sequentially convolve, via the sub-circuits, individual subsets of the sensor data with the one or more filters, wherein one or more of the remaining filters are subsequently received for convolution,wherein a last row of the sub-circuits arranged as the matrix is configured to shift output to an output array, and wherein the output comprises individual subsets of the sensor data convolved with a respective filter of the plurality of filters.
  • 2. The matrix processor of claim 1, wherein the sensor data comprises image data, LIDAR data, ultrasonic data, or radar data.
  • 3. The matrix processor of claim 1, wherein the receive sensor data comprises reformatted operands representing linearized sensor data.
  • 4. The matrix processor of claim 1, wherein one or more of the sub-circuits further comprise encoders.
  • 5. The matrix processor of claim 4, wherein at least a portion of the sub-circuits share a particular encoder, and wherein the particular encoder is a booth encoder.
  • 6. The matrix processor of claim 1, wherein the matrix processor implements a state machine configured to identify redundant data.
  • 7. The matrix processor of claim 6, wherein identifying redundant data is based on input comprising respective sizes associated with individual filters of the plurality of filters and/or individual strides of one or more strides.
  • 8. The matrix processor of claim 1, wherein convolving a first subset comprises: determining a convolution of the first subset with one or more of the filters, wherein one or more remaining subsets are sequentially convolved with the one or more of the remaining filters.
  • 9. The matrix processor of claim 1, wherein the matrix processor comprises an array of tiles, and wherein the tiles comprise respective subsets of the sub-circuits.
  • 10. A system comprising: a first logic circuit configured to format sensor data and provide the formatted sensor data to a matrix processor;a second logic circuit configured to provide one or more filters of a plurality of filters to the matrix processor; andthe matrix processor comprising a plurality of sub-circuits arranged as a matrix, the sub-circuits being configured to sequentially convolve individual subsets of the sensor data with the one or more filters, wherein one or more of the remaining filters are subsequently received for convolution,wherein a last row of the sub-circuits arranged as the matrix is configured to shift output to an output array, and wherein the output comprises individual subsets of the sensor data convolved with a respective filter of the plurality of filters.
  • 11. The system of claim 10, wherein sequentially convolving individual subsets of the sensor data comprises: determining a convolution of a first subset of the sensor data with one or more of the filters, wherein one or more of the remaining filters of the plurality of filters are received for convolution with the first subset.
  • 12. The system of claim 10, wherein the first logic circuit comprises a plurality of data registers that store portions of the sensor data, the plurality of data registers having a first width corresponding to a size of an input region obtained from sensor data.
  • 13. The system of claim 12, wherein the input region corresponds to an individual subset of the sensor data.
  • 14. The system of claim 10, wherein the system comprises a data formatter configured to linearize sensor data into a plurality of vectors, each vector representing a respective subset of the sensor data.
  • 15. A method implemented by a matrix processor, the method comprising: receiving, from a first logic circuit, sensor data comprising a plurality of subsets;receiving, from a second logic circuit, one or more filters of a plurality of filters;using a plurality of sub-circuits of the matrix processor to sequentially convolve individual subsets with the one or more filters, wherein the sub-circuits are arranged as a matrix, wherein one or more of the remaining filters are subsequently received for sequential convolution with the individual subsets; andoutputting, to an output array, output from a last row of the sub-circuits arranged as the matrix, wherein the output comprises individual subsets of the sensor data convolved with a respective filter of the plurality of filters.
  • 16. The method of claim 15, wherein subsequent to convolving a first subset with the one or more filters, a second subset is convolved with the one or more filters.
  • 17. The method of claim 15, wherein the matrix processor sequentially receives a respective one or more filters of the plurality of filters.
  • 18. The matrix processor of claim 1, wherein the output corresponds to a particular output channel of a plurality of output channels, and wherein the plurality of filters is associated with the particular output channel.
  • 19. The system of claim 10, wherein the output corresponds to a particular output channel of a plurality of output channels, and wherein the plurality of filters is associated with the particular output channel.
  • 20. The method of claim 15, wherein the output corresponds to a particular output channel of a plurality of output channels, and wherein the plurality of filters is associated with the particular output channel.
CROSS-REFERENCE TO RELATED APPLICATION

This application is a continuation of, and claims priority to, U.S. patent application Ser. No. 15/710,433 titled “ACCELERATED MATHEMATICAL ENGINE” and filed on Sep. 20, 2017. U.S. Patent App. No. 15/710,433 claims the priority benefit under 35 USC § 119(e) to U.S. Prov. Pat. App. Ser. No. 62/536,399, filed on Jul. 24, 2017, entitled “Accelerated Mathematical Engine,” and listing Peter Joseph Bannon, Kevin Altair Hurd, and Emil Talpes as inventors. Each of the above recited applications is hereby incorporated herein by reference in its entirety and for all purposes.

US Referenced Citations (677)
Number Name Date Kind
5239636 Fischer Aug 1993 A
5267185 Akabane Nov 1993 A
5311459 D'Luna et al. May 1994 A
5333296 Bouchard Jul 1994 A
5471627 Means et al. Nov 1995 A
5519864 Martell May 1996 A
5600843 Kato et al. Feb 1997 A
5717947 Gallup et al. Feb 1998 A
5742782 Ito Apr 1998 A
5850530 Chen Dec 1998 A
5887183 Agarwal et al. Mar 1999 A
6122722 Slavenburg Sep 2000 A
6195674 Elbourne Feb 2001 B1
6446190 Barry Sep 2002 B1
6882755 Silverstein et al. May 2005 B2
7209031 Nakai et al. Apr 2007 B2
7747070 Puri Jun 2010 B2
7904867 Burch et al. Mar 2011 B2
7974492 Nishijima Jul 2011 B2
8165380 Choi et al. Apr 2012 B2
8369633 Lu et al. Feb 2013 B2
8406515 Cheatle et al. Mar 2013 B2
8509478 Haas et al. Aug 2013 B2
8588470 Rodriguez et al. Nov 2013 B2
8744174 Hamada et al. Jun 2014 B2
8773498 Lindbergh Jul 2014 B2
8912476 Fogg et al. Dec 2014 B2
8913830 Sun et al. Dec 2014 B2
8924455 Barman et al. Dec 2014 B1
8928753 Han et al. Jan 2015 B2
8972095 Furuno et al. Mar 2015 B2
8976269 Duong Mar 2015 B2
9008422 Eid et al. Apr 2015 B2
9081385 Ferguson et al. Jul 2015 B1
9275289 Li et al. Mar 2016 B2
9586455 Sugai et al. Mar 2017 B2
9672437 McCarthy Jun 2017 B2
9697463 Ross Jul 2017 B2
9710696 Wang et al. Jul 2017 B2
9738223 Zhang et al. Aug 2017 B2
9754154 Craig et al. Sep 2017 B2
9767369 Furman et al. Sep 2017 B2
9965865 Agrawal et al. May 2018 B1
10074051 Thorson Sep 2018 B2
10133273 Linke Nov 2018 B2
10140252 Fowers et al. Nov 2018 B2
10140544 Zhao et al. Nov 2018 B1
10146225 Ryan Dec 2018 B2
10152655 Krishnamurthy et al. Dec 2018 B2
10167800 Chung et al. Jan 2019 B1
10169680 Sachdeva et al. Jan 2019 B1
10192016 Ng et al. Jan 2019 B2
10216189 Haynes Feb 2019 B1
10228693 Micks et al. Mar 2019 B2
10242293 Shim et al. Mar 2019 B2
10248121 VandenBerg, III Apr 2019 B2
10262218 Lee et al. Apr 2019 B2
10282623 Ziyaee et al. May 2019 B1
10296828 Viswanathan May 2019 B2
10303961 Stoffel et al. May 2019 B1
10310087 Laddha et al. Jun 2019 B2
10311312 Yu et al. Jun 2019 B2
10318848 Dijkman et al. Jun 2019 B2
10325178 Tang et al. Jun 2019 B1
10331974 Zia et al. Jun 2019 B2
10338600 Yoon et al. Jul 2019 B2
10343607 Kumon et al. Jul 2019 B2
10359783 Williams et al. Jul 2019 B2
10366290 Wang et al. Jul 2019 B2
10372130 Kaushansky et al. Aug 2019 B1
10373019 Nariyambut Murali et al. Aug 2019 B2
10373026 Kim et al. Aug 2019 B1
10380741 Yedla et al. Aug 2019 B2
10394237 Xu et al. Aug 2019 B2
10395144 Zeng et al. Aug 2019 B2
10402646 Klaus Sep 2019 B2
10402986 Ray et al. Sep 2019 B2
10414395 Sapp et al. Sep 2019 B1
10423934 Zanghi et al. Sep 2019 B1
10436615 Agarwal et al. Oct 2019 B2
10452905 Segalovitz et al. Oct 2019 B2
10460053 Olson et al. Oct 2019 B2
10467459 Chen et al. Nov 2019 B2
10468008 Beckman et al. Nov 2019 B2
10468062 Levinson et al. Nov 2019 B1
10470510 Koh et al. Nov 2019 B1
10474160 Huang et al. Nov 2019 B2
10474161 Huang et al. Nov 2019 B2
10474928 Sivakumar et al. Nov 2019 B2
10489126 Kumar et al. Nov 2019 B2
10489478 Shalev Nov 2019 B2
10489972 Atsmon Nov 2019 B2
10503971 Dang et al. Dec 2019 B1
10514711 Bar-Nahum et al. Dec 2019 B2
10528824 Zou Jan 2020 B2
10529078 Abreu et al. Jan 2020 B2
10529088 Fine et al. Jan 2020 B2
10534854 Sharma et al. Jan 2020 B2
10535191 Sachdeva et al. Jan 2020 B2
10542930 Sanchez et al. Jan 2020 B1
10546197 Shrestha et al. Jan 2020 B2
10546217 Albright et al. Jan 2020 B2
10552682 Jonsson et al. Feb 2020 B2
10559386 Neuman Feb 2020 B1
10565475 Lecue et al. Feb 2020 B2
10567674 Kirsch Feb 2020 B2
10568570 Sherpa et al. Feb 2020 B1
10572717 Zhu et al. Feb 2020 B1
10574905 Srikanth et al. Feb 2020 B2
10579058 Oh et al. Mar 2020 B2
10579063 Haynes et al. Mar 2020 B2
10579897 Redmon et al. Mar 2020 B2
10586280 McKenna et al. Mar 2020 B2
10591914 Palanisamy et al. Mar 2020 B2
10592785 Zhu et al. Mar 2020 B2
10599701 Liu Mar 2020 B2
10599930 Lee et al. Mar 2020 B2
10599958 He et al. Mar 2020 B2
10606990 Tull et al. Mar 2020 B2
10609434 Singhai et al. Mar 2020 B2
10614344 Anthony et al. Apr 2020 B2
10621513 Deshpande et al. Apr 2020 B2
10627818 Sapp et al. Apr 2020 B2
10628432 Guo et al. Apr 2020 B2
10628686 Ogale et al. Apr 2020 B2
10628688 Kim et al. Apr 2020 B1
10629080 Kazemi et al. Apr 2020 B2
10636161 Uchigaito Apr 2020 B2
10636169 Estrada et al. Apr 2020 B2
10642275 Silva et al. May 2020 B2
10645344 Marman et al. May 2020 B2
10649464 Gray May 2020 B2
10650071 Asgekar et al. May 2020 B2
10652565 Zhang et al. May 2020 B1
10656657 Djuric et al. May 2020 B2
10657391 Chen et al. May 2020 B2
10657418 Marder et al. May 2020 B2
10657934 Kolen et al. May 2020 B1
10661902 Tavshikar May 2020 B1
10664750 Greene May 2020 B2
10671082 Huang et al. Jun 2020 B2
10671349 Bannon et al. Jun 2020 B2
10671886 Price et al. Jun 2020 B2
10678244 Iandola et al. Jun 2020 B2
10678839 Gordon et al. Jun 2020 B2
10678997 Ahuja et al. Jun 2020 B2
10679129 Baker Jun 2020 B2
10685159 Su et al. Jun 2020 B2
10685188 Zhang et al. Jun 2020 B1
10692000 Surazhsky et al. Jun 2020 B2
10692242 Morrison et al. Jun 2020 B1
10693740 Coccia et al. Jun 2020 B2
10698868 Guggilla et al. Jun 2020 B2
10699119 Lo et al. Jun 2020 B2
10699140 Kench et al. Jun 2020 B2
10699477 Levinson et al. Jun 2020 B2
10713502 Tiziani Jul 2020 B2
10719759 Kutliroff Jul 2020 B2
10725475 Yang et al. Jul 2020 B2
10726264 Sawhney et al. Jul 2020 B2
10726279 Kim Jul 2020 B1
10726374 Engineer et al. Jul 2020 B1
10732261 Wang et al. Aug 2020 B1
10733262 Miller et al. Aug 2020 B2
10733482 Lee et al. Aug 2020 B1
10733638 Jain et al. Aug 2020 B1
10733755 Liao et al. Aug 2020 B2
10733876 Moura et al. Aug 2020 B2
10740563 Dugan Aug 2020 B2
10740914 Xiao et al. Aug 2020 B2
10748062 Rippel et al. Aug 2020 B2
10748247 Paluri Aug 2020 B2
10751879 Li et al. Aug 2020 B2
10755112 Mabuchi Aug 2020 B2
10755575 Johnston et al. Aug 2020 B2
10757330 Ashrafi Aug 2020 B2
10762396 Vallespi et al. Sep 2020 B2
10768628 Martin et al. Sep 2020 B2
10768629 Song et al. Sep 2020 B2
10769446 Chang et al. Sep 2020 B2
10769483 Nirenberg et al. Sep 2020 B2
10769493 Yu et al. Sep 2020 B2
10769494 Xiao et al. Sep 2020 B2
10769525 Redding et al. Sep 2020 B2
10776626 Lin et al. Sep 2020 B1
10776673 Kim et al. Sep 2020 B2
10776939 Ma et al. Sep 2020 B2
10779760 Lee et al. Sep 2020 B2
10783381 Yu et al. Sep 2020 B2
10783454 Shoaib et al. Sep 2020 B2
10789402 Vemuri et al. Sep 2020 B1
10789544 Fiedel et al. Sep 2020 B2
10790919 Kolen et al. Sep 2020 B1
10796221 Zhang et al. Oct 2020 B2
10796355 Price et al. Oct 2020 B1
10796423 Goja Oct 2020 B2
10798368 Briggs et al. Oct 2020 B2
10803325 Bai et al. Oct 2020 B2
10803328 Bai et al. Oct 2020 B1
10803743 Abari et al. Oct 2020 B2
10805629 Liu et al. Oct 2020 B2
10809730 Chintakindi Oct 2020 B2
10810445 Kangaspunta Oct 2020 B1
10816346 Wheeler et al. Oct 2020 B2
10816992 Chen Oct 2020 B2
10817731 Vallespi et al. Oct 2020 B2
10817732 Porter et al. Oct 2020 B2
10819923 McCauley et al. Oct 2020 B1
10824122 Mummadi et al. Nov 2020 B2
10824862 Qi et al. Nov 2020 B2
10828790 Nemallan Nov 2020 B2
10832057 Chan et al. Nov 2020 B2
10832093 Taralova et al. Nov 2020 B1
10832414 Pfeiffer Nov 2020 B2
10832418 Karasev et al. Nov 2020 B1
10833785 O'Shea et al. Nov 2020 B1
10836379 Xiao et al. Nov 2020 B2
10838936 Cohen Nov 2020 B2
10839230 Charette et al. Nov 2020 B2
10839578 Coppersmith et al. Nov 2020 B2
10843628 Kawamoto et al. Nov 2020 B2
10845820 Wheeler Nov 2020 B2
10845943 Ansari et al. Nov 2020 B1
10846831 Raduta Nov 2020 B2
10846888 Kaplanyan et al. Nov 2020 B2
10853670 Sholingar et al. Dec 2020 B2
10853739 Truong et al. Dec 2020 B2
10860919 Kanazawa et al. Dec 2020 B2
10860924 Burger Dec 2020 B2
10867444 Russell et al. Dec 2020 B2
10871444 Al et al. Dec 2020 B2
10871782 Milstein et al. Dec 2020 B2
10872204 Zhu et al. Dec 2020 B2
10872254 Mangla et al. Dec 2020 B2
10872326 Garner Dec 2020 B2
10872531 Liu et al. Dec 2020 B2
10885083 Moeller-Bertram et al. Jan 2021 B2
10887433 Fu et al. Jan 2021 B2
10890898 Akella et al. Jan 2021 B2
10891715 Li Jan 2021 B2
10891735 Yang et al. Jan 2021 B2
10893070 Wang et al. Jan 2021 B2
10893107 Callari et al. Jan 2021 B1
10896763 Kempanna et al. Jan 2021 B2
10901416 Khanna et al. Jan 2021 B2
10901508 Laszlo et al. Jan 2021 B2
10902551 Mellado et al. Jan 2021 B1
10908068 Amer et al. Feb 2021 B2
10908606 Stein et al. Feb 2021 B2
10909368 Guo et al. Feb 2021 B2
10909453 Myers et al. Feb 2021 B1
10915783 Hallman et al. Feb 2021 B1
10917522 Segalis et al. Feb 2021 B2
10921817 Kangaspunta Feb 2021 B1
10922578 Banerjee et al. Feb 2021 B2
10924661 Vasconcelos et al. Feb 2021 B2
10928508 Swaminathan Feb 2021 B2
10929757 Baker et al. Feb 2021 B2
10930065 Grant et al. Feb 2021 B2
10936908 Ho et al. Mar 2021 B1
10937186 Wang et al. Mar 2021 B2
10943101 Agarwal et al. Mar 2021 B2
10943132 Wang et al. Mar 2021 B2
10943355 Fagg et al. Mar 2021 B2
11157287 Talpes Oct 2021 B2
11157441 Talpes Oct 2021 B2
11210584 Brand Dec 2021 B2
20020169942 Sugimoto Nov 2002 A1
20030035481 Hahm Feb 2003 A1
20050125369 Buck et al. Jun 2005 A1
20050162445 Sheasby et al. Jul 2005 A1
20060072847 Chor et al. Apr 2006 A1
20060224533 Thaler Oct 2006 A1
20060280364 Ma et al. Dec 2006 A1
20070255903 Tsadik Nov 2007 A1
20090016571 Tijerina et al. Jan 2009 A1
20090192958 Todorokihara Jul 2009 A1
20100017351 Hench Jan 2010 A1
20100118157 Kameyama May 2010 A1
20110029471 Chakradhar et al. Feb 2011 A1
20110239032 Kato et al. Sep 2011 A1
20120017066 Vorbach et al. Jan 2012 A1
20120109915 Kamekawa May 2012 A1
20120110491 Cheung May 2012 A1
20120134595 Fonseca et al. May 2012 A1
20120323832 Snook et al. Dec 2012 A1
20130159665 Kashyap Jun 2013 A1
20140046995 Ranous Feb 2014 A1
20140089232 Buibas et al. Mar 2014 A1
20140142929 Seide et al. May 2014 A1
20140180989 Krizhevsky et al. Jun 2014 A1
20140277718 Tzhikevich et al. Sep 2014 A1
20140351190 Levin et al. Nov 2014 A1
20150046332 Adjaoute Feb 2015 A1
20150104102 Carreira et al. Apr 2015 A1
20150199272 Goel Jul 2015 A1
20150331832 Minoya Nov 2015 A1
20160085721 Abali Mar 2016 A1
20160132786 Balan et al. May 2016 A1
20160328856 Mannino et al. Nov 2016 A1
20160342889 Thorson et al. Nov 2016 A1
20160342890 Young Nov 2016 A1
20160342891 Ross Nov 2016 A1
20160342892 Ross Nov 2016 A1
20160342893 Ross et al. Nov 2016 A1
20160364334 Asaro Dec 2016 A1
20160379109 Chung et al. Dec 2016 A1
20170011281 Dihkman et al. Jan 2017 A1
20170052785 Uliel Feb 2017 A1
20170060811 Yang Mar 2017 A1
20170097884 Werner Apr 2017 A1
20170103298 Ling Apr 2017 A1
20170103299 Aydonat Apr 2017 A1
20170103313 Ross et al. Apr 2017 A1
20170103318 Ross Apr 2017 A1
20170158134 Shigemura Jun 2017 A1
20170193360 Gao Jul 2017 A1
20170206434 Nariyambut et al. Jul 2017 A1
20170277537 Grocutt Sep 2017 A1
20170277658 Pratas Sep 2017 A1
20180012411 Richey et al. Jan 2018 A1
20180018590 Szeto et al. Jan 2018 A1
20180032857 Lele Feb 2018 A1
20180039853 Liu et al. Feb 2018 A1
20180046900 Dally Feb 2018 A1
20180067489 Oder et al. Mar 2018 A1
20180068459 Zhang et al. Mar 2018 A1
20180068540 Romanenko et al. Mar 2018 A1
20180074506 Branson Mar 2018 A1
20180107484 Sebexen Apr 2018 A1
20180121762 Han et al. May 2018 A1
20180150081 Gross et al. May 2018 A1
20180157961 Henry Jun 2018 A1
20180157962 Henry Jun 2018 A1
20180157966 Henry Jun 2018 A1
20180189633 Henry Jul 2018 A1
20180189639 Henry Jul 2018 A1
20180189640 Henry Jul 2018 A1
20180189649 Naranyan Jul 2018 A1
20180189651 Henry Jul 2018 A1
20180197067 Mody Jul 2018 A1
20180211403 Hotson et al. Jul 2018 A1
20180218260 Brand Aug 2018 A1
20180247180 Cheng Aug 2018 A1
20180260220 Lacy Sep 2018 A1
20180307438 Huang Oct 2018 A1
20180307783 Hah Oct 2018 A1
20180308012 Mummadi et al. Oct 2018 A1
20180314878 Lee et al. Nov 2018 A1
20180315153 Park Nov 2018 A1
20180336164 Phelps Nov 2018 A1
20180357511 Misra et al. Dec 2018 A1
20180374105 Azout et al. Dec 2018 A1
20190011551 Yamamoto Jan 2019 A1
20190023277 Roger et al. Jan 2019 A1
20190025773 Yang et al. Jan 2019 A1
20190026078 Bannon Jan 2019 A1
20190026237 Talpes Jan 2019 A1
20190026249 Talpes Jan 2019 A1
20190026250 Das Sarma Jan 2019 A1
20190042894 Anderson Feb 2019 A1
20190042919 Peysakhovich et al. Feb 2019 A1
20190042944 Nair et al. Feb 2019 A1
20190042948 Lee et al. Feb 2019 A1
20190057314 Julian et al. Feb 2019 A1
20190065637 Bogdoll et al. Feb 2019 A1
20190072978 Levi Mar 2019 A1
20190079526 Vallespi et al. Mar 2019 A1
20190080602 Rice et al. Mar 2019 A1
20190088948 Rasale Mar 2019 A1
20190095780 Zhong et al. Mar 2019 A1
20190095946 Azout et al. Mar 2019 A1
20190101914 Coleman et al. Apr 2019 A1
20190108417 Talagala et al. Apr 2019 A1
20190122111 Min et al. Apr 2019 A1
20190130255 Yim et al. May 2019 A1
20190145765 Luo et al. May 2019 A1
20190146497 Urtasun et al. May 2019 A1
20190147112 Gordon May 2019 A1
20190147250 Zhang et al. May 2019 A1
20190147254 Bai et al. May 2019 A1
20190147255 Homayounfar et al. May 2019 A1
20190147335 Wang et al. May 2019 A1
20190147372 Luo et al. May 2019 A1
20190158784 Ahn et al. May 2019 A1
20190179870 Bannon Jun 2019 A1
20190180154 Orlov et al. Jun 2019 A1
20190185010 Ganguli et al. Jun 2019 A1
20190189251 Horiuchi et al. Jun 2019 A1
20190197357 Anderson et al. Jun 2019 A1
20190204842 Jafari et al. Jul 2019 A1
20190205402 Sernau et al. Jul 2019 A1
20190205667 Avidan et al. Jul 2019 A1
20190217791 Bradley et al. Jul 2019 A1
20190227562 Mohammadiha et al. Jul 2019 A1
20190228037 Nicol et al. Jul 2019 A1
20190230282 Sypitkowski et al. Jul 2019 A1
20190235499 Kazemi et al. Aug 2019 A1
20190235866 Das Sarma Aug 2019 A1
20190236437 Shin et al. Aug 2019 A1
20190243371 Nister et al. Aug 2019 A1
20190244138 Bhowmick et al. Aug 2019 A1
20190250622 Nister et al. Aug 2019 A1
20190250626 Ghafarianzadeh et al. Aug 2019 A1
20190250640 O'Flaherty et al. Aug 2019 A1
20190258878 Koivisto et al. Aug 2019 A1
20190266418 Xu et al. Aug 2019 A1
20190266610 Ghatage et al. Aug 2019 A1
20190272446 Kangaspunta et al. Sep 2019 A1
20190276041 Choi et al. Sep 2019 A1
20190279004 Kwon et al. Sep 2019 A1
20190286652 Habbecke et al. Sep 2019 A1
20190286972 El Husseini et al. Sep 2019 A1
20190287028 St Amant et al. Sep 2019 A1
20190289281 Badrinarayanan et al. Sep 2019 A1
20190294177 Kwon et al. Sep 2019 A1
20190294975 Sachs Sep 2019 A1
20190311253 Chung Oct 2019 A1
20190311290 Huang et al. Oct 2019 A1
20190318099 Carvalho et al. Oct 2019 A1
20190325088 Dubey et al. Oct 2019 A1
20190325266 Klepper et al. Oct 2019 A1
20190325269 Bagherinezhad et al. Oct 2019 A1
20190325580 Lukac et al. Oct 2019 A1
20190325595 Stein et al. Oct 2019 A1
20190329790 Nandakumar et al. Oct 2019 A1
20190332875 Vallespi-Gonzalez et al. Oct 2019 A1
20190333232 Vallespi-Gonzalez et al. Oct 2019 A1
20190336063 Dascalu Nov 2019 A1
20190339989 Liang et al. Nov 2019 A1
20190340462 Pao et al. Nov 2019 A1
20190340492 Burger et al. Nov 2019 A1
20190340499 Burger et al. Nov 2019 A1
20190347501 Kim et al. Nov 2019 A1
20190349571 Herman et al. Nov 2019 A1
20190354782 Kee et al. Nov 2019 A1
20190354786 Lee et al. Nov 2019 A1
20190354808 Park et al. Nov 2019 A1
20190354817 Shlens et al. Nov 2019 A1
20190354850 Watson et al. Nov 2019 A1
20190370398 He et al. Dec 2019 A1
20190370575 Nandakumar et al. Dec 2019 A1
20190370645 Lee Dec 2019 A1
20190370935 Chang et al. Dec 2019 A1
20190373322 Rojas-Echenique et al. Dec 2019 A1
20190377345 Bachrach et al. Dec 2019 A1
20190377965 Totolos et al. Dec 2019 A1
20190378049 Widmann et al. Dec 2019 A1
20190378051 Widmann et al. Dec 2019 A1
20190382007 Casas et al. Dec 2019 A1
20190384303 Muller et al. Dec 2019 A1
20190384304 Towal et al. Dec 2019 A1
20190384309 Silva et al. Dec 2019 A1
20190384994 Frossard et al. Dec 2019 A1
20190385048 Cassidy et al. Dec 2019 A1
20190385360 Yang et al. Dec 2019 A1
20200004259 Gulino et al. Jan 2020 A1
20200004351 Marchant et al. Jan 2020 A1
20200012936 Lee et al. Jan 2020 A1
20200017117 Milton Jan 2020 A1
20200025931 Liang et al. Jan 2020 A1
20200026282 Choe et al. Jan 2020 A1
20200026283 Barnes et al. Jan 2020 A1
20200026992 Zhang et al. Jan 2020 A1
20200027210 Haemel et al. Jan 2020 A1
20200033858 Xiao Jan 2020 A1
20200033865 Mellinger et al. Jan 2020 A1
20200034148 Sumbul Jan 2020 A1
20200034665 Ghanta et al. Jan 2020 A1
20200034710 Sidhu et al. Jan 2020 A1
20200036948 Song Jan 2020 A1
20200039520 Misu et al. Feb 2020 A1
20200051550 Baker Feb 2020 A1
20200060757 Ben-Haim et al. Feb 2020 A1
20200065711 Clément et al. Feb 2020 A1
20200065879 Hu et al. Feb 2020 A1
20200069973 Lou et al. Mar 2020 A1
20200073385 Jobanputra et al. Mar 2020 A1
20200074230 Englard et al. Mar 2020 A1
20200086880 Poeppel et al. Mar 2020 A1
20200089243 Poeppel et al. Mar 2020 A1
20200089969 Lakshmi et al. Mar 2020 A1
20200090056 Singhal et al. Mar 2020 A1
20200097841 Petousis et al. Mar 2020 A1
20200098095 Borcs et al. Mar 2020 A1
20200103894 Cella et al. Apr 2020 A1
20200104705 Bhowmick et al. Apr 2020 A1
20200110416 Hong et al. Apr 2020 A1
20200117180 Cella et al. Apr 2020 A1
20200117889 Laput et al. Apr 2020 A1
20200117916 Liu Apr 2020 A1
20200117917 Yoo Apr 2020 A1
20200118035 Asawa et al. Apr 2020 A1
20200125844 She et al. Apr 2020 A1
20200125845 Hess et al. Apr 2020 A1
20200126129 Lkhamsuren et al. Apr 2020 A1
20200134427 Oh et al. Apr 2020 A1
20200134461 Chai et al. Apr 2020 A1
20200134466 Weintraub et al. Apr 2020 A1
20200134848 El-Khamy et al. Apr 2020 A1
20200143231 Fusi et al. May 2020 A1
20200143279 West et al. May 2020 A1
20200148201 King et al. May 2020 A1
20200149898 Felip et al. May 2020 A1
20200151201 Chandrasekhar et al. May 2020 A1
20200151619 Mopur et al. May 2020 A1
20200151692 Gao et al. May 2020 A1
20200158822 Owens et al. May 2020 A1
20200158869 Amirloo et al. May 2020 A1
20200159225 Zeng et al. May 2020 A1
20200160064 Wang et al. May 2020 A1
20200160104 Urtasun et al. May 2020 A1
20200160117 Urtasun et al. May 2020 A1
20200160178 Kar et al. May 2020 A1
20200160532 Urtasun et al. May 2020 A1
20200160558 Urtasun et al. May 2020 A1
20200160559 Urtasun et al. May 2020 A1
20200160598 Manivasagam et al. May 2020 A1
20200162489 Bar-Nahur et al. May 2020 A1
20200167438 Herring May 2020 A1
20200167554 Wang et al. May 2020 A1
20200174481 Van Heukelom et al. Jun 2020 A1
20200175326 Shen et al. Jun 2020 A1
20200175354 Volodarskiy et al. Jun 2020 A1
20200175371 Kursun Jun 2020 A1
20200175401 Shen Jun 2020 A1
20200183482 Sebot et al. Jun 2020 A1
20200184250 Oko Jun 2020 A1
20200184333 Oh Jun 2020 A1
20200192389 ReMine et al. Jun 2020 A1
20200193313 Ghanta et al. Jun 2020 A1
20200193328 Guestrin et al. Jun 2020 A1
20200202136 Shrestha et al. Jun 2020 A1
20200202196 Guo et al. Jun 2020 A1
20200209857 Djuric et al. Jul 2020 A1
20200209867 Valois et al. Jul 2020 A1
20200209874 Chen et al. Jul 2020 A1
20200210717 Hou et al. Jul 2020 A1
20200210769 Hou et al. Jul 2020 A1
20200210777 Valois et al. Jul 2020 A1
20200216064 du Toit et al. Jul 2020 A1
20200218722 Mai et al. Jul 2020 A1
20200218979 Kwon et al. Jul 2020 A1
20200223434 Campos et al. Jul 2020 A1
20200225758 Tang et al. Jul 2020 A1
20200226377 Campos et al. Jul 2020 A1
20200226430 Ahuja et al. Jul 2020 A1
20200238998 Dasalukunte et al. Jul 2020 A1
20200242381 Chao et al. Jul 2020 A1
20200242408 Kim et al. Jul 2020 A1
20200242511 Kale et al. Jul 2020 A1
20200245869 Sivan et al. Aug 2020 A1
20200249685 Elluswamy et al. Aug 2020 A1
20200250456 Wang et al. Aug 2020 A1
20200250515 Rifkin et al. Aug 2020 A1
20200250874 Assouline et al. Aug 2020 A1
20200257301 Weiser et al. Aug 2020 A1
20200257306 Nisenzon Aug 2020 A1
20200258057 Farahat et al. Aug 2020 A1
20200265247 Musk et al. Aug 2020 A1
20200272160 Djuric et al. Aug 2020 A1
20200272162 Hasselgren et al. Aug 2020 A1
20200272859 Iashyn et al. Aug 2020 A1
20200273231 Schied et al. Aug 2020 A1
20200279354 Klaiman Sep 2020 A1
20200279364 Sarkisian et al. Sep 2020 A1
20200279371 Wenzel et al. Sep 2020 A1
20200285464 Brebner Sep 2020 A1
20200286256 Houts et al. Sep 2020 A1
20200293786 Jia et al. Sep 2020 A1
20200293796 Sajjadi et al. Sep 2020 A1
20200293828 Wang et al. Sep 2020 A1
20200293905 Huang et al. Sep 2020 A1
20200294162 Shah Sep 2020 A1
20200294257 Yoo et al. Sep 2020 A1
20200294310 Lee et al. Sep 2020 A1
20200297237 Tamersoy et al. Sep 2020 A1
20200298891 Liang et al. Sep 2020 A1
20200301799 Manivasagam et al. Sep 2020 A1
20200302276 Yang et al. Sep 2020 A1
20200302291 Hong Sep 2020 A1
20200302627 Duggal et al. Sep 2020 A1
20200302662 Homayounfar et al. Sep 2020 A1
20200304441 Bradley et al. Sep 2020 A1
20200306640 Kolen et al. Oct 2020 A1
20200307562 Ghafarianzadeh et al. Oct 2020 A1
20200307563 Ghafarianzadeh et al. Oct 2020 A1
20200309536 Omari et al. Oct 2020 A1
20200309923 Bhaskaran et al. Oct 2020 A1
20200310442 Halder et al. Oct 2020 A1
20200311601 Robinson et al. Oct 2020 A1
20200312003 Borovikov et al. Oct 2020 A1
20200315708 Mosnier et al. Oct 2020 A1
20200320132 Neumann Oct 2020 A1
20200324073 Rajan et al. Oct 2020 A1
20200327192 Hackman et al. Oct 2020 A1
20200327443 Van et al. Oct 2020 A1
20200327449 Tiwari et al. Oct 2020 A1
20200327662 Liu et al. Oct 2020 A1
20200327667 Arbel et al. Oct 2020 A1
20200331476 Chen et al. Oct 2020 A1
20200334416 Vianu et al. Oct 2020 A1
20200334495 Al et al. Oct 2020 A1
20200334501 Lin et al. Oct 2020 A1
20200334551 Javidi et al. Oct 2020 A1
20200334574 Ishida Oct 2020 A1
20200337648 Saripalli et al. Oct 2020 A1
20200341466 Pham et al. Oct 2020 A1
20200342350 Madar et al. Oct 2020 A1
20200342548 Mazed et al. Oct 2020 A1
20200342652 Rowell et al. Oct 2020 A1
20200348909 Das Sarma et al. Nov 2020 A1
20200350063 Thornton et al. Nov 2020 A1
20200351438 Dewhurst et al. Nov 2020 A1
20200356107 Wells Nov 2020 A1
20200356790 Jaipuria et al. Nov 2020 A1
20200356864 Neumann Nov 2020 A1
20200356905 Luk et al. Nov 2020 A1
20200361083 Mousavian et al. Nov 2020 A1
20200361485 Zhu et al. Nov 2020 A1
20200364481 Kornienko et al. Nov 2020 A1
20200364508 Gurel et al. Nov 2020 A1
20200364540 Elsayed et al. Nov 2020 A1
20200364746 Longano et al. Nov 2020 A1
20200364953 Simoudis Nov 2020 A1
20200372362 Kim Nov 2020 A1
20200372402 Kursun et al. Nov 2020 A1
20200380362 Cao et al. Dec 2020 A1
20200380383 Kwong et al. Dec 2020 A1
20200393841 Frisbie et al. Dec 2020 A1
20200394421 Yu et al. Dec 2020 A1
20200394457 Brady Dec 2020 A1
20200394495 Moudgill et al. Dec 2020 A1
20200394813 Theverapperuma et al. Dec 2020 A1
20200396394 Zlokolica et al. Dec 2020 A1
20200398855 Thompson Dec 2020 A1
20200401850 Bazarsky et al. Dec 2020 A1
20200401886 Deng et al. Dec 2020 A1
20200402155 Kurian et al. Dec 2020 A1
20200402226 Peng Dec 2020 A1
20200410012 Moon et al. Dec 2020 A1
20200410224 Goel Dec 2020 A1
20200410254 Pham et al. Dec 2020 A1
20200410288 Capota et al. Dec 2020 A1
20200410751 Omari et al. Dec 2020 A1
20210004014 Sivakumar Jan 2021 A1
20210004580 Sundararaman et al. Jan 2021 A1
20210004611 Garimella et al. Jan 2021 A1
20210004663 Park et al. Jan 2021 A1
20210006835 Slattery et al. Jan 2021 A1
20210011908 Hayes et al. Jan 2021 A1
20210012116 Urtasun et al. Jan 2021 A1
20210012210 Sikka et al. Jan 2021 A1
20210012230 Hayes et al. Jan 2021 A1
20210012239 Arzani et al. Jan 2021 A1
20210015240 Elfakhri et al. Jan 2021 A1
20210019215 Neeter Jan 2021 A1
20210026360 Luo Jan 2021 A1
20210027112 Brewington et al. Jan 2021 A1
20210027117 McGavran et al. Jan 2021 A1
20210030276 Li et al. Feb 2021 A1
20210034921 Pinkovich et al. Feb 2021 A1
20210042575 Firner Feb 2021 A1
20210042928 Takeda et al. Feb 2021 A1
20210046954 Haynes Feb 2021 A1
20210048984 Bannon Feb 2021 A1
20210049378 Gautam et al. Feb 2021 A1
20210049455 Kursun Feb 2021 A1
20210049456 Kursun Feb 2021 A1
20210049548 Grisz et al. Feb 2021 A1
20210049700 Nguyen et al. Feb 2021 A1
20210056114 Price et al. Feb 2021 A1
20210056306 Hu et al. Feb 2021 A1
20210056317 Golov Feb 2021 A1
20210056420 Konishi et al. Feb 2021 A1
20210056701 Vranceanu et al. Feb 2021 A1
20220050806 Talpes Feb 2022 A1
Foreign Referenced Citations (255)
Number Date Country
2019261735 Jun 2020 AU
2019201716 Oct 2020 AU
110599537 Dec 2010 CN
102737236 Oct 2012 CN
103366339 Oct 2013 CN
104835114 Aug 2015 CN
103236037 May 2016 CN
103500322 Aug 2016 CN
106419893 Feb 2017 CN
106504253 Mar 2017 CN
107031600 Aug 2017 CN
107169421 Sep 2017 CN
107507134 Dec 2017 CN
107885214 Apr 2018 CN
108122234 Jun 2018 CN
107133943 Jul 2018 CN
107368926 Jul 2018 CN
105318888 Aug 2018 CN
108491889 Sep 2018 CN
108647591 Oct 2018 CN
108710865 Oct 2018 CN
105550701 Nov 2018 CN
108764185 Nov 2018 CN
108845574 Nov 2018 CN
108898177 Nov 2018 CN
109086867 Dec 2018 CN
107103113 Jan 2019 CN
109215067 Jan 2019 CN
109359731 Feb 2019 CN
109389207 Feb 2019 CN
109389552 Feb 2019 CN
106779060 Mar 2019 CN
109579856 Apr 2019 CN
109615073 Apr 2019 CN
106156754 May 2019 CN
106598226 May 2019 CN
106650922 May 2019 CN
109791626 May 2019 CN
109901595 Jun 2019 CN
109902732 Jun 2019 CN
109934163 Jun 2019 CN
109948428 Jun 2019 CN
109949257 Jun 2019 CN
109951710 Jun 2019 CN
109975308 Jul 2019 CN
109978132 Jul 2019 CN
109978161 Jul 2019 CN
110060202 Jul 2019 CN
110069071 Jul 2019 CN
110084086 Aug 2019 CN
110096937 Aug 2019 CN
110111340 Aug 2019 CN
110135485 Aug 2019 CN
110197270 Sep 2019 CN
110310264 Oct 2019 CN
110321965 Oct 2019 CN
110334801 Oct 2019 CN
110399875 Nov 2019 CN
110414362 Nov 2019 CN
110426051 Nov 2019 CN
110473173 Nov 2019 CN
110516665 Nov 2019 CN
110543837 Dec 2019 CN
110569899 Dec 2019 CN
110599864 Dec 2019 CN
110619282 Dec 2019 CN
110619283 Dec 2019 CN
110619330 Dec 2019 CN
110659628 Jan 2020 CN
110688992 Jan 2020 CN
107742311 Feb 2020 CN
110751280 Feb 2020 CN
110826566 Feb 2020 CN
107451659 Apr 2020 CN
108111873 Apr 2020 CN
110956185 Apr 2020 CN
110966991 Apr 2020 CN
111027549 Apr 2020 CN
111027575 Apr 2020 CN
111047225 Apr 2020 CN
111126453 May 2020 CN
111158355 May 2020 CN
107729998 Jun 2020 CN
108549934 Jun 2020 CN
111275129 Jun 2020 CN
111275618 Jun 2020 CN
111326023 Jun 2020 CN
111428943 Jul 2020 CN
111444821 Jul 2020 CN
111445420 Jul 2020 CN
111461052 Jul 2020 CN
111461053 Jul 2020 CN
111461110 Jul 2020 CN
110225341 Aug 2020 CN
111307162 Aug 2020 CN
111488770 Aug 2020 CN
111539514 Aug 2020 CN
111565318 Aug 2020 CN
111582216 Aug 2020 CN
111598095 Aug 2020 CN
108229526 Sep 2020 CN
111693972 Sep 2020 CN
106558058 Oct 2020 CN
107169560 Oct 2020 CN
107622258 Oct 2020 CN
111767801 Oct 2020 CN
111768002 Oct 2020 CN
111783545 Oct 2020 CN
111783971 Oct 2020 CN
111797657 Oct 2020 CN
111814623 Oct 2020 CN
111814902 Oct 2020 CN
111860499 Oct 2020 CN
111881856 Nov 2020 CN
111882579 Nov 2020 CN
111897639 Nov 2020 CN
111898507 Nov 2020 CN
111898523 Nov 2020 CN
111899227 Nov 2020 CN
112101175 Dec 2020 CN
112101562 Dec 2020 CN
112115953 Dec 2020 CN
111062973 Jan 2021 CN
111275080 Jan 2021 CN
112183739 Jan 2021 CN
112232497 Jan 2021 CN
112288658 Jan 2021 CN
112308095 Feb 2021 CN
112308799 Feb 2021 CN
112313663 Feb 2021 CN
112329552 Feb 2021 CN
112348783 Feb 2021 CN
111899245 Mar 2021 CN
202017102235 May 2017 DE
202017102238 May 2017 DE
102017116017 Jan 2019 DE
102018130621 Jun 2020 DE
102019008316 Aug 2020 DE
0 422 348 Apr 1991 EP
1215626 Sep 2008 EP
2228666 Sep 2012 EP
2420408 May 2013 EP
2723069 Apr 2014 EP
2741253 Jun 2014 EP
3115772 Jan 2017 EP
2618559 Aug 2017 EP
3285485 Feb 2018 EP
2863633 Feb 2019 EP
3113080 May 2019 EP
3525132 Aug 2019 EP
3531689 Aug 2019 EP
3537340 Sep 2019 EP
3543917 Sep 2019 EP
3608840 Feb 2020 EP
3657387 May 2020 EP
2396750 Jun 2020 EP
3664020 Jun 2020 EP
3690712 Aug 2020 EP
3690742 Aug 2020 EP
3722992 Oct 2020 EP
3690730 Nov 2020 EP
3739486 Nov 2020 EP
3501897 Dec 2020 EP
3751455 Dec 2020 EP
3783527 Feb 2021 EP
2402572 Aug 2005 GB
2548087 Sep 2017 GB
2577485 Apr 2020 GB
2517270 Jun 2020 GB
04-295953 Oct 1992 JP
2578262 Aug 1998 JP
3941252 Jul 2007 JP
4282583 Jun 2009 JP
4300098 Jul 2009 JP
2010-079840 Apr 2010 JP
2015004922 Jan 2015 JP
2015-056124 Mar 2015 JP
5863536 Feb 2016 JP
6044134 Dec 2016 JP
2017-027149 Feb 2017 JP
6525707 Jun 2019 JP
2019101535 Jun 2019 JP
2020101927 Jul 2020 JP
2020173744 Oct 2020 JP
100326702 Feb 2002 KR
101082878 Nov 2011 KR
101738422 May 2017 KR
101969864 Apr 2019 KR
101996167 Jul 2019 KR
102022388 Aug 2019 KR
102043143 Nov 2019 KR
102095335 Mar 2020 KR
102097120 Apr 2020 KR
1020200085490 Jul 2020 KR
102189262 Dec 2020 KR
1020200142266 Dec 2020 KR
200630819 Sep 2006 TW
I294089 Mar 2008 TW
I306207 Feb 2009 TW
WO-9410638 May 1994 WO
WO 02052835 Jul 2002 WO
WO 14025765 Feb 2014 WO
WO 16032398 Mar 2016 WO
WO 16048108 Mar 2016 WO
WO 16099779 Jun 2016 WO
WO 16186811 Nov 2016 WO
WO 16186823 Nov 2016 WO
WO 16207875 Dec 2016 WO
WO 17117186 Jul 2017 WO
WO 17158622 Sep 2017 WO
WO 19005547 Jan 2019 WO
WO 19067695 Apr 2019 WO
WO 19089339 May 2019 WO
WO 19092456 May 2019 WO
WO 19099622 May 2019 WO
WO 19122952 Jun 2019 WO
WO 19125191 Jun 2019 WO
WO 19126755 Jun 2019 WO
WO 19144575 Aug 2019 WO
WO 19182782 Sep 2019 WO
WO 19191578 Oct 2019 WO
WO 19216938 Nov 2019 WO
WO 19220436 Nov 2019 WO
WO 20006154 Jan 2020 WO
WO 20012756 Jan 2020 WO
WO 20025696 Feb 2020 WO
WO 20034663 Feb 2020 WO
WO 20056157 Mar 2020 WO
WO 20076356 Apr 2020 WO
WO 20097221 May 2020 WO
WO 20101246 May 2020 WO
WO 20120050 Jun 2020 WO
WO 20121973 Jun 2020 WO
WO 20131140 Jun 2020 WO
WO 20139181 Jul 2020 WO
WO 20139355 Jul 2020 WO
WO 20139357 Jul 2020 WO
WO 20142193 Jul 2020 WO
WO 20146445 Jul 2020 WO
WO 20151329 Jul 2020 WO
WO 20157761 Aug 2020 WO
WO 20163455 Aug 2020 WO
WO 20167667 Aug 2020 WO
WO 20174262 Sep 2020 WO
WO 20177583 Sep 2020 WO
WO 20185233 Sep 2020 WO
WO 20185234 Sep 2020 WO
WO 20195658 Oct 2020 WO
WO 20198189 Oct 2020 WO
WO 20198779 Oct 2020 WO
WO 20205597 Oct 2020 WO
WO 20221200 Nov 2020 WO
WO 20240284 Dec 2020 WO
WO 20260020 Dec 2020 WO
WO 20264010 Dec 2020 WO
Non-Patent Literature Citations (11)
Entry
Author Unknown; “Booth's Multiplication Algorithm”, Wikipedia, Version from May 30, 2017 (Year: 2017).
Author Unknown, “Accumulator (computing)”, Wikipedia, Version from Jul. 14, 2017 (Year: 2017).
Cornu, Thierry; Ienne, Paolo; Niebur, Dagmar; Thiran, Patrick; Viredaz, Marc A., “Design,mplementation, and Test of a Multi-Model Systolic Neural-Network Accelerator”, Scientific Programming-Parallel:;omputing Projects of the Swiss Priority Programme, vol. 5, No. 1, Jan. 1, 1996, (14 pgs).
Kim, Sang Kyun; McMahon, Peter L.; Olukotun, Kunle, “A Large-scale Architecture for Restricted Boltzmann Machines”, Department of Electrical Engineering Stanford University, 2010 18th IEEE Annual International Symposium on, IEEE, Piscataway, NJ, USA, May 2, 2010, (8 pgs).
Krizhevsky, Alex; Sutskever, Ilya; Hinton Geoffrey E.; “ImageNet Classification with Deep:;onvolutional Neural Networks”, The 26th annual conference on Neural Information Processing Systems: Dec. 3-8, 2012. Available from Internet, <http://papers.nips.cc/book/advances-in-neural-information-processing-,ystems-25-2012>, (9 pgs).
Kung S: “VLSI Array processors”, IEEE ASSP Magazine, IEEE. US, vol. 2, No. 3, Jul. 1985 (1 pg).
Sato, Kaz; Young, Cliff; Patterson, David “An in-depth look at Google's first ensor Processing Unit (TPU)”, posted in Google Cloud Big Data And MACHINEcEARNING Blog, posting dale May 12, 2017. Available from Internet, <URL: https://cloud.google.com/blog/big-data/>, (22 pgs).
International Search Report and Written Opinion dated Oct. 1, 2018, in International Patent Application No. PCT/US18/42959.
International Search Report and Written Opinion dated Sep. 10, 2018 in application No. PCT/US18/38618.
Jouppi et al., Jun. 26, 2017, In-datacenter performance analysis of a tensor processing unit, 44th International symposium on Computer Architecture IKSCA), Toronto, Canada, 28 pp.
Oxford Dictionary, Definition of synchronize, retrieved Sep. 12, 2020, https://www/lexico.com/en/definition/synchronize.
Related Publications (1)
Number Date Country
20210048984 A1 Feb 2021 US
Provisional Applications (1)
Number Date Country
62536399 Jul 2017 US
Continuations (1)
Number Date Country
Parent 15710433 Sep 2017 US
Child 16887784 US