With respect to machine learning and cognitive science, a neural network is a statistical learning model that is used to estimate or approximate functions that may depend on a large number of inputs. In this regard, artificial neural networks may include systems of interconnected neurons which exchange messages between each other. The interconnections may include numeric weights that may be tuned based on experience, which makes neural networks adaptive to inputs and capable of learning. For example, a neural network for character recognition may be defined by a set of input neurons which may be activated by pixels of an input image. The activations of the input neurons are then passed on to other neurons after the input neurons are weighted and transformed by a function. This process may be repeated until an output neuron is activated, whereby the character that is read may be determined.
Features of the present disclosure are illustrated by way of example and not limited in the following figure(s), in which like numerals indicate like elements, in which:
For simplicity and illustrative purposes, the present disclosure is described by referring mainly to examples. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure. It will be readily apparent however, that the present disclosure may be practiced without limitation to these specific details. In other instances, some methods and structures have not been described in detail so as not to unnecessarily obscure the present disclosure.
Throughout the present disclosure, the terms “a” and “an” are intended to denote at least one of a particular element. As used herein, the term “includes” means includes but not limited to, the term “including” means including but not limited to. The term “based on” means based at least in part on.
With respect to neural networks, neuromorphic computing is described as the use of very-large-scale integration (VLSI) systems including electronic analog circuits to mimic neuro-biological architectures present in the nervous system. Neuromorphic computing may be used with recognition, mining, and synthesis (RMS) applications. Recognition may be described as the examination of data to determine what the data represents. Mining may be described as the search for particular types of models determined from the recognized data. Further, synthesis may be described as the generation of a potential model where a model does not previously exist. With respect to RMS applications and other types of applications, specialized neural chips, which may be several orders of magnitude more efficient than central processing unit (CPU) or graphics processor unit (GPU) computations, may provide for the scaling of neural networks to simulate billions of neurons and mine vast amounts of data.
With respect to machine readable instructions to control neural networks, neuromorphic memory arrays may be used for RMS applications and other types of applications by performing computations directly in such memory arrays. The type of memory employed in neuromorphic memory arrays may either be analog or digital. In this regard, the choice of the type of memory may impact characteristics such as accuracy, energy, performance, etc., of the associated neuromorphic system.
In this regard, a hybrid synaptic architecture based neural network apparatus, and a method for implementing the hybrid synaptic architecture based neural network are disclosed herein. The apparatus and method disclosed herein may use a combination of analog and digital memory arrays to reduce energy consumption compared, for example, to state-of-the-art neuromorphic systems. According to examples, the apparatus and method disclosed herein may be used with memristor based neural systems, and/or use a memristor's high on/off ratio and tradeoffs between write latency and accuracy to implement neural cores with varying levels of accuracy and energy consumption. The apparatus and method disclosed herein may achieve a high degree of power efficiency, and may simulate an order of magnitude more neurons per chip compared to a fully digital design. For example, since more neurons per unit area may be simulated for an analog implementation, for the apparatus and method disclosed herein, a higher number of neurons per chip (e.g., a higher number of overall neural cores including analog neural cores and digital neural cores) may be simulated per chip compared to a fully digital design.
Referring to
An information recognition, mining, and synthesis module 108 may determine information that is to be recognized, mined, and/or synthesized from input data 110 (e.g., see
A results generation module 114 may generate, based on the analysis of the data subset 112, results 116 (e.g., see
An interconnect 118 between the analog neural cores 104 and the digital neural cores 106 may be implemented by a CPU, a CPU, by a state machine, or other such techniques. For example, the state machine may detect an output of the analog neural cores 104 and direct the output to the digital neural cores 106. In this regard, the CPU, the CPU, the state machine, or other such techniques may be controlled and/or implemented as a part of the information recognition, mining, and synthesis module 108.
The modules and other elements of the apparatus 100 may be machine readable instructions stored on a non-transitory computer readable medium. In this regard, the apparatus 100 may include or be a non-transitory computer readable medium. In addition, or alternatively, the modules and other elements of the apparatus 100 may be hardware or a combination of machine readable instructions and hardware.
Referring to
For example, as shown in
With respect to extraction of features from the data 110, the output values yj may be compared to known values from a database to determine a feature that is represented by the output values yj. For example, the information recognition, mining, and synthesis module 108 may compare the output values yj to known values from a database to determine information (e.g., a feature) that is represented by the output values yj. In this regard, the information recognition, mining, and synthesis module 108 may perform recognition, for example, by examining the data 110 to determine what the data represents, mining to search for particular types of models determined from the recognized data, and synthesis to generate a potential model where a model does not previously exist.
For the analog neural core 104, instead of the use of the memristor array based analog memory array 300, the analog memory array 300 may be implemented by flash memory (used in an analog mode), and other types of memory.
Referring to
For example, as shown in
The output yj (e.g., y1, y2, etc.) of the multiply-add-accumulate units 402 may be routed to other neural cores (e.g., other analog and/or neural cores), where, for a digital neural core, the output is fed as input to the row decoder 404 and the multiply-add-accumulate units 402 of the other neural cores.
For the digital neural core 106, the digital memory array 400 may be implemented by use of a variety of technologies. For example, the digital memory array 400 may be implemented by using memristor based memory, CPU based memory, GPU based memory, a process in memory based solution, etc. For example, with respect to the digital memory array 400, at first w1,1 and a corresponding value for x1 may be read, these values may be multiplied at the multiply-add-accumulate units 402, and so forth for further values of wi,j and xi. In this regard, these operations may be performed by the digital memory array 400 implemented by using memristor based memory, CPU based memory, GPU based memory, a process in memory based solution, etc.
As disclosed herein, since the apparatus 100 may use a combination of analog neural cores 104 that include analog memory arrays and digital neural cores 106 that include digital memory arrays, the corresponding peripheral circuits may also use analog or digital functional units, respectively.
With respect to the use of the analog neural cores 104 and the digital neural cores 106 as disclosed herein, the choice of the neural core may impact the operating power and accuracy of the neural network. For example, a neural core using an analog memory array may consume an order of magnitude less energy compared to a neural core using a digital memory array. However, in certain instances, the use of the analog memory array 300 may degrade the accuracy of the analog neural core 104. For example, if the value of the weights wi,jare inaccurate, these inaccuracies may further degrade the accuracy of the analog neural core 104.
The apparatus 100 may therefore selectively actuate a plurality of analog neural cores 104 to increase energy efficiency of the apparatus 100 or a component that utilizes the apparatus 100 and/or the plurality of analog neural cores 104, and selectively actuate a plurality of digital neural cores 106 to increase accuracy of the apparatus 100 or a component that utilizes the apparatus 100 and/or the plurality of digital neural cores 106. In this regard, according to examples, the apparatus 100 may include or be implemented in a component that includes a hybrid analog-digital neural chip. The hybrid analog-digital neural chip may be used to perform coarse level analysis on the data 110 (e.g., all or a relatively high amount of the data 110) using the analog neural cores 104. Based on the results of the coarse level analysis, the data subset 112 (i.e., a subset of the data 110) may be identified for fine grained analysis. For example, the digital neural cores 106 may be used to perform fine grained analysis on the data subset 112. In this regard, the digital neural cores 106 may be used to perform fine grained mining of the data subset 112. The data subset 112 may represent a region of interest related to an object of interest in the data 110.
According to examples, with respect to determining, based on the information, selected ones of the plurality of analog neural cores 104 that are to be actuated to identify the data subset 112 of the input data 110, the information recognition, mining, and synthesis module 108 may determine, based on the information, selected ones of the plurality of analog neural cores 104 that are to be actuated to identify the data subset 112 of the input data 110 to reduce an energy consumption of the apparatus 100.
According to examples, with respect to determining, based on the information, selected ones of the plurality of analog neural cores 104 that are to be actuated to identify the data subset 112 of the input data 110, the information recognition, mining, and synthesis module 108 may determine, based on the information, selected ones of the plurality of analog neural cores 104 that are to be actuated to identify the data subset 112 of the input data 110 to meet an accuracy specification of the apparatus 100.
According to examples, with respect to accuracy of the apparatus 100, the information recognition, mining, and synthesis module 108 may increase a number of the selected ones of the plurality of digital neural cores 106 that are to be actuated to analyze the data subset 112 to increase an accuracy of the recognition, mining, and/or synthesizing of the information.
According to examples, with respect to energy consumption of the apparatus 100, the information recognition, mining, and synthesis module 108 may reduce an energy consumption of the apparatus 100 by decreasing a number of the selected ones of the plurality of digital neural cores 106 that are to be actuated to analyze the data subset 112.
The apparatus 100 may also selectively actuate a plurality of analog neural cores 104 to reduce the amount of data that is to be buffered for the digital neural cores 106. For example, instead of buffering all of the data for analysis by digital neural cores 106, the buffered data may be limited to the data subset 112 to thus increase energy efficiency of the apparatus 100 or a component that utilizes the apparatus 100. For example, with respect to reducing an amount of data received by the digital neural core input buffers, for an analog neural core input buffer associated with each of the analog neural cores 104 to receive the input data 110 for forwarding to the plurality of memristors, and a digital neural core input buffer associated with each of the digital neural cores 106 to receive the output data from the analog neural cores 104, the information recognition, mining, and synthesis module 108 may reduce an amount of data received by the digital neural core input buffers based on elimination of all but the data subset 112 that is to be analyzed by the selected ones of the plurality of digital neural cores 106.
The apparatus 100 may also selectively actuate the plurality of analog neural cores 104 to increase performance aspects such as an amount of time needed to generate results. For example, based on the faster performance of the analog neural cores 104, the amount of time needed to generate results may be reduced compared to analysis of all of the data 110 by the digital neural cores 106.
According to examples, for the data 110 that includes a streaming video, for the apparatus 100 that operates as or in conjunction with an image recognition system, in order to identify certain aspects of the streaming video (e.g., a moving car, a number plate, or static objects such as buildings, building numbers, etc.), a hybrid analog-digital neural chip (that includes the analog neural cores 104 and the digital neural cores 106) may be used to perform coarse level analysis on the data 110 using the analog neural cores 104 to identify moving features that likely resemble a car. Based on the results of the coarse level analysis, the data subset 112 (i.e., a subset of the data 110 of moving features that likely resemble a car) may be identified for fine grained analysis. For example, the digital neural cores 106 may be used to perform fine grained analysis on the data subset 112 of moving features that likely resemble a car (e.g., a segment of a frame including the moving features that likely resemble a car). In this regard, the digital neural cores 106 may be used to perform fine grained mining of the data subset 112 of moving features that likely resemble a car. The fine grained analysis performed the digital neural cores 106 may be used to identify components such as number plates, face recognition of a person inside the car, etc. In this regard, as the input set to the digital neural cores 106 is smaller than the original streaming video, a number of the digital neural cores 106 that are utilized may be reduced, compared to use of the digital neural cores 106 for the entire analysis of the original streaming video.
The apparatus 100 may also include the selective feeding of results from the analog neural cores 104 to the digital neural cores 106 for processing. For example, if the output y1 for the example of
Referring to
At block 504, the method may include determining, based on the information, selected ones of the plurality of analog neural cores 104 that are to be actuated to identify a data subset 112 of the input data 110.
At block 506, the method may include discarding, based on the identification of the data subset 112, remaining data, other than the data subset 112, from further analysis.
At block 508, the method may include using, by a processor (e.g., the processor 902), the CPU and/or the GPU to analyze the data subset 112 (i.e., to perform the digital neural processing) to generate, based on the analysis of the data subset 112, results 116 of the recognition, mining, and/or synthesizing of the information.
Referring to
At block 604, the method may include determining, based on the information, selected ones of the plurality of analog neural cores 104 that are to be actuated to identify a data subset 112 of the input data 110.
At block 606, the method may include determining, based on the data subset 112, selected ones of the plurality of digital neural cores 106 that are to be actuated to analyze the data subset 112.
At block 608, the method may include generating, based on the analysis of the data subset 112, results 116 of the recognition, mining, and/or synthesizing of the information.
Referring to
At block 704, the method may include determining an energy efficiency parameter and/or an accuracy parameter related to the plurality of analog neural cores 104 and the plurality of digital neural cores 106. The energy efficiency parameter may represent, for example, an amount (or percentage) of energy efficiency that is to be implement for the apparatus 100. For example, a higher energy efficiency parameter may be determined to utilize a higher number of analog neural cores 104 compared to a lower energy efficiency parameter. The accuracy parameter may represent, for example, an amount (or percentage) of accuracy that is to be implement for the apparatus 100. For example, a higher accuracy parameter may be selected to utilize a higher number of digital neural cores 106 compared to a lower energy efficiency parameter.
At block 706, the method may include determining, based on the information and the energy efficiency parameter and/or the accuracy parameter, selected ones of the plurality of analog neural cores 104 that are to be actuated to identify a data subset 112 of the input data 110.
At block 708, the method may include determining, based on the data subset 112, selected ones of the plurality of digital neural cores 106 that are to be actuated to analyze the data subset 112 to generate, based on the analysis of the data subset 112, results 116 of the recognition, mining, and/or synthesizing of the information.
The computer system 800 may include a processor 802 that may implement or execute machine readable instructions performing some or all of the methods, functions and other processes described herein. Commands and data from the processor 802 may be communicated over a communication bus 804. The computer system may also include a main memory 806, such as a random access memory (RAM), where the machine readable instructions and data for the processor 802 may reside during runtime, and a secondary data storage 808, which may be non-volatile and stores machine readable instructions and data. The memory and data storage are examples of computer readable mediums. The memory 806 may include a hybrid synaptic architecture based neural network implementation module 820 including machine readable instructions residing in the memory 806 during runtime and executed by the processor 802. The hybrid synaptic architecture based neural network implementation module 820 may include the modules of the apparatus 100 shown in
The computer system 800 may include an I/O device 810, such as a keyboard, a mouse, a display, etc. The computer system may include a network interface 812 for connecting to a network which may be further connected to analog neural cores and digital neural cores as disclosed herein with reference to
The computer system 900 may include a processor 902 that may implement or execute machine readable instructions performing some or all of the methods, functions and other processes described herein. Commands and data from the processor 902 may be communicated over a communication bus 904. The computer system may also include a main memory 906, such as a RAM, where the machine readable instructions and data for the processor 902 may reside during runtime, and a secondary data storage 908, which may be non-volatile and stores machine readable instructions and data. The memory and data storage are examples of computer readable mediums. The memory 906 may include a hybrid synaptic architecture based neural network implementation module 920 including machine readable instructions residing in the memory 906 during runtime and executed by the processor 902. The hybrid synaptic architecture based neural network implementation module 920 may include the modules of the apparatus 100 shown in
The computer system 900 may include an I/O device 910, such as a keyboard, a mouse, a display, etc. The computer system may include a network interface 912 for connecting to a network. Other known electronic components may be added or substituted in the computer system.
What has been described and illustrated herein is an example along with some of its variations. The terms, descriptions and figures used herein are set forth by way of illustration only and are not meant as limitations. Many variations are possible within the spirit and scope of the subject matter, which is intended to be defined by the following claims—and their equivalents—in which all terms are meant in their broadest reasonable sense unless otherwise indicated.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2015/058397 | 10/30/2015 | WO | 00 |