[VGG] K. Simonvan, A. Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition ICLR 2015.
[PathNet] Evolution Channels Gradient Descent in Super Neural Networks.
Artificial neural networks became the backbone engine in computer vision, voice recognition and other applications of artificial intelligence and pattern recognition. Rapid increase of available computation power allows to tackle problems of higher complexity, which in turn requires novel approaches in network architectures, and algorithms. Emerging applications such as autonomous vehicles, drones, robots and other multi-sensor systems, smart-city, smart home, Internet of Things as well as device to device and device to cloud connectivity enable novel computational solutions, which improves the system efficacy.
In some known neural network systems, the entire neural network is trained on a specific dataset prior to deployment into a product. Then, the trained network is deployed and executed either on a device or on cloud resources. In the latter case the device usually transmits its relevant sensor data, such as a video or audio stream, to the cloud, and receives from the cloud the results of the neural network process.
This approach causes rigidity in the functionality of the trained network, which was trained on the often outdated training set collected prior to deployment. Further, this approach causes rigidity in the utilization of the hardware platform, which may become insufficient for network computations, or to the contrary, may be under-utilized by the network.
[PathNet] Describes somewhat related approach, where a big neural network has multiple modules in each layer of the network, where an output of any module from layer N can be connected to an input of any module from Layer N+1. A specific path between the modules can be selected and/or configured, thus defining a specific network. In such systems, the network can be trained by supervised training along random paths. The performance of different paths is evaluated by the validation set, followed by selection and freezing of the best-performing path, and re-initialization of the parameters on other modules. This, according to the authors, allows more efficient application of genetic algorithms, where the best trained instances of the network are memorized.
According to an aspect of some embodiments of the present invention, there is provided a modular neural network system comprising: a plurality of neural network modules; and a controller configured to select a combination of at least one of the neural network modules to construct a neural network, dedicated for a specific task.
Optionally, some of the modules are trained with at least partially different sets of input data.
Optionally, some of the modules have different sizes or different amounts of internal parameters.
Optionally, each of the different training input data sets reflect different operation conditions.
Optionally, the controller is configured to receive parameters of the task and to select the combination bused on the received parameters and according to known training conditions, wherein the parameters comprise at least one of: type of input data, type of task and available resources.
Optionally, some of the network modules depend on input data and some are independent from input data, wherein the controller is configured to: select a sensor-dependent network module according to a type of input data; and construct a dedicated neural network by using the sensor-dependent network module for sensor-dependent levels of the task, and a sensor-independent network module for sensor-independent levels of the task.
Optionally, the controller is configured to execute at least some of the selected network modules on different platforms.
Optionally, the controller is configured to dynamically change which network modules are executed on which platform according to utilization of at least one of: computation resources, energy and communication resources.
Optionally, the controller is configured to select which network modules are executed on which platform according to data privacy requirements, by executing modules that process privacy-sensitive data on a local device and executing modules that process privacy-insensitive data on a remote platform.
Optionally, the controller is configured to obtain a confidence level of a result of a network module process, and to execute a process with a low confidence level of results by modules and/or platforms that provide stronger computational power.
Optionally, at least one of the network modules is trained according to a result and/or labeled data obtained by another one of the network modules.
Optionally, the controller is configured to construct multiple different dedicated neural networks for a same task, to obtain a tank for results of each of the dedicated neural networks, and to select a dedicated neural network for the task according to the obtained rank.
Optionally, the system further comprising a processor configured to execute code instructions for: analyzing a task to be performed; deciding required properties of a dedicated neural network for performing the task; identifying suitable network modules according to the known training conditions; and linking the identified network modules to construct a dedicated network.
Optionally, the controller is configured to calculate the amount of traffic or computations within layers of the neural network, and to adjust distribution of the network modules between available hardware resources based on a calculated amount of computations within each layer of the network or an amount of data traffic between the layers of the network.
Optionally, the controller is configured to partition the neural network to a separate network module where the data is sufficiently disassociated with the original input data, to process the sufficiently disassociated data on a remote server.
Optionally, the controller is configured to partition the neural network to separate network modules where processing of data sets from different sources is united into third network module.
Some non-limiting exemplary embodiments or features of the disclosed subject matter are illustrated in the following drawings.
In the drawings:
With specific reference now to the drawings in detail, it is stressed that the particulars shown arc by way of example and for purposes of illustrative discussion of embodiments of the invention. In this regard, the description taken with the drawings makes apparent to those skilled in the art how embodiments of the invention may be practiced.
Identical or duplicate or equivalent or similar structures, elements, or parts that appear in one or more drawings are generally labeled with the same reference numeral, optionally with an additional letter or letters to distinguish between similar entities or variants of entities, and may not be repeatedly labeled and/or described. References to previously presented elements are implied without necessarily further citing the drawing or description in which they appear.
Dimensions of components and features shown in the figures are chosen for convenience or clarity of presentation and are not necessarily shown to scale or true perspective. For convenience or clarity, some elements or structures are not shown or shown only partially and/or with different perspective or from different point of views.
As mentioned above, some layered neural network systems have multiple modules in each layer of the network, where an output of any module from layer N can be connected to an input of any module from Layer N+1. A specific path between the modules can be selected and/or configured, thus defining a specific network. In such systems, the network can be trained by supervised training along random paths. The performance of different paths is evaluated by the validation set, followed by selection and freezing of the best-performing path, and re-initialization of the parameters on other modules. This results in only one network for all the cases, which configuration is frozen upon completion of the training.
In contrast, in some embodiments of the present invention, different modules are trained for different learning tasks, based on corresponding different datasets. Therefore, selection of specific combination of modules allows controlled and dynamic configuration of the resulting network for a specific task, by selecting from the combinatorial amount of possible different configurations and tasks.
Some embodiments of the present invention may include a system, a method, and/or a computer program product. The computer program product may include a tangible non-transitory computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention. Computer readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including any object oriented programming language and/or conventional procedural programming languages.
Before explaining at least one embodiment of the invention in detail, it is to be understood that the invention is not necessarily limited in its application to the details of construction and the arrangement of the components and/or methods set forth in the following description and/or illustrated in the drawings and/or the Examples. The invention is capable of other embodiments or of being practiced or carried out in various ways.
Reference is now made to
ANN cluster 20 may include a modular distributed ANN, including a plurality of distributed sub-networks, for example distributed network modules 21a-21d. Although
For example, the different modules for initial layers of convolutional neural network for object recognition may correspond to different illumination or weather conditions, while deeper layers may correspond to different object classes. Modules with larger layers may correspond to more accurate networks, which demand more computations, while smaller layers may correspond to the faster but possibly less accurate networks.
In some embodiments of the present invention, memory 16 stores code instructions executable by processor 14. When executed, the code instructions cause processor 14 to carry out the methods described herein. According to some embodiments of the present invention, processor 14 executes methods for generation and management of modular distributed ANN.
Processor 14 may send instructions to controller 10, which may select, based on the received instructions, at least one of network modules 21a-21d, and/or operate a combination of the selected network modules as a dedicated neural network, for example for executing a neural network process such as classification of data by the resulting dedicated neural network.
Further reference is now made to
Sensor 110 may be, as a non-limiting example, an image senor, layers 120-125, and 130 may be convolutional layers, while layer 160 may be a fully connected layer. Modules 125B-125Z may be trained for different illumination conditions, modules 130B-130Z for different object classes, while modules 160B. may be trained for different trade-offs between the quality of object recognition and cost of computations.
Further reference is now made to
In some embodiments of the present invention, processor 14 executes two or more different modular networks, i.e. different combinations of neural network modules, to perform a neural network process with the same input data. Processor 14 may choose between the modular networks according to determined criteria, for example after receiving the results of the neural network process from each of the modular networks. For example, processor 14 may receive from each of the modular networks a result of the process, calculate a confidence level for each of the results, and rank the results accordingly and/or select the modular network that provides the better confidence level. For example, processor 14 may receive votes about the results, and rank the results and/or select the modular network that received the most positive votes for its result. This way, system 100 may obtain the most efficient dedicated network of the possible modular networks. Further, system 100 may facilitate evolution of the modular networks, where only the most reliable combinations and or modules are selected and further trained and evolve.
In some embodiments, by executing multiple different modular network combinations for performing the same task, with the same or with different input sensor data. the different modular network combinations may be used for cross-training, in which results of one network are used for training or re-training of another network
Reference is now made to
In some embodiments of the present invention, controller 10 may adapt distributed modular neural network 300 and change dynamically which of the network modules are executed on which device or platform. For example, controller 10 may adapt distributed modular neural network 300 based on the available computational and or communication resources on each device and/or platform, and/or based on tire nature of the task and required solution.
In some embodiments of the present invention, the amount of traffic or computations within layers of the neural network is calculated and or estimated. The distribution of the network modules between available hardware resources may be optimized and/or adjusted, for example, based on a calculated and/or estimated amount of computations within each layer of the network and/or amount of data traffic between the layers the network.
Reference is now made to
Network architecture 400 may include receiving at processor 14 input data array 410, for example an input image. For example, in case network architecture 400 includes a convolutional network process for detecting objects in the video stream, input data array 410 may include a single frame out of a video stream. Input data array 410 may have N rows and M column, i.e. frame size of M×N. For example, in case of HD (High Definition) image data, the frame size may be of 1920×1080. A new frame can be obtained and/or received in a rate of 30 frames per second.
Server 60 may maintain a repository 420 of K1 first layer operator filters, for processing of input data in a first layer of the neural network, and a repository 430 of K2 second layer operator filters, for processing of input data in a second layer of the neural network, and so forth. In some embodiments of the present invention, processor 14 may apply on multiple locations on the image data array 410 one of the K1 filters, for example a 3×3 filter 415 or a filter of any other suitable size. For example, filter 415 may be applied on an upper left corner of the input image, and on two adjacent positions shifted by stride S1 from the first position, as shown in
Accordingly, in case there are K1 filters in the first layer filter repository 420, processor 14 may produce K1 corresponding arrays 425 of resulting values, each corresponding to another filter and resulting from applying the filter, and each having a size of M*N/(S1*S1). Accordingly, in case amount of information in an uncompressed input color image of 3 bytes per pixel is M*N*3 bytes, the amount of information in the K1 arrays is K1*M*N/(S1{circumflex over ( )}2)*B bytes, where B is number of bytes per value in the arrays. Therefore, the amount of computations performed in the first layer is D*D*3*K1*M*N/(S1*S1).
On a second layer of the neural network, each of the K2 second layer operator filters may be applied on each of the K1 arrays 425. Each of the K2 second layer operator filters may be of size D×D×K1, for example 3×3×K1 as shown in
Architecture 400 may continue in multiple layers and/or modules of the neural network, wherein in each layer, respective filters are applied on arrays resulting from the previous layers.
For example, for more difficult neural network computations, controller 10 may utilize a more powerful device and/or platform, particularly for execution of higher level network modules, for example while executing the lower levels on a smaller and/or weaker device.
In some embodiments of the present invention, system 100 is configured to execute neural network processes with various kinds and/or sets of sensors. For example, system 100 may interface with device A and/or with device B, each having different sensors, receive sensor data and execute the modular neural network based on the input sensor data. For example, some of network modules 21a-21d may depend on the kind of sensors from which data is received and/or the kind of sensor data received. Other of network modules 21a-21d may be sensor independent. Processor 14 may identify suitable network modules for a task involving a certain type of dev ice and/or certain set of sensors, for example by finding the corresponding properties of the modules stored in database 15. For sensor-independent levels of the neural network process, processor 14 may use suitable sensor-independent network modules, which may be used with various kinds of devices.
Controller 10 may link the identified network modules with sensor-independent network modules to construct a resulting dedicated network. Thus, system 100 may save neural network volume by using a certain portion of the network for multiple kinds of devices, without the need to provide a whole different network for each different device. In some cases, controller 10 may include a switch to select between sensor-dependent modules of modular neural network 20, for example by linking the selected network module to a sensor-independent module of modular neural network 20. In some embodiments, cross-training may be performed, in which data collected on one device is used for training of network modules executed on another device, thus enriching the pool of labeled training data.
Reference is now made to
Reference is now made to
Reference is now made to
In some embodiments, at some stage in the neural network the processing of data 515 and data 517 may be united into third network module 555, for example in layer 529 as shown in
In the context of some embodiments of the present disclosure, by way of example and without limiting, terms such as ‘operating’ or ‘executing’ imply also capabilities, such as ‘operable’ or ‘executable’, respectively.
Conjugated terms such as, by way of example, ‘a thing property’ implies a property of the thing, unless otherwise clearly evident from the context thereof.
The terms ‘processor’ or ‘computer’, or system thereof, are used herein as ordinary context of the art, such as a general purpose processor, or a portable device such as a smart phone or a tablet computer, or a micro-processor, or a RISC processor, or a DSP, possibly comprising additional elements such as memory or communication ports. Optionally or additionally, the terms ‘processor’ or ‘computer’ or derivatives thereof denote an apparatus that is capable of carrying out a provided or an incorporated program and/or is capable of controlling and/or accessing data storage apparatus and/or other apparatus such as input and output ports. The terms ‘processor’ or ‘computer’ denote also a plurality of processors or computers connected, and/or linked and/or otherwise communicating, possibly sharing one or more other resources such as a memory.
The terms ‘software’, ‘program’, ‘software procedure’ or ‘procedure’ or ‘software code’ or ‘code’ or ‘application’ may be used interchangeably according to the context thereof, and denote one or more instructions or directives or electronic circuitry for performing a sequence of operations that generally represent an algorithm and/or other process or method. The program is stored in or on a medium such as RAM, ROM, or disk, or embedded in a circuitry accessible and executable by an apparatus such as a processor or other circuitry. The processor and program may constitute the same apparatus, at least partially, such as an array of electronic gales, such as FPGA or ASIC, designed to perform a programmed sequence of operations, optionally comprising or linked with a processor or other circuitry.
The term ‘configuring’ and/or ‘adapting’ for an objective, or a variation thereof, implies using at least a software and/or electronic circuit and/or auxiliary apparatus designed and/or implemented and/or operable or operative to achieve the objective.
A device storing and/or comprising a program and/or data constitutes an article of manufacture. Unless otherwise specified, the program and or data are stored in or on a non-transitory medium.
In case electrical or electronic equipment is disclosed it is assumed that an appropriate power supply is used for the operation thereof.
The flowchart and block diagrams illustrate architecture, functionality or an operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosed subject matter. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of program code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, illustrated or described operations may occur in a different order or in combination or as concurrent operations instead of sequential operations to achieve the same or equivalent effect.
The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprising”, “including” and/or “having” and other conjugations of these terms, when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The terminology used herein should not be understood as limiting, unless otherwise specified, and is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosed subject matter. While certain embodiments of the disclosed subject matter have been illustrated and described, it will be clear that the disclosure is not limited to the embodiments described herein. Numerous modifications, changes, variations, substitutions and equivalents are not precluded.
Number | Date | Country | |
---|---|---|---|
Parent | 15672328 | Aug 2017 | US |
Child | 17074667 | US |