The present invention relates to a recognition apparatus and a learning system.
In Japanese Patent Laid-open Publication No. H05-290013, there is disclosed “a neural network arithmetic apparatus capable of adapting to an actual environment while maintaining an initial capability”. In Japanese Patent Laid-open Publication No. H07-84984, there is disclosed “a neural network circuit configured to perform image recognition processing, for example”. In Japanese Patent Laid-open Publication No. H09-91263, there is described “a hierarchical neural network configured to combine neurons in a layer shape”.
The environment in which an automobile or other such vehicle travels changes day by day. Therefore, it is necessary for artificial intelligence implemented in the vehicle to collect new learning data that matches a usage environment that changes day by day.
It is also necessary for the artificial intelligence implemented in the vehicle to instantaneously recognize and judge danger during travel, for example. Therefore, it is important for the artificial intelligence implemented in the vehicle to efficiently collect the new learning data of the usage environment without placing a burden on danger recognition processing, for example.
In Japanese Patent Laid-open Publication Nos. H05-290013, H07-84984, and H09-91263, there is no description or disclosure regarding efficient collection of learning data of the usage environment.
In view of the above, it is an object of the present invention to provide a technology for efficiently collecting learning data of a usage environment.
The present invention includes a plurality of solving means for solving at least apart of the problem described above . Examples of those means include the following. In order to solve the above-mentioned problem, according to one embodiment of the present invention, there is provided a recognition apparatus, including: a first neural network configured to receive input of data; a second neural network configured to receive input of the data, the second neural network having a different structure from a structure of the first neural network; a comparison unit configured to compare a first output result of the first neural network and a second output result of the second neural network; and a communication unit configured to wirelessly transmit the data to a host system configured to learn the data when a comparison result between the first output result and the second output result is different by a predetermined standard or more.
According to an embodiment of the present invention, learning data of a usage environment can be efficiently collected. The problem to be solved by the present invention, the configuration, and the advantageous effect other than those described above according to the present invention are made clear based on the following description of the embodiments.
Embodiments of the present invention are now described with reference to the drawings. Herein, there is described an example in which a recognition apparatus according to the present invention is implemented by a field-programmable gate array (FPGA) or other such programmable logic device (PLD).
In
In the DB 4, learning data to be learned by the learning apparatus 3 is stored. The learning data is, for example, image data of a road, image data of a vehicle, for example, an automobile, a motor bike, or a bicycle, and image data of a road sign, for example. The DB 4 is managed by, for example, a data center or a cloud system.
The learning apparatus 3, which includes artificial intelligence, is configured to learn based on the learning data (i.e., image data) stored in the DB 4 (arrow All of
The learning apparatus 3 is configured to periodically learn image data of the DB 4, for example, once every few days to a few weeks to generate the structure of the neural network NN of the PLD 2a. The learning apparatus 3 is also configured to transmit the generated structure (i.e., information on the structure) of the neural network NN to the ECU 2 via the network 5 (arrow A12 of
The ECU 2 is configured to receive from the learning apparatus 3 the structure of the neural network NN generated by the learning apparatus 3. The neural network NN having the structure transmitted from the learning apparatus 3 is formed in the PLD 2a included in the ECU 2. More specifically, the structure of the neural network NN of the PLD 2a of the vehicle 1 is periodically updated by the learning apparatus 3.
A camera (not shown) configured to photograph the surroundings of the vehicle 1, for example, a front direction of the vehicle 1, is mounted to the vehicle 1. Image data D1 photographed by the camera mounted to the vehicle 1 is input to the PLD 2a. The PLD 2a is configured to recognize (or perceive) and judge the input image data Dl by using the neural network NN generated by the learning apparatus 3.
For example, the PLD 2a is configured to recognize a state of a traffic crossing and pedestrians, or traffic signals, in the input image data D1, and judge whether or not there is a danger. The PLD 2a outputs, when it is judged that there is a danger in the input image data D1, an instruction to perform an avoidance action in order to avoid the danger. For example, the PLD 2a outputs a braking instruction when the vehicle 1 is likely to hit a vehicle in front.
The PLD 2a is also configured to extract image data D2, which has a feature, from among the input image data D1 by using the neural network NN. The extraction of the image data D2 is described in more detail later. The image data D2 having a feature is image data that has not been learned by the learning apparatus 3 (i.e., image data that has not been stored in the DB 4). The ECU 2 is configured to transmit the image data D2 having a feature extracted by the PLD 2a to the DB 4 via the network 5 (arrow A13 of
The learning apparatus 3 is configured to learn the image data currently in the DB 4 to generate the structure of the neural network NN of the PLD 2a. However, the environment in which the vehicle 1 travels changes day by day. For example, automobile design and danger change day by day. Therefore, when a new automobile design or a new danger appears, the neural network NN of the PLD 2a may not correctly recognize the new automobile design or the new danger.
However, as described above, the PLD 2a is configured to extract the image data D2 having a feature (i.e., image data of an automobile having a new design or a new danger) from among the input image data Dl by using the neural network NN. The extracted image data D2 having a feature is transmitted to and stored in the DB 4 via the network 5.
This enables the learning apparatus 3 to learn the image data D2 having a feature, and to generate a neural network NN structure that can handle new automobile designs and new dangers. This also enables the PLD 2a to correctly recognize and judge a new automobile design or a new danger when a new automobile design or a new danger appears. In other words, the PLD 2a is capable of efficiently collecting learning data of the usage environment, and performing recognition processing, for example, in accordance with the usage environment that changes day by day.
The PLD 2a is configured to extract the image data D2 having a feature to transmit the extracted image data D2 having a feature to the network 5. Specifically, it is not necessary for the vehicle 1 to transmit all of the photographed image data to the network 5. As a result, the storage capacity of the DB 4 may be saved, and the load on the network 5 may be reduced.
In the example described above, the learning apparatus 3 periodically updates the image data in the DB 4. However, the learning apparatus 3 may learn the image data in the DB 4 during a program update of the ECU 2 performed by an automobile manufacturer, for example. The automobile manufacturer may transmit the structure of the neural network NN of the PLD 2a learned and generated by the learning apparatus 3 to the ECU 2 together with the update program of the ECU 2.
In
In
The learning apparatus 3 and the DB 4 maybe hereinafter referred to as a “host system”.
A camera is mounted to the vehicle 1 illustrated in
The camera mounted to the vehicle 1 is configured to photograph, for example, the surroundings of the vehicle 1. Image data output from the camera mounted to the vehicle 1 is input to the input unit 11.
The neural networks 12a and 12b correspond to the neural network NN illustrated in
Apart of the structure of the neural network 12b is different from the structure of the neural network 12a. For example, the square hatched portions of the neural network 12b illustrated in
An output result (i.e., output value) of the neural network 12a is output to the comparison unit 13 and a vehicle control unit. The vehicle control unit is configured to perform predetermined vehicle control (e.g., braking control or steering wheel control of the vehicle 1) based on the output result of the neural network 12a.
An output result of the neural network 12b is output to the comparison unit 13. The neural network 12b is a neural network for extracting image data having a feature from the image data input to the input unit 11. The image data having a feature may be hereinafter referred to as “feature image data”.
The comparison unit 13 is configured to compare the output result of the neural network 12a and the output result of the neural network 12b. The comparison unit 13 outputs, when the output result of the neural network 12a and the output result of the neural network 12b are different from each other by a predetermined standard or more (i.e., predetermined threshold or more), a feature detection signal to the communication unit 14. For example, the comparison unit 13 outputs the feature detection signal to the communication unit 14 when a degree of similarity between the output result of the neural network 12a and the output result of the neural network 12b is different by a predetermined amount or more.
The communication unit 14 is configured to access the network 5, which may be the Internet, for example, via wireless communication to communicate to/from the host system connected to the network 5. The communication unit 14 outputs to the reconfigurable controller 17 the structure of the neural network 12a transmitted from the host system. The communication unit 14 also transmits to the reconfiguration data memory 15 the structure of the neural network 12b transmitted from the host system. The host system is configured to transmit a plurality of structures (i.e., plurality of patterns) of the neural network 12b. This point is described later.
The communication unit 14 transmits, when the feature detection signal has been output from the comparison unit 13, the image data stored in the data memory 19 to the host system. Specifically, the communication unit 14 transmits, when it is judged by the comparison unit 13 that the image data input to the input unit 11 is feature image data, the image data input to the input unit 11 (i.e., image data stored in the data memory 19) to the host system. As a result, the host system can perform learning in accordance with the usage environment of the vehicle 1.
The structures of the neural network 12b received by the communication unit 14 from the host system are stored in the reconfiguration data memory 15. As described above, there are a plurality of structures of the neural network 12b transmitted from the host system.
The hatched portions in
It is not necessary for all of the structures of the neural network 12b to be stored in the reconfiguration data memory 15. For example, only the parts that are different from those of the structure of the neural network 12a (i.e., only the hatched portions) may be stored in the reconfiguration data memory 15.
Returning to the description of
Returning to the description of
There configurable controller 17 refers to there configuration data memory 15 to form the neural network 12b in accordance with the sequence stored in the sequence data memory 16. The reconfigurable controller 17 also refers to the reconfiguration data memory 15 to form the neural network 12b in accordance with a periodic instruction from the timer 18.
For example, in the case of the sequence example shown in
The image data input to the input unit 11 is temporarily stored in the data memory 19. The communication unit 14 transmits the image data stored in the data memory 19 to the host system in accordance with the feature detection signal from the comparison unit 13.
In this way, the structures of the neural network 12b formed in the PLD 2a are changed in accordance with a period of the timer 18. Specifically, the comparison unit 13 compares the output result of the neural network 12a and the output result of the neural network 12b, in which a part of the structures is periodically changed. As a result, the PLD 2a can extract various feature image data, and transmit the extracted feature image data to the host system.
The host system can learn the new image data extracted by the PLD 2a to generate the neural networks 12a and 12b based on the new learning. The PLD 2a is capable of responding to various environments by receiving the neural networks 12a and 12b that are based on the new learning.
The reference symbol “Clk” in
The reference symbol “Timer” represents a timing at which the timer 18 outputs a timer signal to the reconfigurable controller 17. More specifically, the reconfigurable controller 17 refers to the reconfiguration data memory 15 at the timing indicated by the “Timer” of
The reference symbol “Reconfig” represents the neural network 12b to be formed in the PLD 2a. In the example of
The reference symbol “Input Data (Buffer)” represents the timing at which the data memory 19 stores the image data to be input to the input unit 11.
The reference symbol “Comp Enbl” represents the timing at which the comparison unit 13 compares the output result of the neural network 12a and the output result of the neural network 12b. In the example of
The reference symbol “Comp Enbl” is in an L-state at least during the period in which the neural network 12b is reconfigured. Specifically, the comparison unit 13 is configured to not compare the output result of the neural network 12a and the output result of the neural network 12b during the period in which the neural network 12b is reconfigured.
The reference symbol “Comp Rslt” represents the feature detection signal to be output to the communication unit 14 from the comparison unit 13. The feature detection signal is, as indicated by the reference symbol “Mask” of
The reference symbol “Upload” represents the timing at which the image data stored in the data memory 19 is to be transmitted by the communication unit 14 to the host system. The communication unit 14 extracts, when the feature detection signal (Comp Rslt) has become active (i.e., has been output from the comparison unit 13), the original image data outputting the feature detection signal from the data memory 19, and transmits the extracted image data to the host system.
First, the learning apparatus 3 judges whether or not it is the learning period (Step S1). When it is judged that it is not the learning period (“No” in S1), the learning apparatus 3 ends the processing of this sequence.
On the other hand, when it is judged that it is the learning period (“Yes” in S1), the learning apparatus 3 refers to the DB 4, and learns the image data stored in the DB 4 (Step S2). The image data stored in the DB 4 is, for example, as described with reference to
The learning apparatus 3 generates the structures of the neural networks 12a and 12b to be formed in the PLD 2a of the vehicle 1 based on the learning performed in Step S2 (Step S3).
For example, the learning apparatus 3 generates the structure of the neural network 12a for the PLD 2a to recognize a danger to the vehicle 1. The learning apparatus 3 also generates, for example, the structures of the neural network 12b for the PLD 2a to extract the feature image data. As described above, a plurality of structures of the neural network 12b are generated in order to allow various feature image data to be extracted (e.g., refer to
The learning apparatus 3 transmits the structures of the neural networks 12a and 12b generated in Step S3 to the vehicle 1 (Step S4).
The communication unit 14 of the PLD 2a receives the structures of the neural networks 12a and 12b transmitted in Step S4 (Step S5).
The communication unit 14 of the PLD 2a stores the structures of the neural network 12b received in Step S5 in the reconfiguration data memory 15 (Step S6). As a result, as illustrated in
The reconfigurable controller 17 of the PLD 2a forms the neural networks 12a and 12b having the structures of the neural networks 12a and 12b received in Step S5 (Step S7). The reconfigurable controller 17 forms the neural network 12b in accordance with the first sequence of the sequences stored in the sequence data memory 16. For example, in the case of the example of
Based on the processing sequence described above, in the PLD 2a, neural networks 12a and 12b based on the newest learning are formed every learning period.
First, the PLD 2a (i.e., the timer 18) judges whether or not the timer time (refer to “Timer” of
On the other hand, when it is judged by the timer 18 that the timer time has arrived (“Yes” in Step S11) , based on the sequence stored in the sequence data memory 16, the reconfigurable controller 17 of the PLD 2a refers to the reconfiguration data memory 15 to form the neural network 12b (refer to “Reconfig” of
The comparison unit 13 of the PLD 2a compares the output result of the neural network 12a and the output result of the neural network 12b (Step S13).
The comparison unit 13 of the PLD 2a judges whether or not the comparison result between the output result of the neural network 12a and the output result of the neural network 12b is different by a predetermined standard or more (Step S14). When the comparison result between the output result of the neural network 12a and the output result of the neural network 12b is not different by the predetermined standard or more (“No” in S14), the comparison unit 13 ends the processing of this sequence.
On the other hand, when the comparison result between the output result of the neural network 12a and the output result of the neural network 12b is different by the predetermined standard or more (“Yes” in S14) , the comparison unit 13 outputs the feature detection signal to the communication unit 14 (Step S15).
The communication unit 14 of the PLD 2a transmits the image data (i.e., feature image data) stored in the data memory 19 to the DB 4 based on the feature detection signal output in Step S15 (Step S16).
The DB 4 receives the feature image data transmitted in Step S16 (Step S17), and then stores the feature image data received in Step S17 (Step S18).
Based on the sequence described above, a neural network 12b having different structures is formed in the PLD 2a at every timer time. When image data having a feature has been input to the input unit 11, that image data is transmitted to and stored in the DB 4 as feature image data. As a result, the learning apparatus 3 can learn based on learning data including new learning data (i.e., the feature image data).
The reconfigurable controller 17 includes a control unit 21 and a read/write (R/W) unit 22. When a power supply is input, for example, the control unit 21 initially configures the PLD 2a by referring to data for initial configuration stored in the reconfiguration data memory 15, which is a nonvolatile memory.
Next, the control unit 21 refers to the reconfiguration data memory 15, and sequentially controls reconfiguration of each of the neural network areas 31a and 31b via the R/W unit 22. The control unit 21 controls reconfiguration in accordance with the sequence in the sequence data memory 16. In the sequence data memory 16, the data for initial configuration and the data for performing reconfiguration (e.g., structures of the neural network 12b) by time sharing are stored separately.
The neural network areas 31a and 31b are a configuration random access memory (CRAM) or other such configuration memory. The neural network 12a is formed in, for example, the neural network area 31a, and the neural network 12b is formed in, for example, the neural network area 31b. In
The neural network area 31a illustrated in
The neural network area 31a includes a storage area 51 configured to store a weighting coefficient, a storage area 52 configured to store information on connection relations among, for example, the layers of the neural network 12a, and a calculation area 53 configured to perform calculations. The neural network area 31b also includes similar storage areas and a similar calculation area.
The R/W unit 22 is configured to read data from and write data to the CRAM. An address output unit 41 is configured to output a physical address of the CRAM. A sequence controller 42 is configured to control the address output unit 41 and a copy address map memory 43 in accordance with the sequence in the sequence data memory 16. The copy address map memory 43 is a memory configured to store a correspondence relation between the physical addresses of the neural network areas 31a and 31b.
The control unit 21 also stores, in order to change a connection relation with the weighting coefficient of the neural network 12b by time sharing, a weighting coefficient in a CRAM configured to store logic information. The neural network area 31b in which the neural network 12b is formed is rewritten at a timer time interval of the timer 18. As a result, the neural network area 31b in the CRAM is autonomously rewritten by time sharing.
A related-art neural network weighting coefficient is stored in a random-access memory (RAM) block, and R/W is performed every word. However, it is difficult to simultaneously execute a plurality of data sharing operations to computing units formed in parallel, which is a characteristic of FPGA. In the PLD 2a, the weighting coefficient is stored in a CRAM, which enables a weighting coefficient value to be supplied simultaneously to a plurality of computing units. Further, with a RAM block, it is difficult to read the weighting coefficient because the read port is being used when copy processing of the neural networks is executed. However, with the PLD 2a, a configuration bus that is independent of the read path during calculations can be used for updating via the R/W unit 22, without stopping the calculation processing. In addition, because the RAM blocks are grouped and fixed due to the physical arrangement of the FPGA, depending on the arrangement position, a wiring delay is increased when the neural network structures are changed. However, with the PLD 2a, because the networks and the weighting coefficients are implemented in a configuration memory, the arrangement can be closer, which allows the wiring delay to be decreased. As a result, a new memory and logic circuit for rewriting as described above are not needed, which enables implementation in a small-scale FPGA.
As described above, the PLD 2a includes the neural network 12a that image data is input to, the neural network 12b that image data is input to, and has a different structure from that of the neural network 12a, and the comparison unit 13, which is configured to compare an output result of the neural network 12a and an output result of the neural network 12b. The PLD 2a also includes the communication unit 14, which is configured to wirelessly transmit image data when the output result of the neural network 12a and the output result of the neural network 12b are different from each other by a predetermined standard or more to the host system configured to learn image data. As a result, the PLD 2a can efficiently collect image data of the usage environment.
Because the PLD 2a is configured to efficiently collect image data of the usage environment, the host system can generate the neural networks 12a and 12b in accordance with the usage environment, which changes day by day. The PLD 2a can also perform appropriate recognition processing, for example, by using, in accordance with the usage environment which changes day by day, the neural networks 12a and 12b generated by the host system.
The PLD 2a is configured to not transmit all of the image data photographed by the camera to the network 5 as learning data. This enables the storage capacity of the DB 4 to be reduced, and the load on the network 5 to be suppressed.
The PLD 2a forms, at a predetermined period, the neural network 12b having a plurality of different structures. This enables the PLD 2a to extract various kinds of feature image data.
As a result of using a programmable logic device and implementing the neural network structures on a configuration memory, high performance can be achieved at a smaller scale and using less power.
Because the PLD 2a includes the two neural networks 12a and 12b, image data of the usage environment can be collected without placing a burden on recognition processing, for example.
In a second embodiment of the present invention, the structures of the neural network 12b are different from those in the first embodiment. In the first embodiment, a part of the structure of each layer of the neural network 12b (refer to
A part of the structure of a neural network 61 illustrated in
The portions of the neural network 12b that are not different from those of the neural network 12a are not illustrated in
The neural network 61 has a plurality of structures. The plurality of structures of the neural network 61 are stored in the reconfiguration data memory 15.
The dotted line portion in
Only the portion that is different from that of the neural network 12a is stored in the reconfiguration data memory 15. In other words, only the structures of the dotted-line portion illustrated in
As described above, a part in the layer direction of the structures of the neural network 61 is different from that of the neural network 12a. Therefore, the PLD 2a can efficiently collect image data of the usage environment.
In the example described above, only the structure portion different from that of the neural network 12a is stored in the reconfiguration data memory 15. However, the structure portions that are the same as those of the neural network 12a may also be stored. For example, the structures other than the dotted-line portion illustrated in
In a third embodiment of the present invention, a whole layer of the structures of the neural network 12b is different from that of the neural network 12a.
A part of the structure of a neural network 71 illustrated in
The neural network 71 has a plurality of structures. The plurality of structures of the neural network 71 are stored in the reconfiguration data memory 15.
The hatched layers illustrated in
Only the portion that is different from the neural network 12a is stored in the reconfiguration data memory 15. In other words, only the structures of the hatched portion illustrated in
In the neural network 71 described above, the structure of one layer is different from that of the neural network 12a. However, the structures of two or more layers may be different.
As described above, at least an entire layer of the structures of the neural network 71 is different from that of the neural network 12a. Therefore, the PLD 2a can efficiently collect image data of the usage environment.
In the example described above, only the layer different in structure from the neural network 12a is stored in the reconfiguration data memory 15. However, the layers having the same structure as that of the neural network 12a may also be stored. For example, the structures of layers other than the hatched portion illustrated in
In the first to third embodiments, a part of the structures of the neural network 12b is different from that of the neural network 12a. In a fourth embodiment of the present invention, all of the structures of the neural network 12b are different from those of the neural network 12a.
The entire structure of a neural network 81 illustrated in
A plurality of neural network 81 structures having an entire structure different from that of the neural network 12a are stored in the reconfiguration data memory 15.
As described above, the entire structure of the neural network 81 is different from that of the neural network 12a. Therefore, the PLD 2a can efficiently collect image data of the usage environment .
The function configuration of the above-mentioned recognition apparatus and learning system has been classified in accordance with the main processing content in order to allow the configuration of the recognition apparatus and the learning system to be understood more easily. The classification method and the names of the components are not limited to those in the present invention. The configuration of the recognition apparatus and the learning system may be classified into even more components in accordance with the processing content. One component can also be classified so that it can execute even more processes. The processing of each component may be executed by one piece of hardware, or by a plurality of pieces of hardware.
Each processing unit of the above-mentioned sequence has been divided in accordance with the main processing content in order to allow the processing of the recognition apparatus and the learning system to be understood more easily. The division method and the names of the processing units are not limited to those in the present invention. The processing of the recognition apparatus and the learning system may be divided into even more processing units in accordance with the processing content. One processing unit can also be divided so that it includes even more processes . The present invention can be provided as a program for implementing the functions of the recognition apparatus and the learning system, and also as a storage medium in which that program is stored.
A part or all of each of the above-mentioned configurations, functions, processing units, and the like may be implemented by hardware by, for example, designing an integrated circuit. The control lines and information lines considered to be necessary for the description are described, but it is not necessarily the case that all the control lines and information lines of a product have been described. It may be considered that in actual practice almost all parts are connected to each other.
The technical elements of the above-mentioned embodiments may be applied independently, or may be applied by being divided into a plurality of portions, such as a program portion and a hardware portion.
1 VEHICLE, 2 ECU, 2a PLD, 3 LEARNING APPARATUS, 4 DB, 5 NETWORK, 11 INPUT UNIT, 12a NEURAL NETWORK, 12b NEURAL NETWORK, 13 COMPARISON UNIT, 14 COMMUNICATION UNIT, 15 RECONFIGURATION DATA MEMORY, 16 SEQUENCE DATA MEMORY, 17 RECONFIGURABLE CONTROLLER, 18 TIMER, 19 DATA MEMORY, 21 CONTROL UNIT, 22 R/W UNIT, 31a NEURAL NETWORK AREA, 31b NEURAL NETWORK AREA, 41 ADDRESS OUTPUT UNIT, 42 SEQUENCE CONTROLLER, 43 COPY ADDRESS MAP MEMORY, 51 STORAGE AREA, 52 STORAGE AREA, 53 CALCULATION AREA, 61 NEURAL NETWORK, 71 NEURAL NETWORK, 81 NEURAL NETWORK
Number | Date | Country | Kind |
---|---|---|---|
2016-195629 | Oct 2016 | JP | national |
This application claims the priority based on the Japanese Patent Application No. 2016-195629 filed on Oct. 3, 2016. The entire contents of which are incorporated herein by reference for all purpose.