The embodiments discussed herein are related to an analysis device and an analysis program.
Commonly, when image data is recorded or transmitted, the reduction of the recording cost and transmission cost is achieved by making the data size smaller by an image compression process.
Japanese Laid-open Patent Publication No. 2018-101406, Japanese Laid-open Patent Publication No. 2019-079445, and Japanese Laid-open Patent Publication No. 2011-234033 are disclosed as related art.
According to an aspect of the embodiments, an analysis device includes: a memory; and a computer coupled to the memory and configured to: store information that indicates a degree of influence of each area of each piece of decoded data on recognition results and is calculated by performing a recognition process on the decoded data obtained by decoding each piece of compressed data when a compression process is performed on image data at different compression levels; and designate the compression levels for each area of the image data, based on the information that corresponds to the different compression levels and indicates the degree of influence of each area of each piece of the decoded data on the recognition results.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
Meanwhile, in recent years, there have been an increasing number of cases in which image data is recorded or transmitted for the purpose of being utilized for an image recognition process by artificial intelligence (AI). As a representative model of AI, for example, a model using deep learning or machine learning can be cited.
However, the past compression processing is performed based on the human visual characteristics and thus is not performed based on the motion analysis of AI. For this reason, there have been cases where the compression process is not performed at a sufficient compression level for the area that is not involved in the image recognition process by AI.
In one aspect, an object is to implement a compression process suitable for an image recognition process by AI.
Hereinafter, each embodiment will be described with reference to the accompanying drawings. Note that, in the present specification and the drawings, constituent elements having substantially the same functional configuration are denoted by the same reference sign, and redundant description will be omitted.
First, a system configuration of the entire compression processing system including an analysis device according to a first embodiment will be described.
In
As illustrated in 1a of
The imaging device 110 captures an image at a predetermined frame period and transmits image data to the analysis device 120. Note that the image data includes an object targeted for a recognition process.
The analysis device 120 includes a learned model that performs the recognition process and performs the recognition process by inputting image data or decoded data obtained by decoding compressed data when the compression process is performed on the image data at different compression levels to the learned model, to output the recognition result.
In addition, the analysis device 120 generates a map (referred to as an important feature map) indicating the degree of influence on the recognition result, by performing motion analysis for the learned model using, for example, an error back propagation method and aggregates the degree of influence for each predetermined area (for each block used when the compression process is performed).
Note that the analysis device 120 instructs the image compression device 130 to perform the compression process at different compression levels (quantization values) and repeats similar processes on each piece of the compressed data when the compression process is performed at each compression level.
The analysis device 120 calculates an aggregated value of the degree of influence of each block each time the image compression device 130 is instructed to perform the compression process at different compression levels and designates an optimum compression level (quantization value) of each block, based on changes in the aggregated value with respect to each compression level (each quantization value). Note that the optimum compression level (quantization value) refers to the maximum compression level (quantization value) that allows the recognition process to be precisely performed on the object included in the image data.
In this manner, according to the analysis device 120, by performing the motion analysis on the learned model and calculating the degree of influence on the recognition result, the optimum compression level for when the compression process suitable for the image recognition process by the learned model is performed may be designated.
Meanwhile, as illustrated in 1b of
The analysis device 120 transmits the optimum compression levels (quantization values) designated for each block and the image data to the image compression device 130.
The image compression device 130 performs the compression process on the image data, using the designated optimum compression levels (quantization values) and stores the compressed data in the storage device 140.
In this manner, the analysis device 120 according to the present embodiment uses a compression level suitable for the image recognition process by the learned model. For example, the analysis device 120 according to the present embodiment has the following differences from the past compression process and therefore, is allowed to implement the compression process suitable for the image recognition process by the learned model.
Originally, the past compression process is not based on a feature part focused at the time of inference (it is merely based on the shape, properties, targets of interest, and the like that can be grasped by the human concept), and the feature part focused at the time of inference (a feature part that is not usually allowed to be demarcated by boundaries in the human concept) is not used.
In the past compression process, the internal motion of a convolutional neural network (CNN) unit 320, which is a course of outputting the recognition result (for example, the signal and processing result propagation course from the input of the image data to the output of the recognition result, and the propagation intensity of the signal and processing result), is not analyzed.
Next, a hardware configuration of the analysis device 120 and the image compression device 130 will be described. Note that, since the analysis device 120 and the image compression device 130 have similar hardware configurations, both the devices will be collectively described here with reference to
The processor 201 includes various arithmetic devices such as a central processing unit (CPU) and a graphics processing unit (GPU). The processor 201 reads various programs (for example, an analysis program or an image compression program or the like described later) into the memory 202 and executes the read programs.
The memory 202 includes a main storage device such as a read only memory (ROM) or a random access memory (RAM). The processor 201 and the memory 202 form a so-called computer. The processor 201 executes various programs read into the memory 202 to cause the computer to implement various functions (details of the various functions will be described later).
The auxiliary storage device 203 stores various programs and various pieces of data used when the various programs are executed by the processor 201.
The I/F device 204 is a connection device that connects an operation device 210 and a display device 220, which are examples of external devices, with the analysis device 120 or the image compression device 130. The I/F device 204 receives an operation for the analysis device 120 or the image compression device 130 via the operation device 210. In addition, the I/F device 204 outputs a result of processing by the analysis device 120 or the image compression device 130 and displays the result via the display device 220.
The communication device 205 is a communication device for communicating with another device. In the case of the analysis device 120, communication is performed with the imaging device 110 and the image compression device 130 via the communication device 205. In addition, in the case of the image compression device 130, communication is performed with the analysis device 120 and the storage device 140 via the communication device 205.
The drive device 206 is a device for setting a recording medium 230. The recording medium 230 mentioned here includes a medium that optically, electrically, or magnetically records information, such as a compact disc read only memory (CD-ROM), a flexible disk, or a magneto-optical disk. Alternatively, the recording medium 230 may include a semiconductor memory or the like that electrically records information, such as a ROM or a flash memory.
Note that various programs installed in the auxiliary storage device 203 are installed, for example, by setting the distributed recording medium 230 in the drive device 206 and reading the various programs recorded in the recording medium 230 by the drive device 206. Alternatively, the various programs to be installed in the auxiliary storage device 203 may be installed by being downloaded from a network via the communication device 205.
Next, a functional configuration of the analysis device 120 will be described.
The input unit 310 acquires image data transmitted from the imaging device 110 or compressed data transmitted from the image compression device 130. The input unit 310 notifies the CNN unit 320 and the output unit 340 of the acquired image data and decodes the acquired compressed data using a decoding unit (not illustrated) to also notify the CNN unit 320 of the decoded data.
The CNN unit 320 includes a learned model and, by inputting the image data or the decoded data, performs the recognition process on an object included in the image data or the decoded data to output the recognition result.
The quantization value setting unit 330 notifies the output unit 340 sequentially of the compression levels (from the minimum quantization value (initial value) to the maximum quantization value) used when the image compression device 130 performs the compression process and also stores the compression levels in an aggregation result storage unit 380, which is an example of a storage unit.
The output unit 340 transmits the image data acquired by the input unit 310 to the image compression device 130. In addition, each quantization value notified by the quantization value setting unit 330 is sequentially transmitted to the image compression device 130. Furthermore, the quantization value (designated quantization value) designated by the quantization value designation unit 370 is transmitted to the image compression device 130.
The important feature map generation unit 350 is an example of a map generation unit and acquires CNN unit structure information when the learned model performed the recognition process on the image data or the decoded data, to generate an important feature map by utilizing an error back propagation method based on the acquired CNN unit structure information.
The important feature map generation unit 350 generates the important feature map by using, for example, a back propagation (BP) method, a guided back propagation (GBP) method, or a selective BP method.
Note that the BP method is a method in which the error of each label is computed from a classification probability obtained by performing the recognition process on image data (or decoded data) whose recognition result is the correct answer label, and the feature part is visualized by forming an image of the magnitude of a gradient obtained by back propagation to the input layer. In addition, the GBP method is a method in which the feature part is visualized by forming an image of only the positive values of the gradient information as the feature part.
Furthermore, the selective BP method is a method in which back propagation is performed using the BP method or the GBP method after maximizing only the errors of the correct answer labels. In the case of the selective BP method, the feature part to be visualized is a feature part that affects only the scores of the correct answer labels.
In this manner, by using the BP method, the GBP method, or the selective BP method, the important feature map generation unit 350 analyzes the signal flow and intensity of each path in the CNN unit 320 from the input of the image data or the decoded data to the output of the recognition result. Consequently, according to the important feature map generation unit 350, it may be possible to visualize which part of the input image data or decoded data affects the recognition result to what extent. Accordingly, for example, when AI to which the BP method, the GBP method, or the selective BP method is not applied (or is not applicable) is used as the CNN unit 320, the important feature map generation unit 350 generates the important feature map by analyzing similar information.
Note that, for example, the method of generating the important feature map by the error back propagation method is
disclosed in documents such as
“Selvaraju, Ramprasaath R., et al., “Grad-cam: Visual explanations from deep networks via gradient-based localization.”, The IEEE International Conference on Computer Vision (ICCV), 2017, pp. 618-626”.
The aggregation unit 360 aggregates the degree of influence on the recognition result in block units, based on the important feature map and calculates the aggregated value of the degree of influence for each block. In addition, the aggregation unit 360 stores the calculated aggregated value of each block in the aggregation result storage unit 380 in association with the quantization value.
The quantization value designation unit 370 is an example of a designation unit and designates an optimum quantization value for each block, based on the aggregated value of each block (a number of aggregated values according to the number of quantization values) stored in the aggregation result storage unit 380. In addition, the quantization value designation unit 370 notifies the output unit 340 of the designated optimum quantization value for each block.
In this manner, the analysis device 120 calculates the degree of tolerance (quantization value) to deterioration (influence on the recognition accuracy) due to the compression process, which the feature part that is important when the CNN unit 320 performs the recognition process has, with the concept perceived by the CNN unit 320 as a reference, instead of the concept perceived by humans.
Next, a specific example of the aggregation result stored in the aggregation result storage unit 380 will be described.
As indicated by 4b, an aggregation result 420 includes “block number” and “quantization value” as information items.
In “block number”, the block number of each block in the image data 410 is stored. In “quantization value”, “no compression” indicating a case where the image compression device 130 does not perform the compression process, and the minimum quantization value (“Q1”) to the maximum quantization value (“Qn”) used when the image compression device 130 performs the compression process are stored.
In addition, the area specified by “block number” and “quantization value” stores
an aggregated value aggregated in the corresponding block in such a manner that
Next, a specific example of processing by the quantization value designation unit 370 will be described.
Note that the aggregated values of each block used to generate the graphs 510_1 to 510_m, for example,
As illustrated in the graphs 510_1 to 510_m, the change in the aggregated value when changed from the minimum quantization value (Q1) to the maximum quantization value (Qn) differs from block to block. The quantization value designation unit 370 designates the optimum quantization value of each block,
for example, when any of the following conditions is satisfied:
In
Note that the size of the block at the time of aggregation and the size of the block used for the compression process do not have to match. In that case, for example, the quantization value designation unit 370 designates the quantization value as follows.
When the size of the block used for the compression process is larger than the size of the block at the time of aggregation, the average value (alternatively, the minimum value, the maximum value, or a value modified with another index) of the quantization values based on the aggregated value of each block at the time of aggregation contained in the block used for the compression process is adopted as the quantization value of each block used for the compression process.
When the size of the block used for the compression process is smaller than the size of the block at the time of aggregation, the quantization value based on the aggregated value of the block at the time of aggregation is used as the quantization value of each block used for the compression process contained in the block at the time of aggregation.
In addition, the quantization values indicated by the reference sign 530 may be additionally evaluated by the analysis device 120. For example, first, the analysis device 120 decodes the compressed data that has undergone the compression process using the quantization values indicated by the reference sign 530 and performs the recognition process on the decoded data. Subsequently, the analysis device 120 adds a quantization value (for example, adds one) to the minimum value among the quantization values indicated by the reference sign 530 and alters the quantization values indicated by the reference sign 530. At this time, when a plurality of minimum values exists among the quantization values indicated by the reference sign 530, a similar addition is performed.
Subsequently, the analysis device 120 decodes the compressed data that has undergone the compression process using the altered quantization values indicated by the reference sign 530 and performs the recognition process on the decoded data.
The analysis device 120 repeats these processes until the maximum value among the quantization values indicated by the reference sign 530 is reached and acquires a plurality of pairs of the altered quantization values indicated by the reference sign 530 and the corresponding recognition results.
Subsequently, the analysis device 120 selects a pair having a recognition accuracy falling above an allowable lower limit and having the maximum minimum value of the quantization value, from among the plurality of pairs and replaces the quantization value indicated by the reference sign 530 (before the alteration) using the altered quantization value indicated by the reference sign 530 and contained in the selected pair.
In this manner, by additionally evaluating the quantization values indicated by the reference sign 530, a quantization value having a higher compression rate than the compression rates of the quantization values indicated by the reference sign 530 may be designated.
Next, a functional configuration of the image compression device 130 will be described.
The coding unit 620 is an example of a compression unit. The coding unit 620 includes a difference unit 621, an orthogonal transformation unit 622, a quantization unit 623, an entropy coding unit 624, an inverse quantization unit 625, and an inverse orthogonal transformation unit 626. In addition, the coding unit 620 includes an addition unit 627, a buffer unit 628, an in-loop filter unit 629, a frame buffer unit 630, an in-screen prediction unit 631, and an inter-screen prediction unit 632.
The difference unit 621 calculates the difference between the image data (for example, the image data 410) and predicted image data and outputs a predicted residual signal.
The orthogonal transformation unit 622 executes an orthogonal transformation process on the predicted residual signal output by the difference unit 621.
The quantization unit 623 quantizes the predicted residual signal that has undergone the orthogonal transformation process and generates a quantized signal. The quantization unit 623 generates the quantized signal using the quantization value indicated by the reference sign 530 (the quantization value transmitted from the analysis device 120 or the designated optimum quantization value).
The entropy coding unit 624 generates compressed data by performing an entropy coding process on the quantized signal.
The inverse quantization unit 625 inverse-quantizes the quantized signal. The inverse orthogonal transformation unit 626 executes an inverse orthogonal transformation process on the inverse-quantized quantized signal.
The addition unit 627 generates reference image data by adding the signal output from the inverse orthogonal transformation unit 626 and the predicted image data. The buffer unit 628 stores the reference image data generated by the addition unit 627.
The in-loop filter unit 629 performs a filter process on the reference image data stored in the buffer unit 628. The in-loop filter unit 629 includes
The frame buffer unit 630 stores the reference image data on which the filter process has been performed by the in-loop filter unit 629, in frame units.
The in-screen prediction unit 631 performs in-screen prediction based on the reference image data and generates the predicted image data. The inter-screen prediction unit 632 performs motion compensation between frames using the input image data (for example, the image data 410) and the reference image data and generates the predicted image data.
Note that the predicted image data generated by the in-screen prediction unit 631 or the inter-screen prediction unit 632 is output to the difference unit 621 and the addition unit 627.
In addition, in the above description, it is assumed that the coding unit 620 performs the compression process using an existing moving image coding scheme such as moving picture experts group (MPEG)-2, MPEG-4, H.264, or high efficiency video coding (HEVC). However, the compression process by the coding unit 620 is not limited to these moving image coding schemes and may be performed using any coding scheme in which the compression rate is controlled by parameters such as quantization.
Next, a flow of an image compression process by a compression processing system 100 will be described.
In step S701, the quantization value setting unit 330 initializes the compression level (sets the minimum quantization value (Q1)) and also sets the upper limit of the compression level (sets the maximum quantization value (Qn)).
In step S702, the input unit 310 acquires image data or compressed data in frame units. In addition, when the compressed data is acquired, the input unit 310 decodes the acquired compressed data and generates decoded data.
In step S703, the CNN unit 320 performs the recognition process on the image data (or the decoded data) and outputs the recognition result.
In step S704, the important feature map generation unit 350 generates the important feature map indicating the degree of influence of each area on the recognition result, based on the CNN unit structure information.
In step S705, the aggregation unit 360 aggregates the degree of influence of each area in block units, based on the important feature map. In addition, the aggregation unit 360 stores the aggregation result in the aggregation result storage unit 380 in association with the current compression level (quantization value).
In step S706, the output unit 340 transmits the image data and the current compression level (quantization value) to the image compression device 130. In addition, the image compression device 130 performs the compression process on the transmitted image data at the current compression level (quantization value) and generates compressed data.
In step S707, the quantization value setting unit 330 raises the compression level (here, sets the quantization value (Q2)).
In step S708, the quantization value designation unit 370 determines whether or not the current compression level exceeds the upper limit (whether or not the current quantization value exceeds the maximum quantization value (Qn)). When it is determined in step S708 that the current compression level does not exceed the upper limit (in the case of No in step S708), the process returns to step S702.
In this case, in step S702, the compressed data generated in step S706 is acquired, and the processes in steps S703 to S707 are performed on decoded data obtained by decoding the acquired compressed data.
On the other hand, when it is determined in step S708 that the current compression level exceeds the upper limit (in the case of Yes in step S708), the process proceeds to step S709.
In step S709, the quantization value designation unit 370 designates the optimum compression level (optimum quantization value) in block units, based on the aggregation result stored in the aggregation result storage unit 380. In addition, the output unit 340 transmits the designated optimum quantization value to the image compression device 130.
In step S710, the image compression device 130 performs the compression process on the image data, using the designated optimum quantization value and stores the compressed data in the storage device 140.
As is clear from the above description, the analysis device according to the first embodiment acquires each piece of compressed data when the compression process is performed on the image data using different quantization values. In addition, the analysis device according to the first embodiment generates the important feature map indicating the degree of influence on the recognition result, based on the CNN unit structure information when the decoded data obtained by decoding each piece of the compressed data was input to the learned model and the recognition process was performed. Furthermore, the analysis device according to the first embodiment aggregates the degree of influence in block units, based on the important feature map and designates the compression level of each block of the image data, based on the aggregated values of each block corresponding to different compression levels.
Consequently, according to the first embodiment, the compression process may be performed using the optimum quantization value designated based on the degree of influence on the recognition result. For example, according to the first embodiment, a compression process suitable for an image recognition process by AI may be implemented.
In the first embodiment described above, in designating the optimum quantization value based on the degree of influence on the recognition result, the minimum quantization value to the maximum quantization value that can be set in the image compression device 130 have been described as being all used.
In contrast to this, in a second embodiment, a case where the optimum quantization value is designated by performing the compression process using a predetermined quantization value will be described. The second embodiment will be described below focusing on differences from the first embodiment described above.
First, a functional configuration of an analysis device 120 according to the second embodiment will be described.
The maximum quantization value setting unit 810 notifies an output unit 340 of the maximum quantization value (Qn). The quantization value designation unit 820 determines a group to which the aggregated value of each block notified by an aggregation unit 360 belongs, from group information stored in the group information storage unit 830, which is an example of the storage unit. In addition, the quantization value designation unit 820 notifies the output unit 340 of the optimum quantization value associated with the determined group in advance.
Next, a specific example of processing by the quantization value designation unit 820 will be described.
As illustrated in
individually.
The quantization value designation unit 820 acquires, from the aggregation unit 360, the aggregated value of each block calculated by performing the recognition process on the decoded data obtained by decoding the compressed data when the compression process is performed on the image data using the maximum quantization value (Qn). In addition, the quantization value designation unit 820 determines which group the aggregated value of each block belongs to.
Furthermore, the quantization value designation unit 820 notifies the output unit 340 of the quantization value associated with the determined group, as the optimum quantization value of each block.
Note that, in the example in
In addition, in the example in
Furthermore, in the example in
Next, a flow of an image compression process by a compression processing system 100 will be described.
In step S1001, the maximum quantization value setting unit 810 sets the maximum compression level (maximum quantization value (Qn)).
In step S1002, an input unit 310 acquires image data in frame units.
In step S1003, the output unit 340 transmits the image data and the maximum compression level (maximum quantization value (Qn)) to an image compression device 130. In addition, the image compression device 130 performs the compression process on the transmitted image data at the maximum compression level (maximum quantization value (Qn)) and generates compressed data.
In step S1004, the input unit 310 acquires and decodes the compressed data generated by the image compression device 130. In addition, a CNN unit 320 performs the recognition process on the decoded data and outputs the recognition result.
In step S1005, an important feature map generation unit 350 generates the important feature map indicating the degree of influence on the recognition result, based on the CNN unit structure information.
In step S1006, the aggregation unit 360 aggregates the degree of influence of each area in block units, based on the important feature map. In addition, the aggregation unit 360 notifies the quantization value designation unit 820 of the aggregation result.
In step S1007, the quantization value designation unit 820 refers to the group information stored in the group information storage unit 830 and determines which group the aggregated value of each block notified by the aggregation unit 360 belongs to. This causes the quantization value designation unit 820 to group each block into groups.
In step S1008, the quantization value designation unit 820 designates the optimum quantization value associated with each of groups determined for each block, as the optimum quantization value of each block. In addition, the output unit 340 transmits the designated optimum quantization value to the image compression device 130.
In step S1009, the image compression device 130 performs the compression process on the image data, using the designated optimum quantization value and stores the compressed data in a storage device 140.
As is clear from the above description, the analysis device according to the second embodiment acquires the compressed data when the compression process is performed on the image data using the maximum quantization value. In addition, the analysis device according to the second embodiment generates the important feature map indicating the degree of influence on the recognition result, based on the CNN unit structure information when the recognition process was performed by inputting the decoded data obtained by decoding the compressed data to the learned model. Furthermore, the analysis device according to the second embodiment aggregates the degree of influence in block units, based on the important feature map and, by determining a group to which the aggregated value belongs, designates the quantization value associated with the group, as the optimum quantization value.
Consequently, according to the second embodiment, the compression process may be performed using the optimum quantization value designated based on the degree of influence on the recognition result. For example, according to the second embodiment, an effect similar to the effect of the first embodiment described above is obtained. Besides, according to the second embodiment, the optimum quantization value may be designated with a smaller number of compression processes as compared with the first embodiment described above.
In the second embodiment described above, in determining the group to which the aggregated value belongs, the recognition process has been described as being performed on the decoded data obtained by decoding the compressed data when the compression process is performed using the maximum quantization value. In contrast to this, in a third embodiment, pseudo-like compressed data (pseudo-compressed data) is generated by performing image processing having an equivalent effect to the effect of performing the compression process using the maximum quantization value, and the recognition process is performed on the pseudo-compressed data. Consequently, according to the third embodiment, the optimum quantization value may be designated with a still smaller number of compression processes as compared with the second embodiment. The third embodiment will be described below focusing on differences from the second embodiment described above.
First, a functional configuration of an analysis device 120 according to the third embodiment will be described.
The image processing unit 1110 performs a filtering process on the image data acquired by an input unit 310, for example, using a low-pass filter. This causes the image processing unit 1110 to generate the pseudo-compressed data having a similar effect to the effect of performing the compression process on the image data using the maximum quantization value.
In addition, the image processing unit 1110 inputs the generated pseudo-compressed data to a CNN unit 320. This causes the CNN unit 320 to perform the recognition process on the pseudo-compressed data and causes an important feature map generation unit 350 to generate the important feature map based on the CNN unit structure information. Furthermore, an aggregation unit 360 aggregates the important feature map in block units, and the quantization value designation unit 820 determines a group to which the aggregated value of each block belongs, from the group information stored in a group information storage unit 830, whereby the output unit 340 is notified of the optimum quantization value.
Next, a specific example of processing by the quantization value designation unit 820 will be described.
Note that the quantization value designation unit 820 determines which group each block belongs to, based on the acquired aggregated value of each block and notifies the output unit 340 of the optimum quantization value associated with the determined group, as the optimum quantization value of each block.
Next, a flow of an image compression process by a compression processing system 100 will be described.
In step S1301, the image processing unit 1110 generates the pseudo image data by the filtering process using the low-pass filter and inputs the generated pseudo image data to the CNN unit 320.
In step S1302, the input unit 310 acquires the pseudo image data, and the CNN unit 320 performs the recognition process on the acquired pseudo image data and outputs the recognition result.
As is clear from the above description, the analysis device according to the third embodiment performs the filtering process on the image data and acquires the pseudo-compressed data. In addition, the analysis device according to the third embodiment generates the important feature map indicating the degree of influence on the recognition result, based on the CNN unit structure information when the recognition process was performed by inputting the pseudo-compressed data to the learned model. Furthermore, the analysis device according to the third embodiment aggregates the degree of influence in block units, based on the important feature map and, by determining a group to which the aggregated value belongs, designates the quantization value associated with the group, as the optimum quantization value.
Consequently, according to the third embodiment, the compression process may be performed using the optimum quantization value designated based on the degree of influence on the recognition result. For example, according to the third embodiment, an effect similar to the effect of the first embodiment described above is obtained. Besides, according to the third embodiment, the optimum quantization value may be designated with a smaller number of compression processes as compared with the first and second embodiments described above.
In the first embodiment described above, the compression process has been described as being performed using different quantization values each time one piece of image data in frame units is input, to designate the optimum quantization value. In contrast to this, in the fourth embodiment, the compression process is performed using different quantization values while a plurality of pieces of image data in frame units is input and the optimum quantization value is designated. The fourth embodiment will be described below focusing on differences from the first embodiment described above.
First, a functional configuration of an analysis device 120 according to the fourth embodiment will be described.
The position determination unit 1410 extracts position information on the object included in the decoded data obtained by decoding the image data or the compressed data, from the recognition result output from a CNN unit 320.
In addition, the position determination unit 1410 notifies the quantization value setting unit 1420 of the extracted position information.
The quantization value setting unit 1420 notifies an output unit 1430 of the compression level (quantization value) used when an image compression device 130 performs the compression process. The quantization value setting unit 1420 sequentially notifies the output unit 1430 of the quantization values obtained by making additions on a predetermined increment basis, by starting from the minimum quantization value.
In addition, the quantization value setting unit 1420 monitors the aggregated value of each block notified by an aggregation unit 360 each time making a notification of the quantization value and, when the aggregated value of each block exceeds a predetermined threshold value, lowers the quantization value. In this manner, the quantization value setting unit 1420 is capable of controlling the quantization value of which a notification is to be made such that the aggregated value does not exceed a predetermined threshold value.
Note that the quantization value setting unit 1420 specifies a block of which the aggregated value is monitored, based on the position information on the object notified by the position determination unit 1410 and controls the quantization value of the specified block, based on the aggregated value of the specified block.
Next, a specific example of processing by the quantization value setting unit 1420 will be described.
The decoded data 1511 to 1514 obtained by decoding the compressed data each includes an object 1521. The example in
The quantization value setting unit 1420 specifies the position of the object 1521 in the decoded data 1511 to 1514 obtained by decoding the compressed data, based on the position information notified by the position determination unit 1410.
In addition, the quantization value setting unit 1420 acquires the aggregated value of each block included in the specified position, from the aggregation unit 360. In
The example in
Here, it is assumed that the aggregated value (reference sign 1533) of a block included in the object 1521, which has been calculated by performing the recognition process on the decoded data 1513 obtained by decoding the compressed data when the compression process is performed using the quantization value Qx+3, exceeds a predetermined threshold value 1530.
In this case, the quantization value setting unit 1420 makes the quantization value of which the notification is to be made next, be a quantization value smaller than the quantization value Qx+3 (the example in
In this manner, by controlling the quantization value of which the notification is to be made such that the aggregated value of each block included in the object does not exceed a predetermined threshold value, the quantization value setting unit 1420 may continuously make notifications of the optimum quantization value.
Next, a flow of an image compression process by a compression processing system 100 will be described.
In step S1601, the aggregation unit 360 aggregates the degree of influence of each area in block units, based on the important feature map.
In step S1602, the quantization value setting unit 1420 specifies the position of the object, based on the position information notified by the position determination unit 1410 and determines whether or not the aggregated value of each block included in the specified position of the object exceeds a predetermined threshold value.
When it is determined in step S1602 that the predetermined threshold value is not exceeded (in the case of No in step S1602), the process proceeds to step S1603.
In step S1603, the quantization value setting unit 1420 makes an addition to the quantization value on a predetermined increment basis and notifies the output unit 1430 of the quantization value after the addition.
On the other hand, when it is determined in step S1602 that the predetermined threshold value is exceeded (in the case of Yes in step S1602), the process proceeds to step S1604.
In step S1604, the quantization value setting unit 1420 makes a subtraction from the quantization value on a predetermined increment basis and notifies the output unit 1430 of the quantization value after the subtraction.
In step S1605, the image compression device 130 performs the compression process on the image data, using the quantization value transmitted from the output unit 1430 and stores the compressed data in a storage device 140.
In step S1606, the input unit 310 determines whether or not to end the image compression process and, when it is determined not to end (in the case of No in step S1606), the process returns to step S702. On the other hand, when it is determined in step S1606 to end (in the case of Yes in step S1606), the image compression process ends.
As is clear from the above description, the analysis device according to the fourth embodiment acquires each piece of compressed data when the compression process is performed on each of a plurality of pieces of the image data using different quantization values. In addition, the analysis device according to the fourth embodiment generates the important feature map indicating the degree of influence on the recognition result, based on the CNN unit structure information when the decoded data obtained by decoding each piece of the compressed data was input to the learned model and the recognition process was performed. Furthermore, the analysis device according to the fourth embodiment aggregates the important feature map in block units and acquires the aggregated values of the blocks included in the position of the object. Moreover, the analysis device according to the fourth embodiment controls the quantization value such that the acquired aggregated value does not exceed a predetermined threshold value.
In this manner, by controlling the quantization value such that the aggregated value of each block included in the object does not exceed a predetermined threshold value, according to the analysis device according to the fourth embodiment, the optimum quantization value may be continuously output.
In the first to third embodiments described above, the aggregated value has been described as being calculated for each block, and the optimum quantization value has been described as being designated for each block. In contrast to this, in the fifth embodiment, comparison with the aggregated value of a reference block is made, and the optimum quantization value is designated based on the comparison result. The fifth embodiment will be described below focusing on differences from the first embodiment described above.
Here, in the example in
In this case, for example, the quantization value designation unit calculates
In step S1801, the quantization value designation unit compares the aggregated value of the reference block and the aggregated value of each block and designates the optimum quantization value of each block, based on the optimum quantization value of the reference block and the comparison result.
In this manner, by comparing with the aggregated value of the reference block and designating the optimum quantization value based on the comparison result, according to the fifth embodiment, the compression process may be performed at a compression level equal to or higher than a predetermined compression level, regardless of the image data. In addition, according to the fifth embodiment, the quantization values may be aligned between the blocks.
In the first to third embodiments described above, the aggregated value has been described as being calculated for each block, and the quantization value has been described as being designated based on the calculated aggregated value. In contrast to this, in the sixth embodiment, by correcting the quantization value preset in an image compression device 130 (the quantization value set based on the human visual characteristics) using the calculated aggregated value, the optimum quantization value is designated. The sixth embodiment will be described below focusing on differences from the first embodiment described above.
In addition, in
In addition, in
Qa(x, y)=Qpb(x, y)+P(x, y)×Weighting Factor (Equation 1)
Note that, in equation 1, Qa(x, y) refers to the optimum quantization value of a block specified by coordinates (x, y). In addition, in equation 1, Qpb(x, y) refers to a quantization value of the block specified by the coordinates (x, y), which is a quantization value preset in the image compression device 130. Furthermore, in equation 1, P(x, y) refers to an aggregation result of the block specified by the coordinates (x, y) when the recognition process is performed on the decoded data obtained by decoding the predetermined compressed data.
Next, a flow of an image compression process by a compression processing system 100 will be described.
In step S2001, the quantization value designation unit determines whether or not a precise recognition result has been output from the CNN unit. When it is determined in step S2001 that a precise recognition result has been output (in the case of Yes in step S2001), the process proceeds to step S704.
In step S704, an important feature map generation unit 350 generates the important feature map indicating the degree of influence of each area on the recognition result, based on the CNN unit structure information.
In step S705, an aggregation unit 360 aggregates the degree of influence of each area in block units, based on the important feature map. In addition, the aggregation unit 360 stores the aggregation result in an aggregation result storage unit 380 in association with the current compression level (quantization value).
In step S2002, a quantization value setting unit 330 raises the compression level (quantization value).
In step S2003, an output unit 340 transmits the image data and the current compression level (quantization value) to the image compression device 130. In addition, the image compression device 130 performs the compression process on the transmitted image data using the current compression level (quantization value) and generates compressed data.
On the other hand, when it is determined in step S2001 that an erroneous recognition result has been output (in the case of No in step S2001), the process proceeds to step S2004.
In step S2004, the quantization value designation unit multiplies the aggregated value of the decoded data regarded as recognizable most recently, by the weighting factor and adds the multiplication result to the quantization value preset in the image compression device 130.
In step S2005, the image compression device 130 performs the compression process on the image data using the quantization value calculated in step S2004 and stores the compressed data in a storage device 140.
In this manner, according to the sixth embodiment, by correcting the quantization value preset in the image compression device (the quantization value set based on the human visual characteristics) using the calculated aggregated value, the optimum quantization value may be designated.
In the first to sixth embodiments described above, the case where the degree of influence on the recognition result is aggregated in block units and the optimum quantization value is designated based on the aggregation result has been described. In contrast to this, in a seventh embodiment, the image data is divided into an effective area and an invalid area based on the aggregation result, and after the blocks included in the invalid area are invalidated, the compression process is performed on the effective area.
Note that invalidation of the blocks included in the invalid area means, for example, making the pixel value of each pixel of the blocks included in the invalid area be “0”, and image data in which the blocks included in the invalid area are invalidated will be hereinafter referred to as “invalidated image data”.
In this manner, by performing the compression process on the invalidated image data (on the effective area included in the invalidated image data), the data size of the compressed data may be further reduced as compared with the case where the compression process is performed on the entire image data.
Note that, in performing the compression process on the invalidated image data, a quantization value assigned in advance may be used, or the optimum quantization value designated based on the methods described in the first to sixth embodiments described above may be used. In addition, in the case of a compression scheme capable of performing the compression process on data in any form, the compression process may be performed on data obtained by removing the invalid area of the invalidated image data. The seventh embodiment will be described below focusing on differences from the first embodiment described above.
First, a functional configuration of an analysis device 120 according to the seventh embodiment will be described.
The invalid area determination unit 2110 determines whether or not each block is a block belonging to the invalid area, based on the aggregated value of the degree of influence of each block on the recognition result (a number of aggregated values according to the number of quantization values) stored in an aggregation result storage unit 380.
Note that, in determining whether or not each block is a block belonging to the invalid area, the invalid area determination unit 2110 first acquires the recognition result from a CNN unit 320 and specifies a quantization value when the precise recognition result was not output. Subsequently, the invalid area determination unit 2110 determines whether or not each block is a block belonging to the invalid area, based on whether or not the difference between the aggregated value corresponding to the minimum quantization value and the aggregated value at the specified quantization value is equal to or greater than a predetermined threshold value.
In addition, the invalid area determination unit 2110 notifies the invalidated image generation unit 2120 of the block determined to belong to the invalid area.
The invalidated image generation unit 2120 generates invalidated image data in which the block notified by the invalid area determination unit 2110, among the respective blocks included in the image data, is invalidated. Furthermore, the invalidated image generation unit 2120 notifies an output unit 340 of the generated invalidated image data.
Next, a specific example of processing by the invalid area determination unit 2110 will be described.
The invalid area determination unit 2110 calculates the difference between the aggregated value corresponding to the minimum quantization value and the aggregated value corresponding to the unrecognizable quantization value. The example in
The invalid area determination unit 2110 determines whether or not the corresponding block is a block belonging to the invalid area, based on whether or not the calculated difference is equal to or greater than a predetermined threshold value.
The example in
Next, a specific example of the invalidated image data generated by the invalidated image generation unit 2120 will be described.
In invalidated image data 2300 illustrated in
The output unit 340 invalidates each block included in the area 2301 and transmits image data (invalidated image data 2300) made up of the respective blocks included in the area 2302 to an image compression device 130.
This causes the image compression device 130 to generate the compressed data by performing the compression process on the invalidated image data 2300. Therefore, the data size of the compressed data may be further reduced as compared with the case where the compression process is performed on the entire image data using the optimum quantization value.
Note that, when the image compression device 130 performs the compression process on the invalidated image data 2300, an analysis device 120 may calculate an optimum quantization value according to the degree of influence on the recognition result for each block included in the area 2302 and may transmit the calculated optimum quantization value to the image compression device 130.
Consequently, the data size of the compressed data may be still further reduced as compared with the case where the compression process is performed on the invalidated image data 2300 using a quantization value assigned in advance.
Next, a flow of an image compression process by a compression processing system 100 will be described.
In step S2401, the invalid area determination unit 2110 determines whether or not a precise recognition result has been output from the CNN unit 320. When it is determined in step S2401 that a precise recognition result has been output (in the case of Yes in step S2401), the process returns to step S702.
On the other hand, when it is determined in step S2401 that a precise recognition result has not been output (in the case of No in step S2401), the process proceeds to step S2402.
In step S2402, the invalid area determination unit 2110 calculates the difference between the aggregated value associated with the minimum quantization value and the aggregated value associated with the quantization value at the time of being unrecognizable, for each block. In addition, the invalid area determination unit 2110 determines whether or not each block is a block belonging to the invalid area, based on the calculated difference.
In step S2403, the invalidated image generation unit 2120 generates the invalidated image data by invalidating the block belonging to the invalid area.
In step S2404, the output unit 340 transmits the invalidated image data to the image compression device 130. In addition, the image compression device 130 performs the compression process on the invalidated image data and stores the compressed data in a storage device 140. Note that the image compression device 130 performs the compression process using the quantization value when the precise recognition result was output immediately before it is determined that the precise recognition result was not output.
As is clear from the above description, the analysis device according to the seventh embodiment acquires each piece of compressed data when the compression process is performed on the image data using different quantization values. In addition, the analysis device according to the seventh embodiment generates the important feature map indicating the degree of influence on the recognition result, based on the CNN unit structure information when the recognition process was performed by inputting the decoded data obtained by decoding each piece of the compressed data to the learned model and aggregates the degree of influence for each block. Furthermore, the analysis device according to the seventh embodiment determines whether or not each block belongs to the invalid area, based on the difference between the aggregated value corresponding to the quantization value when the precise recognition result was not output and the aggregated value corresponding to the minimum quantization value. Moreover, the analysis device according to the seventh embodiment performs the compression process on the invalidated image data in which the block belonging to the invalid area is invalidated.
In this manner, by performing the compression process on the image data in which the invalid area determined based on the degree of influence on the recognition result is invalidated, an effect similar to the effect of the first embodiment described above is obtained, and additionally, the data size of the compressed data may be further reduced as compared with the first embodiment described above.
In the seventh embodiment described above, the block belonging to the invalid area has been described as being determined based on the degree of influence on the recognition result. In contrast to this, in the eighth embodiment, the block belonging to the effective area is determined based on the degree of influence on the recognition result.
Note that, in the eighth embodiment, in determining the block belonging to the effective area, the minimal effective area is first set, and the effective area is fixed by gradually expanding the effective area according to changes in the aggregated value of each block when the quantization value is raised. In this manner, in the eighth embodiment, a decrease in recognition accuracy due to raising the quantization value is covered by the expansion of the effective area, whereby a larger quantization value may be designated as the optimum quantization value. The eighth embodiment will be described below focusing on differences from the seventh embodiment described above.
First, a functional configuration of an analysis device 120 according to the eighth embodiment will be described.
The initial invalidated image generation unit 2510 generates invalidated image data including a preset minimal effective area (referred to as initial invalidated image data). In addition, the initial invalidated image generation unit 2510 notifies an output unit 340 of the generated initial invalidated image data.
The effective area determination unit 2520 reads the aggregation result from an aggregation result storage unit 380 and determines whether or not the effective area is to be expanded, based on the amount of change in the aggregated value of each block with respect to the change in the quantization value. In addition, when it is determined that the effective area is to be expanded, the effective area determination unit 2520 notifies the invalidated image generation unit 2530 of the expanded effective area.
The invalidated image generation unit 2530 invalidates the blocks belonging to the area (invalid area) other than the expanded effective area notified by the effective area determination unit 2520 and generates the invalidated image data. In addition, the invalidated image generation unit 2530 notifies the output unit 340 of the generated invalidated image data.
Next, a specific example of processing by the effective area determination unit 2520 will be described.
In the initial invalidated image data 2610, the hatched area is an invalid area 2611. On the other hand, in the initial invalidated image data 2610, a non-hatched area 2612 is the minimal effective area.
Here, an image compression device 130 performs the compression process on the initial invalidated image data 2610 based on different quantization values. This causes a CNN unit 320 to perform the recognition process on the decoded data obtained by decoding the compressed data corresponding to each quantization value and causes an aggregation unit 360 to aggregate the degree of influence on the recognition result corresponding to each quantization value in block units.
In
The effective area determination unit 2520 calculates a difference Δx between the aggregated value corresponding to the current quantization value and the aggregated value corresponding to the minimum quantization value for the block 2612_1, for example. This causes the effective area determination unit 2520 to determine whether or not the effective area is desired to be expanded to a block adjacent to the block 2612_1.
Similarly, the effective area determination unit 2520 calculates a difference Δx+1 between the aggregated value corresponding to the current quantization value and the aggregated value corresponding to the minimum quantization value for the block 2612_2. This causes the effective area determination unit 2520 to determine whether or not the effective area is desired to be expanded to a block adjacent to the block 2612_2.
Note that the effective area determination unit 2520 makes a similar determination for all the blocks located inside the boundary position between the effective area and the invalid area.
The example in
Note that the effective area determination unit 2520 notifies the invalidated image generation unit 2530 of the expanded effective area in which a block adjacent to the block 2612_2 is included into the effective area, and the invalidated image generation unit 2530 generates the invalidated image data based on the notified expanded effective area.
In
As illustrated in
In this manner, the effective area determination unit 2520 fixes the effective area by gradually expanding the effective area according to the change in the aggregated value of each block when the quantization value is raised. Note that, when the aggregated value of a block located inside the boundary position between the effective area and the invalid area is lowered by including an adjacent block into the effective area, and the difference with the aggregated value corresponding to the minimum quantization value becomes less than the predetermined threshold value, the effective area determination unit 2520 continues the expansion of the effective area.
On the other hand, when the aggregated value of a block located inside the boundary position between the effective area and the invalid area is not lowered although an adjacent block is included into the effective area, and the difference with the aggregated value corresponding to the minimum quantization value remains equal to or greater than the predetermined threshold value, the effective area determination unit 2520 terminates the expansion of the effective area.
Next, a flow of an image compression process by a compression processing system 100 will be described.
In step S2701, an input unit 310 acquires image data in frame units.
In step S2702, the CNN unit 320 performs the recognition process on the image data to output the recognition result, and an important feature map generation unit 350 generates the important feature map. In addition, the aggregation unit 360 aggregates the degree of influence in block units. Consequently, the aggregated value corresponding to the minimum quantization value is calculated for each block.
In step S2703, a quantization value setting unit 330 initializes the compression level and additionally, sets the upper limit of the compression level. In addition, the initial invalidated image generation unit 2510 generates the initial invalidated image data.
In step S2704, the image compression device 130 performs the compression process on the invalidated image data (here, the initial invalidated image data) using the current quantization value and generates the compressed data.
In step S2705, the CNN unit 320 performs the recognition process on the decoded data obtained by decoding the compressed data to output the recognition result, and the important feature map generation unit 350 generates the important feature map. In addition, the aggregation unit 360 aggregates the degree of influence in block units.
In step S2706, for the block inside the boundary position between the effective area and the invalid area, the effective area determination unit 2520 determines whether or not the difference between the aggregated value corresponding to the current quantization value and the aggregated value corresponding to the minimum quantization value is equal to or greater than a predetermined threshold value.
When it is determined in step S2706 that the difference is less than the predetermined threshold value (in the case of No in step S2706), the process proceeds to step S2712.
On the other hand, when it is determined in step S2706 that the difference is equal to or greater than the predetermined threshold value (in the case of Yes in step S2706), the process proceeds to step S2707.
In step S2707, the effective area determination unit 2520 includes a block adjacent to the block whose difference is equal to or greater than the predetermined threshold value, into the effective area and notifies the invalidated image generation unit 2530 of the expanded effective area.
In step S2708, the invalidated image generation unit 2530 generates the invalidated image data based on the expanded effective area.
In step S2709, the image compression device 130 performs the compression process on the invalidated image data using the current quantization value and generates the compressed data.
In step S2710, the CNN unit 320 performs the recognition process on the decoded data obtained by decoding the compressed data to output the recognition result, and the important feature map generation unit 350 generates the important feature map. In addition, the aggregation unit 360 aggregates the degree of influence in block units.
In step S2711, the effective area determination unit 2520 determines whether or not the aggregated value has been lowered and the difference has become less than the predetermined threshold value for the block determined to be equal to or greater than the predetermined threshold value in step S2706.
When it is determined in step S2711 that the difference has become less than the predetermined threshold value (in the case of Yes in step S2711), the process proceeds to step S2712.
In step S2712, the quantization value setting unit 330 raises the compression level (quantization value), and the process returns to step S2704.
On the other hand, when it is determined in step S2711 that the difference remains equal to or greater than the predetermined threshold value (in the case of No in step S2711), the process proceeds to step S2713.
In step S2713, the invalidated image generation unit 2530 generates the invalidated image data based on the effective area immediately before the effective area is expanded in step S2707.
In step S2714, the image compression device 130 performs the compression process on the invalidated image data generated in step S2713, using the compression level (quantization value) immediately before the effective area is expanded in step S2707 and stores the compressed data.
As is clear from the above description, the analysis device according to the eighth embodiment first sets the minimal effective area and gradually expands the effective area according to changes in the aggregated value of each block when the quantization value is raised.
Consequently, according to the analysis device according to the eighth embodiment, a decrease in recognition accuracy due to raising the quantization value may be covered by the expansion of the effective area, and the compression process may be performed with a larger quantization value as the optimum quantization value.
As a result, according to the eighth embodiment, an effect similar to the effect of the first embodiment described above may be obtained, and additionally, the data size of the compressed data may be further reduced than the first embodiment described above.
In the eighth embodiment described above, in expanding the effective area, attention is paid to the aggregated value of the block inside the boundary position between the effective area and the invalid area. In contrast to this, in a ninth embodiment, in expanding the effective area, attention is paid to the aggregated values of blocks adjacent with the boundary position interposed (the aggregated value of the block inside and the aggregated value of the block outside the boundary position between the effective area and the invalid area). The ninth embodiment will be described below focusing on differences from the eighth embodiment described above.
First, a functional configuration of an analysis device 120 according to the eighth embodiment will be described.
The differences from the functional configuration illustrated in
The initial effective area setting unit 2820 first sets a minimal effective area in the effective area determination unit 2810.
The effective area determination unit 2810 reads the aggregation result from an aggregation result storage unit 380 and determines whether or not the effective area is to be expanded, based on the aggregated value of each block at each quantization value.
For example, the effective area determination unit 2810 acquires the aggregated value of each block when the aggregated value of each block is calculated for each piece of compressed data generated each time the quantization value is raised for the entire image data and is stored in the aggregation result storage unit 380.
At that time, the effective area determination unit 2810 calculates the difference in the aggregated values between the block located inside and the block located outside the boundary position between the initial effective area and the invalid area (between blocks adjacent with the boundary position interposed). Then, when it is determined that the calculated difference is equal to or greater than a predetermined threshold value, the effective area determination unit 2810 includes the block located outside the boundary position into the effective area.
Also after that, the aggregated value of each block is similarly acquired for each piece of compressed data generated each time the quantization value is raised continuously for the entire image data. At that time, the effective area determination unit 2810 calculates the difference in the aggregated values between the block located inside and the block located outside the boundary position between the expanded effective area and the invalid area. Then, when it is determined that the calculated difference is equal to or greater than the predetermined threshold value, the effective area determination unit 2810 includes the block located outside the boundary position into the effective area.
The invalidated image generation unit 2830 generates the invalidated image data based on the effective area when the expansion of the effective area by the effective area determination unit 2810 is completed. In addition, the invalidated image generation unit 2830 notifies an output unit 340 of the generated invalidated image data.
Next, a specific example of processing by the effective area determination unit 2810 will be described.
Here, the image compression device 130 performs the compression process on the image data 2910 using each quantization value to generate the compressed data. This causes a CNN unit 320 to perform the recognition process on the decoded data obtained by decoding the compressed data corresponding to each quantization value and causes an aggregation unit 360 to aggregate the degree of influence on the recognition result corresponding to each quantization value in block units.
In
In addition, a graph 2932 indicates the aggregated values of a block 2922 (the block number=“block X+1”) corresponding to each quantization value. Note that the block 2922 is a block outside the boundary position between the initial effective area 2912 and the invalid area 2911 and is a block adjacent to the block 2921.
The effective area determination unit 2810 calculates the difference between the aggregated value of the block 2921 and the aggregated value of the block 2922 corresponding to the current quantization value and determines whether or not the block 2922 is to be included into the effective area by determining whether or not the calculated difference is equal to or greater than a predetermined threshold value.
The example in
In
Note that, also after that, the image compression device 130 similarly acquires the aggregated value of each block for each piece of compressed data generated each time the quantization value is raised continuously for the entire image data. At that time, the effective area determination unit 2810 calculates the difference in the aggregated values between the block located inside and the block located outside the boundary position between the expanded effective area 2942 and an invalid area 2941. Then, when it is determined that the calculated difference is equal to or greater than the predetermined threshold value, the effective area determination unit 2810 includes the block located outside the boundary position into the effective area.
When the expansion of the effective area is completed, the effective area determination unit 2810 notifies the invalidated image generation unit 2830 of the effective area at the time of completion, and the invalidated image generation unit 2830 generates the invalidated image data based on the notified effective area.
Next, a flow of an image compression process by a compression processing system 100 will be described.
In step S3001, the initial effective area setting unit 2820 sets the initial effective area.
In step S3002, the image compression device 130 performs the compression process on the image data with the current quantization value and generates the compressed data.
In step S3003, the CNN unit 320 performs the recognition process on the decoded data obtained by decoding the compressed data to output the recognition result, and the important feature map generation unit 350 generates the important feature map. In addition, an aggregation unit 360 aggregates the degree of influence in block units.
In step S3004, the effective area determination unit 2810 calculates the difference in the aggregated values between the block inside and the block outside the boundary position for the current effective area and invalid area and determines whether or not the calculated difference in the aggregated values is equal to or greater than a predetermined threshold value.
When it is determined in step S3004 that the difference is less than the predetermined threshold value (in the case of No in step S3004), the process proceeds to step S3006.
On the other hand, when it is determined in step S3004 that the difference is equal to or greater than the predetermined threshold value (in the case of Yes in step S3004), the process proceeds to step S3005.
In step S3005, the effective area determination unit 2810 includes the block outside the boundary position into the effective area.
In step S3006, a quantization value setting unit 330 raises the compression level (quantization value), and the process proceeds to step S3007.
In step S3007, the quantization value setting unit 330 determines whether or not the compression level (quantization value) exceeds the upper limit and, when it is determined that the upper limit is not exceeded (in the case of No in step S3007), the process returns to step S3002.
On the other hand, when it is determined in step S3007 that the upper limit is exceeded (in the case of Yes in step S3007), the process proceeds to step S3008.
In step S3008, the invalidated image generation unit 2830 generates the invalidated image data based on the current effective area.
In step S3009, the image compression device 130 performs the compression process on the invalidated image data and stores the compressed data. Note that the image compression device 130 performs the compression process on the invalidated image data using, for example, the quantization value when the effective area is expanded.
As is clear from the above description, the analysis device according to the ninth embodiment first sets the minimal effective area and gradually expands the effective area according to the difference in the aggregated values between adjacent blocks at the boundary position when the quantization value is raised.
Consequently, according to the analysis device according to the ninth embodiment, a decrease in recognition accuracy due to raising the quantization value may be covered by the expansion of the effective area, and the compression process may be performed with a larger quantization value as the optimum quantization value.
As a result, according to the ninth embodiment, an effect similar to the effect of the first embodiment described above may be obtained, and additionally, the data size of the compressed data may be further reduced than the first embodiment described above.
In the first embodiment described above, the compression process has been described as being performed all using the minimum quantization value to the maximum quantization value. However, the quantization values used for the compression process are not limited to this, and the compression process may be performed using a predetermined number of quantization values included between the minimum quantization value and the maximum quantization value. The predetermined number of quantization values refers to a number of quantization values that allow the optimum quantization value to be designated and refers to at least two or more quantization values.
In addition, in the first embodiment described above, the image data has been described as including one object. However, the image data may include a plurality of objects. In this case, the CNN unit structure information may be acquired simultaneously for the plurality of objects in the image data, and the compression levels may be designated simultaneously for the plurality of objects. Alternatively, after the CNN unit structure information is acquired separately for the plurality of objects in the image data and the compression levels are designated for each object, the compression level of the entire image data may be designated by merging the compression level of each object.
In addition, in the third embodiment described above, as the image processing when the pseudo-compressed data is generated, the filtering process using a low-pass filter has been described as an example. However, the image processing when the pseudo-compressed data is generated is not limited to this.
For example, the Fourier transform may be performed on the entire image data, and the inverse Fourier transform may be performed after high-frequency components are cut. Alternatively, the Fourier transform may be performed on the image data in block units, and the inverse Fourier transform may be performed after high-frequency components are cut.
Alternatively, the entire image data may be transformed by the discrete cosine transform (DCT) and transformed by the inverse DCT after quantization. Alternatively, the image data may be transformed by the DCT in block units and transformed by the inverse DCT after quantization.
In addition, in any of the seventh to ninth embodiments described above or an embodiment combining any of the seventh to ninth embodiments described above, an area having a large degree of influence on the recognition result and an area having a small degree of influence on the recognition result may be separated such that
Note that the compression level calculated in each of the above embodiments and the information indicating the effective area or the invalid area may be used as information for designating the processing content of preprocessing for image data in which the reduction of the data size can be expected by performing the compression process. For example, the preprocessing mentioned here includes a process of reducing color information from image data, a process of reducing high-frequency components from the image data, and the like.
Note that the embodiments are not limited to the configurations described here and may include combinations of the configurations or the like described in the above embodiments with other elements, and the like. These points can be altered without departing from the spirit of the embodiments and can be appropriately assigned according to application modes thereof.
All examples and conditional language provided herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
This application is a continuation application of International Application PCT/JP2019/050896 filed on Dec. 25, 2019 and designated the U.S., the entire contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2019/050896 | Dec 2019 | US |
Child | 17751871 | US |