This application is based on and claims priority under 35 USC 119 from Japanese Patent Application No. 2019-150418 filed Aug. 20, 2019.
The present disclosure relates to an image processing apparatus.
Typical image processing apparatuses process an image in steps of window of multiple pixels. In such image processing, one line of the image is read and saved on a memory. Data corresponding to one line responsive to the size of the window is read from the memory and then processed. Data transfer between an arithmetic core performing image processing and the memory storing the data on each line is performed in a direct memory access (DMA) fashion.
Japanese Unexamined Patent Application Publication No. 2014-35619 discloses an image processing apparatus including multiple image processing modules, a memory, and a DMA controller (DMAC) that controls memory access of the image processing modules.
Image quality may be controlled by modifying the size of a window (window size) in the image processing performed by window units. Modifying the number of lines of data for the window to modify the window size involves a change in the number of channels of the DMAC that controls the memory access.
Aspects of non-limiting embodiments of the present disclosure relate to saving time used to change the number of channels of the DMAC in changing the window size.
Aspects of certain non-limiting embodiments of the present disclosure overcome the above disadvantages and/or other disadvantages not described above. However, aspects of the non-limiting embodiments are not required to overcome the disadvantages described above, and aspects of the non-limiting embodiments of the present disclosure may not overcome any of the disadvantages described above.
According to an aspect of the present disclosure, there is provided an image processing apparatus. The image processing apparatus includes a processor configured to processing an image, a reading direct memory access controller (DMAC) and a writing direct memory access controller (DMAC), each DMAC configured to control direct memory access to a memory. The reading DMAC is configured to read data from the memory and the writing DMAC is configured to write data to the memory. The image processing apparatus further includes an upper first-in first-out (FIFO) unit that is connected to the reading DMAC and the writing DMAC and includes FIFOs of the number equal to the number of channels of each of the reading DMAC and the writing DMAC and a lower FIFO unit that is connected between the upper FIFO unit and the processor and includes FIFOs that correspond to the FIFOs of the upper FIFO unit at a ratio of 1 FIFO in the upper FIFO unit to F FIFOs in the lower FIFO unit (F being an integer equal to 2 or larger). The number of lines of data input to the processor is modifiable in steps of the number of channels of the reading DMAC by not using part of the F FIFOs corresponding to each FIFO of the upper FIFO unit.
Exemplary embodiment of the present disclosure will be described in detail based on the following figures, wherein:
Exemplary embodiment of the disclosure is described with reference to the drawings. An image processing apparatus of the exemplary embodiment of the disclosure acquires image data read from an image and performs image processing with a processor as a processing unit. An image reading device (such as an image input terminal (IIT)) acquires the image data by reading the image in steps of a scan line. The image data per line is referred to as line data. The line data is read in lines and saved on a line buffer.
The processor receives one line read last and retrieves from a line buffer the line data for multiple (n−1) lines read until immediately before the one line, performs image processing on the line data for a total of n lines, and output processed data for one line. Specifically, the processor performs the image processing on the line data for the n lines in window units with each window including m pixels×n lines and generates processed data for one pixel representing the window. The processor performs the image processing along the line while shifting the window by one pixel at a time and acquires and outputs the processed data for one line including processed data for a consecutive series of pixels.
The processor 10 performs the image processing on the acquired line data for the 5 lines (line N0 through N4) while shifting by one pixel at a time the window having a window size of 5 pixels×5 lines and generates the processed data. The processed data is generated as position data of the position of any pixel in the window. Referring to
As described above, the window is a square having a window size of 5 pixels×5 lines. The number of lines of the window size is defined more in detail by the number of pieces of line data input to the processor 10. The number of pixels may be set in accordance with the throughput of the processor 10 or the like. The window set in the processor 10 may not necessarily be limited to the square window that is defined by the number of pixels equal to the number of lines. When the window size is quoted in the following discussion, the number of lines is specified while the number of pixels is not specified.
The processor 10 is a digital filter that performs a filtering process on digital data of an image. The processor 10 performs the filtering process in window units with each window including multiple pixels. As described above, the processor 10 reads the line data of the number of pieces responsive to the window size (lines N0 through N8 in
The memory 20 saves the line data of pixels that are read line by line. The memory 20 may be a dynamic random-access memory (DRAM), a static random-access memory (SRAM), or the like. The memory 20 may be a shared memory that is a working memory used in the arithmetic operation of a central processing unit (CPU).
The DMAC 30 controls memory access in the image processing apparatus 1. The DMAC 30 includes the writing DMAC 31 that is used to write data on the memory 20 and the reading DMAC 32 that is used to read data from the memory 20. The writing DMAC 31 writes data on one channel. The reading DMAC 32 reads data on multiple channels. In the configuration in
The upper FIFO unit 40 and lower FIFO unit 50 are buffers that are used to write or read data on the memory 20 through the DMAC 30. The upper FIFO unit 40 is connected to the DMAC 30. The lower FIFO unit 50 is connected between the upper FIFO unit 40 and the processor 10.
The upper FIFO unit 40 includes first-in first-outs (FIFOs), the number of which corresponds to the number of channels of each of the writing DMAC 31 and reading DMAC 32. The number of FIFOs in the upper FIFO unit 40 connected to the writing DMAC 31 corresponds to the number of channels of the writing DMAC 31, namely, the number of FIFOs is 1. The FIFO connected to the writing DMAC 31 is a FIFO 45. The number of FIFOs in the upper FIFO unit 40 connected to the reading DMAC 32 corresponds to the number of channels of the reading DMAC 32, namely, the number of FIFOs is C (C is an integer equal to or above). In the configuration in
The lower FIFO unit 50 includes F FIFOs corresponding to each of the FIFOs 41 through 45 in the upper FIFO unit 40 (F is an integer equal to or above 2). In the configuration in
Let n represent the total number of pieces of the line data input to the processor 10. Specifically, the total number of pieces of line data read from the memory 20 out of the line data to be input to the processor 10 is (n−1). Since the lower FIFO unit 50 is connected to the processor 10, the number of FIFOs in the lower FIFO unit 50 is (n−1) that corresponds to the number of pieces of the line data read from the memory 20 and input to the processor 10. Specifically, the number of FIFOs in the lower FIFO unit 50 on the side of the reading DMAC 32 is expressed by the following equation:
C×F=n−1.
Since C=4 and n=9 (9 lines from lines N0 through N8) in the configuration in
The bus bridge 60 is a bridge circuit arranged between the memory 20 and the DMAC 30. The DMAC 30 is connected to the memory 20 via the bus bridge 60 and writes or reads data on the memory 20.
Data transfer between the processor 10 and the line buffer is described with reference to the configuration in
The layout order of the input ports N1 through N8 agrees with the age of the line data. Specifically, the line data read immediately before the line data input to the input port N0 is input to the input port N1 and the line data read one line before is input to the input port N2. Similarly, the line data read further before is input to the following ports. Finally, the oldest line data read 8 lines before is input to the input port N8.
The input ports N1 through N8 are respectively connected to FIFOs 51a and 51b through FIFOs 54a and 54b in the lower FIFO unit 50 on the side of the reading DMAC 32. Specifically, the input port N1 is connected to the FIFO 51a in the lower FIFO unit 50. The input port N2 is connected to the FIFO 52a in the lower FIFO unit 50. The input port N3 is connected to the FIFO 53a in the lower FIFO unit 50. The input port N4 is connected to the FIFO 54a in the lower FIFO unit 50. The input port N5 is connected to the FIFO 51b in the lower FIFO unit 50. The input port N6 is connected to the FIFO 52b in the lower FIFO unit 50. The input port N7 is connected to the FIFO 53b in the lower FIFO unit 50. The input port N8 is connected to the FIFO 54b in the lower FIFO unit 50.
The processor 10 includes the output port (N4 (output)) used to output, as process results, output data and two output ports (N0 and N4) used to output the line data in the original input state. The output ports correspond to the center line (N4) from among all line data N0 through N8 input via the input ports. In the following discussion, the output port that is used to output the input line data just as it is input is referred to as a “transfer port” to differentiate it from the output port that is used to output the processed data. The transfer port N0 outputs the line data that has been input to the input port N0 and the transfer port N4 outputs the line data that has been input to the input port N4.
The transfer port is typically set as described below. The output port corresponding to the input port N0 is the transfer port NO. From among the input ports corresponding to the line data from the line buffer, input ports that are at locations away from N1 (latest) and correspond to an integer multiple of the number of channels C of the reading DMAC 32 are N4 and N8 in
The transfer port N0 is connected to the FIFO 55a in the lower FIFO unit 50. The transfer port N4 is connected to the FIFO 55b in the lower FIFO unit 50. The line data output from the transfer ports N0 and N4 are supplied to the FIFOs 55a and 55b in the lower FIFO unit 50 on the side of the writing DMAC 31. In the example in
Referring to
The line data saved at the FIFO 55a and the line data saved at the FIFO 55b in the lower FIFO unit 50 are alternately transferred to the FIFO 45 in the upper FIFO unit 40. In the configuration in
Typically, in the FIFO 45 and the FIFOs 55a and 55b in the upper FIFO unit 40 on the side of the writing DMAC 31, the line data transferred from the processor 10 to the lower FIFO unit 50 is successively transferred piece by piece in order from the FIFOs 55a and 55b to the FIFO 45 in the upper FIFO unit 40. The writing DMAC 31 then writes the line data in order on the memory 20.
The reading DMAC 32 reads from the memory 20 via the bus bridge 60 the line data corresponding to the number of channels of the reading DMAC 32 and causes the FIFOs 41 through 44 in the upper FIFO unit 40 to save the read line data. The FIFO 41 saves alternately the line data to be transferred to the FIFOs 51a in the lower FIFO unit 50 and the line data to be transferred to 51b in the lower FIFO unit 50. In the example in
The FIFO 42 alternately saves the line data to be supplied the FIFO 52a in the lower FIFO unit 50 and the line data to be supplied to the FIFO 52b in the lower FIFO unit 50. In the example in
The FIFO 43 alternately saves the line data to be supplied the FIFO 53a in the lower FIFO unit 50 and the line data to be supplied to the FIFO 53b in the lower FIFO unit 50. In the example in
The FIFO 44 alternately saves the line data to be supplied the FIFO 54a in the lower FIFO unit 50 and the line data to be supplied to the FIFO 54b in the lower FIFO unit 50. In the example in
The FIFO 41 alternately supplies the line data to the FIFO 51a in the lower FIFO unit 50 and the line data to the FIFO 51b in the lower FIFO unit 50. In the example in
Typically, in the FIFOs 41 through 44 in the upper FIFO unit 40 and in the FIFOs 51a and 51b through the FIFOs 54a and 54b in the lower FIFO unit 50 on the side of the reading DMAC 32, the line data is read from the memory 20 by the reading DMAC 32 and is then saved on the FIFOs 41 through 44. In accordance with the order of the lines input to the memory 20, the line data on the FIFOs 41 through 44 is successively saved piece by piece on the FIFOs 51a and 51b through 54a and 54b in the lower FIFO unit 50 corresponding to the FIFOs 41 through 44 in the upper FIFO unit 40.
The line data of the same synchronization information (P1, P2, P3, P4, . . . ) is supplied to the processor 10 from each of the FIFOs 51a and 51b through 54a and 54b in the lower FIFO unit 50 in each cycle. The input port N0 of the processor 10 receives the line data each time when it is read.
The output unit 13 includes multiple output IFs 131 forming output ports (including transfer ports) and data selectors 132 controlling on and off of the output ports. In the configuration in
Referring to
FIFO 41: (LN1, P1), (LN5, P1)→L99, L95
FIFO 42: (LN2, P1), (LN6, P1)→L98, L94
FIFO 43: (LN3, P1), (LN7, P1)→L97, L93
FIFO 44: (LN4, P1), (LN8, P1)→L96, L92
The line data is distributed into the FIFOs 51a and 51b through 54a and 54b in the lower FIFO unit 50 and then transferred to the input ports N1 through N8 as described below. Specifically, the line data is input to the input ports in the order of age, namely, the older data is input to the input port N8 and the younger data is input to the input port N1. Referring to
Input port N8: L92
Input port N7: L93
Input port N6: L94
Input port N5: L95
Input port N4: L96
Input port N3: L97
Input port N2: L98
Input port N1: L99
In this cycle, the line data “L100” on the 100th line is input to the input port NO. In the next cycle, the line data “L101” on the 101st line is input to the input port NO. The line data read by the reading DMAC 32 and saved at the FIFOs 41 through 44 and the line data distributed into the input ports N1 through 8 are shifted from the state described above to the new state by line.
The line data output from the transfer ports N0 and N4 is now considered. Since the line data input to the input ports N0 and N4 is output in the original input form thereof from the transfer ports N0 and N4, the output of the transfer port N0 is L100 and the output of the transfer port N4 is L96 at the timing when the line data (L100) on the 100th line is input to the input port NO. The FIFO 45 on the side of the writing DMAC 31 holds LN=L100 and LN4=L96. As illustrated in
Control of the window size of the image processing apparatus 1 constructed as illustrated in
In the example in
In the input ports of the processor 10, the input port N0 having the input line connected thereto and the input lines N1 through N4 having the FIFOs 51a, 52a, 53a, and 54a respectively connected thereto are used while the input ports N5 through N8 are unused. In such a case, the data selectors 122 (see
Since the active input ports in the processor 10 is N0 through N4, the output port is changed to N2 serving as the center line. Concerning the transfer ports in the processor 10, the transfer port N0 through which the input to the input port N0 is output is used while the transfer port N4 through which the input to the input port N4 is output is not used. In such a case, the data selector 132 (see
Let C represent the number of channels of the reading DMAC 32 and n the total number of pieces of the line data input to the processor 10, and
C×F=n−1 then, n=C×F+1=4×3+1=13 The total number of pieces of the line data input to the processor 10 is 13. In the configuration in
The rest of the configuration of the image processing apparatus 1 in
Referring to
Similarly, the line data read further before is input to the following ports. Finally, the oldest line data read 12 lines before is input to the input port N12.
The input ports N1 through N12 are respectively connected to FIFOs 51a, 51b, and 51c through FIFOs 54a, 54b, 54c in the lower FIFO unit 50 on the side of the reading DMAC 32. Specifically, the input port N1 is connected to the FIFO 51a in the lower FIFO unit 50. The input port N2 is connected to the FIFO 52a in the lower FIFO unit 50. The input port N3 is connected to the FIFO 53a in the lower FIFO unit 50. The input port N4 is connected to the FIFO 54a in the lower FIFO unit 50. The input port N5 is connected to the FIFO 51b in the lower FIFO unit 50. The input port N6 is connected to the FIFO 52b in the lower FIFO unit 50. The input port N7 is connected to the FIFO 53b in the lower FIFO unit 50. The input port N8 is connected to the FIFO 54b in the lower FIFO unit 50. The connection up until now is identical to the connection in the configuration in
The processor 10 includes one output port (N6 (output)) used to output, as process results, output data and three output ports (NO, N4 and N8) used to output the line data in the original input state thereof. The transfer port NO outputs the line data that has been input to the input port NO, the transfer port N4 outputs the line data that has been input to the input port N4, and the transfer port N8 outputs the line data that has been input to the input port N8.
Through the setting method of the transfer port described with reference to
The transfer port N0 is connected to the FIFO 55a in the lower FIFO unit 50. The transfer port N4 is connected to the FIFO 55b in the lower FIFO unit 50. The transfer port N8 is connected to the FIFO 55c in the lower FIFO unit 50. The line data output from the transfer ports NO, N4, and N8 are supplied to the FIFOs 55a, 55b, and 55c in the lower FIFO unit 50 on the side of the writing DMAC 31. Specifically, in the example in
The line data saved at the FIFOs 55a, 55b, and 55c in the lower FIFO unit 50 is successively supplied to the FIFO 45 in the upper FIFO unit 40. In the example in
The reading DMAC 32 reads the line data for the number of channels of the reading DMAC 32 from the memory 20 via the bus bridge 60 and causes the FIFOs 41 through 44 in the upper FIFO unit 40 to save the read line data. The FIFO 41 saves the line data that is successively supplied to the FIFOs 51a, 51b, and 51c in the lower FIFO unit 50. In the example in
The FIFO 42 saves the line data that is successively supplied to the FIFOs 52a, 52b, and 52c in the lower FIFO unit 50. In the example in
The FIFO 43 saves the line data that is successively supplied to the FIFOs 53a, 53b, and 53c in the lower FIFO unit 50. In the example in
The FIFO 44 saves the line data that is successively supplied to the FIFOs 54a, 54b, and 54c in the lower FIFO unit 50. In the example in
The line data is successively supplied from the FIFO 41 to the FIFOs 51a, 51b, and 51c in the lower FIFO unit 50. In the example in
The FIFOs 51a, 51b, and 51c through 54a, 54b, and 54c in the lower FIFO unit 50 supply the line data with the same synchronization information (P1, P2, P3, P4, . . . ) to the processor 10. Each time the line data is supplied to the processor 10, the input port N0 in the processor 10 receives the newly read line data.
Referring to
FIFO 41: (LN1, P1), (LN5, P1), (LN9, P1)→L99, L95, L91
FIFO 42: (LN2, P1), (LN6, P1), (LN10, P1)→L98, L94, L90
FIFO 43: (LN3, P1), (LN7, P1), (LN11, P1)→L97, L93, L89
FIFO 44: (LN4, P1), (LN8, P1), (LN12, P1)→L96, L92, L88
The line data is distributed into the FIFOs 51a, 51b, and 51c through 54a, 54b, and 54c in the lower FIFO unit 50 and then transferred to the input ports N1 through N12 as described below. Specifically, the line data is input to the input ports in the order of age, namely, the older data is input to the input port N12 and the younger data is input to the input port N1.
Input port N12: L88
Input port N11: L89
Input port N10: L90
Input port N9: L91
Input port N8: L92
Input port N7: L93
Input port N6: L94
Input port N5: L95
Input port N4: L96
Input port N3: L97
Input port N2: L98
Input port N1: L99
In this cycle, the line data “L100” on the 100th line is input to the input port NO. In the next cycle, the line data “L101” on the 101st line is input to the input port NO. The line data read by the reading DMAC 32 and saved at the FIFOs 41 through 44 and the line data distributed into the input ports N1 through 12 are shifted from the state described above to the new state line by line.
The line data output from the transfer ports NO, N4, and N8 is now considered. The line data input to the input ports NO, N4, and N8 is output in the original input state thereof from the transfer ports NO, N4, and N8. At the timing when the line data (L100) on the 100th line is input to the input port NO, the output of the transfer port N0 is L100, the output of the transfer port N4 is L96, and the output of the transfer port N8 is L92. The FIFO 45 on the side of the writing DMAC 31 holds LN0=L100, LN4=L96, and LN8=L92. Referring to
Control of the window size of the image processing apparatus 1 constructed as illustrated in
From among the input ports of the processor 10, the input port N0 connected to the input line and the input ports N1 through N8 connected to the FIFOs 51a and 51b through 55a and 55b are used and the input ports N9 through N12 are not used. In such a case, the data selector 122 (see
Since the active input ports in the processor 10 is NO through N8, the output port is changed to N4 serving as the center line. Concerning the transfer ports in the processor 10, the transfer port N0 through which the input to the input port N0 is output is used, the transfer port N4 through which the input to the input port N4 is output is used. On the other hand, the transfer port N8 through which the input to the input port N8 is output is not used. In such a case, the data selector 132 (see
From among the input ports of the processor 10, the input port N0 connected to the input line and the input ports N1 through N4 connected to the FIFOs 51a, 52a, 53a, 54a, and 55a are used and the input ports N5 through N12 are not used. In such a case, the data selector 122 (see
Since the active input ports in the processor 10 is NO through N4, the output port is changed to N2 serving as the center line. Concerning the transfer ports in the processor 10, the transfer port N0 through which the input to the input port N0 is output is used, and the transfer port N4 through which the input to the input port N4 is output and the transfer port N8 through which the input to the input port N8 is output are not used. In such a case, the data selectors 132 (see
In the examples described above, the window size in the processor 10 is controlled by using all or part of FIFOs in the lower FIFO unit 50. The number of lines of the line data may be changed in steps of the number of channels of the reading DMAC 32. In contrast, the window size in the processor 10 may be controlled in steps, with the magnitude of the step smaller than the number of channels of the reading DMAC 32, by activating or inactivating the lines input to the processor 10.
This control is described with reference to the configuration in
If all the FIFOs 51a and 51b through 54a and 54b in the lower FIFO unit 50 are used, the processor 10 receives the line data on the 9 lines. The window size in the processor 10 is thus 9 lines. As illustrated in
In contrast with the above cases, the input port for 2 lines may be inactivated in the processor 10. The switching between the activation and inactivation of the input ports is performed by the data selectors 122 (see
Which lines to be inactivated may be determined such that the remaining active lines become 7 consecutive lines. For example, the line data (line NO) read last from the image and the line data (line N8 in
The control method in the configuration in
Transfer performance per channel of the DMAC 30 is determined in accordance with bus width, latency, and the like (step S902). The transfer performance is an index that represents how much line data is transferable to the memory 20 via a processing cycle of one channel of the DMAC 30. The bus width is one of parameters directly related to the transfer performance and represents how many bits of data is readable (or writable) in a single reading operation or writing operation. The latency is one of the parameters directly related to the transfer performance and represents the number of clocks used from when a data read or write command is issued to when the data reading or writing operation is actually terminated.
The number of channels of the DMAC 30 is determined to satisfy the design performance of the arithmetic core 11 in the processor 10 determined in step S901 (step S903). Specifically, the number of channels of the reading DMAC 32 to supply the line data to the processor 10 is determined. A determination is made as to whether the number of FIFOs in the lower FIFO unit 50 to identify the number of lines for the line data to supply the line data to the processor 10 is appropriate (in other words, excessive or insufficient) for the determined number of channels of the DMAC 30 (in particular, the reading DMAC 32).
If the number of FIFOs in the lower FIFO unit 50 is not appropriate (no path from step S904), the number of FIFOs in the lower FIFO unit 50 is adjusted (step S905). The process returns to step S902. The setting of the DMAC 30 and the lower FIFO unit 50 responsive to the design performance is repeated. If the number of FIFOs in the lower FIFO unit 50 is determined to be appropriate (yes path from step S904), the design of the image processing apparatus 1 is complete.
In the exemplary embodiment above, the term “processor” refers to hardware in a broad sense. Examples of the processor includes general processors (e.g., CPU: Central Processing Unit), dedicated processors (e.g., GPU: Graphics Processing Unit, ASIC: Application Specific Integrated Circuit, FPGA: Field Programmable Gate Array, and programmable logic device).
In the exemplary embodiment above, the term “processor” is broad enough to encompass one processor or plural processors in collaboration which are located physically apart from each other but may work cooperatively. The order of operations of the processor is not limited to one described in the exemplary embodiment above, and may be changed.
The foregoing description of the exemplary embodiment of the present disclosure has been provided for the purposes of illustration and description. It is not intended to be exhaustive or to limit the disclosure to the precise forms disclosed. Obviously, many modifications and variations will be apparent to practitioners skilled in the art. The embodiment was chosen and described in order to best explain the principles of the disclosure and its practical applications, thereby enabling others skilled in the art to understand the disclosure for various embodiments and with the various modifications as are suited to the particular use contemplated. It is intended that the scope of the disclosure be defined by the following claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
JP2019-150418 | Aug 2019 | JP | national |
Number | Name | Date | Kind |
---|---|---|---|
20140043343 | Iwai | Feb 2014 | A1 |
Number | Date | Country |
---|---|---|
2014-35619 | Feb 2014 | JP |
Number | Date | Country | |
---|---|---|---|
20210056368 A1 | Feb 2021 | US |