The present invention pertains to an image processing device, an image processing method, and a storage medium.
Technologies for detecting contamination with detection targets such as foreign matter enclosed in transparent containers have been disclosed. For example, in Patent Document 1, an inspection region is narrowed within an image by means of image processing, and air bubbles and foreign matter in the region are distinguished by using the shapes of the air bubbles. In Patent Document 1, when determining whether a liquid sealed inside a semi-transparent container such as a syringe is contaminated with foreign matter, the presence or absence of foreign matter in the liquid is distinguished by vibrating or rotating the container to put the foreign matter in a suspended state, and by recognizing the suspended foreign matter and air bubbles generated by the vibrations or rotations.
Patent Document 2 discloses an example of a method for detecting contaminants in a liquid. With the method for detecting contaminants in a liquid in Patent Document 2, the amounts of light from a transmission light source and from a reflection light source are relatively adjusted with respect to foreign matter suspended in a container, and foreign matter in the liquid is detected by the differences in brightness between air bubbles, which appear with high brightness, and foreign matter, which appear with low brightness.
Patent Document 3 discloses technology for detecting foreign matter by using feature quantities obtained from differential images of foreign matter suspended by rotation. In Patent Document 3, two sets of images are selected with a prescribed time interval therebetween, and a difference image is generated. Additionally, in Patent Document 3, edge pixels are extracted by performing a differential process on the difference image, and the extracted edges are grouped to extract individual suspension differential images. Additionally, in Patent Document 3, feature quantities such as distributions and shapes of differential values are determined from the grouped individual suspension differential images, and foreign matter in the liquid is detected by distinguishing between foreign matter and air bubbles by these feature quantities.
Patent Document 4 discloses technology pertaining to an outer inspection method for transparent films wherein an outer inspection regarding the presence or absence of defects such as foreign matter or air bubbles is performed by implementing computer image processing on images of transparent films with low reflectance.
In the case in which the detection targets such as foreign matter contaminating a medium in a transparent container as mentioned above accumulate in curved portions such as the bottom surfaces of transparent containers or on medium surfaces (i.e., liquid surfaces) inside transparent containers, there was a problem in that the reflection of light due to the curvature at the bottom surfaces of the transparent containers and in the shapes of the medium surfaces affected the detection of small detection targets, making such detection difficult.
Therefore, an objective of the present invention is to provide an image processing device, an image processing method, and a storage medium that solve the problems mentioned above.
According to a first aspect of the present invention, an image processing device: compares multiple images capturing a detection target inside a transparent container, the images being captured while rotating the transparent container, and determines, among candidate regions for being the detection target appearing in the images, candidate regions that move in a movement direction in accordance with the rotation; and determines a presence or absence of the detection target by using first determination results obtained by using a first learning model and image information for the candidate regions that move in the movement direction in accordance with the rotation to determine whether or not the candidate regions are the detection target, and second determination results obtained by using a second learning model and information indicating a chronological change in the candidate regions that move in the movement direction in accordance with the rotation to determine whether or not the candidate regions are the detection target.
According to a second aspect of the present invention, an image processing method involves: comparing multiple images capturing a detection target inside a transparent container, the images being captured while rotating the transparent container, and determining, among candidate regions for being the detection target appearing in the images, candidate regions that move in a movement direction in accordance with the rotation; and determining a presence or absence of the detection target by using first determination results obtained by using a first learning model and image information for the candidate regions that move in the movement direction in accordance with the rotation to determine whether or not the candidate regions are the detection target, and second determination results obtained by using a second learning model and information indicating a chronological change in the candidate regions that move in the movement direction in accordance with the rotation to determine whether or not the candidate regions are the detection target.
According to a third aspect of the present invention, a storage medium stores a program for making a computer in an image processing device execute: a process of comparing multiple images capturing a detection target inside a transparent container, the images being captured while rotating the transparent container, and determining, among candidate regions for being the detection target appearing in the images, candidate regions that move in a movement direction in accordance with the rotation; and a process of determining a presence or absence of the detection target by using first determination results obtained by using a first learning model and image information for the candidate regions that move in the movement direction in accordance with the rotation to determine whether or not the candidate regions are the detection target, and second determination results obtained by using a second learning model and information indicating a chronological change in the candidate regions that move in the movement direction in accordance with the rotation to determine whether or not the candidate regions are the detection target.
According to the present invention, detection targets that have accumulated in curved portions inside transparent containers and on medium surfaces inside transparent containers can be more accurately detected.
Hereinafter, embodiments of the present invention will be explained in detail with reference to the drawings.
First, an embodiment of the invention will be explained in detail with reference to the drawings.
In the first embodiment, the transparent container 2 has a cylindrical shape, and the medium enclosed inside the transparent container is a liquid 3. Foreign matter that is a detection target has accumulated on the bottom surface of the transparent container 2, and the image processing device 1 determines the presence or absence of foreign matter based on images that include foreign matter accumulated on the bottom surface of transparent container 2. The foreign matter that is the detection target may be accumulated at a liquid surface of the liquid 3 inside the transparent container 2, and the image processing device 1 may determine the presence or absence of foreign matter based on images that include foreign matter accumulated at a liquid surface of the liquid 3. The bottom surface of the transparent container 2 is one example of a curved portion of a wall surface of the transparent container 2. The liquid surface of the liquid 3 inside the transparent container 2 is one embodiment of a surface at which a medium (liquid 3) enclosed inside the transparent container 2 contacts another medium (air) inside the transparent container 2. The image processing device 1 determines, in particular, detection targets in images capturing the detection targets accumulated in curved portions of wall surfaces of the transparent container 2 or detection targets accumulated at surfaces where a medium enclosed inside the transparent container 2 contacts another medium inside the transparent container. The media enclosed inside the transparent container 2 is not limited to the liquid 3. Additionally, the curved portions of wall surfaces inside the transparent container 2 are not limited to the bottom surface. Additionally, the surface at which the liquid 3 enclosed inside the transparent container 2 contacts another medium inside the transparent container 2 is not limited to the liquid surface of the liquid 3. The image processing device 1 may determine the types (foreign matter, air bubbles, etc.) of the detection targets. The image processing device 1 may detect detection targets accumulated in a medium other than a liquid enclosed inside the transparent container 2.
In other words, the image processing device 1 may detect the detection targets from images capturing the detection targets accumulated on the bottom surface of the transparent container 2 or the detection targets accumulated at a surface at which a medium enclosed inside the transparent container 2 contacts another medium inside the transparent container 2. The image processing device 1 may detect the detection targets from only images capturing detection targets accumulated on the bottom surface of the transparent container 2, or may detect the detection targets from only images capturing detection targets accumulated at the liquid surface of the liquid 3 in the transparent container 2. The image processing device 1 may detect the detection targets in images capturing the detection targets suspended within the liquid 3 in the transparent container 2.
The rotation device 4 is a device that grips the bottom surface with the cylindrical transparent container 2 oriented vertically, and that rotates the transparent container 2 so that the axis of rotation (z axis) is at a center aligned with the center of the bottom surface. The detection targets that are suspended in the liquid 3 enclosed inside the transparent container 2 rotate about the axis of rotation as the transparent container 2 is rotated. In the present embodiment, the image processing device 1 is provided with an image capture device 10. The image capture device 10 is installed at a position from which the detection targets can be captured, with the image capture direction fixed on an image capture axis (x axis) orthogonal to the z axis. While the rotation device 4 is rotating the transparent container 2, the image processing device 1 controls the image capture device 10 to capture images of the detection targets, and can obtain multiple consecutive images with angles of view in which the detection targets appear. While the image capture device 10 is capturing images, the relationship between the z axis and the x axis is maintained. As a result thereof, the image processing device 1 acquires multiple images containing the detection targets, the positions of which change as the rotation device 4 is rotated. In the example in
In this case, when a detection target is suspended in the liquid 3 as in the third example (image 5a) in a state in which there is foreign matter in the liquid 3, the brightness value in regions other than the detection target within the image region can easily be made constant, as illustrated in image 5a, by controlling the illumination environment, making the region of the detection target more easily discernible. In the third example (image 5a), in addition thereto, the suspended detection target gradually descends in the direction of gravity, and the detection target can be easily detected by sensing the movement thereof. In contrast therewith, there is a requirement to accurately detect detection targets accumulated at liquid surfaces or near the bottom surfaces of transparent containers 2, as indicated in the first example (image 4a) and the second example (image 4b). In the first example (image 4a) and the second example (image 4b), the liquid surface and curved portions of the wall surfaces of the transparent container 2, such as the bottom surface regions, appear in the image, so that the brightness values of the regions in the image other than the detection targets are not constant and it becomes difficult to make the region of the detection target discernible. Additionally, since the detection targets accumulate on the bottom surface and near the liquid surface, detection by descending movement is not possible. For this reason, the image processing device 1 has both a function for allowing the regions of detection targets to be made discernible even near the liquid surface or the bottom surface, and a function for detecting accumulated detection targets by rotational motion of the container.
The image processing device 1 illustrated in this drawing is a computer provided with hardware such as a CPU (Central Processing Unit) 101, a ROM (Read-Only Memory) 102, a RAM (Random Access Memory) 103, an HDD (Hard Disk Drive) 104, a communication module 105, and a database 106. Additionally, the image processing device 1 is provided with an image capture device 10.
As indicated in
The image capture unit 301 controls the image capture device 10 to acquire images generated by the image capture device 10 capturing images.
The histogram generation unit 302 generates a reference histogram indicating the frequency of occurrence of each pixel value among the respective pixels in multiple images (1, 2, . . . , N frames) prepared in advance by capturing images while the transparent container is rotating.
The histogram storage unit 303 stores the reference histogram that was generated in advance by the histogram generation unit 302.
The extraction unit 304 compares the frequencies of occurrence (histogram) of brightnesses in the pixels of one image selected from among multiple images (1, 2, . . . , N frames) acquired by sequentially capturing images while rotating the transparent container 2 with the rotation device 4 based on comparisons with the reference histogram stored in the histogram storage unit 303, and extracts candidate regions for being detection targets, such as foreign matter and air bubbles, in the one selected image.
The difference detection unit 305 sequentially compares pixel values of detection target candidate regions extracted from multiple images with successive image capture timings, and performs a process of eliminating, from being detection targets, the candidate regions that have not moved.
The clustering unit 306 clusters, as single candidate regions, groups of adjacent pixels among the pixels remaining as candidate regions.
The tracking determination unit 307 identifies, in the respective pixels of multiple successive images with sequentially different image capture timings, successive associated regions that move in accordance with the rotation of the transparent container in the regions (detection windows) of clustered candidate regions.
The recognition unit 308 recognizes detection targets by using first determination results obtained by using a first learning model and image information for candidate regions that move in a movement direction in accordance with the rotation to determine whether or not the candidate regions are the detection target, and second determination results obtained by using a second learning model and information indicating a chronological change in the candidate regions that move in the movement direction in accordance with the rotation to determine whether or not the candidate regions are the detection target. The recognition of the detection target is one embodiment of the process for determining the presence or absence of detection targets.
The functional blocks indicated in
The weight matrix generation unit 401 generates a weight matrix M for calculating the relationship between the respective tracking results for detection target candidate regions up to the image of the (N−1)-th frame and the respective detection target candidate regions in the image of the N-th frame based on weighting information obtained from the new trajectory weighting unit 402, the position weighting unit 403, the size change weighting unit 404, and the rotation direction weighting unit 405.
The new trajectory weighting unit 402, the position weighting unit 403, the size change weighting unit 404, and the rotation direction weighting unit 405 each determine weight information indicating whether there is a strong or weak correlation between the respective tracking results of detection target candidate regions up to the image of the (N−1)-th frame at different viewpoints and the respective detection target candidate regions in the image of the N-th frame. Details regarding the weighting units will be described below.
The link determination unit 406 uses the weight matrix M and an algorithm for solving the assignment problem, such as the Hungarian algorithm, to assign the respective detection target candidate regions in the image of the N-th frame to the respective tracking results of the detection target candidate regions up to the image of the (N−1)-th frame.
The unused region storage unit 407 stores candidate regions that were not assigned to tracking results by the link determination unit 406.
The existing trajectory storage unit 408 stores, for the respective frames, in an associated manner, trajectory recognition information and the respective candidate regions that have been assigned tracking results by the link determination unit 406.
The trajectory length determination unit 409 identifies, as detection target trajectories, trajectories that have been linked to lengths equal to or greater than a threshold value among the trajectory information stored in the existing trajectory storage unit 408.
The functional blocks indicated in
The recognition unit 308 provides the functions of a simulated texture storage unit 501, a synthesized target image generation unit 502, a chronological tracking data storage unit 503, a real image storage unit 504, a synthesized target image storage unit 505, a classifier 506, a learning unit 507, and a recognition result storage unit 508.
The simulated texture storage unit 501 stores simulated textures.
The synthesized target image generation unit 502 generates synthesized target images simulating detection targets by cutting out simulated textures in random shapes and overlaying them at random positions and angles on images.
The chronological tracking data storage unit 503 stores chronological data of tracking results obtained when tracking detection targets.
The real image storage unit 504 stores multiple images of different shapes and sizes, the images being of actual detection targets such as foreign matter and air bubbles. The synthesized target image storage unit 505 stores synthesized target images generated by the synthesized target image generation unit 502.
The classifier 506 recognizes detection targets such as foreign matter and air bubbles by using a learning model generated by the learning unit 507. The classifier 506 specifically outputs a three-value classification determination result including foreign matter, air bubbles, and erroneous detection of other regions.
The learning unit 507 generates a learning model for recognizing detection targets by using determination results from the tracking determination unit 307.
The recognition result storage unit 508 stores determination results from the classifier 506.
First, the image processing device 1 generates, in advance, a first learning model and a second learning model. The first learning model is a learning model for recognizing whether or not a detection target candidate region moving in a movement direction in accordance with the rotation of the transparent container 2 is a detection target based on images (1, 2, . . . , N frames) generated by repeatedly capturing images while the transparent container 2 is rotating. Additionally, the second learning model is a learning model for recognizing whether or not a detection target candidate region moving in a movement direction in accordance with the rotation of the transparent container is a detection target based on information regarding each of the images (1, 2, . . . , N frames) indicating the chronological change in the candidate regions moving in the movement direction in accordance with the rotation. The image processing device 1 uses these two learning models, i.e., the first learning model and the second learning model, to recognize detection targets.
Specifically, the learning unit 507 performs a first machine learning process using, as correct data, real images in which foreign matter and air bubbles, which are actual detection targets, appear, and simulated images of those detection targets, which have been prepared as training data. The real images are recorded in the real image storage unit 504. The simulated images are recorded in the synthesized target image storage unit 505. An example of a real image or a simulated image is image information having the center of a detection target region at the center of the image.
The learning unit 507 performs the machine learning process using the real images and the simulated images recorded in these storage units. The simulated images are images simulating detection targets such as foreign matter and air bubbles generated by the synthesized target image generation unit 502 using textures (image color samples, etc.) recorded in the simulated texture storage unit 501, the images being prepared by cutting out textures in random shapes and overlaying them at random positions and angles on images. The learning unit 507, for example, by a first machine learning process, generates a first learning model that includes information on parameters such as weights, such as a neural network in which images are input and which outputs results indicating that there is a detection target when a detection target indicated by the real images or the simulated images appears in those images. The classifier 506 outputs information indicating that there is a detection target in the case in which a detection target appears in an image when the image is input to a neural network generated by using the first learning model. Additionally, the first learning model may be information making up a neural network that identifies detection target recognition information. Thus, the learning unit 507 generates a first learning model that can robustly detect even detection targets such as unknown foreign matter, using real images indicating the actual appearance of detection targets by images and simulated images prepared by synthesizing textures.
Additionally, the learning unit 507 generates, as training data, chronological tracking data based on multiple image frames capturing detection targets actually contained in a rotating transparent container 2, and uses this data to perform a second machine learning process. The chronological tracking data is recorded in the chronological tracking data storage unit 503. The chronological tracking data may be, for example, information such as identifiers, positions (in-image coordinates), and numbers of pixels indicating detection target regions in chronologically successive images, and motion amounts and motion vectors based on variations in the positions of those regions in the successive image frames. The learning unit 507, for example, by the second machine learning process, generates a second learning model which, when respective chronological tracking data for images of multiple frames captured in chronological order are input, outputs results indicating that the respective trajectories determined to be detection target candidate regions in the images of the respective frames are trajectories of detection targets. The second learning model is information from training results that include information on parameters such as weights in a neural network, etc. When chronological tracking data relating to images of multiple frames is input to a neural network generated by using the second learning model, the classifier 506 outputs information indicating that a trajectory is a detection target if the respective trajectories determined to be detection target candidate regions in the images of the respective frames are trajectories of a detection target.
Furthermore, the classifier 506 uses the first learning model and the second learning model, which are used for determining whether there are detection targets by means of different information, to determine, from images, whether they include detection targets. Additionally, the classifier 506 may use the first learning model and the second learning model, which are used for determining whether there are detection targets by means of different information, to identify detection target identifiers from images. Since the first learning model and the second learning model are used to identify detection targets, the detection targets can be recognized with higher accuracy. Due to the classifier 506 using the first learning model and the second learning model to determine whether detection targets are included in images, it is possible to perform three-value classification determination including whether a detection target is foreign matter or an air bubble, or an erroneous detection of a background other than the above. The image processing device 1 may use the first learning model and the second learning model to separately recognize multiple detection targets as foreign matter that are detection targets.
Next, the process by which the image processing device 1 generates a reference histogram will be explained.
First, the image capture direction and the angle of view of the image capture device 10 are fixed. The image capture device 10, as one example, captures images including the vicinity of the bottom surface of a transparent container 2. In this state, the rotation device 4 rotates the transparent container 2 about the Z axis. The rotational motion of the transparent container 2 is of a constant velocity. As a result thereof, in the images generated by the image capture device 10, the same location on the transparent container 2 appears with a regular periodicity. The image capture unit 301 in the image processing device 1 controls the image capture device 10 so as to continuously capture images. When generating the reference histogram, it is assumed that there are no detection targets (foreign matter, air bubbles, etc.) inside the transparent container 2.
The image capture unit 301 sequentially acquires the images (1, 2, . . . , N frames) generated by the image capture device 10 (step S101). The image capture device 10 may generate a number of images, such as ten images, with each rotation of the transparent container 2. The histogram generation unit 302 acquires the multiple images (1, 2, . . . , N frames). The histogram generation unit 302 sets, in each of the multiple images, rectangular histogram generation ranges a including one reference pixel and surrounding pixels in the image, while shifting the reference pixel one at a time. As one example, the histogram generation range a may be a range of five pixels vertically and five pixels horizontally.
The histogram generation unit 302 generates reference histograms of the histogram generation ranges a based on the respective pixels of the respective histogram generation ranges a at the same in-image positions in multiple consecutive images acquired chronologically from the time 0 to the time t (step S102). The histogram generation unit 302 generates reference histograms for all of the pixels in the images. The reference histogram, as one example, indicates pixel values of brightness or one of 255 tones of color information such as RGB information on the vertical axis and indicates the occurrence frequency on the horizontal axis. The image processing device 1 generates the reference histograms using images generated while the transparent container 2 rotates multiple times. The histogram generation unit 302 records the generated reference histogram in the histogram storage unit 303. The above-mentioned process corresponds to a process whereby the histogram generation unit 302, when an image region captured by the image capture unit 301 is represented by 2, prepares reference histograms h_i∈Ω in which each pixel i∈Ω is saved from the time 0 to the time t. The histogram storage unit 303 stores the reference histograms h_i∈Ω of the respective pixels obtained by the histogram generation unit 302.
Next, the process performed when the image processing device 1 actually recognizes a detection target will be explained.
As when generating the reference histogram, the rotation device 4 rotates the transparent container 2. In this state, the image capture unit 301 of the image processing device 1 controls the image capture device 10 so as to continuously capture images. The image capture unit 301 acquires one image (one frame) generated by the image capture device 10 (step S201). The extraction unit 304 acquires a reference histogram generated for one pixel among the reference histograms stored in the histogram storage unit 303 (step S202). The extraction unit 304 uses the reference histogram stored in the histogram storage unit 303 to identify, in images newly obtained from the image capture unit 301, pixels at positions aligned with the pixels for which that reference histogram was generated (step S203). The extraction unit 304 compares the reference histogram with the pixel value of the identified pixel, and determines whether that pixel is in a detection target candidate region or in another region.
In the case in which a detection target such as foreign matter is included in the acquired image, the pixel values of the pixels indicating the detection target become pixel values similar to pixel values with low frequencies in the reference histogram generated based on images not including detection targets. Meanwhile, the pixel values of pixels not indicating detection targets become pixel values similar to pixel values with high occurrence frequencies in the reference histogram. Therefore, the extraction unit 304, for a pixel identified from an image in which a detection target is detected, compares the pixel value thereof with the reference histogram, and determines whether or not the identified pixel can be considered to be a detection target candidate region. Specifically, the extraction unit 304 determines whether or not the output frequency of the pixel value of the pixel identified from the image is lower than a frequency threshold value θ set in the reference histogram (step S204). When the pixel value of the pixel identified from the image is lower than the frequency threshold value θ set in the reference histogram, the extraction unit 304 determines that the pixel is a detection target candidate (step S205). On the other hand, when the pixel value of the pixel identified from the image is higher than the frequency threshold value θ set in the reference histogram, the extraction unit 304 determines that the pixel is not a detection target candidate because it is similar to an image including a detection target (step S206).
In the above-mentioned process in the extraction unit 304, the formula for determining whether or not each pixel i∈Ω in a newly captured image is a candidate region can be represented, using the reference histogram h_i stored in the histogram storage unit 303 and the pixel value b_i of one pixel i of one image newly captured by the image capture unit 301, as follows:
Additionally, the threshold value θ is a threshold value of the cumulative amount in a histogram when distinguishing between detection target candidate regions and other regions. The method for determining the threshold value θ will be further described. By making use of the fact that the transparent container 2 is rotating in a fixed direction, the cosine similarity between the rotation direction vector of the container and the optical flow of each pixel is determined. The extraction unit 304 sets the threshold value θ so that, when the cosine similarity becomes high, the threshold value becomes high, and when the cosine similarity becomes low, the threshold value becomes low. That is, the extraction unit 304, when the motion vector of each pixel is aligned with the container rotation direction, more readily responds that the pixel is a detection target candidate region, and when they are not aligned, becomes less likely to so respond. This makes use of the fact that optical flow tends to be able to be detected in the motion direction for detection targets in the transparent container 2, such as foreign matter and air bubbles, more easily than other regions that are not detection targets.
The extraction unit 304 determines, for all of the pixels in one acquired image, whether or not they are similarly detection target candidate regions (step S207). Additionally, the extraction unit 304 similarly determines, for all pixels in respective images sequentially generated by the image capture device 10, whether or not they are detection target candidate regions (step S208).
The difference detection unit 305 performs filtering on a group of detection target candidate regions obtained by the extraction unit 304 and extracts only pixels of candidate regions in which the brightness value has changed between successive frames (step S209). Since the transparent container 2 is undergoing rotational motion, foreign matter and air bubbles move together with the container inside the transparent container 2. For this reason, in the regions of foreign matter and air bubbles, there tend to be large differences in brightness values when taking the difference between successive frames. On the other hand, for pixels making up regions other than detection targets, the amount of change in the brightness value is small, even when the difference between successive frames is taken. For this reason, the difference detection unit 305 identifies candidate regions in which the brightness value has changed between frames by taking the difference between successive frames. Similar effects can be obtained even when the order of the detection target candidate region extraction process by the extraction unit 304 and the filtering process by the difference detection unit 305 is switched, or the processes may be executed in parallel.
The clustering unit 306, for pixels in detection target candidate regions detected in images processed by the extraction unit 304 and the difference detection unit 305, searches for linked pixels in eight neighboring pixels in the vicinity thereof, and if it is determined that the linked pixels are detection target candidate regions, integrates the candidate regions by clustering so as to be identified as a single candidate region (step S210).
The tracking determination unit 307 sequentially tracks candidate regions identified by the clustering unit 306 between successive frames, and when a sufficiently long trajectory has been obtained, determines that the trajectory is that of a detection target such as foreign matter or an air bubble.
In the tracking determination unit 307, tracking results are updated by associating respective candidate regions c_i∈C in a current frame with tracking results t_j∈T obtained up to the current frame. In this case, C represents the set of candidate regions newly detected in the current frame and T represents the set of tracking results obtained up to the current frame. The tracking results are updated by generating a weight matrix M, and using said weight matrix M and the Hungarian algorithm to associate candidate regions having the same detection target trajectory between frames.
The weight matrix generation unit 401 in the tracking determination unit 307 generates the weight matrix M based on respective weight information including weight information α obtained from the new trajectory weighting unit 402, weight information β obtained from the position weighting unit 403, weight information γ obtained from the size change weighting unit 404, and weight information δ obtained from the rotation direction weighting unit 405.
The new trajectory weighting unit 402 uses the weight information α so that, the longer the continuity of candidate regions identified as associated regions in multiple consecutive images from the past, the stronger the association is made between said candidate region in a last image among the multiple consecutive images and the candidate region in the next new image following the last consecutive image. In other words, the new trajectory weighting unit 402 uses the weight information α to update the weights included in the weight matrix M generated by the weight matrix generation unit 401 so that, the longer the trajectory length indicated by the tracking results, the more preference is given to connect candidate regions in new images.
The position weighting unit 403 uses the weight information β so that the farther the distance of the position of a candidate region from the center of rotation in the transparent container 2 appearing in an image, even when the tracking results up to the (N−1)-th frame are distant from the candidate region in the N-th frame, the stronger the association is made between those candidate regions. Due to the rotation of the transparent container 2, for targets at positions more distant from the center of rotation, the distances between the positions at which they appear in successive images will be larger. Therefore, even if a target is far from the center of rotation and the positions at which it appears in successive images are distant from each other, it is preferable to determine that the trajectories thereof are connected. Therefore, the position weighting unit 403 uses the weight information β to update the weights included in the weight matrix M generated by the weight matrix generation unit 401 so that even candidate regions at positions at distances far from the center of rotation of the transparent container 2 can be preferentially connected with tracking results.
The size change weighting unit 404 uses the weight information γ so that the larger the change in shape of a candidate region including multiple pixels, the weaker the association is made between candidate regions in successive images. In the case in which the change in the shape of a candidate region is large, there is an increased likelihood that the relationship between the candidate regions in successive images is interrupted. Therefore, the size change weighting unit 404 uses the weight information γ to update the weights so that, when there is a large difference equal to or greater than a prescribed threshold value between the number of pixels in a candidate region making up tracking results in an (N−1)-th frame and the number of pixels in a candidate region in an N-th frame, the association between the candidate region making up tracking results in the (N−1)-th frame and the candidate region in the N-th frame is made low. In
The rotation direction weighting unit 405 uses the weight information δ so that, in each of successive images, the association between candidate regions moving opposite to the direction of rotation of the transparent container 2 is weakened. In other words, the rotation direction weighting unit 405 uses the weight information δ to update the weights so that the cosine similarities between rotation direction vectors of the container and center-of-gravity vectors from in-image positions of candidate regions in an (N−1)-th frame to in-image positions of respective candidate regions in an N-th frame are calculated, and those with low similarity are excluded. In
The weight matrix generation unit 401 generates a weight matrix M regarding the respective tracking results up to the (N−1)-th frame and the respective candidate regions in the N-th frame. The elements m_ij∈M of the weight matrix M relating to the candidate regions c_i∈C appearing in the N-th frame and tracking results t_j∈T up to the (N−1)-th frame can be expressed by Equation (1). In Equation (1), the symbol “*” indicates multiplication.
In this case, D_ij represents the L2 norm of the center-of-gravity coordinates of the newest candidate region making up the tracking results t_j and the cluster c_i. The weight information α, β, γ, and δ represent the weights generated by the new trajectory weighting unit 402, the position weighting unit 403, the size change weighting unit 404, and the rotation direction weighting unit 405, and λ_1, λ_2, λ_3, and λ_4 represent weights adjusting the same.
Furthermore, the link determination unit 406 in the tracking determination unit 307 uses the weight matrix M and the Hungarian algorithm to assign the candidate regions in the N-th frame to the tracking results in the (N−1)-th frame (step S211). As a result thereof, the respective candidate regions included in the N-th frame are classified into candidate regions not included in trajectories being tracked and candidate regions included in trajectories being tracked. The processing in the tracking determination unit 307 is one embodiment of a process for using candidate regions in successive images chronologically captured among multiple images and weight information (weight matrix M) indicating the level of continuity of candidate regions in those images to identify, as associated regions, candidate regions in the successive images. The link determination unit 406 records, in the unused region storage unit 407, pixel information regarding candidate regions not included in a trajectory being tracked in the images of the N frames. For example, the link determination unit 406 records, in the unused region storage unit 407, identifiers of the images of the N frames associated with identifiers of candidate regions not included in a trajectory being tracked, image information making up those candidate regions, etc. The link determination unit 406 records, in the existing trajectory storage unit 408, pixel information of candidate regions included in trajectories being tracked in the image of the N frames. For example, the link determination unit 406 records, in the existing trajectory storage unit 408, identifiers of the images of the N frames in association with identifiers of candidate regions included in trajectories being tracked, image information making up those candidate regions, etc.
The trajectory length determination unit 409 identifies candidate regions that are linked to a length equal to or greater than a prescribed threshold value, within the existing trajectory storage unit 408, as candidate regions for being detection targets such as foreign matter and air bubbles (step S212). For example, the trajectory length determination unit 409 calculates, based on the identifiers of the images of the N frames, the identifiers of candidate regions included in trajectories being tracked, and the image information (such as pixel positions) making up those candidate regions, which are stored in the existing trajectory storage unit 408, the distance from the initial (i.e., in the image of a first frame) position of a candidate region to the position of the candidate region in the latest image (e.g., in the image of the N-th frame) indicating a trajectory. If this distance is equal to or greater than the threshold value, the trajectory length determination unit 409 identifies that the candidate region indicating the trajectory detected in the respective images from the first frame to the N-th frame is a candidate region for being a detection target such as foreign matter or air bubbles.
The recognition unit 308 acquires N frames of images including information on one or multiple detection target candidate regions identified by the above-mentioned process in the tracking determination unit 307. The candidate region information includes, for example, information, etc. indicating pixels in the images. The recognition unit 308 uses the detection target candidate regions identified in each of the images of the N frames in the tracking determination unit 307 to determine whether the transparent container 2 is contaminated with matter that is a detection target. In this process, the recognition unit 308 uses the positions (in-image coordinates) of the detection target candidate regions in the images of the N frames to generate chronological tracking data. Additionally, the chronological tracking data may be data generated based on successive images among the images of the N frames generated by the link determination unit 406 in the tracking determination unit 307 during the link determination process. As mentioned above, the chronological tracking data may be, for example, information such as identifiers, positions (in-image coordinates), and numbers of pixels indicating detection target regions in chronologically successive images, and motion amounts and motion vectors based on variations in the positions of those regions in the successive image frames. The chronological tracking data may be generated once for the images of the N frames, or the chronological tracking data may be generated for each of the images of the N frames.
The classifier 506 in the recognition unit 308 generates respective square patch images including the candidate regions, centered on the candidate regions in the respective images of the N frames. The classifier 506 selects, from among the respective patch images corresponding to a single trajectory in the respective images of the N frames, based on the respective images of the N frames and the chronological tracking data, the patch image in which the candidate region is the smallest, the patch image in which the candidate region is the largest, and the patch image in which the range of the candidate region is closest to the average among the patch images in that trajectory. The classifier 506 inputs these selected patch images to the first learning model. As a result thereof, the classifier 506 outputs, for the respective patch images that have been input, first determination results indicating whether or not the candidate regions in those patch images are detection targets (step S213). The first determination results may be information regarding results indicating whether the patch images corresponding to one or multiple candidate regions included in an image are foreign matter or air bubbles, which are detection targets, or are other background regions. The first detection results may be information indicating whether or not the patch images corresponding to one or multiple candidate regions included in an image match correct detection targets used as training data. The classifier 506 similarly generates respective patch images indicating other trajectories in the respective images of the N frames based on chronological tracking data, inputs each of the selected patch images in the respective trajectories to the first learning model, and outputs first determination results indicating whether or not the candidate regions in those patch images are detection targets. When outputting that a patch image is a detection target in the first determination results, the classifier 506 records, in the recognition result storage unit 508, information indicating that the transparent container 2 contains foreign matter, which is a detection target. The classifier 506 may record patch images determined to be detection targets in the recognition result storage unit 508 in association with an identifier of a transparent container 2.
Additionally, the classifier 506 in the recognition unit 308 selects chronological tracking data corresponding to one trajectory in the respective images of the N frames. The classifier 506 inputs the selected chronological tracking data to the second learning model. As a result thereof, the classifier 506 outputs second determination results indicating whether or not the input chronological tracking data is a detection target (step S214). The classifier 506 similarly inputs chronological tracking data for other trajectories in the respective images of the N frames to the second learning model, and outputs second determination results indicating whether or not those trajectories are detection targets. When outputting that chronological tracking data is a detection target in the second determination results, the classifier 506 records, in the recognition result storage unit 508, information indicating that the transparent container 2 contains foreign matter, which is a detection target.
According to the processes described above, foreign matter accumulated at positions that are difficult to see due to the effects of refraction of light, such as at liquid surfaces and bottom surfaces, is rotated together with a transparent container 2, and the detection targets such as foreign matter accumulated near the liquid surface or the bottom surface can be more accurately detected by determining whether or not the characteristics of the motion of the detection targets such as foreign matter moving due to the rotation are characteristics of motion in accordance with the rotation.
The foreign matter detection device 100, as indicated in
In the second embodiment, the transparent container 2 has a cylindrical shape, and the medium enclosed inside the transparent container 2 is a liquid 3. Foreign matter that is a detection target has accumulated at the bottom surface of the liquid 3, and the image processing device 1 determines the presence or absence of foreign matter based on images that include foreign matter accumulated at the bottom surface of the liquid 3. The foreign matter that is the detection target may be accumulated at the liquid surface of the liquid 3, and the image processing device 1 may determine the presence or absence of foreign matter based on images that include foreign matter accumulated at the liquid surface of the liquid 3. The image processing device 1 may determine the types (foreign matter, air bubbles, etc.) of detection targets. The image processing device 1 may detect detection targets accumulated in a medium other than a liquid enclosed inside the transparent container 2. That is, the image processing device 1 may detect detection targets from images capturing detection targets accumulated at the bottom surface of the transparent container 2 or detection targets accumulated at surfaces at which a medium enclosed inside the transparent container 2 contacts another medium inside the transparent container 2. The image processing device 1 may detect detection targets from images capturing the detection targets accumulated at the bottom surface of the transparent container 2, or may detect detection targets from images capturing the detection targets accumulated at a top surface of the transparent container 2.
The rotation device 4 is a device that grips the bottom surface with the cylindrical transparent container 2 oriented vertically, and that rotates the transparent container 2 so that the axis of rotation (z axis) is at a center aligned with the center of the bottom surface. Detection targets that are suspended, accumulated, etc. at the liquid surface, the liquid bottom, or within the liquid 3 enclosed inside the transparent container 2 rotate about the axis of rotation as the transparent container 2 is rotated. In the present embodiment, the image processing device 1 is provided with an image capture device 10. The image capture device 10 is installed at a position from which images of the detection targets can be captured, with the image capture direction fixed on an image capture axis (x axis) orthogonal to the z axis. While the rotation device 4 is rotating the transparent container 2, the image processing device 1 controls the image capture device 10 to capture images of the detection targets, and can obtain multiple consecutive images with angles of view in which the detection targets appear. While the image capture device 10 is capturing images, the relationship between the z axis and the x axis is maintained. As a result thereof, the image processing device 1 acquires multiple images containing the detection targets, the positions of which change as the rotation device 4 is rotated. In the second embodiment also, reflected illumination 5 may be used instead of the transmitted illumination 6, or both may be used.
Furthermore, the structural difference between the foreign matter detection devices 100 in the present embodiment and in the first embodiment is that the foreign matter detection device 100 of the second embodiment is provided with a shaking device 7, and the shaking device 7 shakes the transparent container 2 before applying rotation by means of the rotation device 4. In the second embodiment, the shaking device 7, for example, grips the upper portion of the transparent container 2 and shakes the transparent container 2 from left to right in the manner of a pendulum. As a result thereof, the positions of detection targets such as foreign matter accumulated in the liquid, at the liquid surface, and at the liquid bottom inside the shaking device 7 move. The image processing device 1 of the second embodiment detects detection targets by using multiple images of the transparent container 2 being rotated by the rotation device 4 before being shaken by the shaking device 7 and multiple images of the transparent container 2 being rotated by the rotation device 4 after being shaken by the shaking device 7.
As indicated in
The image capture unit 701 controls the image capture device 10 to acquire images generated by the image capture device 10 capturing images.
The image storage unit 702 stores the images that the image capture unit 701 has acquired from the image capture device 10.
The pattern searching unit 703 performs a process for matching patterns that change, between pre-shaking images and post-shaking images, under the influence of the reflection of light, etc. appearing in the images.
The extraction unit 704 uses respective images in which patterns have been matched between pre-shaking images and post-shaking images to extract candidate regions for being detection targets.
The clustering unit 706 clusters, as single candidate regions, groups of adjacent pixels among the pixels remaining as candidate regions.
The tracking determination unit 707 sequentially identifies, in the respective pixels of multiple successive images with different sequential image capture timings, successive associated regions that move in accordance with the rotation of the transparent container among the regions (detection windows) of the clustered candidate regions.
The recognition unit 708 recognizes detection targets by using first determination results obtained by using a first learning model and image information for candidate regions that move in a movement direction in accordance with the rotation to determine whether or not the candidate regions are the detection target, and second determination results obtained by using a second learning model and information indicating a chronological change in the candidate regions that move in the movement direction in accordance with the rotation to determine whether or not the candidate regions are the detection target. The recognition of the detection target is one embodiment of the process for determining the presence or absence of detection targets.
In the second embodiment, the respective functions of the clustering unit 706, the tracking determination unit 707, and the recognition unit 708 are similar to those in the first embodiment.
Next, the processing in the second embodiment when the image processing device 1 actually recognizes a detection target will be explained. As with the processing explained for the first embodiment, it is assumed that a first learning model and a second learning model have been generated.
First, the rotation device 4 rotates the transparent container 2 about the z axis. In this state, the image capture unit 701 of the image processing device 1 controls the image capture device 10 so as to continuously capture images. The image capture unit 701 sequentially acquires images (frames) generated by the image capture device 10 (step S301). The image capture unit 701 sequentially records the respective images that have been acquired in the image recording unit 702 (step S302). The respective images that have been recorded are pre-shaking images.
Next, based on control by the image processing device 1, the shaking device 7 shakes the transparent container 2 in a panning direction (left-right direction) with the upper portion thereof as a fulcrum (step S303). As a result thereof, the transparent container 2 is shaken in the panning direction (left-right direction) viewed from the image capture device 10. Alternatively, a manager may manually shake the transparent container 2 in the panning direction. Then, the rotation device 4 again rotates the transparent container 2 about the z axis (step S304). The image capture unit 701 in the image processing device 1 controls the image capture device 10 so as to continuously capture images. The image capture unit 701 sequentially acquires images (frames) generated by the image capture device 10 (step S305). These respective images are post-shaking images.
The pattern searching unit 703 in the image processing device 1 compares pixel regions having a brightness that is a prescribed threshold value or higher due to the effects of light in the respective pre-shaking images (
The extraction unit 704 compares the pre-shaking images and the post-shaking images determined to be origin images by the pattern searching unit 703 and takes the differences between the pixels in those images to extract, as detection target candidate regions (
The clustering unit 706, for pixels in detection target candidate regions detected in images processed by the extraction unit 704, searches for linked pixels in eight neighboring pixels in the vicinity thereof, and if it is determined that the linked pixels are detection target candidate regions, combines the candidate regions by clustering so as to be identified as a single candidate region (step S309).
The tracking determination unit 707 sequentially tracks candidate regions identified by the clustering unit 706 between successive frames, and when a sufficiently long trajectory has been obtained, determines that the trajectory is that of a detection target such as foreign matter or an air bubble.
In the tracking determination unit 707, tracking results are updated by associating respective candidate regions c_i∈C in a current frame with tracking results t_j∈T obtained up to the current frame. In this case, C represents the set of candidate regions newly detected in the current frame and T represents the set of tracking results obtained up to the current frame. The tracking results are updated by generating a weight matrix M and using said weight matrix M and the Hungarian algorithm to associate candidate regions having the same detection target trajectory between frames.
The weight matrix generation unit 401 in the tracking determination unit 707 generates the weight matrix M, as in the first embodiment, based on respective weight information including weight information α obtained from the new trajectory weighting unit 402, weight information β obtained from the position weighting unit 403, weight information γ obtained from the size change weighting unit 404, and weight information δ obtained from the rotation direction weighting unit 405.
The weight matrix generation unit 401 generates a weight matrix M regarding the respective tracking results up to the (N−1)-th frame and the respective candidate regions in the N-th frame. The elements m_ij∈M of the weight matrix M relating to the candidate regions c_i∈C appearing in the N-th frame and tracking results t_j∈T up to the (N−1)-th frame can be expressed by Equation (2). In Equation (1), the symbol “*” indicates multiplication.
In this case, D_ij represents the L2 norm of the center-of-gravity coordinates of the newest candidate region making up the tracking results t_j and the cluster c_i. The weight information α, β, γ, and δ represent the weights generated by the new trajectory weighting unit 402, the position weighting unit 403, the size change weighting unit 404, and the rotation direction weighting unit 405, and λ_1, λ_2, λ_3, and λ_4 represent weights adjusting the same.
Furthermore, the link determination unit 406 in the tracking determination unit 707 uses the weight matrix M and the Hungarian algorithm to assign the candidate regions in the N-th frame to the tracking results in the (N−1)-th frame (step S310). As a result thereof, the respective candidate regions included in the N-th frame are classified into candidate regions not included in trajectories being tracked and candidate regions included in trajectories being tracked. The link determination unit 406 records, in the unused region storage unit 407, pixel information regarding candidate regions not included in a trajectory being tracked in the images of the N frames. For example, the link determination unit 406 records, in the unused region storage unit 407, identifiers of the images of the N frames in association with identifiers of candidate regions not included in a trajectory being tracked, image information making up those candidate regions, etc. The link determination unit 406 records, in the existing trajectory storage unit 408, pixel information of candidate regions included in trajectories being tracked in the image of the N frames. For example, the link determination unit 406 records, in the existing trajectory storage unit 408, identifiers of the images of the N frames in association with identifiers of candidate regions included in trajectories being tracked, image information making up those candidate regions, etc.
The trajectory length determination unit 409 identifies candidate regions that are linked to a length equal to or greater than a prescribed threshold value, within the existing trajectory storage unit 408, as candidate regions for being detection targets such as foreign matter and air bubbles (step S311). For example, the trajectory length determination unit 409 calculates, based on the identifiers of the images of the N frames, the identifiers of candidate regions included in trajectories being tracked, and the image information (such as pixel positions) making up those candidate regions, which are stored in the existing trajectory storage unit 408, the distance from the initial (i.e., in the image of a first frame) position of a candidate region to the position of the candidate region in the latest image (e.g., in the image of the N-th frame) indicating a trajectory. If this distance is equal to or greater than the threshold value, the trajectory length determination unit 409 identifies that the candidate region indicating the trajectory detected in the respective images from the first frame to the N-th frame is a candidate region for being a detection target such as foreign matter or air bubbles.
The recognition unit 708 acquires N frames of images including information on one or multiple detection target candidate regions identified by the above-mentioned process in the tracking determination unit 707. The candidate region information includes, for example, information indicating pixels in the images, etc. The recognition unit 708 uses the detection target candidate regions identified in each of the images of the N frames in the tracking determination unit 707 to determine whether the transparent container 2 is contaminated with matter that is a detection target by the process indicated below. In this process, the recognition unit 708 uses the positions (in-image coordinates) of the detection target candidate regions in the images of the N frames to generate chronological tracking data. Additionally, the chronological tracking data may be data generated based on successive images among the images of the N frames generated by the link determination unit 406 in the tracking determination unit 707 during the link determination process. As mentioned above, the chronological tracking data may be, for example, information such as identifiers, positions (in-image coordinates), and numbers of pixels indicating detection target regions in chronologically successive images, and motion amounts and motion vectors based on variations in the positions of those regions in the successive image frames. The chronological tracking data may be generated once for the images of the N frames, or the chronological tracking data may be generated for each of the images of the N frames.
The classifier 506 in the recognition unit 708 generates respective square patch images including the candidate regions, centered on the candidate regions in the respective images of the N frames. The classifier 506 selects, from among the respective patch images corresponding to a single trajectory in the respective images of the N frames, based on the respective images of the N frames and the chronological tracking data, the patch image in which the candidate region is the smallest, the patch image in which the candidate region is the largest, and the patch image in which the range of the candidate region is closest to the average among the patch images in that trajectory. The classifier 506 inputs these selected patch images to the first learning model. As a result thereof, the classifier 506 outputs, for the respective patch images that have been input, first determination results indicating whether or not the candidate regions in those patch images are detection targets (step S312). The first determination results may be information regarding results indicating whether the patch images corresponding to one or multiple candidate regions included in an image are foreign matter or air bubbles, which are detection targets, or are other background regions. The first detection results may be information indicating whether or not the patch images corresponding to one or multiple candidate regions included in an image match correct detection targets used as training data. The classifier 506 similarly generates respective patch images indicating other trajectories in the respective images of the N frames based on chronological tracking data, inputs each of the selected patch images in the respective trajectories to the first learning model, and outputs first determination results indicating whether or not the candidate regions in those patch images are detection targets. When outputting that a patch image is a detection target in the first determination results, the classifier 506 records, in the recognition result storage unit 508, information indicating that the transparent container 2 contains foreign matter, which is a detection target. The classifier 506 may record patch images determined to be detection targets in the recognition result storage unit 508 in association with an identifier of a transparent container 2.
Additionally, the classifier 506 in the recognition unit 708 selects chronological tracking data corresponding to one trajectory in the respective images of the N frames. The classifier 506 inputs the selected chronological tracking data to the second learning model. As a result thereof, the classifier 506 outputs second determination results indicating whether or not the input chronological tracking data is a detection target (step S313). The classifier 506 similarly inputs chronological tracking data for other trajectories in the respective images of the N frames to the second learning model, and outputs second determination results indicating whether or not those trajectories are detection targets. When outputting that chronological tracking data is a detection target in the second determination results, the classifier 506 records, in the recognition result storage unit 508, information indicating that the transparent container 2 contains foreign matter, which is a detection target.
According to the processes described above, foreign matter accumulated at positions that are difficult to see due to the effects of refraction of light, such as at liquid surfaces and bottom surfaces, is rotated together with a transparent container 2, and the detection targets such as foreign matter accumulated near the liquid surfaces or bottom surfaces can be more accurately detected by determining whether or not the characteristics of the motion of the detection targets such as foreign matter moving due to the rotation are characteristics of motion in accordance with the rotation.
The image processing device 1 is provided with at least a first determining means and a second determining means.
The first determining means compares multiple images capturing a detection target inside a transparent container 2, the images being captured while rotating the transparent container 2, and determines, among candidate regions for being the detection target appearing in the images, candidate regions that move in a movement direction in accordance with the rotation (step S2101).
The second determining means determines the presence or absence of the detection target by using first determination results obtained by using a first learning model and image information for the candidate regions that move in the movement direction in accordance with the rotation to determine whether or not the candidate regions are the detection target, and second determination results obtained by using a second learning model and information indicating a chronological change in the candidate regions that move in the movement direction in accordance with the rotation to determine whether or not the candidate regions are the detection target (step S2102).
The image processing device 1 described above has an internal computer system. Furthermore, the steps in the respective processes described above are stored in a computer-readable storage medium in the form of a program, and the processes described above are performed by a computer reading and executing this program. In this case, the computer-readable recording medium refers to a magnetic disk, a magneto-optic disk, a CD-ROM, a DVD-ROM, a semiconductor memory, or the like. Additionally, this computer program may be distributed to a computer by means of communication lines and the computer that has received this distribution may execute said program.
Additionally, the program described above may be for realizing just some of the aforementioned functions. Furthermore, it may be a so-called difference file (difference program) that can realize the aforementioned functions by being combined with a program already recorded in a computer system.
The present invention can be applied to uses such as automated and high-speed implementation, as opposed to manual performance, of inspections regarding whether or not liquids such as medical drug products are contaminated with foreign matter or the like. Additionally, the present invention can also be used for inspecting whether beverages or the like are contaminated with foreign matter when producing the beverages.
Although some or all of the embodiments mentioned above may be described as in the appendices below, they are not limited to what is indicated below.
An image processing device includes:
The image processing device according to appendix 1, provided with:
The image processing device according to appendix 2, provided with:
The image processing device according to any one of appendix 1 to appendix 3, provided with:
The image processing device according to appendix 4, wherein:
The image processing device according to appendix 4 or appendix 5, wherein:
The image processing device according to any one of appendix 4 to appendix 6, wherein:
The image processing device according to any one of appendix 4 to appendix 7, wherein:
The image processing device according to any one of appendix 1 to appendix 8, wherein:
An image processing method includes:
A storage medium in which a program is stored for making a computer in an image processing device execute:
| Filing Document | Filing Date | Country | Kind |
|---|---|---|---|
| PCT/JP2022/007601 | 2/24/2022 | WO |