The present application claims priority to and the benefit of Korean Patent Application No. 10-2023-0094552, filed on Jul. 20, 2023, the entire disclosure of which is incorporated herein by reference.
The present disclosure relates to an image processing apparatus and method.
The content described in the present section simply provides background information for the present disclosure and does not constitute prior art.
Image object recognition in an autonomous driving vehicle means a technology for analyzing a surrounding image of the vehicle and recognizing objects such as roads, lanes, obstacles, and pedestrians. The autonomous driving vehicle may provide a safe and accurate traveling environment by utilizing an image object recognition technology. The image object recognition technology in an autonomous vehicle may be applied through the following process. Images are collected and preprocessed. Objects in the preprocessed image are detected and the detected objects are separated. The separated objects are tracked and meanings of the objects are interpreted for a determination of traveling.
In object recognition based on an image processing apparatus and/or method of the related art, an area of non-interest is defined in advance and processing is performed. In the image processing apparatus and/or method of the related art, a size of the area of non-interest is defined to be larger than that of a vehicle to eliminate an error that occurs when surroudning images of the vehicle are combined. However, because the size or position of the area of non-interest may change after the vehicle is shipped, there is a limit to defining the area of non-interest in advance and performing processing. For example, even though an object with a low probability of actually colliding with the vehicle is detected, the object may be incorrectly recognized as an object with a high probability of colliding, and incorrect information may be provided to a driver. Or, an amount of information provided to the driver is reduced. Such image processing apparatus and/or method may result in lower quality of object recognition.
To improve an image processing apparatus and/or method of the related art, SSEG (Semantic Segmentation) based on a learning model that classifies boundaries between objects may be used. SSEG is an image-based learning model that divides objects in an image in units of pixels. SSEG may be used to recognize a space and recognize lanes or obstacles in an autonomous vehicle. However, there is a problem that it is impossible to accurately distinguish between an area of non-interest and an area of interest using only a basic structure of SSEG. For example, when a road around the vehicle is reflected on a painted surface of a car body, SSEG cannot accurately distinguish between the area of non-interest and the area of interest.
In view of the above, the present disclosure is intended to solve these problems, and a main object of the present disclosure is to accurately distinguish between an area of non-interest and an area of interest by inputting an image providing additional information along with an existing input image to an artificial intelligence model.
Further, another main object of the present disclosure is to minimize a negative impact on performance for other classes by inputting an existing input image to an artificial intelligence model without changing the existing input image.
Further, yet another main object of the present disclosure is to accurately distinguish between an area of non-interest and an area of interest and secure the maximum number of objects present around a vehicle by updating additional information in real time and inputting the information to an artificial intelligence model.
The problems to be solved by the present disclosure are not limited to the problems mentioned above, and other problems not mentioned can be clearly understood by those skilled in the art from the description below.
An embodiment of the present disclosure provides an image processing apparatus for distinguishing between an area of interest and an area of non-interest in an image, the image processing apparatus comprises: a memory configured to store a second image including information on an area of non-interest, an area of interest, and a variable area; a receiver configured to receive a first image captured from at least one camera; and a processor including an artificial intelligence model trained to distinguish objects in an area of interest of the first image from a input data, consisting of the second image and the first image, wherein the processor extracts information on an area of non-interest from a resultant image, outputted by the artificial intelligence model, and updates a size of the variable area of the second image using an extracted information on the area of non-interest.
Another embodiment of the present disclosure provides an image processing method of distinguishing between an area of interest and an area of non-interest in an image using an image processing apparatus, the image processing method comprising: receiving a first image captured from at least one camera; preprocessing the first image and a second image that including information on an area of non-interest, an area of interest, and a variable area; inputting a preprocessed data, comprising the first image and the second image, into an artificial intelligence model; outputting a resultant image from the artificial intelligence model; extracting information on an area of non-interest from the resultant image; and updating a size of the variable area of the second image using an extracted information on the area of non-interest, wherein the artificial intelligence model is trained to distinguish objects in the area of interest of the first image from an input data comprising the second image and the first image.
As described above, according to the present embodiment, there is an effect that it is possible to accurately distinguish between an area of non-interest and an area of interest by inputting an image providing additional information along with an existing input image to an artificial intelligence model.
Further, according to the present embodiment, there is an effect that it is possible to minimize a negative impact on performance for other classes by inputting an existing input image to an artificial intelligence model without changing the existing input image.
Further, according to the present embodiment, there is an effect that it is possible to accurately distinguish between an area of non-interest and an area of interest and secure the maximum number of objects present around a vehicle by updating additional information in real time and inputting the information to an artificial intelligence model.
Hereinafter, some exemplary embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. In the following description, like reference numerals preferably designate like elements, although the elements are shown in different drawings. Further, in the following description of some embodiments, a detailed description of known functions and configurations incorporated therein will be omitted for the purpose of clarity and for brevity.
Additionally, various terms such as first, second, A, B, (a), (b), etc., are used solely to differentiate one component from the other but not to imply or suggest the substances, order, or sequence of the components.
Throughout the present specification, when a part ‘includes’ or ‘comprises’ a component, the part is meant to further include other components, not to exclude thereof unless specifically stated to the contrary.
The terms such as ‘unit’, ‘module’, and the like refer to one or more units for processing at least one function or operation, which may be implemented by hardware, software, or a combination thereof.
Unless otherwise noted, the description of one embodiment is intended to be applicable to other embodiments.
The following detailed description, together with the accompanying drawings, is intended to describe exemplary embodiments of the present invention and is not intended to represent the only embodiments in which the present invention may be practiced.
The following detailed description, together with the accompanying drawings, is intended to describe exemplary embodiments of the present invention and is not intended to represent the only embodiments in which the present invention may be practiced.
Referring to
The receiver 130 may receive a first image captured from at least one camera
140. The camera 140 is mounted on a vehicle and may capture a surrounding image of the vehicle. For example, the first image may include some objects like roads, trees, other vehicles, people and buildings. The camera 140 may be configured to capture a wide-angle image.
The memory 120 may store the second image. The second image may include information on an area of non-interest, an area of interest, and a variable area. The area of non-interest refers to an unnecessary area in an area that is an object separation target in the first image. For example, the area of non-interest may be a partial area of the vehicle on which the camera 140 is mounted, an area occupied by an object attached to the vehicle by the driver, or the like. More detailed description of the area of non-interest will be described later with reference to
The processor 110 may include the artificial intelligence model 111. The artificial intelligence model 111 may be trained to distinguish or to recognize objects in an area of interest of the first image from a input data, consisting of the second image and the first image, and to separate an object in the area of interest of the first image. The artificial intelligence model 111 is SSEG (Semantic Segmentation) and may be trained using an artificial neural network model. Here, the artificial neural network model may be any one of U-Net (Universal Network), V-Net (Vision Network), ResNet (Residual Network), DenseNet (Densely Connected Convolutional Network), InceptionNet (Inception Network), ShuffleNet, MobileNet, EfficientNet, and Swin Transformer. CNN (Convolutional Neural Network), LSTM (Long Short-Term Memory), and RNN (Recurrent Neural Network).
The processor 110 may extract information on the area of non-interest from the resultant image, outputted from the artificial intelligence model 111. The processor 110 may update a size of the variable area of the second image using the extracted information on the area of non-interest. Various embodiments in which the processor 110 updates the size of the variable area of the second image will be described later with reference to
Referring to
The image processing apparatus may preprocess the first image, and the second image which is previously stored in the memory 120 (S220). Here, preprocessing means processing the first image and the second image into a form that is easy to input to the artificial intelligence model 111. Preprocessing step of the first image and the second image may include at least one of image resizing, image pixel value normalization, image noise removal, image histogram equalization, image embossing, image masking, image cropping, image rotation, image flipping, and image enhancement.
The image processing apparatus may input the pre-processed first image and the pre-processed second image to the artificial intelligence model 111 (S230). The artificial intelligence model 111 may be configured to separate the area of non-interest and the area of interest from the first image when the image processing apparatus simultaneously inputs the first image and the second image to the artificial intelligence model 111. Accordingly, the artificial intelligence model 111 may separate an object in the area of interest other than the area of non-interest in the first image.
The image processing apparatus may receive the resultant image from the artificial intelligence model 111 (S240). The resultant image may be an image that shows the first image, with objects that have been distinguished from the input image.
The image processing apparatus may extract the information on the area of non-interest from the resultant image (S250). Forexample, the image processing apparatus may handle (or treat) body portion of the vehicle, including tire portion, as the area of non-interest. The image processing apparatus may update the size of the variable area of the second image using the extracted information on the area of non-interest (S260). A process in which the image processing apparatus updates the variable area of the second image will be described later with reference to
Referring to
Referring to
Equation 1 shows the method of calculating the reliability of the recognition result of the artificial intelligence model 111 according to the number of accumulated pieces of information on the area of non-interest. In Equation 1, conf means the reliability of the recognition result of the artificial intelligence model 111. n is a variable indicating the number of the accumulated pieces of information on the area of non-interest. n may be determined through an experiment. g is a function that converts input signal values into output signals. g may be set according to an environment of the image processing apparatus.
Equation 2 shows a method of updating conf. In Equation 2, τ denotes a momentum coefficient. τ is a variable used when updating conf. τ is a real number between 0 and 1 and may be determined through an experiment.
Equation 3 shows a process of processing conf through f and inputting the processed conf to the second image. In Equation 3, f is a type of function that processes conf. Using f, a pixel to be input to the second image may be calculated from conf.
Referring to
Referring to
The flowchart of the present disclosure describes processes as being sequentially executed, but this is merely illustrative of the technical idea of an embodiment of the present disclosure. In other words, since it is apparent to those having ordinary skill in the art that an order described in the flowchart may be changed or one or more processes may be executed in parallel without departing from the essential characteristics of an embodiment of the present disclosure, the flowchart is not limited to a time-series order.
Various implementations of systems and techniques described herein may be realized as digital electronic circuits, integrated circuits, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), computer hardware, firmware, software, and/or combinations thereof. These various implementations may include one or more computer programs executable on a programmable system. The programmable system includes at least one programmable processor (which may be a special-purpose processor or a general-purpose processor) coupled to receive and transmit data and instructions from and to a storage system, at least one input device, and at least one output device. The computer programs (also known as programs, software, software applications or codes) contain commands for a programmable processor and are stored in a “computer-readable recording medium”.
The computer-readable recording medium includes all types of recording devices in which data readable by a computer system is stored. Such a computer-readable recording medium may be a non-volatile or non-transitory medium, such as ROM, CD-ROM, magnetic tape, floppy disk, memory card, hard disk, magneto-optical disk, or a storage device, and may further include a transitory medium such as a data transmission medium. In addition, the computer-readable recording medium may be distributed in a computer system connected via a network, so that computer-readable codes may be stored and executed in a distributed manner.
Various implementations of systems and techniques described herein may be embodied by a programmable computer. Here, the computer includes a programmable processor, a data storage system (including volatile memory, non-volatile memory, or other types of storage systems, or combinations thereof) and at least one communication interface. For example, the programmable computer may be one of a server, a network device, a set top box, an embedded device, a computer expansion module, a personal computer, a laptop, a personal data assistant (PDA), a cloud computing system, or a mobile device.
Although exemplary embodiments of the present disclosure have been described for illustrative purposes, those having ordinary skill in the art will appreciate that various modifications, additions, and substitutions are possible, without departing from the idea and scope of the claimed invention. Therefore, exemplary embodiments of the present disclosure have been described for the sake of brevity and clarity. The scope of the technical idea of the present embodiments is not limited by the illustrations. Accordingly, one of ordinary skill would understand that the scope of the claimed invention is not to be limited by the above explicitly described embodiments but by the claims and equivalents thereof.
Number | Date | Country | Kind |
---|---|---|---|
10-2023-0094552 | Jul 2023 | KR | national |