DEVICE AND COMPUTER IMPLEMENTED METHOD FOR EVALUATING A DIGITAL IMAGE

Description

CROSS REFERENCE

The present application claims the benefit under 35 U.S.C. § 119 of European Patent Application No. EP 23 17 6815.1 filed on Jun. 1, 2023, which is expressly incorporated herein by reference in its entirety.

FIELD

The present invention relates to a device and a computer-implemented method for evaluating a digital image.

BACKGROUND INFORMATION

Digital images are evaluated for example in the field of at least partially autonomous vehicles or automated optical inspection devices.

SUMMARY

A device and computer implemented method for evaluating a digital image according to features of the present invention provide a simple score to classify digital images.

According to an example embodiment of the present invention, the method comprises providing the digital image, providing a first part of a predetermined model, wherein the predetermined model is configured for determining a semantic segmentation of the digital image with a second part of the predetermined model, wherein the first part is configured to determine a feature depending on the digital image, wherein the second part is configured to determine the semantic segmentation depending on the feature, wherein the method comprises determining the feature depending on the digital image with the first part, providing a set of quantizations for quantizing the feature, determining the quantization of the feature depending on the set of quantizations and depending on the feature, determining a quantization error depending on the feature and the quantization, and evaluating the digital image depending on the quantization error. The feature and the quantization may be vectors. The quantization error may be a distance between the vectors that serves as the simple score to evaluate the digital image.

The method may comprise providing the second part of the model, wherein the second part is configured for determining the semantic segmentation of the digital image depending on the quantization, and determining the semantic segmentation of the digital image depending on the quantization with the second part. The score that is provided in addition to the semantic segmentation allows evaluating the semantic segmentation.

In one example embodiment of the present invention, providing the set of quantizations comprises providing a reference for the semantic segmentation of the digital image, determining the semantic segmentation of the digital image, and determining a quantization in the set of quantizations depending on a difference between the reference and the semantic segmentation and depending on the quantization error. The first part and the second part are predetermined parts of the predetermined model. The quantization is determined without modifying the first part or the second part.

In one example embodiment of the present invention, providing the pretrained model comprises training the first part to determine the feature and the second part to determine the semantic segmentation depending on the feature. The set of quantizations is inserted into the pretrained model between the pretrained first part and the pretrained second part.

In one example embodiment of the present invention, the method comprises determining the feature with a predetermined normalization, and determining the quantization for the feature with the predetermined normalization. The normalization normalizes the feature and the quantization in the same way in order to facilitate the determination of the quantization error. The vectors representing feature and quantization may be normalized to unit length.

In one example embodiment of the present invention, the method comprises upscaling the feature from a first scale to a second scale, determining the quantization in the second scale from the feature in the second scale, downscaling the quantization from the second scale to the first scale and determining the quantization error in the first scale. The change of the resolution may improve accuracy.

Preferably, according to an example embodiment of the present invention, the feature is a vector and the quantization of the feature is a vector, wherein determining the quantization error comprises determining a cosine distance between the feature and the quantization of the feature.

According to an example embodiment of the present invention, providing the digital image may comprise capturing the digital image in particular with a camera of an at least partially autonomous vehicle or an automated optical inspection device.

According to an example embodiment of the present invention, evaluating the digital image may comprise detecting an anomaly if the quantization error exceeds a threshold or not detecting the anomaly otherwise.

According to an example embodiment of the present invention, the device for evaluating the digital image comprises at least one processor and at least one storage for storing the digital image and instructions that, when executed by the at least one processor, cause the at least one processor to execute the method, wherein the at least one processor is configured for executing the instructions. The device according to the present invention provides advantages that correspond to the advantages the method of the present invention provides.

A program may be provided, wherein the program comprises instructions that, when executed by at least one processor, cause the at least one processor to execute the method. The program provides advantages that correspond to the advantages the method provides.

Further embodiments of the present invention are derived from the following description and the figures.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 schematically depicts a device for evaluating a digital image, according to an example embodiment of the present invention.

FIG. 2 depicts steps in a method for evaluating a digital image, according to an example embodiment of the present invention.

DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS

FIG. 1 depicts schematically a device 100 for evaluating a digital image x.

The digital image is for example a video image, radar image, LiDAR image, ultrasonic image, motion image, thermal image.

The device 100 comprises at least one processor 102 and at least one storage 104 for storing the digital image x and instructions.

The instructions cause the at least one processor 102 to execute a method for evaluating the digital image x, when executed by the at least one processor 102.

The at least one processor 102 is configured for executing the instructions.

The device 100 may comprise a camera 106 or an interface for receiving a digital image from the camera 106.

A program may comprise the instructions.

The method for evaluating the digital image x comprises a step 202.

In the step 202, the digital image x is provided.

Providing the digital image x may comprise capturing the digital image x.

According to an example, the digital image x is captured with the camera 106.

The camera 106 is for example part of or mounted to an at least partially autonomous vehicle or an automated optical inspection device.

The method comprises a step 204.

In step 204, a predetermined model {tilde over (G)} is provided.

The predetermined model {tilde over (G)}=D∘F comprises a first part F and a second part D.

In the example, the predetermined model {tilde over (G)} is a pretrained semantic segmentation network, i.e., an artificial neural network.

The predetermined model {tilde over (G)} is for example determined in a training based on a data set X={x_i,y_i}_{i=1, . . . , N}comprising N input images x_i∈[0,1]^cⁱⁿ^×H×Wwith corresponding annotations y_i∈(0, . . . , N_c−1)^H×W, respectively, where N_cis the number of classes, that the network can assign in a semantic segmentation ŷ.

The second part D comprises a last layer of the network {tilde over (G)}. The first part F comprises the other layers of the network {tilde over (G)}.

The first part is configured to determine a feature z_e=F(x) depending on the digital image x.

The predetermined model {tilde over (G)} is configured for determining the semantic segmentation ŷ of the digital image x with the second part D.

The second part D is configured to determine the semantic segmentation ŷ depending on the feature z_e.

Providing the pretrained model may comprise training the first part F to determine the feature z_eand the second part D to determine the semantic segmentation ŷ depending on the feature z_e.

The method comprises a step 206.

The step 206 comprises providing a predetermined set of quantizations {tilde over (Q)} for quantizing the feature z_e.

The method comprises a step 208.

In the step 208, the feature z_eis determined depending on the digital image x with the first part F.

The feature z_eis for example determined with a predetermined normalization N.

The method comprises a step 210.

In the step 210, a quantization z_qof the feature z_eis determined depending on the set of quantizations {tilde over (Q)} and depending on the feature z_e.

The quantization z_qfor the feature z_eis for example determined with the predetermined normalization N:

$z_{q} = Q (z_{e})$

with the normalized set of quantizations

$Q = \tilde{Q} N$

The method comprises a step 212.

In the step 212, a quantization error

$E_{quant} = { z_{q} - z_{e} }_{2}^{2}$

is determined depending on the feature z_eand the quantization z_q.

According to an example, the feature z_eis a vector and the quantization z_qof the feature z_eis a vector. In the example, the feature z_ehas the same dimension as the quantization z_q.

According to an example, determining the quantization error comprises determining a cosine distance between the feature z_eand the quantization z_qof the feature z_e.

The method comprises a step 214.

In the step 214, the semantic segmentation ŷ of the digital image x is determined depending on the quantization z_qwith the second part D.

In one example, the semantic segmentation ŷ is determined with the normalized set of quantizations Q:

$\hat{y} = G = D (Q (F (x)))$

In one example, the semantic segmentation ŷ is determined with the set of quantizations {tilde over (Q)}:

$\hat{y} = G = D (Q (F (x)))$

G denominates an amended model resulting from adding the set of quantizations {tilde over (Q)} or the normalized set of quantizations Q to the predetermined model {tilde over (G)}.

The second part D is configured for determining the semantic segmentation ŷ of the digital image x depending on a quantization z_qof the feature z_e. In the example, the dimensions of the quantization z_qand the feature z_eare the same.

The method comprises a step 216.

In the step 216, the digital image x is evaluated depending on the quantization error E_quant.

Evaluating the digital image may comprise detecting an anomaly if the quantization error E_quantexceeds a threshold or not detecting the anomaly otherwise.

The quantization error is in one example used as a signal for determining whether a region in the digital image is anomalous or nominal.

The step 216 may comprise triggering an action when the anomaly is detected. The step 216 may comprise triggering an action when the absence of the anomaly is detected.

The action may be an emergency braking of the vehicle or alerting a user of the anomaly. The action may be actively selecting the digital image for transmission to a back-end computer for further processing or not selecting it for that purpose.

Further processing may comprise machine learning with the digital image or testing, verifying or validating a machine learning system with the digital image.

The action may be capturing another digital image without forwarding the digital image in case the anomaly is detected or directly transmitted the evaluated digital image in particular to a back-end otherwise.

The action may be sorting out the evaluated digital image for labelling in case the anomaly is detected.

The step 214 is in the example executed to determine the semantic segmentation ŷ and the evaluation regarding the existence or absence of an anomaly in the digital image x.

The step 214 may be omitted in order to evaluate the existence or absence of an anomaly based on the quantization error E_quant.

The amended model G represents an anomaly segmentation system.

A use case for this evaluation is to distinguish outliers from nominal data in semantic segmentations of digital images. This is particularly useful in safety relevant applications such as autonomous driving.

For example, the semantic segmentation is provided for detecting a presence of an object representing a road surface, a pedestrian, a vehicle. An outlier in may be an unknown object.

A use cases for this evaluation is to classify outliers from nominal data in automated optical inspection.

In one example, the set of quantizations {tilde over (Q)} or the normalized set of quantizations Q is provided as a module. The module is inserted in an existing neural network that is already trained for semantic segmentation as a penultimate layer of the neural network.

The set of quantizations {tilde over (Q)} or the normalized set of quantizations Q may be added to another layer of the neural network.

Preferably, a vector quantization layer representing the set of quantizations {tilde over (Q)} or the normalized set of quantizations Q is added.

The trained vector quantization layer is used to detect anomalies, e.g. by flagging input data as anomalous if its quantization error, i.e. a vector quantizer determined by the vector quantization layer, is above the given threshold.

In one example, the only parameters that are retrained are those of the set of quantizations {tilde over (Q)} or the normalized set of quantizations Q. These are typically much fewer parameters than the entire neural network. Thereby it becomes possible to incorporate the module for an outlier detection into an already trained neural network.

In one example, the amended model G is fine-tune or the last layer, i.e. a classifier layer of the amended model G is retrained to further improve the performance.

The method may comprise upscaling the feature z_efrom a first scale to a second scale, determining the quantization z_qin the second scale from the feature z_ein the second scale, downscaling the quantization z_qfrom the second scale to the first scale and determining the quantization error E_quantin the first scale.

The amended model may be trained in a training with training data.

The training may comprise determining parameters that define the set of quantizations {tilde over (Q)} and/or the normalization N. Preferably, the training comprises determining the parameters that define the set of quantizations {tilde over (Q)} and/or the normalization N without modifying the first part and/or the second part of the predetermined model.

The training data may comprise pairs of a reference y^(k)for the semantic segmentation ŷ^(k)and a digital image x_k. The reference y^(k)is for example a one hot-encoding of the label y.

Providing the set of quantizations {tilde over (Q)} in the training comprises:

- providing a digital image x_kand reference y^(k)for a semantic segmentation ŷ^(k)of the digital image x_k,
- determining the semantic segmentation ŷ^(k)of the digital image x_kwith the amended model G,
- determining the a quantization q in the set of quantizations {tilde over (Q)} depending on a difference between the reference ŷ^(k)and the semantic segmentation ŷ(k) and depending on a loss function that comprises a cross entropy-loss

$L_{CE} = \sum_{k = 0}^{N_{C} - 1} y^{(k)} \log {\hat{y}}^{(k)}$

- and a quantization error loss

$L_{quant} = { sq [z_{e} (x_{k})] - z_{q} }_{2}^{2} + β { z_{e} (x_{k}) - sq [z_{q}] }_{2}^{2}$

- wherein sq[·] denotes a stop gradient operation and β is a parameter that controls the update rate of the encoding parameters compared to updates of the set of parameters. An example is β=0.25.

The digital image comprises pixel boundaries between classes that are assigned in the semantic segmentation.

The set of quantizations {tilde over (Q)} according to one example is a codebook comprising codebook vectors, i.e. the vectors in the set of quantizations {tilde over (Q)}.

Pixel boundaries between classes are for example regions with large quantization errors E_quant, due to the absence of nearby codebook vectors.

This can cause large false positive rates in anomaly prediction. To mitigate this the amended model G may be trained depending on an additional loss term, a boundary loss L_boundary, that overweighs the quantization error E_quanton the class boundaries during training and encourages the allocation of codebook vectors on the class boundary.

Note that logit-based uncertainty methods cannot avoid larger uncertainties and thus anomaly scores when traversing pixel boundaries between classes, due to the continuity of the output of the network.

Let m_idenote an appropriately resized binary mask of class boundaries of the reference y, i.e. m=1 on class boundary regions and zero elsewhere. The boundary loss L_boundaryis then defined by:

$L_{boundary} = m ⊙ E_{quant}$

wherein ⊙ denotes the pointwise product. The amended model G is trained using for example the loss function:

$L = α L_{CE} + L_{quant} + λ L_{boundary}$

wherein α, λ and β are hyperparameters. An anomaly score is for example computed from the E_quantscaled up to the second resolution.

The decision of whether to classify a pixel as an outlier is decided for example from whether its quantization error E_quantlies above or below the threshold. A value t of the threshold is for example 0≤t≤1.

According to one example, the amended model G is a neural network comprising the normalized quantization Q. The neural network that is trained with the loss function L. The threshold t is selected to determine the sensitivity. During inference a digital image is passed through the neural network. At the layer representing the normalized quantization the quantization error E_quantis computed. Pixels that correspond to a quantization error E_quant>t are labelled as anomalous, the remaining pixels as nominal. The quantization error E_quantmay be upscaled to the original image resolution.

A neural network comprising the set of quantizations {tilde over (Q)} instead of the normalized quantization Q is processed alike.

Claims

1. A computer implemented method for evaluating a digital image, comprising: providing the digital image;providing a first part of a predetermined model, wherein the predetermined model is configured for determining a semantic segmentation of the digital image with a second part of the predetermined model, wherein the first part is configured to determine a feature depending on the digital image, and wherein the second part is configured to determine the semantic segmentation depending on the feature;determining the feature depending on the digital image with the first part;providing a set of quantizations for quantizing the feature;determining a quantization of the feature depending on the set of quantizations and depending on the feature;determining a quantization error depending on the feature and the quantization; andevaluating the digital image depending on the quantization error.
2. The method according to claim 1, further comprising: providing the second part of the model, wherein the second part is configured for determining the semantic segmentation of the digital image depending on the quantization; anddetermining the semantic segmentation of the digital image depending on the quantization with the second part.
3. The method according to claim 2, wherein the providing of the set of quantizations includes providing a reference for the semantic segmentation of the digital image, determining the semantic segmentation of the digital image, and determining a quantization in the set of quantizations depending on a difference between the reference and the semantic segmentation and depending on the quantization error.
4. The method according to claim 1, wherein the providing of the predetermined model includes training the first part to determine the feature and the second part to determine the semantic segmentation depending on the feature.
5. The method according to claim 1, further comprising: determining the feature with a predetermined normalization; anddetermining the quantization for the feature with the predetermined normalization.
6. The method according to claim 1, further comprising: upscaling the feature from a first scale to a second scale;determining a quantization in the second scale from the feature in the second scale;downscaling the quantization from the second scale to the first scale; anddetermining the quantization error in the first scale.
7. The method according to claim 1, wherein the feature is a vector and the quantization of the feature is a vector, wherein determining the quantization error includes determining a cosine distance between the feature and the quantization of the feature.
8. The method according to claim 1, wherein the providing of the digital image includes capturing the digital image with a camera of an at least partially autonomous vehicle or an automated optical inspection device.
9. The method according to claim 1, wherein the evaluating of the digital image includes detecting an anomaly when the quantization error exceeds a threshold or not detecting the anomaly otherwise.
10. A device configured to evaluate a digital image, the device comprising: at least one processor; andat least one storage configured to store the digital image and instructions that, when executed by the at least one processor, cause the at least one processor to perform the following steps: providing the digital image,providing a first part of a predetermined model, wherein the predetermined model is configured for determining a semantic segmentation of the digital image with a second part of the predetermined model, wherein the first part is configured to determine a feature depending on the digital image, and wherein the second part is configured to determine the semantic segmentation depending on the feature,determining the feature depending on the digital image with the first part,providing a set of quantizations for quantizing the feature,determining a quantization of the feature depending on the set of quantizations and depending on the feature,determining a quantization error depending on the feature and the quantization, andevaluating the digital image depending on the quantization error;wherein the at least one processor is configured to execute the instructions.
11. A non-transitory computer-readable storage medium on which is stored a program including instructions for evaluating a digital image, the instructions, when executed by at least one processor, cause the at least one processor to perform the following steps: providing the digital image;providing a first part of a predetermined model, wherein the predetermined model is configured for determining a semantic segmentation of the digital image with a second part of the predetermined model, wherein the first part is configured to determine a feature depending on the digital image, and wherein the second part is configured to determine the semantic segmentation depending on the feature;determining the feature depending on the digital image with the first part;providing a set of quantizations for quantizing the feature;determining a quantization of the feature depending on the set of quantizations and depending on the feature;determining a quantization error depending on the feature and the quantization; andevaluating the digital image depending on the quantization error.

Priority Claims (1)

Number	Date	Country	Kind
23176815.1	Jun 2023	EP	regional

DEVICE AND COMPUTER IMPLEMENTED METHOD FOR EVALUATING A DIGITAL IMAGE

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)