The present invention relates to a device configured to classify data, to a system for determining driving warning information, to a method for classifying data, as well as to a computer program element.
The general background of this invention is the field of driving warning systems. Present approaches in the domain of advanced driver assistance systems (ADAS) are increasingly using lifelong learning approaches, with systems being trained to improve detection and classification functionalities. Balanced data input is generally required, within such a lifelong learning framework which can lead to poor detection and classification.
It would be advantageous to have improved device for classifying data.
The object of the present invention is solved with the subject matter of the independent claims, wherein further embodiments are incorporated in the dependent claims. It should be noted that the following described aspects and examples of the invention apply also for the device configured to classify data, the system for determining driving warning information, the method for classifying data, and for the computer program element.
In a first aspect, there is provided a device configured to classify data, comprising:
an input unit; and
a processing unit.
The input unit is configured to provide the processing unit with a plurality of data samples. The plurality of data samples comprises at least one test sample and comprises a plurality of positive samples and comprises a plurality of negative samples. Each of the positive samples has been determined to contain data relating to at least one object to be detected, and each of the negative samples has been determined not to contain data relating to at least one object to be detected. The processing unit is configured to generate a first plurality of groups. The processing unit is also configured to populate each group of the first plurality of groups with a different at least one of the plurality of negative samples. At least one of the group of the first plurality of groups contains at least two negative samples. The processing unit is also configured to determine if the at least one test sample contains data relating to the at least one object on the basis of the plurality of the positive samples and the first plurality of groups.
In this manner, the normal situation where there are many more negative samples than positive samples is addressed, by an effect reducing the emphasis of the negative samples by populating a number of groups with the negative samples where the number of groups is less than the number of negative samples, and classifying a test sample as having an object or not on the basis of the positive samples and the groups, rather than on the basis of all the positive samples and all the negative samples. In other words, the effective positive and negative samples are balanced, by placing the negative samples (which generally far outnumber the positive samples) in groups.
The determination that the positive samples contain data relating to an object to be detected and the determination that the negative samples do not contain data relating to an object can be provided as input data, validated for example by a human operative, but also can be provided by the device itself through in effect the device utilising a self learning algorithm.
In an example, at least some of the first plurality of groups are assigned a weighting factor.
To put this another way, “each subcluster” or “group” is assigned an importance weight(s). In an example, the subclusters for each group are generated based on a distance metric. For example, this can be done manually. Importance weights are then assigned to *each* subcluster automatically. These weights can be re-adjusted to give critical subclasses higher importance during training. Such re-adjustment can be done manually. By assigning important weights, large numbers of subclusters can be handled in this manner, rather than small numbers of subclusters as is used in the state of the art, that are necessarily hand-designed.
In an example, the processing unit is configured to populate each group of the first plurality of groups on the basis of a different feature of sample data similarity for each group.
In other words, negative samples with a degree of similarity relating to a particular feature are grouped into a specific group, with other negative samples sharing a degree of similarity relating to a different particular feature are grouped into a different specific group. In this way, redundancy of information within the negative samples is taken into account, providing for an improved balancing of the positive and negative samples.
In an example, the processing unit is configured to determine a first plurality of representative samples for the first plurality of groups, wherein a different representative sample is associated with each group of the first plurality of groups; and wherein the processing unit is configured to determine if the at least one test sample contains data relating to the at least one object on the basis of the plurality of positive samples and the first plurality of representative samples.
In this manner, a balanced dataset can be provided comprising the positive samples and the representatives of each group of the negative samples, supporting classification to better balance data so that not too much emphasis is put on negative samples and/or specific negative samples with high redundancy in one group.
In other words, fine-grained balancing is considered for a *large* number of subclasses (and not only positive vs. negative, or class-based subclustering).
In an example, the processing unit is configured to generate a second plurality of groups. The processing unit is also configured to populate each group of the second plurality of groups with a different at least one of the plurality of positive samples, wherein at least one of the group of the second plurality of groups contains at least two positive samples. The processing unit is also configured to determine if the at least one test sample contains data relating to the at least one object on the basis of the second plurality of groups and the first plurality of groups.
In this way, better balanced classification is provided, where not too much emphasis is placed on either the positive or negative samples.
In an example, at least some of the second plurality of groups are assigned a weighting factor.
In an example, the processing unit is configured to populate each group of the second plurality of groups on the basis of a different feature of sample data similarity for each group.
In this manner, redundancy of information within the positive samples is taken into account, providing for an improved balancing of the positive and negative samples.
In an example, the processing unit is configured to determine a second plurality of representative samples for the second plurality of groups, wherein a different representative sample is associated with each group of the second plurality of groups. The processing unit is also configured to determine if the at least one test sample contains data relating to the at least one object on the basis of the second plurality of representative samples and the first plurality of representative samples.
In this manner, a balanced dataset can be provided comprising the representatives of each group of the positive samples and the representatives of each group of the negative samples, supporting classification to better balance data so that not too much emphasis is put on positive samples and/or specific positive samples and not put on negative samples and/or specific negative samples.
In an example, the processing unit is configured to process the plurality of negative samples to determine a plurality of features of negative data sample similarity. The processing unit is also configured to populate each group of the first plurality of groups on the basis of a different feature of negative sample data similarity for each group. The processing unit is also configured to process the plurality of positive samples to determine a plurality of features of positive data sample similarity. The processing unit is also configured to process the at least one test sample to determine at least one representative feature. The determination if the at least one test sample contains data relating to the at least one object comprises a comparison of the at least one representative feature with the plurality of features of negative data sample similarity and the plurality of features of positive data sample similarity.
In an example, the processing unit is configured to process the plurality of negative samples to determine a plurality of features of negative data sample similarity. The processing unit is also configured to populate each group of the first plurality of groups on the basis of a different feature of negative sample data similarity for each group. The processing unit is also configured to process the plurality of positive samples to determine a plurality of features of positive data sample similarity. The processing unit is also configured to process a test sample of the at least one test sample to determine a test representative feature. The processing unit is also configured to add the test sample to the plurality of positive samples or add the test sample to the plurality of negative samples. The selective addition comprises a comparison of the test representative feature with the plurality of features of negative data sample similarity and the plurality of features of positive data sample similarity.
The determination that the negative samples do not contain data relating to an object can be provided by the device itself through in effect the device utilising a self learning algorithm.
In an example, if the processing unit determines that the test sample is to be added to the plurality of negative samples, the processing unit is configured to add the test sample to a group of the first plurality of groups based on a similarity metric satisfying a threshold condition.
In an example, if the processing unit determines that the test sample is to be added to the plurality of negative samples, the processing unit is configured to generate a new group of the first plurality of groups and add the test sample to the new group based on a similarity metric not satisfying a threshold condition.
In a second aspect, there is provided system for determining driving warning information, the system comprising:
at least one data acquisition unit; and
a device configured to classify data according to the first aspect; and
an output unit.
The at least one data acquisition unit is configured to acquire the at least one test sample. The processing unit is configured to determine driver warning information on the basis of the determination if the at least one test sample contains data relating to the at least one object. The output unit is configured to output information indicative of the driver warning information.
In a third aspect, there is provided method for classifying data, comprising:
According to another aspect, there is provided a computer program element controlling apparatus as previously described which, in the computer program element is executed by processing unit, is adapted to perform the method steps as previously described.
There is also provided a computer readable medium having stored the computer element as previously described.
Advantageously, the benefits provided by any of the above aspects equally apply to all of the other aspects and vice versa.
The above aspects and examples will become apparent from and be elucidated with reference to the embodiments described hereinafter.
Exemplary embodiments will be described in the following with reference to the following drawings:
In an example, the determination if the at least one test sample contains data relating to the at least one object comprises the processing unit implementing an artificial neural network. In an example, the artificial neural network comprises a convolutional neural network.
In an example, the population of each group of the first plurality of groups comprises the processing unit implementing the artificial neural network, such as the convolutional neural network.
In an example, the generation of the first plurality of groups comprises the processing unit implementing the artificial neural network.
In an example, the artificial neural network comprises a learning algorithm. In this way, the artificial neural network by having a learning algorithm is self-configurable.
According to an example, at least some of the first plurality of groups are assigned a weighting factor.
In examples, the importance weights for each subcluster (or group) can be set: manually, e.g., highly critical subclusters are manually given high importance weights (during training samples of these subsclusters can be drawn with greater probability); automatically using a statistical criterion, e.g., the number of samples in a subcluster; automatically using a relevance criterion, e.g., based on the classification score of the scores in the subcluster or on the mis-classification rate of the samples in a subcluster.
In an example, each group of the first plurality of groups can be assigned a different weighting factor.
In an example, the weighting factor for a group can be the number of samples in that group.
In an example, the weighting factor for a group can be the number manually assigned to that group.
In an example, the samples of the plurality of negative samples are each assigned a weighting factor,that can be different for each sample. For example, negative samples that are different to other negative samples can be given an individual weighting factor. In other words, such a negative sample forms a group of one, for which there is a weighting factor (for the sample) without a group giving to be generated as such. In this manner, when there are many more negative samples than positive samples, this imbalance can be addressed and account for differences between the negative samples.
According to an example, the processing unit 30 is configured to populate each group of the first plurality of groups on the basis of a different feature of sample data similarity for each group.
In an example, the improved balance is achieved by weighting the samples, e.g., in continuously collected image data some highly similar samples have up to 100 instances or more, and others have only about 5-10 instances. Even though these samples are likely to be observed in the environment with the same probability, they have a different number of samples because, e.g., in one case the car is slow (more instances) and in the other case faster (less instances). The present device addresses this imbalance through the weighting of samples.
In an example, the processing unit is configured to process the negative samples to determine a number of different features of sample similarity in order to populate different groups with negative samples that share or substantially share that or a similar feature.
According to an example, the processing unit 30 is configured to determine a first plurality of representative samples for the first plurality of groups, wherein a different representative sample is associated with each group of the first plurality of groups. The processing unit is also configured to determine if the at least one test sample contains data relating to the at least one object on the basis of the plurality of positive samples and the first plurality of representative samples.
In an example, during training a fine-grained balancing mechanism is applied. The subclusters are drawn randomly based on the importance weights. Subclusters with higher importance weights are drawn more likely than subclusters with lower importance weights. All subclusters may have equal importance weights. In this case, the subclustering balances out that some similar samples are more often in the collected training set than other similar samples (e.g. samples collected from a slow or fast driving car). When a subcluster is drawn, a sample from this subcluster is drawn randomly. There is not additional balancing for the samples within each subcluster.
In comparison to the state of the art, this approach proposes a fine-grained balancing for hundreds of subclasses. State of the art approaches only consider balancing for a small number of classes.
In an example, the determination of the first plurality of representative samples comprises the processing unit implementing the artificial neural network.
According to an example, the processing unit 30 is configured to generate a second plurality of groups. The processing unit is also configured to populate each group of the second plurality of groups with a different at least one of the plurality of positive samples, wherein at least one of the group of the second plurality of groups contains at least two positive samples. The processing unit is also configured to determine if the at least one test sample contains data relating to the at least one object on the basis of the second plurality of groups and the first plurality of groups.
In an example, the generation of the second plurality of groups comprises the processing unit implementing the artificial neural network.
In an example, the population of each group of the second plurality of groups comprises the processing unit implementing the artificial neural network, such as the convolutional neural network.
According to an example, at least some of the second plurality of groups are assigned a weighting factor.
In an example, each group of the first plurality of groups can be assigned a different weighting factor.
In an example, the samples of the plurality of positive samples are each assigned a weighting factor, that can be different for each sample. For example, positive samples that are different to other positive samples can be given an individual weighting factor. In other words, such a positive sample forms a group of one, for which there is a weighting factor (for the sample) without a group giving to be generated as such. In this manner, any imbalance can be addressed and account for differences between the positive samples.
According to an example, the processing unit 30 is configured to populate each group of the second plurality of groups on the basis of a different feature of sample data similarity for each group.
In an example, the processing unit is configured to process the positive samples to determine a number of different features of sample similarity in order to populate different groups with positive samples that share or substantially share that or a similar feature.
According to an example, the processing unit 30 is configured to determine a second plurality of representative samples for the second plurality of groups, wherein a different representative sample is associated with each group of the second plurality of groups. The processing unit is also configured to determine if the at least one test sample contains data relating to the at least one object on the basis of the second plurality of representative samples and the first plurality of representative samples.
In an example, the determination of the second plurality of representative samples comprises the processing unit implementing the artificial neural network.
According to an example, he processing unit 30 is configured to process the plurality of negative samples to determine a plurality of features of negative data sample similarity. The processing unit is also configured to populate each group of the first plurality of groups on the basis of a different feature of negative sample data similarity for each group; and wherein the processing unit is configured to process the plurality of positive samples to determine a plurality of features of positive data sample similarity. The processing unit is also configured to process the at least one test sample to determine at least one representative feature. The determination if the at least one test sample contains data relating to the at least one object comprises a comparison of the at least one representative feature with the plurality of features of negative data sample similarity and the plurality of features of positive data sample similarity.
In an example, the processing of the plurality of negative samples to determine a plurality of features of negative data sample similarity comprises the processing unit implementing the artificial neural network.
In an example, the processing of the at least one test sample to determine the at least one representative feature comprises the processing unit implementing the artificial neural network.
According to an example, the processing unit 30 is configured to process the plurality of negative samples to determine a plurality of features of negative data sample similarity. The processing unit is also configured to populate each group of the first plurality of groups on the basis of a different feature of negative sample data similarity for each group; and wherein the processing unit is configured to process the plurality of positive samples to determine a plurality of features of positive data sample similarity. The processing unit is also configured to process a test sample of the at least one test sample to determine a test representative feature. The processing unit is also configured to add the test sample to the plurality of positive samples or add the test sample to the plurality of negative samples. The selective addition comprises a comparison of the test representative feature with the plurality of features of negative data sample similarity and the plurality of features of positive data sample similarity.
The determination the determination that the negative samples do not contain data relating to an object can be provided by the device itself through in effect the device utilising a self learning algorithm.
In an example, the adding of the test sample to the plurality of positive or negative samples comprises the processing unit implementing the artificial neural network.
According to an example, if the processing unit 30 determines that the test sample is to be added to the plurality of negative samples, the processing unit is configured to add the test sample to a group of the first plurality of groups based on a similarity metric satisfying a threshold condition.
According to an example, if the processing unit 30 determines that the test sample is to be added to the plurality of negative samples, the processing unit is configured to generate a new group of the first plurality of groups and add the test sample to the new group based on a similarity metric not satisfying a threshold condition.
In an example, if the processing unit determines that the test sample is to be added to the plurality of positive samples, the processing unit is configured to add the test sample to a group of the second plurality of groups based on the similarity metric satisfying a threshold condition.
The determination that the positive samples contain data relating to an object to be detected can be provided by the device itself through in effect the device utilising a self learning algorithm.
In an example, if the processing unit determines that the test sample is to be added to the plurality of positive samples, the processing unit is configured to generate a new group of the second plurality of groups and add the test sample to the new group based on a similarity metric not satisfying a threshold condition.
In an example, for “lifelong learning”, new data will be continuously collected and the classifier will be retrained either online (on the car) or offline. To explain further, for this retraining, it is important to know the relevance of the sample. This relevance can be estimated by computing its similarity to the subcluster centers. When the similarity is below a threshold, a new subcluster is defined and an importance weight for this subcluster is computed.
The following provides a short summary: For continuous and lifelong learning of a function on a sensor device, the classifier is re-trained with newly collected data. The critical point is the judgment of the importance of newly collected data in an efficient manner regarding run-time and memory constraints. The proposed solution clusters the training data. Importance of newly collected samples is computed by the distance between the newly collected samples and the previously collected clusters. This reduces heavily the amount of data stored on the device and the computation time for distance computation since the entire dataset to compare against is reduced to the set of clusters. In addition, the clusters have semantic names and allow for a better interpretation.
In an example, in addition a subset of old data may be reused for retraining to guarantee the performance on the original training dataset. Storing all training samples requires too much storage. Reducing automatically to a smaller set of training samples, here the centers of the subclusters, is highly relevant, and is addressed by the device and the algorithms it implements.
In an example, the plurality of data samples comprises one or more of: image data, radar data; acoustic data; and lidar data.
In an example, the image data is acquired by a camera. In an example, the radar data is captured by a radar system. In an example, the acoustic data is acquired by an acoustic sensor. In an example, the lidar data is acquired by a laser sensing system.
In an example, the plurality of data samples comprises magnetic resonance data. In an example, the plurality of data samples comprises X-ray image data. In an example, the plurality of data samples comprises X-ray Computer Tomography data. Thus in examples, image data can be acquired by one or more of: an MRI scanner; CT scanner; ultrasound sensor.
In an example, the input unit comprises the at least one data acquisition unit.
In an example, the at least one data acquisition unit comprises a camera.
In an example, the at least one data acquisition unit comprises a radar unit.
In an example, the at least one data acquisition unit comprises an acoustic sensor.
In an example, the at least one data acquisition unit comprises a lidar, laser based sensor.
In an example, the system is located in or on a vehicle, for example forming part of an Advanced Driving assistance System ADAS.
in a providing step 210, also referred to as step a), providing a plurality of data samples, wherein the plurality of data samples comprises at least one test sample and comprises a plurality of positive samples and comprises a plurality of negative samples, and wherein, each of the positive samples has been determined to contain data relating to at least one object to be detected, and wherein each of the negative samples has been determined not to contain data relating to at least one object to be detected;
in a generating step 220, also referred to as step e), generating a first plurality of groups;
in a populating step 230, also referred to as step g), populating each group of the first plurality of groups with a different at least one of the plurality of negative samples, wherein at least one of the group of the first plurality of groups contains at least two negative samples; and
in a determining step 240, also referred to as step m), determining if the at least one test sample contains data relating to the at least one object on the basis of the plurality of the positive samples and the first plurality of groups.
In an example, step g) comprises populating each group of the first plurality of groups on the basis of a different feature of sample data similarity for each group.
In an example, the method comprises step f) determining 250 a first plurality of representative samples for the first plurality of groups, wherein a different representative sample is associated with each group of the first plurality of groups; and wherein step m) comprises determining if the at least one test sample contains data relating to the at least one object on the basis of the plurality of positive samples and the first plurality of representative samples.
In an example, the method comprises step h) generating 260 a second plurality of groups; and comprises step i) populating 270 each group of the second plurality of groups with a different at least one of the plurality of positive samples, wherein at least one of the group of the second plurality of groups contains at least two positive samples; and wherein step m) comprises determining if the at least one test sample contains data relating to the at least one object on the basis of the second plurality of groups and the first plurality of groups.
In an example, step i) comprises populating each group of the second plurality of groups on the basis of a different feature of sample data similarity for each group.
In an example, the method comprises step j) determining 280 a second plurality of representative samples for the second plurality of groups, wherein a different representative sample is associated with each group of the second plurality of groups; and wherein step m) comprises determining if the at least one test sample contains data relating to the at least one object on the basis of the second plurality of representative samples and the first plurality of representative samples.
In an example, the method comprises step b) processing 290 the plurality of negative samples to determine a plurality of features of negative data sample similarity; and wherein step g) comprises populating each group of the first plurality of groups on the basis of a different feature of negative sample data similarity for each group; and wherein the method comprises step c) processing 300 the plurality of positive samples to determine a plurality of features of positive data sample similarity; and the method comprises step d) processing 310 the at least one test sample to determine at least one representative feature; and wherein step m) comprises a comparison of the at least one representative feature with the plurality of features of negative data sample similarity and the plurality of features of positive data sample similarity.
In an example, the method comprises step b) processing 290 the plurality of negative samples to determine a plurality of features of negative data sample similarity; and wherein step g) comprises populating each group of the first plurality of groups on the basis of a different feature of negative sample data similarity for each group; and wherein the method comprises step c) processing 300 the plurality of positive samples to determine a plurality of features of positive data sample similarity; and wherein the method comprises step k) processing 320 a test sample of the at least one test sample to determine a test representative feature; and wherein the method comprises step l) adding 330 in sub-step l1) 340) the test sample to the plurality of positive samples or adding in sub-step l2) 350 the test sample to the plurality of negative samples, the selective addition comprising a comparison of the test representative feature with the plurality of features of negative data sample similarity and the plurality of features of positive data sample similarity.
In an example, step l2) 350 comprises adding 352 the test sample to a group of the first plurality of groups based on a similarity metric satisfying a threshold condition.
In an example, step l2) 350 comprises generating 354 a new group of the first plurality of groups and adding the test sample to the new group based on a similarity metric not satisfying a threshold condition.
In an example, the plurality of data samples comprises one or more of: image data, radar data; acoustic data; and lidar data.
Approaches to lifelong learning that are for example applied in the domain of advanced driver assistance systems, are increasingly gaining importance. However, such lifelong learning techniques must be trained, and retrained. This can include training on the device continuously, in online mode. This requires a training data be stored on the device and extended by data continuously collected by the device during its lifetime. Convolution all neural networks (CNN) and other machine learning approaches can be utilised in this manner. These approaches, or techniques, require a dataset of positive and negative samples for determining the parameters/weights of the internal net-and wage structure. Determining such parameters based on such data is called training. The positive samples include objects to be detected, such as pedestrians, cars, vehicles, trucks or bicycles among others. The negative samples include background or background falsely classified as a valid detection/object. Machine learning approaches, such as utilising CNNs, require balanced training datasets, containing a balanced (equal) number of positive and negative samples. However, training datasets been for example automotive domain, obtained from either off-line data or data collected online what the device are not balanced. Usually, the number of positive samples is small in comparison to the possible number of negative samples upon which the device can be trained. This results in CNNs with trained net-topologies and weights that overemphasizes the importance of either positive or negative samples, leading to poor classification performance in general and/or to a decrease in performance of the device during its lifetime as it self learns through a lifelong learning approach. This problem is encountered not only in the automotive domain, but in other domains requiring the classification of data as described above.
The device, system and method described above in relation to
As part of addressing these issues, a clustering approach is provided that balances the data prior to the online and/or off-line training of the machine learning approach on a device that utilises for example a camera sensor, and provides for the ability to continuously provide lifelong training during online operation of the device. Data, or samples, can be in the form of camera imagery, processed camera imagery, radar data, processed radar data and other vehicular sensor information. In this the way there is provided:
The clustering approach processes/clusters the samples in order to form groups of samples with high similarity within one group. Representatives of each group, within the positive/negative sample pools, can then be selected to form a balanced dataset for training as device. Thus, this clustering supports classification to better balance the data (e.g acquired samples in real time) so that not too much emphasis is put on either the positive or negative training group or specific samples with high redundancy in one group.
In addition, this clustering approach helps better consider the (few or very few) data one sub-class within the sample groups from more balanced training of different classes within one sample group. Such samples may include critical false-positives and critical false-negatives, and where collecting large sets of samples is often impractical.
In relation to a sensor device, and its continuous and lifelong learning, the classifier is retrained with newly collected data. Newly collected data is judged in terms of its importance, in an efficient manner with respect to processing runtime and memory constraints. The training data is clustered, and the importance of newly collected samples (data) is computed through the determination of the distance between the newly collected samples on the previously collected clusters. This heavily reduces the amount of data required to be stored and device, and a computation time distance computation. This is because the dataset used for comparison is reduced to the set of clusters, rather than all the samples. In addition, the clusters have semantic names allow for a better interpretation.
With continued reference to
A detailed workflow is now described regarding how clustering is performed:
Complementing the above workflow, each cluster can be extended by samples. Such samples used to extend clusters can be synthetically generated, borrowed from other datasets, collected in additional data recordings. This enables sharing of different information collected by different devices/sensors online. In addition, it simplifies concatenating information collected by different devices/sensors based on clustering approaches. The above described clustering approach, and workflow, can then be used in a lifelong learning approach where online data acquired for example by device forms part of the training data set for improvement of the device. Furthermore, the way by which data is classified for incorporation into particular groups or clusters or subclusters enables acquired data, for example from a camera forming part of an ADAS system, to be analysed and classified in terms of the content within the image. In other words, the device can determine if an image contains features such as pedestrians, other cars, road signs et cetera, or if the image does not indeed contain such data.
With continued reference to
The signing of weights to the subclusters (groups) that applies across the approaches is shown in
During training a fine-grained balancing mechanism is applied. The subclusters are drawn randomly based on the importance weights. Subclusters with higher importance weights are drawn more likely than subclusters with lower importance weights. All subclusters may have equal importance weights. In this case, the subclustering balances out that some similar samples are more often in the collected training set than other similar samples (e.g. samples collected from a slow or fast driving car). When a subcluster is drawn, a sample from this subcluster is drawn randomly. There is not additional balancing for the samples within each subcluster.
In comparison to the state of the art, this approach proposes a fine-grained balancing for hundreds of subclasses. State of the art approaches only consider balancing for a small number of classes.
The above described device, system and method which make use of one or other of the above described clustering approaches, addresses the issue of an imbalance in the data that is provided to the device for training and lifelong learning processes. This also provides the device, system and method with an efficient way of processing real-time data to determine if imagery contains features of importance, such as whether pedestrians are present or not. This is achieved through the consideration of redundant and multiply appearing samples, which are then clustered, taking into account samples that can occur more often in one sample pool and others. This also addresses the specific case within the vehicular domain, where a huge number of redundant negative samples extracted from images recorded by slow-moving vehicle can be obtained, as opposed to very few, nonredundant samples recorded by a fast driving vehicle. For example, some negative samples can have hundreds of repetitions in a dataset, whereas others are less frequently observed. The currently described device, system and method address this, ensuring that greater emphasis is not provided to more frequent samples, providing for a general improvement in classification performance. In other words, the device, system and method provide an approach to lifelong learning, online training, and processing of real time data that:
The device, system and method are described with respect to the automotive sector and the provision of warning information. However, the device, system and method are applicable to any other field where data is required to be classified, and utilized within a machine learning environment. This includes, biomedical image processing, such as within the Magnetic Resonance, Computer Tomography, or ultrasound domains, as well as to aerial image processing and synthetic aperture radar image processing. Particularly, the device, system and method can:
In another exemplary embodiment, a computer program or computer program element is provided that is characterized by being configured to execute the method steps of the method according to one of the preceding embodiments, on an appropriate system.
The computer program element might therefore be stored on a computer unit, which might also be part of an embodiment. This computing unit may be configured to perform or induce performing of the steps of the method described above. Moreover, it may be configured to operate the components of the above described apparatus and/or system. The computing unit can be configured to operate automatically and/or to execute the orders of a user. A computer program may be loaded into a working memory of a data processor. The data processor may thus be equipped to carry out the method according to one of the preceding embodiments.
According to a further exemplary embodiment of the present invention, a computer readable medium, such as a CD-ROM, is presented wherein the computer readable medium has a computer program element stored on it which computer program element is described by the preceding section.
It has to be noted that embodiments of the invention are described with reference to different subject matters. In particular, some embodiments are described with reference to method type claims whereas other embodiments are described with reference to the device type claims. However, a person skilled in the art will gather from the above and the following description that, unless otherwise notified, in addition to any combination of features belonging to one type of subject matter also any combination between features relating to different subject matters is considered to be disclosed with this application. However, all features can be combined providing synergetic effects that are more than the simple summation of the features.
While the invention has been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative or exemplary and not restrictive. The invention is not limited to the disclosed embodiments. Other variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing a claimed invention, from a study of the drawings, the disclosure, and the dependent claims.
In the claims, the word “comprising” does not exclude other elements or steps, and the indefinite article “a” or “an” does not exclude a plurality. A single processor or other unit may fulfill the functions of several items re-cited in the claims. The mere fact that certain measures are re-cited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage. Any reference signs in the claims should not be construed as limiting the scope.
Number | Date | Country | Kind |
---|---|---|---|
16204089.3 | Dec 2016 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2017/082052 | 12/8/2017 | WO | 00 |