This application claims the benefit of and priority to Korean Patent Application No. 10-2023-0193510, filed in the Korean Intellectual Property Office on Dec. 27, 2023, the entire contents of which are incorporated herein by reference.
The technical idea of the present disclosure relates to an object detection method and an object detection device, and more specifically, to an emergency vehicle detection method and an emergency vehicle detection device.
Recently, research is being conducted on various methods for detecting the location of an object. Methods for detecting the location of a specific vehicle are also being studied and, for example, methods for detecting emergency vehicles are being studied.
Methods for detecting an emergency vehicle include a method of detecting the location of an emergency vehicle based on the sound of a siren of the emergency vehicle and include a method of detecting the location of the emergency vehicle based on an image of the emergency vehicle. However, if an emergency vehicle is detected based on sound, the accuracy of detecting the location of a vehicle may decrease due to latency issues. If the location of an emergency vehicle is detected based on an image, the accuracy of detecting whether there is the emergency vehicle may decrease. Therefore, there may be a need for a method of accurately detecting the location of an emergency vehicle by detecting the emergency vehicle based on both sound and image. The subject matter described in this background section is intended to promote an understanding of the background of the disclosure and thus may include subject matter that is not already known to those of ordinary skill in the art.
The present disclosure has been made to solve the above-mentioned problems occurring in the prior art while advantages achieved by the prior art are maintained intact.
Aspects of the present disclosure provide an object detection method and an object detection device that detect whether an external vehicle is an emergency vehicle based on sound and images.
Aspects of the present disclosure provide an object detection method and an object detection device that detect the location of an emergency vehicle based on sound and images.
Aspects of the present disclosure provide an object detection method and an object detection device that detect the location of an emergency vehicle by considering the correlation between at least two of the locations of external vehicles detected based on sound or the location of the emergency vehicle detected based on images.
Aspects of the present disclosure provide an object detection method and an object detection device that detect the location of an emergency vehicle through a neural network.
Aspects of the present disclosure provide an object detection method and an object detection device that improve the accuracy of detection of emergency vehicle location through self-supervised learning.
The technical problems to be solved by the present disclosure are not limited to the aforementioned problems. Any other technical problems not mentioned herein should be more clearly understood from the following description by those having ordinary skill in the art to which the present disclosure pertains.
According to an aspect of the present disclosure, an emergency vehicle detection method includes acquiring, by an acquisition device, sounds of external vehicles including an emergency vehicle and a general vehicle. The method also includes generating, by a generator, a first spectrogram for an emergency vehicle sound and a second spectrogram for a general vehicle sound. The method also includes generating, by the generator, first emergency vehicle location information based on the first spectrogram. The method also includes generating, by the generator, general vehicle location information based on the second spectrogram. The method also includes acquiring, by the acquisition device, images of the external vehicles. The method also includes generating, by the generator, second emergency vehicle location information based on the images of the external vehicles. The method also includes detecting, by a detector, a location of the emergency vehicle based on a correlation of at least two of the first emergency vehicle location information, the general vehicle location information, or the second emergency vehicle location information.
According to an embodiment, generating the first spectrogram and the second spectrogram may include generating, by the generator, a third spectrogram for the sounds of the external vehicles. Generating the first spectrogram and the second spectrogram may also include generating, by the generator, a Fourier spectrum by performing a 2D Fast Fourier Transform (FFT) on the third spectrogram. Generating the first spectrogram and the second spectrogram may also include detecting, by the detector, a first spectrum corresponding to the emergency vehicle sound and a second spectrum corresponding to the general vehicle sound from the Fourier spectrum. Generating the first spectrogram and the second spectrogram may also include generating, by the generator, the first spectrogram and the second spectrogram by performing an inverse 2D FFT on each of the first spectrum and the second spectrum.
According to an embodiment, detecting the first spectrum and the second spectrum may include detecting, by the detector, the first spectrum from the Fourier spectrum by extracting patterns that include peak points and are repeated in a horizontal axis direction. Detecting the first spectrum and the second spectrum may also include detecting, by the detector, the second spectrum by filtering out the first spectrum from the Fourier spectrum.
According to an embodiment, generating the first spectrogram and the second spectrogram may include generating, by the generator, the first spectrogram and the second spectrogram by detecting the emergency vehicle sound and the general vehicle sound from the sounds of the external vehicles through a neural network that separates sounds.
According to an embodiment, generating the first emergency vehicle location information may include generating, by the generator, the first emergency vehicle location information based on the first spectrogram through a neural network that detects a location of an object based on a spectrogram.
According to an embodiment, the first emergency vehicle location information may include information about a first Gaussian function corresponding to the location of the emergency vehicle. The general vehicle location information may include information about a second Gaussian function corresponding to a location of the general vehicle. The second emergency vehicle location information may include information about a third Gaussian function corresponding to the location of the emergency vehicle.
According to an embodiment, the first Gaussian function may be a 2D Gaussian function with a positive output value. The second Gaussian function may be a 2D Gaussian function with a negative output value. The third Gaussian function may be a 2D Gaussian function with a positive output value.
According to an embodiment, detecting the location of the general vehicle may include extracting, by an extractor, a fourth Gaussian function whose degree of correlation with at least one of the first Gaussian function or the second Gaussian function is greater than a threshold value, from among the third Gaussian functions. Detecting the location of the general vehicle may also include detecting, by the detector, the location of the emergency vehicle by determining a location of a vehicle corresponding to the fourth Gaussian function as the location of the emergency vehicle.
According to an embodiment, generating the first emergency vehicle location information may include generating, by the generator, the first emergency vehicle location information based on the first spectrogram through a neural network that detects a location of an object based on a spectrogram. Generating the first emergency vehicle location information may also include performing, by a learning device, self-supervised learning on the neural network by using the location of the vehicle corresponding to the fourth Gaussian function as GT (Ground Truth) for the location of the emergency vehicle.
According to an embodiment, generating the second emergency vehicle location information may include generating, by the generator, the second emergency vehicle location information based on the images of the external vehicles by a neural network that detects a location of an object based on an image.
According to an aspect of the present disclosure, an emergency vehicle detection device includes a memory that stores computer-executable instructions and includes at least one processor configured to access the memory and execute the instructions. The at least one processor acquires, by an acquisition device, sounds of external vehicles including an emergency vehicle and a general vehicle, and images of the external vehicles. The at least one processor generates, by a generator, a first spectrogram for an emergency vehicle sound and a second spectrogram for a general vehicle sound. The at least one processor by the generator, first emergency vehicle location information based on the first spectrogram. The at least one processor generates, by the generator, general vehicle location information based on the second spectrogram. The at least one processor generates, by the generator, second emergency vehicle location information based on the images of the external vehicles. The at least one processor detects, by a detector, a location of the emergency vehicle based on a correlation of at least two of the first emergency vehicle location information, the general vehicle location information, or the second emergency vehicle location information.
According to an embodiment, the at least one processor may generate, by the generator, a third spectrogram for the sounds of the external vehicles. The at least one processor may also generate, by the generator, a Fourier spectrum by performing a 2D Fast Fourier Transform (FFT) on the third spectrogram. The at least one processor may also detect, by the detector, a first spectrum corresponding to the emergency vehicle sound and a second spectrum corresponding to the general vehicle sound from the Fourier spectrum. The at least one processor may also generate, by the generator, the first spectrogram and the second spectrogram by performing an inverse 2D FFT on each of the first spectrum and the second spectrum.
According to an embodiment, the at least one processor may detect, by the detector, the first spectrum from the Fourier spectrum by extracting patterns that include peak points and are repeated in a horizontal axis direction and detect the second spectrum by filtering out the first spectrum from the Fourier spectrum.
According to an embodiment, the at least one processor may generate, by the generator, the first spectrogram and the second spectrogram by detecting the emergency vehicle sound and the general vehicle sound from the sounds of the external vehicles by using a neural network that separates sounds.
According to an embodiment, the at least one processor may generate, by the generator, the first emergency vehicle location information corresponding to the location of the emergency vehicle based on the first spectrogram by using a neural network that detects a location of an object based on a spectrogram.
According to an embodiment, the first emergency vehicle location information may include information about a first Gaussian function corresponding to the location of the emergency vehicle. The general vehicle location information may include information about a second Gaussian function corresponding to a location of the general vehicle. The second emergency vehicle location information may include information about a third Gaussian function corresponding to the location of the emergency vehicle.
According to an embodiment, the first Gaussian function may be a 2D Gaussian function with a positive output value. The second Gaussian function may be a 2D Gaussian function with a negative output value. The third Gaussian function may be a 2D Gaussian function with a positive output value.
According to an embodiment, the at least one processor may extract, by an extractor, a fourth Gaussian function whose degree of correlation with at least one of the first Gaussian function or the second Gaussian function is greater than a threshold value, from among the third Gaussian functions. The at least one processor may also detect, by the detector, the location of the emergency vehicle by determining a location of a vehicle corresponding to the fourth Gaussian function as the location of the emergency vehicle.
According to an embodiment, the at least one processor may generate, by the generator, the first emergency vehicle location information based on the first spectrogram by using a neural network that detects a location of an object based on a spectrogram. The at least one processor may also perform, by a learning device, self-supervised learning on the neural network by using the location of the vehicle corresponding to the fourth Gaussian function as GT (Ground Truth) for the location of the emergency vehicle.
According to an embodiment, the at least one processor may generate, by the generator, the second emergency vehicle location information based on the images of the external vehicles by using a neural network that detects a location of an object based on an image.
The features briefly summarized above with respect to the present disclosure are merely illustrative aspects of the detailed description of the present disclosure described below, and do not limit the scope of the present disclosure.
The above and other objects, features, and advantages of the present disclosure should be more apparent from the following detailed description taken in conjunction with the accompanying drawings:
Hereinafter, with reference to the accompanying drawings, embodiments of the present disclosure are described in detail such that those of ordinary skill in the art may carry out the present disclosure. However, the present disclosure may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein.
Further, in describing the embodiment of the present disclosure, a detailed description of well-known features or functions has been omitted in order not to unnecessarily obscure the gist of the present disclosure. In the drawings, parts not related to the description have been omitted, and like reference numerals refer to like elements throughout the present disclosure.
In the present disclosure, it should be understood that when an element is referred to as being “connected to”, “coupled to”, or “combined with” another element, the element may be directly connected or coupled to or combined with the other element or intervening elements may be present therebetween. It should be further understood that the terms “comprise”, “include” or “have” and variations thereof, when used in the present disclosure, specify the presence of stated elements but do not preclude the presence or addition of one or more other elements.
In the present disclosure, terms such as first and second are used only for the purpose of distinguishing one element from other elements and do not limit the order or importance of the elements unless specifically mentioned. Accordingly, within the scope of the present disclosure, a first element in one embodiment may be referred to as a second element in another embodiment, and similarly, the second element in one embodiment may be referred to as a first element in another embodiment.
In the present disclosure, distinct elements are only for clearly describing their features and do not necessarily mean that the elements are separated. In other words, a plurality of elements may be integrated to form a single hardware or software unit, or a single element may be distributed to form a plurality of hardware or software units. Accordingly, such integrated or distributed embodiments are included in the scope of the present disclosure, even if not otherwise noted.
In the present disclosure, elements described in the various embodiments are not necessarily essential elements, and some elements may be optional. Accordingly, embodiments including a subset of the elements described in one embodiment are also included in the scope of the present disclosure. Additionally, embodiments that include other elements in addition to elements described in the various embodiments are also included in the scope of the present disclosure.
In the present disclosure, expressions of positional relationships used in the present disclosure, such as top, bottom, left, or right, are described for convenience of description, and when the drawings are viewed in reverse, the positional relationships described in the present disclosure may also be interpreted in the opposite way.
In the present disclosure, each of the phrases “A or B,” “at least one of A and B,” “at least one of A or B,” “A, B or C,” “at least one of A, B and C,” “at least one or more of A, B, or C,” and “at least one of A, B, or C” may include any one of items listed along with a relevant phrase, or any possible combination thereof. When a controller, module, component, device, element, or the like of the present disclosure is described as having a purpose or performing an operation, function, or the like, the controller, module, component, device, element, or the like should be considered herein as being “configured to” meet that purpose or to perform that operation or function. Each controller, module, component, device, element, and the like may separately embody or be included with a processor and a memory, such as a non-transitory computer readable media, as part of the apparatus.
Hereinafter, embodiments of the present disclosure are described in detail with reference to
Methods of detecting an emergency vehicle may include a method of detecting a location of an emergency vehicle based on the sound of the emergency vehicle (for example, a repeated siren sound), a method of detecting a location of an emergency vehicle based on the image of the emergency vehicle, and the like. If the location of an emergency vehicle is detected based on the sound of the emergency vehicle, a latency issue may occur because the location of the emergency vehicle is detected by analyzing sound data for each movement section of the vehicle in chronological order. Therefore, if the location of an emergency vehicle is detected based on the sound of the emergency vehicle, an error may occur in detecting the location of the emergency vehicle. In addition, if the location of an emergency vehicle is detected based on the image of an emergency vehicle, an error occurs in detecting the location of a general vehicle rather than an emergency vehicle. For example, because the positions of warning lights included in emergency vehicles are different, it may be difficult to extract the feature of an emergency vehicle from images of external vehicles. Therefore, if the location of an emergency vehicle is detected based on images of external vehicles, the location may be detected more accurately than if the emergency vehicle is detected based on the sounds of external vehicles. However, an error may also occur in which the location of a general vehicle is detected rather than the emergency vehicle. Therefore, an emergency vehicle detection method, which accurately detects the location of the emergency vehicle based on both sound and image may be needed. An emergency vehicle detection method according to an embodiment of the present disclosure may accurately detect the location of the emergency vehicle based on the sounds of external vehicles and images of external vehicles.
Referring to
According to the emergency vehicle detection method, in S120, a generator may generate a first spectrogram for emergency vehicle sounds and a second spectrogram for general vehicle sounds. For example, the emergency vehicle detection method may include separating the acquired sounds of external vehicles into emergency vehicle sounds and general vehicle sounds. The emergency vehicle detection method may also include generating a first spectrogram for emergency vehicle sounds and a second spectrogram for general vehicle sounds. More specific details on generating a first spectrogram for emergency vehicle sounds and a second spectrogram for general vehicle sounds based on the sounds of external vehicles are described below.
According to the emergency vehicle detection method, in S130, the generator may generate first emergency vehicle location information based on the first spectrogram. Specifically, the emergency vehicle detection method may include generating a first Gaussian function corresponding to the location of the emergency vehicle based on the first spectrogram. For example, the emergency vehicle detection method may include generating the first Gaussian function corresponding to the location of the emergency vehicle based on the first spectrogram by using a neural network that detects the location of an object based on a spectrogram.
According to the emergency vehicle detection method, in S140, the generator may generate general vehicle location information based on the second spectrogram. Specifically, the emergency vehicle detection method may include generating a second Gaussian function corresponding to the location of the general vehicle based on the second spectrogram. For example, the emergency vehicle detection method may include generating the second Gaussian function corresponding to the location of the emergency vehicle based on the second spectrogram by using a neural network that detects the location of an object based on a spectrogram.
According to the emergency vehicle detection method, in S150, the acquisition device may acquire images of external vehicles. In other words, the emergency vehicle detection method may include acquiring sounds of the external vehicles and images of the external vehicles.
According to the emergency vehicle detection method, in S160, the generator may generate second emergency vehicle location information based on the images of external vehicles. Specifically, vehicle detection method may include generating a third Gaussian function corresponding to the location of the emergency vehicle based on the images of the external vehicles. For example, the emergency vehicle detection method may include generating the third Gaussian function corresponding to the location of the emergency vehicle based on the images of the external vehicles through a neural network that detects the location of an object based on an image.
The first emergency vehicle location information may include information about the first Gaussian function corresponding to the location of the emergency vehicle. In addition, the general vehicle location information may include information about the second Gaussian function corresponding to the location of the general vehicle. In addition, the second emergency vehicle location information may include information about the third Gaussian function corresponding to the location of the emergency vehicle.
The first Gaussian function may be a 2D Gaussian function with a positive output value, the second Gaussian function may be a 2D Gaussian function with a negative output value, and the third Gaussian function may be a 2D Gaussian function with a positive output value. On the other hand, the first Gaussian function may be a 2D Gaussian function with a negative output value, the second Gaussian function may be a 2D Gaussian function with a positive output value, and the third Gaussian function may be a 2D Gaussian function with a negative output value.
According to the emergency vehicle detection method, in S170, a detector may detect the location of the emergency vehicle based on a correlation between at least two of the first emergency vehicle location information, the general vehicle location information, or the second emergency vehicle location information. Specifically, the emergency vehicle detection method may include detecting the location of the emergency vehicle based on at least one of the correlation between the first Gaussian function and the third Gaussian function or the correlation between the second Gaussian function and the third Gaussian function. For example, the emergency vehicle detection method may include extracting, from the third Gaussian function, a fourth Gaussian function whose value of a correlation with at least one of the first Gaussian function or the second Gaussian function is greater than a threshold value from the third Gaussian function. The emergency vehicle detection method may include detecting the location of the emergency vehicle by determining the location of a vehicle corresponding to the fourth Gaussian function as the location of the emergency vehicle. More specific details regarding the correlation between at least two of the first Gaussian function, the second Gaussian function, or the third Gaussian function are described below.
Referring to
According to the emergency vehicle detection method, in S122, the generator may generate a Fourier spectrum by performing a 2D Fast Fourier Transform (FFT) on the third spectrogram. For example, because the third spectrogram contains a spectrogram for a sound repeated by the siren of an emergency vehicle, if a 2D FFT is performed on the third spectrogram, a Fourier spectrum with the repeating pattern 430 may generated. For example, referring to
According to the emergency vehicle detection method, in S123, the detector may detect a first spectrum corresponding to emergency vehicle sound and a second spectrum corresponding to general vehicle sound in the Fourier spectrum. Specifically, the emergency vehicle detection method may include detecting the first spectrum corresponding to the emergency vehicle sound by extracting patterns that include peak points and are repeated in the horizontal axis direction from the Fourier spectrum. In addition, the emergency vehicle detection method may include detecting the second spectrum corresponding to the general vehicle sound by filtering out the first spectrum in the Fourier spectrum.
According to the emergency vehicle detection method, in S124, the generator may generate a first spectrogram for the emergency vehicle sound and a second spectrogram for the general vehicle sound by performing an inverse 2D FFT on each of the first spectrum and the second spectrum.
Referring to
The emergency vehicle detection method according to an embodiment of the present disclosure may include detecting the location of a vehicle based on the sound of the vehicle. Specifically, the emergency vehicle detection method may include detecting the location of the vehicle based on a spectrogram of a vehicle sound. However, as described above, if the location of the vehicle is detected based on the sound of the vehicle, it may be necessary to detect the location of the vehicle by analyzing sound data for each of movement sections of the vehicle in chronological order, causing a latency issue.
Referring to
Referring to
For example, the emergency vehicle detection method may include generating the first Gaussian functions 712 and 714 corresponding to the locations of the emergency vehicles based on a first spectrogram for emergency vehicle sound. The emergency vehicle detection method may also include generating the second Gaussian functions 711, 713, and 715 corresponding to the locations of the general vehicles based on a second spectrogram for general vehicle sound. Specifically, the first Gaussian functions 712 and 714 may be a 2D Gaussian functions with positive output values, and the second Gaussian functions 711, 713, and 715 may be a 2D Gaussian functions with negative output values. It should be noted that, for convenience of description, the first Gaussian function and the second Gaussian function are expressed as 1D Gaussian functions in
As described above, if the location of a vehicle is detected based on sound, a location error may occur. For example, it may be seen that the location of the general vehicle 721 and the location of the second Gaussian function 711 corresponding to the location of the general vehicle 721 do not exactly match each other. In addition, it may be seen that the location of the emergency vehicle 722 and the location of the first Gaussian function 712 corresponding to the location of the emergency vehicle 722 do not exactly match each other. In addition, it may be seen that the location of the general vehicle 723 and the location of the second Gaussian function 713 corresponding to the location of the general vehicle 723 do not exactly match each other. In addition, it may be seen that the location of the emergency vehicle 724 and the location of the first Gaussian function 714 corresponding to the location of the emergency vehicle 724 do not exactly match each other. In addition, it may be seen that the position of the general vehicle 725 and the location of the second Gaussian function 715 corresponding to the location of the general vehicle 725 do not exactly match each other.
Referring to
As the neural network, for example, EfficientDet may be used, but the neural network is not limited thereto.
Referring to
If the location of an object is detected based on an image, the accuracy of location detection may be greater than the accuracy if the location of the object is detected based on sound. Therefore, the location of the emergency vehicle 722 and the location of the third Gaussian function 732 may match each other, the location of the emergency vehicle 724 and the location of the third Gaussian function 734 may also match each other, and the location of the general vehicle 725 and the location of the third Gaussian function 735 may also match each other. It should be noted that, the general vehicle 725 may be determined to be an emergency vehicle and the third Gaussian function 735 may be generated because it is difficult to extract the feature of the emergency vehicle from the image due to differences in the locations of the warning lights of emergency vehicles as described above.
Referring to
The neural network may be, for example, CenterNet, but is not limited thereto.
Referring to
For example, as shown in
The degree of correlation between Gaussian functions with positive output values (the first Gaussian function and the third Gaussian function) may be greater than the degree of correlation between a Gaussian function with a positive output value (the third Gaussian function) and a Gaussian function with a negative output value (the second Gaussian function). Therefore, the location of the emergency vehicle may be accurately detected by extracting the fourth Gaussian function 732 or 734 whose degree of a correlation with at least one of the first Gaussian functions 712 and 714 or the second Gaussian functions 711, 713, and 715 is greater than the threshold value, among the third Gaussian functions 732, 734, and 735.
Referring to
For example, the emergency vehicle detection method may include acquiring the sounds of external vehicles including emergency vehicles and general vehicles. Additionally, the emergency vehicle detection method may include separating sounds of the external vehicles into emergency vehicle sounds and general vehicle sounds. The emergency vehicle detection method may include generating a Fourier spectrum by performing a 2D FFT on a spectrogram for the sounds of the external vehicles. The emergency vehicle detection method may also include generating a first spectrogram of the emergency vehicle sounds by extracting patterns, which include peak values and are repeated, from the generated Fourier spectrum. In addition, the emergency vehicle detection method may include generating a second spectrogram of the general vehicle sounds based on the Fourier spectrum from which the spectrum caused by the emergency vehicle sounds is removed. For example, the emergency vehicle detection method may include separating the sounds of the external vehicles into the emergency vehicle sounds and the general vehicle sounds through a neural network that separates sounds. The emergency vehicle detection method may also include generating the first spectrogram for the emergency vehicle sounds and the second spectrogram for the general vehicle sounds.
For example, the emergency vehicle detection method may include generating a first Gaussian function corresponding to the location of an emergency vehicle based on a first spectrum, through a neural network that detects the location of an object based on a spectrogram. The emergency vehicle detection method may also include generating a second Gaussian function corresponding to the location of a general vehicle based on a second spectrum, through the neural network that detects the location of the object based on the spectrogram.
In addition, the emergency vehicle detection method may include generating a third Gaussian function corresponding to the location of the emergency vehicle based on the images of external vehicles. The emergency vehicle detection method may include generating a third Gaussian function corresponding to the location of the emergency vehicle based on the images of the external vehicles through a neural network that detects the location of an object based on an image.
The first Gaussian function may be a 2D Gaussian function with a positive output value, the second Gaussian function may be a 2D Gaussian function with a negative output value, and the third Gaussian function may be a 2D Gaussian function with a positive output value, but are limited thereto.
The emergency vehicle detection method may include generating a fourth Gaussian function corresponding to the location of the emergency vehicle based on a correlation between at least two of the first Gaussian function, the second Gaussian function, or the third Gaussian function. Specifically, the emergency vehicle detection method may include extracting a fourth Gaussian function from among the third Gaussian function whose value of a correlation with at least one of the first Gaussian function or the second Gaussian function is greater than a threshold value from the third Gaussian n function. Because both the first Gaussian function and the third Gaussian function have positive output values, the value of a correlation between the two functions may be greater than the threshold value. Because the second Gaussian function and the third Gaussian function are a Gaussian function with negative output values and a Gaussian function with positive output values, respectively, the value of the correlation between the two functions may be less than the threshold value.
The location of the emergency vehicle corresponding to the fourth Gaussian function derived by considering both the sounds of external vehicles and the images of the external vehicles may be more accurate than the location of the emergency vehicle corresponding to the first Gaussian function derived based on the sounds of the external vehicles. Therefore, the data of the fourth Gaussian function corresponding to the location of the emergency vehicle may be the Ground Truth (GT) of a neural network that detects the location of an object based on a spectrogram. Accordingly, the emergency vehicle detection method may include performing self-supervised learning for the neural network by using the location of a vehicle corresponding to the fourth Gaussian function as the GT for the location of the emergency vehicle. The emergency vehicle detection method may increase the prediction accuracy of the neural network that detects the location of a vehicle based on a spectrogram through self-supervised learning for the neural network.
Referring to
The acquisition device 110 may be configured to acquire sounds of external vehicles, including emergency vehicles and general vehicles, and to acquire images of the external vehicles.
The generator 120 may be configured to generate a first spectrogram for emergency vehicle sounds and a second spectrogram for general vehicle sounds. In addition, the generator 120 may be configured to generate a first spectrogram and a second spectrogram by detecting emergency vehicle sounds and general vehicle sounds from the sounds of external vehicles through a neural network that separates sounds. The generator 120 may be configured to generate a first Gaussian function corresponding to the location of the emergency vehicle based on the first spectrogram. The generator 120 may be configured to generate a second Gaussian 1 function corresponding to the location of the general vehicle based on the second spectrogram. The generator 120 may be configured to generate a third Gaussian function corresponding to the location of the emergency vehicle based on the images of the external vehicles. The first Gaussian function may be a 2D Gaussian function with a positive output value, the second Gaussian function may be a 2D Gaussian function with a negative output value, and the third Gaussian function may be a 2D Gaussian function with a positive output value. Furthermore, the generator 120 may be configured to generate first Gaussian function the corresponding to the location of the emergency vehicle based on the first spectrogram through a neural network that detects the location of an object based on a spectrogram. The generator 120 may be configured to generate the second Gaussian function corresponding to the location of the emergency vehicle based on the second spectrogram by using a neural network that detects the location of an object based on a spectrogram. The generator 120 may be configured to generate a third spectrogram for the sounds of the external vehicles. The generator 120 may be configured to generate a Fourier spectrum by performing a 2D FFT on the third spectrogram.
The detector 130 may be configured to detect the location of the emergency vehicle based on a correlation between at least two of the first Gaussian function, the second Gaussian function, or the third Gaussian function. The detector 130 may be configured to detect a first spectrum corresponding to emergency vehicle sounds and a second spectrum corresponding to general vehicle sounds from the Fourier spectrum. The detector 130 may be configured to detect the first spectrum by extracting patterns that include peak points and are repeated in the horizontal axis direction from the Fourier spectrum. The detector 130 may be configured to detect the second spectrum from the Fourier spectrum by filtering out the first spectrum.
The generator 120 may be configured to generate a first spectrogram and a second spectrogram by performing an inverse 2D FFT on each of the first spectrum and the second spectrum.
The extractor 140 may be configured to extract, from the third Gaussian functions, a fourth Gaussian function whose value of correlation with at least one of the first Gaussian function or the second Gaussian function is greater than a threshold value.
The detector 130 may be configured to detect the location of the emergency vehicle by determining the location of a vehicle corresponding to the fourth Gaussian function as the location of the emergency vehicle.
The learning device may be configured to perform self-supervised learning for the neural 1 network by using the location of a vehicle corresponding to the fourth Gaussian function as the GT for the location of the emergency vehicle.
Referring to
The processor 1100 may be a central processing device (CPU) or a semiconductor device that processes instructions stored in the memory 1300 and/or the storage 1600.
The memory 1300 and the storage 1600 may include various types of volatile or non-volatile storage media. For example, the memory 1300 may include a ROM (Read Only Memory) 1310 and a RAM (Random Access Memory) 1320.
Accordingly, the processes of the method or algorithm described in relation to the embodiments of the present disclosure may be implemented directly by hardware executed by the processor 1100, a software module, or a combination thereof. The software module may reside in a storage medium (i.e., the memory 1300 and/or the storage 1600), such as a RAM, a flash memory, a ROM, an EPROM, an EEPROM, a register, a hard disk, solid state drive (SSD), a detachable disk, or a CD-ROM. The storage medium is coupled to the processor 1100, and the processor 1100 may read information from the storage medium and may write information in the storage medium. In another method, the storage medium may be integrated with the processor 1100. The processor 1100 and the storage medium may reside in an application specific integrated circuit (ASIC). The ASIC may reside in a user terminal. In another method, the processor 1100 and the storage medium may reside in the user terminal as an individual component.
The above description is merely illustrative of the technical idea of the present disclosure, and various modifications and variations may be made without departing from the essential characteristics of the present disclosure by those having ordinary skill in the art to which the present disclosure pertains. Therefore, the embodiments of the present disclosure are provided to explain the spirit and scope of the present disclosure but are not intended to limit the spirit and scope of the present disclosure. Thus, the spirit and scope of the present disclosure is not limited by the embodiments. The scope of protection of the present disclosure should be interpreted according to the following claims, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of the present disclosure.
According to the object detection method and the object detection device of the present disclosure, it is possible to detect whether an external vehicle is an emergency vehicle based on sound and images.
According to the object detection method and the object detection device of the present disclosure, it is possible to detect the location of an emergency vehicle based on sound and images.
According to the object detection method and the object detection device of the present disclosure, it is possible to detect the location of an emergency vehicle by considering a correlation between at least two of the locations of external vehicles detected based on sound or the location of the emergency vehicle detected based on images.
According to the object detection method and the object detection device of the present disclosure, it is possible to detect the location of an emergency vehicle through a neural network.
According to the object detection method and the object detection device of the present disclosure, it is possible to improve the accuracy of detection of emergency vehicle location through self-supervised learning.
The effects obtainable in the present disclosure are not limited to the aforementioned effects. Any other effects not mentioned herein should be more clearly understood from the following description by those having ordinary skill in the art to which the present disclosure pertains.
Hereinabove, although the present disclosure has been described with reference to embodiments and the accompanying drawings, the present disclosure is not limited thereto but may be variously modified and altered by those having ordinary skill in the art to which the present disclosure pertains without departing from the spirit and scope of the present disclosure claimed in the following claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2023-0193510 | Dec 2023 | KR | national |