This application claims benefit of priority under 35 U.S.C. 119(a)-(d) to a Russian Application No. 2022107829 filed on Mar. 24, 2022, which is incorporated by reference herein.
The present invention relates generally to the field of computational data processing in general and, in particular, to a system and method for detecting and recognizing small objects in images using a machine learning algorithm.
Unmanned aerial vehicles (UAVs) are increasing in popularity for various applications. For example, UAVs are prevalent among hobbyists and enthusiasts for recreation, and are increasingly considered as viable package delivery vehicles. In aerial photography from the board of an UAV, the processing and analysis of images are traditionally performed after the UAV returns to the ground and after extracting data from it.
At the same time, there are several conventional solutions that allow data processing to be performed directly and on board the UAV, but in this case significant requirements are imposed both on the size of objects in the images of images and on photo and video recording. Preferably the objects in the images should be large enough, and the size of the photos should be relatively small due to the limited computing power of the UAV. Usually, in such a picture, the size of the object should occupy a significant part of the image, for example, but not limited to, more than 20%.
Known solutions typically do not allow users to perform a number of tasks related to the rapid search for objects of small and ultra-small size relative to the size of the entire image. Examples of such tasks include, but are not limited to, the search of images to find lost people, the search and counting of affected people and/or buildings in the emergency zone. Search tasks may include: detecting and/or recognizing objects (e.g., a person, pet, building, car). Thus, there are very few optimized solutions dedicated to joint detection and recognition of small and ultra-small objects in images.
Another task that is not solved be existing solutions in the art is the processing of the image in order to detect and recognize small and ultra-small objects directly on the UAV using machine learning algorithms. Typically, if a UAV lacks certain computing power, the likelihood of recognizing the object or classifying it may be significantly reduced, and/or the time of image processing may be significantly increased. Accordingly, the lack of required computing resources may lead to a loss of information, time (e.g., if the UAV already left the zone of interest) and may lead to inefficiency in detecting objects in real time. It should be noted that the mentioned methods of image processing by UAVs using machine learning image processing algorithms usually imply only very simplified classification. In such classification, either the number of classes is typically very small (2-3), or the classes in classification have very different characteristics. For set of features that can be fed into a classifier sharing similar shape and size (for example, different types of equipment), the accuracy of classification typically is not high enough. Generally, the smaller the size of the recognized objects, the more difficult the classification task becomes.
Thus, there is a need for efficient detection, recognition and classification of objects of small and ultra-small sizes directly on board the UAV in real time.
Disclosed is a new approach of detection and recognition of small objects on high-resolution optical images directly on board the UAV using a computing module that includes a machine learning algorithm. Advantageously, the disclosed solution additionally enables classification of a detected object.
In an aspect a machine learning algorithm may include a convolutional neural network. Advantageously, the disclosed solution may be implemented using any type of UAV, including, but not limited to: airplane, helicopter and the like. The disclosed approach allows informing the operator at the ground station about the results of detection and recognition of the detected object in real time, using a radio channel or cellular network.
In one aspect, a method for detecting small-sized objects based on image analysis using an unmanned aerial vehicle (UAV) includes obtaining object search parameters, wherein the search parameters include at least one characteristic of an object of interest; generating, during a flight of the UAV, at least one image containing a high-resolution image; analyzing the generated image using a machine learning algorithm based on the obtained search parameters; recognizing the object of interest using a machine learning algorithm if at least one object fulfilling the search parameters is detected in the image during the analysis; and determining the location of the detected object, in response to recognizing the object as the object of interest.
The accompanying drawings, which are incorporated into and constitute a part of this specification, illustrate one or more example aspects of the present disclosure and, together with the detailed description, serve to explain their principles and implementations.
Exemplary aspects are described herein in the context of a system, method, and computer program product for detecting and recognizing small objects in images using a machine learning algorithm. Those of ordinary skill in the art will realize that the following description is illustrative only and is not intended to be in any way limiting. Other aspects will readily suggest themselves to those skilled in the art having the benefit of this disclosure. Reference will now be made in detail to implementations of the example aspects as illustrated in the accompanying drawings. The same reference indicators will be used to the extent possible throughout the drawings and the following description to refer to the same or like items.
Glossary: a number of terms are defined herein which will be used to describe variant aspects of the present disclosure.
“Operator,” as used herein, refers to a “user” of the disclosed system.
“Small object,” as used herein, refers to a spot of relatively small or ultra-small sizes in the image, while the brightness of the spot differs from the brightness of its surroundings. The term “small object” refers, in particular, to an image of an object that occupies a couple of tens of pixels and in the image (frame) having resolution of at least 4K pixels, for example, 6000×4000 pixels. With the aforementioned ratio of the size of the object to the size of the image itself, examples of a small object in the image may be a picture a private house or car, and an example of an ultra-small object may be a person. It should be noted that with high-resolution images, the disclosed system enables, on the one hand, analysis of an image covering large area of land. On the other hand, the disclosed system imposes additional requirements for the analysis of the image itself to detect objects on the ground, when the image covers territory of such wide area. For example, with a UAV flight altitude of 150 meters and the formation of photo frames of size 6000×4000 pixels (24 megapixels), where the image size is 70×46 meters, and the resolution is 1.2 pixels per cm, then the coverage of the area with one frame will be equal to 3320 square meters.
Aspects of the present disclosure enable solving the shortcomings of existing systems by providing a method and device for detecting and recognizing small objects based on the analysis of high-resolution photo or video images, using a computing module configured to implement at least one machine learning algorithm.
Turning UAV 100 shown in
In an aspect, the UAV 100 may include the at least the following components shown in
In one aspect, the detection module 110 may be configured as a mirrorless photo camera capable of generating a high-resolution image of the UAV 100 during a UAV 100 flight. For example, the detection module 110 may include the Sony Alpha 6000 camera or the like. The computing module 120 (computing module) may be implemented on an on-board processing computer 125, for example, on a microcomputer (embedded microprocessor system) having dimensions suitable for installation on a UAV and allows the necessary calculations of the claimed aspects to be performed at the required speed (for example, no more than 1 second for image processing and analysis). In an aspect, a microcomputer may be a stand-alone device in the form of module boards, containing an e processor, a GPU, power management systems, high-speed interfaces, and the like. An example of such a stand-alone device may be a device implemented using NVIDIA(R) Jetson™ or modules such as the MB164.01 from «STC “Modul». The GPS/Glonass module 130 may be configured to transmit location information (e.g., GPS coordinates of the UAV 100). The data transmission module 140 may be a standard device for communicating and receiving data over communication channels, such as a radio channel and/or cellular communications.
It should be noted that in the description of the method 200, the features of the method can be used in both singular and plural, while in various aspects of the method both forms can be used, except where it is indicated explicitly. For example, the term “object” may mean both one object of interest and several objects of interest. Accordingly, the term “specified parameters” may mean both the parameters of one search object and the parameters of several search objects.
The result of the detection and recognition of a small object may be the selection of a group of pixels related to the object of interest depicted on the image and the determination of the object of interest using the selected group of pixels.
The claimed method 200 for detecting and recognizing small objects in the image obtained from the detection module 110 using a computing module 120 implementing at least one machine learning algorithm, as shown in
According to the flow chart presented in
In an aspect, step 210 may additionally specify the UAV's 100 flight path, photography mode, and parameters (e.g., frame-per-second frequency, image resolution) prior to image generation. In addition, the computing module 120 may obtain data about at least one object to search for it in the obtained image. The data about the object may be both another image on which the object is represented, a description of the object, the shape of the object, color, estimated size, and the like.
In an aspect, at step 220, in response to obtaining an image, the detection module 110 may transmit the obtained data to the computing module 120. The transmitted data may contain at least one image depicting a detected object. In addition, in real time, the calculation module 120 may additionally receive the location information (e.g., GPS coordinates and the altitude of the UAV 100) at the time of generation of the image.
In an aspect, at step 230, the computing module 120 may utilize a machine learning algorithm to analyze in real time each received image to search for at least one object meeting criteria provided in search options.
It should be noted that in one aspect the machine learning algorithm may include an artificial neural network (ANN). In aspect, the ANN may be preselected for searching generated images to perform detection and recognition of objects of interest based on the specified parameters. In addition, the ANN may be pre-trained and may be configured to perform additional training dynamically. Training may be performed based on a prepared list of images on which similar objects of interest were depicted in different climatic conditions, from different angles and with different background illumination. Advantageously, the training process may utilize images of the object of interest made in conditions of limited visibility. In addition, the ANN may be trained right before starting to search for the object of interest, based on provided search parameters. The search parameters may be the parameters characterizing the object of interest, such as, but not limited to, dimensions, color, geometric shape or set of shapes. For example, if the object of interest is a person, the provided search parameters may include, but are not limited to, person's size, hair color, type and color of clothes they might be wearing. If the object of interest is a building or a vehicle, then, the search parameters may include, but are not limited to, the dimensions, color of the object or its parts, the type of object. In other words the characteristic elements of the building or vehicle may be specified as search parameters.
Limited visibility, as used herein, refers to both difficult weather conditions (fog, side light, setting sun, etc.), as well as other reasons why the object of interest from above or at an angle relative to the flight of the UAV 100 may be only partially visible. For example, when searching for a person in the forest in the summer, the foliage most often covers most of the surface of the earth and may cover most of the object of interest. Accordingly, the ANN may be trained to detect fragments of the object of interest, for example, but not limited to, parts of the human body (arm sand, legs, shoulders, heads, and the like), which can only be detected in such an image. The ANN may be also trained to make a decision on the detection of object of interest based on the analysis of the detected fragments.
In one aspect, a convolutional neural network (CNN) may be used as a machine learning algorithm, which may allow for more efficient detection and recognition of objects in images.
At step 240, in response to the initial detection in the analyzed image of an object fulfilling the specified parameters, the computing module 120 may use the machine learning algorithm to recognize the detected object in order to make a final decision on the match between the found object the object of interest. In this case, depending on a particular implementation, recognition can be carried out both using the same machine learning algorithm that detected the object, and/or using another machine learning algorithm, which may be implemented by the computing module 220.
In one aspect, recognition may consist of determining the match between the detected object and at least some of the specified parameters of the object of interest. For example, if at least 75% of the parameters match, the detected object may be declared as the object of interest.
In an aspect, at step 250, in response to a successful recognition of the object of interest, location of the object may be determined by analyzing the location data obtained by the GPS/Glonass module 130. Location data may contain information about the coordinates and altitude of the UAV 100 at the time of generation of the image, taking into account the data related to the image (size and format), as well as data from the image such as the size of the object (in pixels) and its position in the image. Based on the specified information, the location (coordinates) of the object of interest calculated. Also, to determine the coordinates of the object of interest, optical data on the image sensor used by the detection module 110 of the UAV 100 (for example, the lens used by the aforementioned camera) is used. In an aspect, a 50 mm fixed lens may be used.
In an aspect, additional step 260, a file may be organized. The file may contain a part (fragment) of the image on which the object of interest was detected and recognized, and the calculated coordinates of the object of interest. An example of the selected fragment 303, 306, 309 is shown in
In an aspect, at step 270, the transmission module 140 may transmit the generated file to the ground station, while in the event connection is not available, the transmission module 140 may wait until the connection is available and may retransmit. Depending on the implementation, the ground station may be either stationary or mobile (for example, a vehicle equipped with the necessary means for receiving information and visualizing it).
In an aspect, at an optional step 280 the ground station may use the data from the file, to visualize the position of the object of interest on a map containing the search location. The ground station may use, for example, a web interface or a software application having similar properties for visualization.
It should be noted that in one aspect, the disclosed system may be implemented as a system comprising several UAVs 100 and at least one ground station. When searching (detecting and recognizing) an object, data may be exchanged between different UAV's 100 for more efficient search. Data may be exchanged both between UAVs 100 and a ground station with which the UAVs 100 communicate. At the same time, each of the UAVs 100 may be controlled from the ground station, for example, the transmission of data about the object of interest, flight path and search modes. In the system, the UAVs 100 may be implemented as various types of UAVs. For example, one UAV 100 may be of n an aircraft type, and the other one of a copter type. Advantageously, the joint use of several UAVs 100 may aggregate data from them in a single interface (at the ground station). The use of several UAVs 100 simultaneously makes it possible to increase the search area to at least several square kilometers.
In Example 1, the image 502 is presented, wherein in the first case, image 504 represents an image before the application of the disclosed method, and in image 505 after the application of the method. In Example 2, a second image 506 is presented, where image 508 is the incoming method, and image 510 illustrates the result of using the disclosed method.
As shown, the computer system 20 includes a central processing unit (CPU) 21, a system memory 22, and a system bus 23 connecting the various system components, including the memory associated with the central processing unit 21. The system bus 23 may comprise a bus memory or bus memory controller, a peripheral bus, and a local bus that is able to interact with any other bus architecture. Examples of the buses may include PCI, ISA, PCI-Express, HyperTransport™, InfiniBand™, Serial ATA, I2C, and other suitable interconnects. The central processing unit 21 (also referred to as a processor) can include a single or multiple sets of processors having single or multiple cores. The processor 21 may execute one or more computer-executable code implementing the techniques of the present disclosure. The system memory 22 may be any memory for storing data used herein and/or computer programs that are executable by the processor 21. The system memory 22 may include volatile memory such as a random access memory (RAM) 25 and non-volatile memory such as a read only memory (ROM) 24, flash memory, etc., or any combination thereof. The basic input/output system (BIOS) 26 may store the basic procedures for transfer of information between elements of the computer system 20, such as those at the time of loading the operating system with the use of the ROM 24.
The computer system 20 may include one or more storage devices such as one or more removable storage devices 27, one or more non-removable storage devices 28, or a combination thereof. The one or more removable storage devices 27 and non-removable storage devices 28 are connected to the system bus 23 via a storage interface 32. In an aspect, the storage devices and the corresponding computer-readable storage media are power-independent modules for the storage of computer instructions, data structures, program modules, and other data of the computer system 20. The system memory 22, removable storage devices 27, and non-removable storage devices 28 may use a variety of computer-readable storage media. Examples of computer-readable storage media include machine memory such as cache, SRAM, DRAM, zero capacitor RAM, twin transistor RAM, eDRAM, EDO RAM, DDR RAM, EEPROM, NRAM, RRAM, SONOS, PRAM; flash memory or other memory technology such as in solid state drives (SSDs) or flash drives; magnetic cassettes, magnetic tape, and magnetic disk storage such as in hard disk drives or floppy disks; optical storage such as in compact disks (CD-ROM) or digital versatile disks (DVDs); and any other medium which may be used to store the desired data and which can be accessed by the computer system 20.
The system memory 22, removable storage devices 27, and non-removable storage devices 28 of the computer system 20 may be used to store an operating system 35, additional program applications 37, other program modules 38, and program data 39. The computer system 20 may include a peripheral interface 46 for communicating data from input devices 40, such as a keyboard, mouse, stylus, game controller, voice input device, touch input device, or other peripheral devices, such as a printer or scanner via one or more I/O ports, such as a serial port, a parallel port, a universal serial bus (USB), or other peripheral interface. A display device 47 such as one or more monitors, projectors, or integrated display, may also be connected to the system bus 23 across an output interface 48, such as a video adapter. In addition to the display devices 47, the computer system 20 may be equipped with other peripheral output devices (not shown), such as loudspeakers and other audiovisual devices.
The computer system 20 may operate in a network environment, using a network connection to one or more remote computers 49. The remote computer (or computers) 49 may be local computer workstations or servers comprising most or all of the aforementioned elements in describing the nature of a computer system 20. Other devices may also be present in the computer network, such as, but not limited to, routers, network stations, peer devices or other network nodes. The computer system 20 may include one or more network interfaces 51 or network adapters for communicating with the remote computers 49 via one or more networks such as a local-area computer network (LAN) 50, a wide-area computer network (WAN), an intranet, and the Internet. Examples of the network interface 51 may include an Ethernet interface, a Frame Relay interface, SONET interface, and wireless interfaces.
Aspects of the present disclosure may be a system, a method, and/or a computer program product. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present disclosure.
The computer readable storage medium can be a tangible device that can retain and store program code in the form of instructions or data structures that can be accessed by a processor of a computing device, such as the computing system 20. The computer readable storage medium may be an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination thereof. By way of example, such computer-readable storage medium can comprise a random access memory (RAM), a read-only memory (ROM), EEPROM, a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), flash memory, a hard disk, a portable computer diskette, a memory stick, a floppy disk, or even a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon. As used herein, a computer readable storage medium is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or transmission media, or electrical signals transmitted through a wire.
Computer readable program instructions described herein can be downloaded to respective computing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network interface in each computing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing device.
Computer readable program instructions for carrying out operations of the present disclosure may be assembly instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language, and conventional procedural programming languages. The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a LAN or WAN, or the connection may be made to an external computer (for example, through the Internet). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present disclosure.
In various aspects, the systems and methods described in the present disclosure can be addressed in terms of modules. The term “module” as used herein refers to a real-world device, component, or arrangement of components implemented using hardware, such as by an application specific integrated circuit (ASIC) or FPGA, for example, or as a combination of hardware and software, such as by a microprocessor system and a set of instructions to implement the module's functionality, which (while being executed) transform the microprocessor system into a special-purpose device. A module may also be implemented as a combination of the two, with certain functions facilitated by hardware alone, and other functions facilitated by a combination of hardware and software. In certain implementations, at least a portion, and in some cases, all, of a module may be executed on the processor of a computer system. Accordingly, each module may be realized in a variety of suitable configurations, and should not be limited to any particular implementation exemplified herein.
In the interest of clarity, not all of the routine features of the aspects are disclosed herein. It would be appreciated that in the development of any actual implementation of the present disclosure, numerous implementation-specific decisions must be made in order to achieve the developer's specific goals, and these specific goals will vary for different implementations and different developers. It is understood that such a development effort might be complex and time-consuming, but would nevertheless be a routine undertaking of engineering for those of ordinary skill in the art, having the benefit of this disclosure.
Furthermore, it is to be understood that the phraseology or terminology used herein is for the purpose of description and not of restriction, such that the terminology or phraseology of the present specification is to be interpreted by the skilled in the art in light of the teachings and guidance presented herein, in combination with the knowledge of those skilled in the relevant art(s). Moreover, it is not intended for any term in the specification or claims to be ascribed an uncommon or special meaning unless explicitly set forth as such.
The various aspects disclosed herein encompass present and future known equivalents to the known modules referred to herein by way of illustration. Moreover, while aspects and applications have been shown and described, it would be apparent to those skilled in the art having the benefit of this disclosure that many more modifications than mentioned above are possible without departing from the inventive concepts disclosed herein.
Number | Date | Country | Kind |
---|---|---|---|
2022107829 | Mar 2022 | RU | national |