This application generally relates to structure inspection using an unmanned aerial vehicle (UAV), and, more specifically, to semantic three-dimensional (3D) scan systems and techniques for a multi-phase structure inspection performed using a UAV.
UAVs are often used to capture images from vantage points that would otherwise be difficult for humans to reach. Typically, a UAV is operated by a human using a controller to remotely control the movements and image capture functions of the UAV. In some cases, a UAV may have automated flight and autonomous control features. For example, automated flight features may rely upon various sensor input to guide the movements of the UAV.
Systems and techniques for, inter alia, multi-phase semantic 3D scan are disclosed.
In some implementations, a method comprises: performing, using an unmanned aerial vehicle, a first phase inspection of a structure to determine a semantic understanding of components associated with the structure and pose information of the components; determining, based on the semantic understanding of the components and the pose information, a flight path indicating capture points and camera poses associated with the capture points; and performing, using the unmanned aerial vehicle, a second phase inspection of a subset of the components according to the flight path.
In some implementations of the method, performing the first phase inspection of the structure to determine the semantic understanding of the components and the pose information comprises: navigating the unmanned aerial vehicle to a distance from the structure such that all of the structure is within a field of view of a camera of the unmanned aerial vehicle.
In some implementations of the method, performing the first phase inspection of the structure to determine the semantic understanding of the components and the pose information comprises: detecting the components using a camera of the unmanned aerial vehicle; and storing data indicative of the detected components on two-dimensional images associated with associated poses of the camera.
In some implementations of the method, detecting the components comprises: triangulating, using the unmanned aerial vehicle, locations of the components.
In some implementations of the method, determining the flight path indicating the capture points and the camera poses comprises: determining a number of columns for the unmanned aerial vehicle to vertically navigate, wherein each column of the number of columns includes one or more of the capture points.
In some implementations of the method, the number of columns is four columns, the four columns form a rectangular boundary surrounding the structure, and performing the second phase inspection of the subset of the components according to the flight path comprises: navigating the unmanned aerial vehicle about the rectangular boundary including ascending to a traversal height while moving between ones of the four columns.
In some implementations of the method, the number of columns is two columns, the two columns are at diagonally opposing locations about the structure, and performing the second phase inspection of the subset of the components according to the flight path comprises: navigating the unmanned aerial vehicle between the two columns over the structure.
In some implementations of the method, performing the second phase inspection of the subset of the components according to the flight path comprises: while the unmanned aerial vehicle is at a capture point of the capture points, aiming a camera of the unmanned aerial vehicle at a component of the components according to a camera pose of the camera poses; and capturing, using the aimed camera, an image of the component.
In some implementations of the method, aiming the camera at the component according to the camera pose comprises: continuously attempting to detect the component within a video feed captured using the camera until the component is centered in images of the video feed.
In some implementations of the method, the method comprises: labeling the image with information associated with one or both of the structure or the flight path.
In some implementations of the method, the method comprises: obtaining user input corresponding to one or more of an object of interest, a traversal height, a flight distance, a maximum speed, an exploration radius, a gimbal angle, or a column path, wherein the user input is used to perform the first phase inspection.
In some implementations, a UAV comprises: one or more cameras; one or more memories; and one or more processors configured to execute instructions stored in the one or more memories to: perform a first phase inspection of a structure to determine, based on one or more images captured using the one or more cameras, a semantic understanding of components associated with the structure and pose information of the components; and perform a second phase inspection of a subset of the components according to a flight path that is based on the components and the pose information.
In some implementations of the UAV, to perform the first phase inspection of the structure to determine the semantic understanding of the components and the pose information, the one or more processors are configured to execute the instructions to: capture the one or more images while all of the structure remains within a field of view of the one or more cameras; and perform image segmentation to detect the components within the one or more images.
In some implementations of the UAV, to detect the components within the one or more images, the one or more processors are configured to execute the instructions to: triangulate locations of the components to determine unique locations in three-dimensional space of the components.
In some implementations of the UAV, to perform the second phase inspection of the subset of the components according to the flight path, the one or more processors are configured to execute the instructions to: aim the one or more cameras at a component; and capture, using the aimed one or more cameras, an image of the component.
In some implementations of the UAV, the one or more processors are configured to execute the instructions to: determine the flight path based on the components and the pose information, wherein the flight path indicates capture points and camera poses associated with the capture points, and wherein the capture points are arranged into a number of columns for the unmanned aerial vehicle to vertically navigate.
In some implementations, a system comprises: an unmanned aerial vehicle; and a user device in communication with the unmanned aerial vehicle, wherein the unmanned aerial vehicle is configured to: perform a first phase inspection of a structure according to user input obtained from the user device to determine a semantic understanding of components associated with the structure and pose information of the components; determine, based on the semantic understanding of the components and the pose information, a flight path indicating a number of columns for the unmanned aerial vehicle to vertically navigate, wherein each column of the number of columns includes one or more capture points each associated with one or more of the components; and perform a second phase inspection according to the flight path.
In some implementations of the system, the first phase inspection is performed while all of the structure is within a field of view of a camera of the unmanned aerial vehicle and the components are determined based on images captured using the camera.
In some implementations of the system, the camera is aimed at a component of the one or more of the components according to a camera pose associated with a respective capture point of the one or more capture points to capture an image of the component during the second phase inspection.
In some implementations of the system, where the number of columns is four columns, the four columns form a rectangular boundary surrounding the structure and the unmanned aerial vehicle navigates about the rectangular boundary including ascending to a traversal height while moving between ones of the four columns, and where the number of columns is two columns, the two columns are at diagonally opposing locations about the structure and the unmanned aerial vehicle navigates between the two columns over the structure.
The disclosure is best understood from the following detailed description when read in conjunction with the accompanying drawings. It is emphasized that, according to common practice, the various features of the drawings are not to-scale. On the contrary, the dimensions of the various features are arbitrarily expanded or reduced for clarity.
The versatility of UAVs has made their use in structural inspection increasingly common in recent years. Personnel of various industries operate UAVs to navigate about structures (e.g., buildings, towers, bridges, pipelines, and utility equipment) and capture visual media indicative of the statuses and conditions thereof. Initially, UAV inspection processes involved the manual-only operation of a UAV, such as via a user device wirelessly communicating with the UAV; however, automated approaches have been more recently used in which a UAV determines a target structure and performs a sophisticated navigation and media capture process to automatically fly around the structure and capture images and/or video thereof. In some such cases, these automated approaches may involve the UAV or a computing device in communication therewith performing a 3D scan of a target structure to generate a high-fidelity 3D geometric reconstruction thereof as part of an inspection process. For example, modeling of the 3D geometric reconstruction may be provided to a UAV operator to enable the UAV operator to identify opportunities for a further inspection of the structure.
However, these 3D scan approaches, although representing meaningful improvements over more manual and semi-manual inspection processes, may not be suitable in all structure inspection situations. In particular, such approaches generally involve high-fidelity processing and thus use large amounts of input media generate the 3D geometric reconstruction. This ultimately involves the capture of large amounts of images or videos over a relatively long period of time. Moreover, the 3D geometric reconstruction is purely geometric in that the reconstruction resulting from the 3D scan approach is limited to geometries of the structure. In some situations, however, a UAV operator may not need high-fidelity data about all of a structure, but may instead want to focus on details related or otherwise limited to certain components of the structure. Similarly, the UAV operator may want to have a semantic understanding of the structure that goes beyond what geometries can convey. Finally, while a UAV may of course be manually operated to focus only on certain aspects of a structure, such manual operation is often labor intensive, solve, imprecise, and non-deterministic and thus offers limited value.
Implementations of this disclosure address problems such as these using multi-phase semantic 3D scan approaches for structure inspection. In particular, multi-phase semantic 3D scan according to the implementations of this disclosure includes a two-phase inspection process in which, during a first phase inspection, a UAV determines a semantic understanding of components associated with a structure under inspection and pose information of the components and, during a second phase inspection, the UAV navigates according to a flight path determined based on the components and pose information to capture images of a subset of the components of the structure. The first phase inspection may be referred to generally as an exploration phase in which structure components of relevance are identified within the inspection scene. The second phase inspection may be referred to generally as an inspection phase in which a detailed inspection limited to those components of relevance is performed. The multi-phase semantic 3D scan approaches disclosed herein involve the unique identification, by type and location, of structure components using triangulation and camera pose information obtained via the first phase inspection. Thus, using the implementations of this disclosure, portions of a structure may be inspected in a time- and media-sensitive manner and in detail beyond what is conveyable by structure geometry.
Generally, the quality of a scan (i.e., an inspection or an operation of an inspection) being “semantic” refers to the scan incorporating information and processing to recognize detailed contextual information for a structure and components thereof. In particular, a UAV performing a semantic 3D scan for a structure inspection may access and leverage comprehensive taxonomies of structures and components, generated empirically and/or via machine learning-based classification and recognition processes modeled across data sets. Using this taxonomical information, the UAV readily identifies components of relevance to a given inspection from amongst a generally or near exhaustive listing of components scanned via the first phase inspection. This semantic understanding of structures and components enables valuable automations for UAV-based structure inspections, improving the accuracy of captured data and materially reducing time and media capture requirements of other scan approaches. For example, a semantic understanding of a structure component may indicate one or more camera poses for a UAV to automatically assume to perform an inspection of (i.e., capture images of) the component, such as based on known angles and views which are desirable for the component.
To describe some implementations in greater detail, reference is first made to examples of hardware and software structures used to implement a semantic 3D scan system.
The UAV 102 is a vehicle which may be controlled autonomously by one or more onboard processing aspects or remotely controlled by an operator, for example, using the controller 104. The UAV 102 may be implemented as one of a number of types of unmanned vehicle configured for aerial operation. For example, the UAV 102 may be a vehicle commonly referred to as a drone but may otherwise be an aircraft configured for flight within a human operator present therein. In particular, the UAV 102 may be a multi-rotor vehicle. For example, the UAV 102 may be lifted and propelled by four fixed-pitch rotors in which positional adjustments in-flight may be achieved by varying the angular velocity of each of those rotors.
The controller 104 is a device configured to control at least some operations associated with the UAV 102. The controller 104 may communicate with the UAV 102 via a wireless communications link (e.g., via a Wi-Fi network, a Bluetooth link, a ZigBee link, or another network or link) to receive video or images and/or to issue commands (e.g., take off, land, follow, manual controls, and/or commands related to conducting an autonomous or semi-autonomous navigation of the UAV 102). The controller 104 may be or include a specialized device. Alternatively, the controller 104 may be or includes a mobile device, for example, a smartphone, tablet, laptop, or other device capable of running software configured to communicate with and at least partially control the UAV 102.
The dock 106 is a structure which may be used for takeoff and/or landing operations of the UAV 102. In particular, the dock 106 may include one or more fiducials usable by the UAV 102 for autonomous takeoff and landing operations. For example, the fiducials may generally include markings which may be detected using one or more sensors of the UAV 102 to guide the UAV 102 from or to a specific position on or in the dock 106. In some implementations, the dock 106 may further include components for charging a battery of the UAV 102 while the UAV 102 is on or in the dock 106. The dock 106 may be a protective enclosure from which the UAV 102 is launched. A location of the dock 106 may correspond to the launch point of the UAV 102.
The server 108 is a remote computing device from which information usable for operation of the UAV 102 may be received and/or to which information obtained at the UAV 102 may be transmitted. For example, the server 108 may be used to train a learning model usable by one or more aspects of the UAV 102 to implement functionality of the UAV 102. In another example, signals including information usable for updating aspects of the UAV 102 may be received from the server 108. The server 108 may communicate with the UAV 102 over a network, for example, the Internet, a local area network, a wide area network, or another public or private network.
In some implementations, the system 100 may include one or more additional components not shown in
An example illustration of a UAV 200, which may, for example, be the UAV 102 shown in
The cradle 402 is configured to hold a UAV. The UAV may be configured for autonomous landing on the cradle 402. The cradle 402 has a funnel geometry shaped to fit a bottom surface of the UAV at a base of the funnel. The tapered sides of the funnel may help to mechanically guide the bottom surface of the UAV into a centered position over the base of the funnel during a landing. For example, corners at the base of the funnel may serve to prevent the aerial vehicle from rotating on the cradle 402 after the bottom surface of the aerial vehicle has settled into the base of the funnel shape of the cradle 402. For example, the fiducial 404 may include an asymmetric pattern that enables robust detection and determination of a pose (i.e., a position and an orientation) of the fiducial 404 relative to the UAV based on an image of the fiducial 404, for example, captured with an image sensor of the UAV.
The conducting contacts 406 are contacts of a battery charger on the cradle 402, positioned at the bottom of the funnel. The dock 400 includes a charger configured to charge a battery of the UAV while the UAV is on the cradle 402. For example, a battery pack of the UAV (e.g., the battery pack 224 shown in
The box 408 is configured to enclose the cradle 402 in a first arrangement and expose the cradle 402 in a second arrangement. The dock 400 may be configured to transition from the first arrangement to the second arrangement automatically by performing steps including opening the door 410 of the box 408 and extending the retractable arm 412 to move the cradle 402 from inside the box 408 to outside of the box 408.
The cradle 402 is positioned at an end of the retractable arm 412. When the retractable arm 412 is extended, the cradle 402 is positioned away from the box 408 of the dock 400, which may reduce or prevent propeller wash from the propellers of a UAV during a landing, thus simplifying the landing operation. The retractable arm 412 may include aerodynamic cowling for redirecting propeller wash to further mitigate the problems of propeller wash during landing. The retractable arm supports the cradle 402 and enables the cradle 402 to be positioned outside the box 408, to facilitate takeoff and landing of a UAV, or inside the box 408, for storage and/or servicing of a UAV.
In some implementations, the dock 400 includes a fiducial 414 on an outer surface of the box 408. The fiducial 404 and the fiducial 414 may be detected and used for visual localization of the UAV in relation the dock 400 to enable a precise landing on the cradle 402. For example, the fiducial 414 may encode data that, when processed, identifies the dock 400, and the fiducial 404 may encode data that, when processed, enables robust detection and determination of a pose (i.e., a position and an orientation) of the fiducial 414 relative to the UAV. The fiducial 414 may be referred to as a first fiducial and the fiducial 404 may be referred to as a second fiducial. The first fiducial may be larger than the second fiducial to facilitate visual localization from farther distances as a UAV approaches the dock 400. For example, the area of the first fiducial may be 25 times the area of the second fiducial.
The dock 400 is shown by example only and is non-limiting as to form and functionality. Thus, other implementations of the dock 400 are possible. For example, other implementations of the dock 400 may be similar or identical to the examples shown and described within U.S. patent application Ser. No. 17/889,991, filed Aug. 31, 2022, the entire disclosure of which is herein incorporated by reference.
The processing apparatus 502 is operable to execute instructions that have been stored in the data storage device 504 or elsewhere. The processing apparatus 502 is a processor with random access memory (RAM) for temporarily storing instructions read from the data storage device 504 or elsewhere while the instructions are being executed. The processing apparatus 502 may include a single processor or multiple processors each having single or multiple processing cores. Alternatively, the processing apparatus 502 may include another type of device, or multiple devices, capable of manipulating or processing data. The processing apparatus 502 may be arranged into processing unit, such as a central processing unit (CPU) or a graphics processing unit (GPU).
The data storage device 504 is a non-volatile information storage device, for example, a solid-state drive, a read-only memory device (ROM), an optical disc, a magnetic disc, or another suitable type of storage device such as a non-transitory computer readable memory. The data storage device 504 may include another type of device, or multiple devices, capable of storing data for retrieval or processing by the processing apparatus 502. The processing apparatus 502 may access and manipulate data stored in the data storage device 504 via the interconnect 514, which may, for example, be a bus or a wired or wireless network (e.g., a vehicle area network).
The sensor interface 506 is configured to control and/or receive data from one or more sensors of the UAV 500. The data may refer, for example, to one or more of temperature measurements, pressure measurements, a global positioning system (GPS) data, acceleration measurements, angular rate measurements, magnetic flux measurements, a visible spectrum image, an infrared image, an image including infrared data and visible spectrum data, and/or other sensor output. For example, the one or more sensors from which the data is generated may include single or multiple of one or more of an image sensor 516, an accelerometer 518, a gyroscope 520, a geolocation sensor 522, a barometer 524, and/or another sensor. In some implementations, the accelerometer 518 and the gyroscope 520 may be combined as an inertial measurement unit (IMU). In some implementations, the sensor interface 506 may implement a serial port protocol (e.g., inter-integrated circuit (I2C) or serial peripheral interface (SPI)) for communications with one or more sensor devices over conductors. In some implementations, the sensor interface 506 may include a wireless interface for communicating with one or more sensor groups via low-power, short-range communications techniques (e.g., using a vehicle area network protocol).
The communications interface 508 facilitates communication with one or more other devices, for example, a paired dock (e.g., the dock 106), a controller (e.g., the controller 104), or another device, for example, a user computing device (e.g., a smartphone, tablet, or other device). The communications interface 508 may include a wireless interface and/or a wired interface. For example, the wireless interface may facilitate communication via a Wi-Fi network, a Bluetooth link, a ZigBee link, or another network or link. In another example, the wired interface may facilitate communication via a serial port (e.g., RS-232 or universal serial bus (USB)). The communications interface 508 further facilitates communication via a network, which may, for example, be the Internet, a local area network, a wide area network, or another public or private network.
The propulsion control interface 510 is used by the processing apparatus to control a propulsion system of the UAV 500 (e.g., including one or more propellers driven by electric motors). For example, the propulsion control interface 510 may include circuitry for converting digital control signals from the processing apparatus 502 to analog control signals for actuators (e.g., electric motors driving respective propellers). In some implementations, the propulsion control interface 510 may implement a serial port protocol (e.g., I2C or SPI) for communications with the processing apparatus 502. In some implementations, the propulsion control interface 510 may include a wireless interface for communicating with one or more motors via low-power, short-range communications (e.g., a vehicle area network protocol).
The user interface 512 allows input and output of information from/to a user. In some implementations, the user interface 512 can include a display, which can be a liquid crystal display (LCD), a light emitting diode (LED) display (e.g., an organic light-emitting diode (OLED) display), or another suitable display. In some such implementations, the user interface 512 may be or include a touchscreen. In some implementations, the user interface 512 may include one or more buttons. In some implementations, the user interface 512 may include a positional input device, such as a touchpad, touchscreen, or the like, or another suitable human or machine interface device.
In some implementations, the UAV 500 may include one or more additional components not shown in
The UAV 602 includes hardware and software that configure the UAV 602 to perform a multi-phase semantic 3D scan of the structure 604. In particular, and in addition to other components as are described with respect to
The components 610 generally are, include, or otherwise refer to components (e.g., objects, elements, pieces, equipment, sub-equipment, tools, or other physical matter) associated with (i.e., on or within) the structure 604. In one non-limiting example, where the structure 604 is a powerline transmission tower, the components 610 may include one or more of an insulator, a static line or connection point, a conductor or overhead wire, a footer, or a transformer). The structure 604 may include any number of types of the components 610 and any number of ones of the components 610 for each of the individual types thereof.
The output of the multi-phase semantic 3D scan inspection of the structure 604 includes at least one or more images, captured using the one or more cameras 608, of a subset of the components 610. The output may be communicated to a user device 612 via a user/UAV interface 614. The user device 612 is a computing device configured to communicate with the UAV 602 wirelessly or by wire. For example, the user device 612 may be one or more of the controller 104 shown in
The UAV 602, via the multi-phase semantic 3D scan software 606, may utilize empirical and/or machine learning-based data modeled for use in structure inspections. In particular, the UAV 602 may communicate with a server 616 that includes a data library 618 usable by the multi-phase semantic 3D scan software 606 to perform a multi-phase semantic 3D scan inspection of the structure 604. The server 616 is a computing device remotely accessible by or otherwise to the UAV 602. The data library 618 may, for example, include one or more of historic inspection data for the structure 604 or like structures, machine learning models (e.g., classification engines comprising trained convolutional or deep neural networks) trained according to inspection image output data sets with user-specific information culled, and/or other information usable by the system 600. In some cases, the data library 618 or other aspects at the server 616 may be accessed by or otherwise using the user device 612 instead of the UAV 602.
To further describe functionality of the multi-phase semantic 3D scan software 606, reference is next made to
The user input processing tool 700 obtains and processes input obtained from an operator of the UAV for the multi-phase semantic 3D scan of the structure (e.g., the structure 604 shown in
As will be described in more detail below, additional user input may be obtained and processed using the user input processing tool 700 following a first inspection phase of the multi-phase semantic 3D scan of the structure. In particular, in some cases, output of the first inspection phase indicating a semantic understanding of components associated with (i.e., on or within) the structure and pose information of the components may be presented to an operator of the UAV (e.g., via the user device or a secondary device in communication with the user device or otherwise registered for use with the UAV). The output of the first inspection phase in such cases is presented to enable the operator of the UAV to indicate ones of the components which they would like the UAV to inspect during a second inspection phase of the multi-phase semantic 3D scan of the structure. For example, a list of components determined via the first inspection phase may be presented within a graphical user interface (GUI) output for display at the user device or a secondary device. The list may include individual entries each corresponding to a different component type. For example, where the structure is a powerline transmission tower, the GUI may include a first entry for insulators, a second entry for connection points, a third entry for conductors, and so on. The entries may each have a selectable or otherwise interactive element (e.g., a checkbox). When such an interactive element is selected or otherwise interacted, the additional user input is obtained, in which the additional user input indicates an intention of the UAV operator to cause the respective components associated with the subject entry to be inspected during a second phase inspection of the multi-phase semantic 3D scan of the structure. However, selections of components in this manner may instead be performed automatically by the UAV or otherwise by a computing device in communication with the UAV (e.g., using intelligence of detected components independently, relative to the structure, or in view of an inspection purpose for the structure (e.g., known issues resulting in the performance of the multi-phase semantic 3D scan of the structure)). Thus, the additional user input as described herein may be limited to manual processes for indicating components rather than automated approaches for indicating components.
In some cases, the additional user input may be provided partially before and partially after performance of the second-phase inspection. For example, after the second-phase inspection begins, the user input processing tool 700 may receive, via a GUI output for display at the user device or a secondary device, input indicating control values for capturing the one or more images during the second phase inspection. The control values are generally values usable by the UAV and/or the multi-phase semantic 3D scan software 606 to control some aspect of the second phase inspection and may, in non-limiting examples, refer to one or more of gimbal angles for the UAV camera(s), viewpoints of the UAV relative to components to inspect, or incline values indicating how the UAV is to approach the component in one or more directions.
The structure component determination tool 702 determines semantic understandings of components associated with (i.e., on or within) the structure to inspect and pose information of the components. As will be described in more detail below, a first inspection phase of the multi-phase semantic 3D scan of the structure includes the UAV capturing one or more images (i.e., exploration images) of the structure while navigating about an exploration path around some or all of the structure. The structure component determination tool 702 obtains, as input, those one or more images captured during the first inspection phase and processes them to determine, as output, semantic understandings of the components to inspect and the pose information of the components. In particular, the structure component determination tool 702 processes an image captured by the UAV during the first inspection phase to detect one or more objects, as one or more components, within the image and to determine pose information of those one or more components using navigation system information of the UAV.
Detecting the one or more objects within the image as the one or more components includes performing object detection against the image to identify a bounding box for each component within the image. Objects detected within the image are thus represented via their bounding box, and object recognition is performed against the bounded objects to identify them as components of the structure, as well as to identify what components they are. In some cases, performing object recognition includes comparing objects detected within the images against modeled object data to determine whether an object appears as expected. The objects detected and recognized are identified as components and information indicating the components is stored in connection with the bounding boxes of the components. In some cases, image segmentation may be performed instead of object detection and object recognition. In some cases, other computer vision techniques may be performed instead of either of the above approaches.
Determining the pose information of the one or more components detected within the image using the navigation system information of the UAV includes determining orientations and/or locations of the detected components based on their bounding boxes. The pose information of the components thus corresponds to or otherwise represents the orientations and/or locations of the different components within a scene of the structure. In particular, the pose information of a given component may identify sides, facets, surfaces, or like qualities of the component independently and with relation to others. For example, the pose information may identify a front of a component such that a location of the front of the component may be identifiable. In some cases, the pose information of a component may be based on the type of the component. For example, because inspectors of powerline transmission towers generally need to view insulators from their top and bottom sides, the pose information for an insulator component may include a top pose above the insulator and a bottom pose below the insulator.
Because it is important that objects be uniquely identified to prevent multiple of the same component from being confused as being the same exact component, the structure component determination tool 702, as part of the component and pose information determination, performs a triangulation process to ensure that specific component location information is known. In particular, locations of bounding boxes for detected components may be denoted using location data according to one or more of a visual positioning system (VPS) of the UAV, GPS, or another location-based system. The location data for bounding boxes of the detected components may be compared to determine duplicate components, in which a duplicate component is a component that has already been detected in a previous image. Duplicate components are accordingly culled to prevent them from being considered for scanning multiple times.
The flight path determination tool 704 determines a flight path for the UAV to navigate within the scene of the structure based on the components and the pose information of the components, as determined using the structure component determination tool 702. In particular, the flight path includes capture points (e.g., waypoints) to which the UAV will navigate for capturing one or more images of individual components of the components during the second phase inspection of the structure. The capture points are expressed as a sequence within the flight path according to a trajectory determined using the flight path determination tool 704. The capture points are generally arranged into a number of columns, in which a column generally refers to a vertical airspace through which the UAV will ascend, descend, and/or otherwise navigate to capture one or more images of one or more components visible from that column according to the respective pose information. The specific size of a column is non-limiting, however generally a height of a column will exceed a width of the column. The flight path thus causes the UAV to navigate between all columns during the second phase inspection.
Determining the flight path for the UAV to navigate includes determining a trajectory for the UAV to navigate proximate to locations of the bounding boxes determined using the structure component determination tool 702. Determining the trajectory includes determining a number of columns for the UAV to vertically navigate based on the locations of the components according to location information identified for their respective bounding boxes, in which each column of the number of columns includes one or more of the capture points. In particular, a bounding box determined for a component may identify a VPS-based, GPS-based, or other location for the component within the scene of the structure. The location of a bounding box may be recorded within data or metadata associated with the bounding box. Thus, the flight path determination tool 704 obtains the location information for each of the bounding boxes (i.e., for each of the components) and determines the trajectory accordingly.
The trajectory generally refers to a manner by which the UAV may fly to visually access each of the components to inspect during the second phase inspection. The flight path determination tool 704 uses the pose information for the components to determine locations at which the UAV may visually access respective ones of the components. Visual access as used herein generally refers to the UAV navigating to a location about a structure at which a subject component may be positioned entirely within a field of view of a camera of the UAV. The visual access may be based on specific pose information for the subject component, for example, to ensure that a specific facet of the component is visible in which the location of that facet is known based on the pose information of the component.
To ensure inspection efficiency, however, recognizing that pose information may express visual access for a given component at multiple possible locations, determining the flight path includes determining a number of columns for the UAV to vertically navigate. The number of columns are determined to wholly or partially surround some or all of the structure. For example, the number of columns may be two or four. The particular number of columns to use for a given inspection may be dependent upon the location of the structure relative to obstructions or other structures in the scene, the type of the structure, and/or other factors. The use of columns for vertical navigation restricts horizontal traversal of the UAV proximate to the structure for the safety and security of the UAV and the structure.
The flight path determination tool 704 determines the number of columns according to the trajectory and thus the locations of the components and the pose information thereof. For example, flight path mapping software as is commonly used in UAV inspection systems may be used to determine an optimal trajectory according to the location and pose information of the components. The number of columns may thus be automatically determined based on the output of such flight path mapping functionality. In some cases, the number of columns may be determined based on user input processed using the user input processing tool 700. In either case, the locations of the columns are also determined according to the number of columns. For example, where two columns may be used to visually access all necessary components to inspect during the second phase inspection, the two columns may be located at opposing diagonal ends of the structure. In another example, where four columns may be used to visually access all such components, the four columns may form a rectangular boundary around the relevant part or whole of the structure and thus be located at corners of that rectangular boundary.
The flight path determination tool 704 then uses the number of columns and the trajectory to determine the flight path for the second phase inspection of the structure. In particular, the flight path generally corresponds to the trajectory but aligns with the locations of the columns to ensure that the UAV efficiently navigates between the columns. The flight path generally will cause the UAV to navigate between the columns at a certain altitude above the structure and then to descend to each of one or more capture point locations associated with the column to enable a capture of images of respective components visibly accessibly at the subject column. The flight path may generally begin at a first column and end at a last column. For example, navigating the flight path may include the UAV flying from an initial location to a location of the first column to begin the flight path.
The flight path determination tool 704 further determines camera poses for a camera of the UAV, to correspond with the capture points of the flight path. In particular, a camera pose represents a manner by which to orient a camera of the UAV in order to capture an image of a subject component at a subject capture point. The camera pose data may be embedded within the flight path using VPS, GPS, or another location-based system.
In some cases, an operator of the UAV, via a GUI output for display at the user device or a secondary device, may alter the flight path after it is determined. For example, the GUI may provide a graphical representation of the flight path about the scene of the structure and may include one or more user interface tools configured to enable a modification of some or all of the flight path. In one non-limiting example, a touch interface or click-and-drag approach may be used to alter one or more lines of the flight path.
In some cases, the flight path determination tool 704 may initiate the determination of the flight path using a template associated with the structure of a type of the structure. For example, the template may include capture points in a number of columns that are most commonly used for the specific structure under inspection or for structures of the same type as that specific structure. In some such cases, the operator of the UAV may modify some or all of the templatized flight path in the manner described above.
While the above describes the functionality of the flight path determination tool 704 with respect to all components determined by the structure component determination tool 702, in some cases, the flight path determination tool 704 determines a flight path for only a subset of those components. For example, the user input processed by the user input processing tool 700 may specify only certain types of components or only certain components to inspect. The flight path determination tool 704 may thus reference the components determined using the structure component determination tool 702 against the components represented within that user input to determine a subset of the components for which to determine the flight path. In some such cases, the subset of components for which to determine the flight path may also or instead be determined based on additional user input obtained using the user input processing tool 700 (e.g., after the first inspection phase of the multi-phase semantic 3D scan of the structure). In another example, an automated determination of a subset of the components determined using the component determination tool 702 may be made based on historical scan data for the structure or similar structures, information related to the structure or similar structures, and/or like factors.
The semantic 3D scan inspection tool 706 causes a performance of a multi-phase inspection of the structure, including a first phase inspection performed to determine a semantic understanding of the components and the pose information of the components and a second phase inspection performed according to the flight path to determine output associated with some or all of those components. In particular, to cause a performance of the first phase inspection, the semantic 3D scan inspection tool 706 causes a navigation system of the UAV (e.g., via the propulsion control interface 510 shown in
To begin the first phase inspection, a location of the structure is provided via VPS, GPS, or a like location service. The semantic 3D scan inspection tool 706 instructs the navigation system of the UAV to navigate toward the location of the structure at an altitude above the structure while instructing one or more cameras of the UAV to capture images and/or video of the structure with the structure visible and centered within a field of view of the one or more cameras. The UAV accordingly navigates toward the structure with the entire structure visible and centered within the field of view of the one or more cameras until the structure is large enough to cover a threshold percentage (e.g., 50 percent) of the images or video captured by the one or more cameras (a threshold percentage of the pixels of the images or video frames). Based on a determination that the structure covers the threshold percentage of the images or video, the UAV begins its orbit around the structure while continuing to capture images or video (i.e., exploration images) until all portions of the structure have been captured within at least one image or video frame. Those images or video captured while the UAV orbits around the structure are then provided to the structure component determination tool 702 for use in determining the components of the structure and pose information of the components.
To cause a performance of the second phase inspection, the semantic 3D scan inspection tool 706 causes the navigation system of the UAV to fly the UAV according to the flight path determined using the flight path determination tool 704. When the UAV arrives at a given capture point of the flight path, the semantic 3D scan inspection tool 706 causes the UAV, via the one or more cameras of the UAV, to perform a fine-aiming process by which a camera of the UAV is aimed at a component of the structure according to a camera pose represented within or otherwise by the flight path. The semantic 3D scan inspection tool 706 determines whether the component is centered within a field of view of the camera (e.g., based on the component covering a threshold percentage (e.g., fifty percent) of the field of view). Based on a determination that the component is centered within the field of view of the camera, the semantic 3D scan inspection tool 706 causes the camera to capture an image of the component. The UAV then navigates to a next capture point of the flight path and continues to capture images of components and navigate along the flight path until a last capture point is reached and an image is captured thereat. At such a time, the UAV is determined to have completed the flight path and may return to its dock or to another location for further action.
The output of the second phase inspection includes the images of the components captured at the capture points of the flight path (i.e., inspection images). Those images are labeled with information associated with one or both of the structure or the flight path. For example, the images may be embedded with data or metadata identifying a location of the structure, a name or other identifier associated with the structure, a name or other identifier of respective components depicted within the images, columns of the flight path in which the UAV was located when the images were captured, a date of the inspection, and/or like information.
The multi-phase semantic 3D scan software 606 is shown and described as being run (e.g., executed, interpreted, or the like) at a UAV used to perform an inspection of a structure. However, in some cases, the multi-phase semantic 3D scan software 606 or one or more of the tools 700 through 706 may be run other than at the UAV. For example, one or more of the tools 700 through 706 may be run at a user device or a server device and the output thereof may be communicated to the UAV for processing.
In some implementations, trajectories other than columns may be used to determine the flight path and to navigate according to the flight path during the inspection phase. For example, the flight path, rather than being based on columns, may instead include an orbit trajectory such as a lawnmower pattern, zig-zag pattern, horizontal line pattern, oblique line pattern, or even random waypoints. The particular type of trajectory may be based on the structure type, and thus the use of columns or the non-use of columns may be based on the structure type. Implementations of this disclosure that address columns may thus be understood to also address alternative trajectories as set forth herein.
Referring next to
To further describe some implementations in greater detail, reference is next made to examples of techniques for semantic 3D scan for multi-phase structure inspection.
For simplicity of explanation, the technique 1000 is depicted and described herein as a series of steps or operations. However, the steps or operations in accordance with this disclosure can occur in various orders and/or concurrently. Additionally, other steps or operations not presented and described herein may be used. Furthermore, not all illustrated steps or operations may be required to implement a technique in accordance with the disclosed subject matter.
At 1002, user input associated with a structure to inspect is obtained. The user input corresponds to one or more configurations for the inspection of the structure. For example, the user input may correspond to one or more of an object of interest, a traversal height, a flight distance, a maximum speed, an exploration radius, a gimbal angle, or a column path, wherein the user input is used to perform the first phase inspection. The user input may, for example, be obtained from a user device in communication (e.g., wireless communication) with a UAV to use to perform the inspection.
At 1004, a first phase inspection of the structure is performed using the UAV to determine a semantic understanding of components of the structure and pose information of the components. Performing the first phase inspection of the structure includes navigating the UAV to a distance from the structure such that all of the structure is within a field of view of a camera of the UAV. Performing the first phase inspection of the structure further includes detecting the components using a camera of the unmanned aerial vehicle and storing data indicative of the detected components on two-dimensional images associated with associated poses of the camera. That is, once the structure is centered within the field of view of the camera of the UAV, and based on a determination that the structure covers a threshold percentage of the field of view, one or more images of the structure are captured using the camera. Thereafter, the components of the structure are detected within the one or more images, and data indicative of the detected components is stored for later use during the multi-phase inspection. For example, the components may be detected within the one or more images using a computer vision technique or multiple such techniques, such as one or more of object detection, object recognition, or image segmentation. Detecting the components may further, for example, include triangulating, using the UAV, locations of the components, such as to determine unique locations of the components in 3D space.
At 1006, a flight path is determined based on the components and the pose information. The flight path indicates capture points and camera poses associated with the capture points. For example, determining the flight path may include determining a number of columns for the unmanned aerial vehicle to vertically navigate, wherein each column of the number of columns includes one or more of the capture points. In non-limiting examples, the number of columns may be two or four.
At 1008, a second phase inspection of the structure is performed using the UAV according to the flight path. The second phase inspection may be performed for all components detected within the one or more images captured during the first phase inspection or for a subset of those components. Performing the second phase inspection may include navigating the UAV in a particular manner based on the number of columns of the flight path. For example, where the number of columns is four columns, the four columns form a rectangular boundary surrounding the structure, and performing the second phase inspection of the subset of the components according to the flight path may include navigating the UAV about the rectangular boundary including ascending to a traversal height while moving between ones of the four columns. In another example, where the number of columns is two, the two columns are at diagonally opposing locations about the structure, and performing the second phase inspection of the subset of the components according to the flight path may include navigating the UAV between the two columns over the structure. In either case, performing the second phase inspection according to the flight path includes, while the UAV is at a capture point of the capture points, aiming a camera of the UAV at a component of the components according to a camera pose of the camera poses, and capturing, using the aimed camera, an image of the component. For example, aiming the camera at the component according to the camera pose may include continuously attempting to detect the component within a video feed captured using the camera until the component is centered in images of the video feed.
At 1010, inspection data produced based on the second phase inspection is output. The inspection data may, for example, include one or more images captured during the second phase inspection. In some cases, the one or more images of the inspection data may be labeled with information associated with one or both of the structure or the flight path. For example, labeling the one or more images of the inspection data may include introducing metadata (e.g., EXIF metadata) on files of the one or more images. Alternatively, labeling the one or more images of the inspection data may include moving files of the one or more images into folders of a storage system (e.g., at the UAV, user device, or server), such that the labeling corresponds to the corresponding images being labeled according to a name of their respective folders.
In some implementations, the exploration phase and inspection phase may be collectively referred to as a first inspection, and a second inspection using the UAV may be performed or at least initiated while the first inspection remains in progress. For example, an exploration phase of a second inspection may be interleaved with the inspection phase of the first inspection. In such a case, the exploration of next structure components may be performed at the same time as the inspection of the current structure components.
The implementations of this disclosure can be described in terms of functional block components and various processing operations. Such functional block components can be realized by a number of hardware or software components that perform the specified functions. For example, the disclosed implementations can employ various integrated circuit components (e.g., memory elements, processing elements, logic elements, look-up tables, and the like), which can carry out a variety of functions under the control of one or more microprocessors or other control devices.
Similarly, where the elements of the disclosed implementations are implemented using software programming or software elements, the systems and techniques can be implemented with a programming or scripting language, such as C, C++, Java, JavaScript, assembler, or the like, with the various algorithms being implemented with a combination of data structures, objects, processes, routines, or other programming elements.
Functional aspects can be implemented in algorithms that execute on one or more processors. Furthermore, the implementations of the systems and techniques disclosed herein could employ a number of conventional techniques for electronics configuration, signal processing or control, data processing, and the like. The words “mechanism” and “component” are used broadly and are not limited to mechanical or physical implementations, but can include software routines in conjunction with processors, etc. Likewise, the terms “system” or “tool” as used herein and in the figures, but in any event based on their context, may be understood as corresponding to a functional unit implemented using software, hardware (e.g., an integrated circuit, such as an ASIC), or a combination of software and hardware. In certain contexts, such systems or mechanisms may be understood to be a processor-implemented software system or processor-implemented software mechanism that is part of or callable by an executable program, which may itself be wholly or partly composed of such linked systems or mechanisms.
Implementations or portions of implementations of the above disclosure can take the form of a computer program product accessible from, for example, a computer-usable or computer-readable medium. A computer-usable or computer-readable medium can be a device that can, for example, tangibly contain, store, communicate, or transport a program or data structure for use by or in connection with a processor. The medium can be, for example, an electronic, magnetic, optical, electromagnetic, or semiconductor device.
Other suitable mediums are also available. Such computer-usable or computer-readable media can be referred to as non-transitory memory or media, and can include volatile memory or non-volatile memory that can change over time. A memory of an apparatus described herein, unless otherwise specified, does not have to be physically contained by the apparatus, but is one that can be accessed remotely by the apparatus, and does not have to be contiguous with other memory that might be physically contained by the apparatus.
While the disclosure has been described in connection with certain implementations, it is to be understood that the disclosure is not to be limited to the disclosed implementations but, on the contrary, is intended to cover various modifications and equivalent arrangements included within the scope of the appended claims, which scope is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures as is permitted under the law.
This application claims the benefit of U.S. Provisional Application Ser. No. 63/539,357, filed Sep. 20, 2023, the entire disclosure of which is herein incorporated by reference.
Number | Date | Country | |
---|---|---|---|
63539357 | Sep 2023 | US |