This patent application claims the benefit and priority of Chinese Patent Application No. 202110268931.X, entitled “method for restoring video data of drainage pipe based on computer vision” filed with the China National Intellectual Property Administration on Mar. 12, 2021, the disclosure of which is incorporated by reference herein in its entirety as part of the present application.
The present disclosure relates to the technical field of image and video processing, and in particular, to an image restoration technique based on computer vision, including target detection and straight line detection.
Detection and repair of pipe defects would be an important part of urban construction, which has become a research hotspot in computer vision. Unfortunately, it is very difficult to obtain high-quality video data of a pipe. At present, pipe detection mainly relies on a pipe robot equipped with a high-definition camera to obtain internal data of a pipe. The mainstream pipe robots on the market are generally pulled by ropes to go forward, and therefore, the ropes will inevitably appear in the video data of pipes. The ropes will seriously interfere with the identification of defects in edge information of a pipe. In view of the above-mentioned problem in the video data of the pipe, the present disclosure focuses on the research of a video data restoration algorithm based on computer vision, in which scale invariant feature transform (SIFT) corner detection is used to recognize a target and Hough transform is used to detect the ropes, and image restoration is performed on the recognized area. This method can effectively eliminate the interference of ropes and an iron chain in the video data of the pipe.
At present, there are some problems in the field of pipe defect recognition, such as interference from a power unit of a robot in video data of a pipe. As a result, in an actual complex pipe environment, defect features of the pipe are easily affected by a change of the power unit, and subsequent pipe detection cannot be performed efficiently, thereby increasing detection costs.
The present disclosure provides a method for restoring video data of a pipe based on computer vision, which is applicable to the field of pipe defect detection and repair.
In view of the above-mentioned problems in the conventional art, by combining an SIFT corner detection algorithm, a Hough transform straight line detection algorithm, and Telea's fast marching method (FMM) image restoration algorithm in the current computer vision field, the present disclosure provides a method for restoring video data of a pipe based on computer vision. This method can effectively eliminate an interference of a rope power source in the video data of a pipe, thereby improving the quality of the data and significantly enhancing the efficiency of pipe corrosion detection, and the method is applicable to the field of urban drainage pipe maintenance.
To achieve the above effect, the present disclosure provides the following technical solution:
Step (1), a pipe robot is controlled to obtain pipe images/videos in a pipe, and gray stretching and smoothing filtering are performed on the pipe images/videos.
Step (2), a clear frame of data is selected to extract an iron chain from the data as a template.
Step (3), target detection is performed on the iron chain in the center of the pipe in the video data by using an SIFT algorithm to determine a position of the iron chain.
Step (4), ropes on left and right sides of the iron chain are detected by using Hough transform to determine positions of the ropes.
Step (5), pixels with a gray value of 0 are used to cover the located iron chain and ropes.
Step (6), the video data is restored by using Telea's FMM image restoration algorithm.
Step (7), the restored video data is obtained.
The present disclosure has the following advantages: the method can effectively eliminate the interference of the iron chain and ropes in the video data of a pipe, thereby improving efficiency of subsequent pipe defect recognition, and thus having a certain reference value for pipe defect detection.
The present disclosure will be further described below in conjunction with the accompanying drawings and embodiments.
To make the objective, technical solutions and effects of the present disclosure clearer and more comprehensible, the present disclosure will be described in more detail with reference to the accompanying drawings and embodiments, but these embodiments should not be construed as a limitation to the present disclosure.
As shown in
S1010: a pipe robot with a high-definition camera enters a pipe to collect image/video information of a pipe, and gray stretching is performed on the collected pipe image/video. Contrast of the pipe image is enhanced to make light and shade contrast of the pipe image more distinct and features more obvious. A gray value f(x,y) of each pixel (x,y) in an input image is used as an independent variable of a function, and H denotes a transform operation performed on f(x,y) in the spatial domain to increase or reduce the gray value thereof, and thus a dependent variable is obtained as a gray value g(x,y) in an output image. Equation (1) is specifically as follows:
g(x,y)=H[f(x,y)] (1)
Spatial smoothing filtering enhancement is performed on a gray image by using a neighborhood averaging method of a spatial domain method, thereby eliminating jagged contours due to uneven light, local highlighting, and metal reflection caused by a point light source in a real pipe environment. The weight of each pixel is equal in the neighborhood averaging method, that is, importance of each pixel is assumed to be the same, and equation (2) is specifically as follows:
where s is a set of pixel coordinates in a neighborhood of (x,y), while (i,j) is coordinates of a pixel in the neighborhood and M is the number of pixels in the set s. A resulting preprocessed image is as shown in
S1110: an iron chain is extracted from a center of video data as a template for target recognition. An image of the iron chain is intercepted in the center of the video data image after the image is preprocessed, as shown in
S1120: target detection is performed on all data by using an SIFT corner detection algorithm, and the iron chain at the center is found and located.
SIFT is short for scale-invariant feature transform, which is a description used in the field of image processing. This description is scale-invariant, which allows detection of key points in an image, and is a local feature descriptor. The SIFT has good stability and invariance, and is adaptable to rotation, scaling, and variable brightness, and capable of avoiding interference of variable viewing angle, affine transformation and noise to a certain extent.
The SIFT algorithm uses a Gaussian kernel function to perform filtering when constructing a scale space, so that an original image preserves the most detail features, and the detail features are gradually reduced after Gaussian filtering to simulate feature representation in a large scale situation. L(x,y,σ) is defined as a convolution operation of the original image I(x,y) and a scale-variable two-dimensional Gaussian function G(x,y,σ).
As shown in equations (3) and (4), (x,y) represents a pixel position in the image, and m, n represent dimensions of a Gaussian template; σ represents a scale space factor, and the smaller the value of the scale space factor, the less the image is smoothed, and the smaller the corresponding scale; a large scale corresponds to overview features of the image, while a small scale corresponds to detail features of the image; and * represents the convolution operation.
Extreme points are found out based on scale invariance, and a reference direction needs to be assigned to each key point based on local features of the image, so that the descriptor is invariant to rotation of the image. For key points detected in a difference of Gaussian (DOG) pyramid, gradient and direction distribution features of pixels in a neighborhood window of Gaussian pyramid image in which such key points are located are collected. A module value m(x,y) and a direction σ(x,y) of the gradient are as shown in equations (5) and (6):
This algorithm uses a gradient histogram statistical method to count image pixels in a particular area with a key point as an origin to determine a direction of the key point. After completing the gradient calculation of the key points, a histogram is used to show the gradients and directions of pixels in the neighborhood. A peak direction of the histogram represents a main direction of the key points, while a peak of a direction histogram represents a direction of a neighborhood gradient at this feature point and a maximum value in the histogram is taken as the main direction of the key points. To enhance robustness of matching, only directions in which peaks are greater than 80% of the peak of the main direction are kept as auxiliary directions of the key points.
The SIFT corner detection is used to perform target detection on the data, with an effect view of target detection as shown in
S1130: positions of ropes are detected by using Hough transform after a position of the center of the data is obtained. A main principle is as follows: all straight lines f(x)kx+b (k representing a straight slope and b representing y-intercept) that possibly pass through each pixel point (x0,y0) at an edge are mapped into a Hough space, and then appropriate positions are selected. As a straight line perpendicular to x-axis does not have the slope, it cannot be expressed based on the slope, and thus is expressed by a parametric equation r=x*cos(θ)+y*sin(θ), where (x,y) represents a pixel point at an edge, while r represents a distance between a straight line passing through this point and the origin, and θ represents an included angle between r and the positive x-axis. Voting is performed in the Hough space after mapping of each edge point, and 1 is added to a pixel value of the edge point (x,y) every time a straight line equation satisfies this point (r,θ).
After the ropes are detected by using the Hough transform, the resulting image is as shown in
S1140: the positions of the central iron chain and the ropes on two sides are obtained after the above two steps. The positions of the iron chain and the ropes are then covered with pixels having a gray value of 0, thereby getting ready for restoration. A resulting image is as shown in
S1150: the data is restored by Telea's FMM (fast marching method) image restoration algorithm.
The fast marching restoration algorithm is a fast time-sensitive image restoration method. The basic idea of this algorithm is to start restoration from the edge pixels of a damaged area, gradually march to the pixels within the damaged area and finally complete the whole restoration process. Several parameters are defined first: Ω is defined as the damaged area of the image, and ∂Ω is defined as a boundary where the damaged area is in contact with an undamaged area. The nature of fast marching is to obtain distances T between all pixel points in the area Ω and the boundary ∂Ω; a sequence of marching is determined according to the magnitude of T, and restoration is continued until all pixels within f) are processed. The basic principle of the FMM algorithm is as shown in
For a damaged point P in ∂Ω, an area Bε(p) with width ε on an outside of the boundary is created, and a gray value of pixel point p in this area is calculated based on the gray values of all known pixel points q according to the following equation:
R
q(p)=R(q)+∇R(q)(p−q) (7)
where R(q) and ∇R(q) represent the gray value and the gradient value of the known pixel point q, respectively; and obviously, the gray value of point P needs to be calculated through substitutions of parameters of all undamaged points in the area Bε(p). These undamaged pixel points in the area have different weights in the whole operation process, and the respective weights are obtained by using a weighting calculation equation (8):
where w(p,q) represents a weight function for a pixel which is used to determine a contribution of each pixel in the domain Bε(p). w(p,q) refers to an iso-illuminance parameter of the damaged point p and is related to a geometric distance parameter between two points. This processing approach retains an extension of regional image structure data to a certain extent during updating and calculation of parameters of the damaged point p. The function is defined as equation (9):
w(p,q)=dir(p,q)*dst(p,q)*lev(p,q) (9)
where dir(p,q) represents a texture direction constraint, dst(p,q) represents a geometric distance constraint and lev(p,q) represents a level set constraint. dir(p,q) reflects correlation between point P and point q in the texture direction, and the more approximate the two points in texture, the greater the weight. dst(p,q) reflects correlation of a geometric distance between point P and point q, and obviously, the smaller this value, the greater the weight. lev(p,q) reflects an influence of information arrival, and the weight is greater when it is closer to known information.
The three constraint conditions are as shown in equations (10):
where d0 and T0, as a distance constraint parameter and a level set constraint parameter, are generally set to 1. dir(p,q) ensures a greater contribution of a known pixel point of N=∇T when closer to a normal direction, and N(p) represents the texture direction of point p. dst(p,q) ensures a greater weight of a known point closer to damaged point P in gray updating and calculation thereof. lev(p,q) ensures a greater contribution of a known point closer to the boundary outside the same boundary ∂Ω.
A direction of an iso-illuminance curve of the FMM algorithm is updated according to a calculation of a domain T. To ensure that the restoration is started from an initial boundary M and to eliminate interference of a large number of irrelevant internal pixels far away from the boundary, a distance domain T needs to be calculated on two sides of the initial boundary M. As described above, the gray value of pixel point P is calculated based on the known pixels in the domain Bε(p), and then, a set −Tout of points outside the boundary is calculated outside the boundary area M within a restricted range of T≤ε, and similarly, a set Tin of internal points relative to the boundary is calculated inside the boundary area ∂Ω, thereby defining the whole T domain and guaranteeing that the restoration calculation of FMM is performed on a narrow edge with a width of ε on an outside of the boundary ∂Ω. The T domain of the whole image is then defined as:
For the value of ε of the selected domain Bε(p), 3-10 pixel points are usually selected for a good effect, thereby achieving balance between a restoration rate and a restoration effect.
After the image is restored by using the FMM algorithm, the resulting restored image is as shown in
The specific embodiments described herein are merely intended to illustrate the spirit of the present disclosure by way of example. A person skilled in the art can make various modifications or supplements to the specific embodiments described or replace them in a similar manner, but it may not depart from the spirit of the present disclosure or the scope defined by the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
202110268931.X | Mar 2021 | CN | national |