Method for measuring antenna downtilt angle based on multi-scale deep semantic segmentation network

Description

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a national stage application under 35 U.S.C. 371 of PCT Application No. PCT/CN2019/076718, filed on 1 Mar. 2019, which PCT application claimed the benefit of Chinese Patent Application No. 2018113384154, filed on 9 Nov. 2018, the entire disclosure of each of which are hereby incorporated herein by reference.

TECHNICAL FIELD

The present disclosure relates to the field of mobile communication, and in particular, to a method for measuring an antenna downtilt angle based on a multi-scale deep semantic segmentation network.

BACKGROUND

Nowadays, in the era of network information, the quality of mobile communication networks is extremely important. In GSM-R construction and planning, as shown in FIG. 1, an azimuth angle and a downtilt angle of an antenna affect the coverage of signals and the interference between the signals, so the antenna needs to be timely calculated and adjusted strictly to improve the quality of network signals.

There are two traditional methods to measure the antenna downtilt angle: the first one is climbing to an antenna base station manually and using a measuring instrument (a compass, a slope meter, or the like) for measurement; and the second one is installing an angle sensor on the antenna to return data. The antenna is susceptible to wind, snow and other factors, resulting in a change in the downtilt angle, so it needs to be measured regularly. For the first method, as the base station is high and the number of the antennas is larger, the manual safety hazard and workload are larger, and the practicability is low. For the second method, the installation time is long, and the antenna models are different, so the installation cost of the instruments is high and the practicability is not high. Both the two methods consume a lot of manpower and material resources and are not suitable for large-scale measurement today.

SUMMARY

To solve the above problems, the present disclosure aims at providing a method for measuring an antenna downtilt angle based on a multi-scale deep semantic segmentation network. The method for measuring a downtilt angle of a mobile base station antenna by calling a target detection algorithm and a semantic segmentation algorithm and using an unmanned aerial vehicle as a carrier is highly applicable, cost-effective, and safe.

The technical scheme adopted by the present disclosure to solve the problems is as follows:

An antenna downtilt angle measuring method based on a multi-scale deep semantic segmentation network, including:

collecting image data: base station antenna data is collected by using an unmanned aerial vehicle and antenna images collected are taken as a data set;

predicting a target bounding box: a target antenna in the data set is positioned, and a bounding box is predicted by logistic regression;

performing target recognition and semantic segmentation: target features of the target antenna in the data set are extracted, the target features are learned and processed by an activation function, a target image is output for semantic image segmentation, and pixel points of the target image and the background are classified; and

calculating an antenna downtilt angle: the width and height of an antenna box are obtained according to a border of the target image to calculate the antenna downtilt angle.

Further, the collecting image data includes:

locating the unmanned aerial vehicle on the top of a pole of a base station antenna, and recording the longitude and latitude (L0, W0) of the pole in the vertical direction; causing the unmanned aerial vehicle to fly around a point of the base station antenna, setting a flight radius of the unmanned aerial vehicle, and the unmanned aerial vehicle moving around the pole along the radius on the same horizontal plane to acquire antenna images with different attitudes and angles of a mobile base station antenna as a data set.

Further, the predicting a target bounding box includes:

positioning a target antenna in the antenna image, predicting a bounding box by logistic regression, first dividing the entire antenna image into N*N grids, predicting the entire antenna image after the antenna image is input, scanning each grid at a time, and starting to predict the target antenna when the center of the grid where the target antenna is located is positioned, wherein 4 coordinate values predicted for each bounding box are t_x, t_y, t_w, and t_h, respectively, an upper-left offset of each target cell is (c_x, c_y), box heights of prior bounding boxes are p_x, p_yrespectively, and the network predicts their values as:

b_x=σ(t_x)+c_x (1)
b_y=σ(t_y)+c_y (2)
b_w=p_we^t^w (3)
b_h=p_he^t^h (4)

where σ(·) denotes the activation function, which can be expressed as:

σ(x)=1/1+e^−x

where p_w, p_hdenote the width and height of the prior bounding boxes respectively, e denotes the natural constant, which is about equal to 2.71828;

where b_x, b_y, b_w, b_hcan be calculated according the above formulas, wherein b_wand b_hdenote the width and the height of the bounding boxes respectively,

where the input antenna image is divided into N*N grids, each grid includes five predictors (x, y, w, h, confidence) and a c class, and the output of the network is of a size of S*S*(5*B+C); B is the number of the bounding boxes in each grid, C means the class is only antenna in the present disclosure, and thus is 1, and confidence represents that the predicted grid includes two pieces of information, i.e., confidence of the target antenna and prediction accuracy of the bounding box:

confidence=Pr(object)*IOU_prd^truth (5)

where IOU_prd^truthdenotes Intersection over Union between the bounding boxes and the prior bounding boxes, and where a threshold is set to 0.5 when Pr(Object)=1; the target antenna falls in the center of the grid, that is, the bounding box currently predicted coincides with an actual background box object better than before; if the predicted bounding box is not the best currently, the bounding box is not predicted when the threshold is smaller than 0.5, and it is determined that the target antenna does not fall into the grid.

Further, the performing target recognition and semantic segmentation includes:

performing target recognition by using a network convolutional layer for feature extraction: antenna image pixel 416*416 is input, the channel number is 3, there are 32 layers of convolution kernels, each kernel has a size of 3*3, 32 layers of convolution kernels are used to learn 32 feature maps, and for color differences of the target antenna, features of the target antenna are learned by using different convolution kernels; convolutional layer up-sampling is performed during feature extraction, and a prediction formula for object classes is as follows:

Pr(Class_i|object)*Pr(object)*Pr(object)*IOU_pred^truth=Pr(object)*IOU_pred^truth (6)

where Pr(Classi|object) is an object class probability;

then applying the activation function by logistic regression:

$\begin{matrix} f (x) = \frac{1}{1 + e^{- x}} & (7) \end{matrix}$

a predicted target output range is made between 0 and 1, the antenna image is processed by the activation function after feature extraction, and when the output value is greater than 0.5, the object is determined as an antenna;

then performing semantic image segmentation on the antenna image by using a deep convolutional network, and classifying the pixel points of the target image and the background:

after the target image is input, it first goes through feature extraction by a dilated convolutional network; and after a feature image is input, dilated convolution is calculated:

y[i]=Σ_kx[i+r*k]*w[k] (8)

for a two-dimensional signal, an output corresponding to each position i is y, w is a filter, and the detour rate r is a step size for sampling the input signal;

after the input image is processed by the convolutional network for output, pixel points of the output target image are classified by a fully connected conditional random field, and the classification is mainly performed for the target image and the background boundary.

Further, the calculating an antenna downtilt angle includes:

obtaining the width x and the height y of the antenna box according to the border of the target image, and calculating a downtilt angle of the base station antenna according to a geometric relation, the downtilt angle of the base station antenna being an angle θ between the base station antenna and a vertical plane:

$\begin{matrix} θ = \arctan \frac{x}{y} . & (12) \end{matrix}$

The present disclosure has the following beneficial effects: the present disclosure adopts an antenna downtilt angle measuring method based on a multi-scale deep semantic segmentation network. The method for measuring a downtilt angle of a mobile base station antenna by calling a target detection algorithm and a semantic segmentation algorithm and using an unmanned aerial vehicle as a carrier is highly applicable, cost-effective, and safe.

BRIEF DESCRIPTION OF THE DRAWINGS

The present disclosure is further described below with reference to the accompanying drawings and examples.

FIG. 1 is a schematic diagram of a downtilt angle of a base station antenna;

FIG. 2 is a flowchart of an antenna downtilt angle measuring method based on a multi-scale deep semantic segmentation network according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of border prediction for an antenna downtilt angle measuring method based on a multi-scale deep semantic segmentation network according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a network structure for an antenna downtilt angle measuring method based on a multi-scale deep semantic segmentation network according to an embodiment of the present disclosure;

FIG. 5 is a schematic diagram of a bottleneck block for an antenna downtilt angle measuring method based on a multi-scale deep semantic segmentation network according to an embodiment of the present disclosure;

FIG. 6 is a schematic diagram of standard convolution for an antenna downtilt angle measuring method based on a multi-scale deep semantic segmentation network according to an embodiment of the present disclosure;

FIG. 7 is a schematic diagram of high-resolution feature extraction for an antenna downtilt angle measuring method based on a multi-scale deep semantic segmentation network according to an embodiment of the present disclosure;

FIG. 8 is a schematic diagram of one-dimensional low-resolution feature extraction for an antenna downtilt angle measuring method based on a multi-scale deep semantic segmentation network according to an embodiment of the present disclosure;

FIG. 9 is a schematic diagram of dilated convolution for an antenna downtilt angle measuring method based on a multi-scale deep semantic segmentation network according to an embodiment of the present disclosure; and

FIG. 10 is a view of a random field for an antenna downtilt angle measuring method based on a multi-scale deep semantic segmentation network according to an embodiment of the present disclosure.

DETAILED DESCRIPTION

Referring to FIG. 2, an antenna downtilt angle measuring method based on a multi-scale deep semantic segmentation network is provided in an embodiment of the present disclosure, including:

collecting image data: base station antenna data is collected by using an unmanned aerial vehicle and antenna images collected are taken as a data set;

predicting a target bounding box: a target antenna in the data set is positioned, and a bounding box is predicted by logistic regression;

calculating an antenna downtilt angle: the width and height of an antenna box are obtained according to a border of the target image to calculate the antenna downtilt angle.

In the embodiment, the method for measuring a downtilt angle of a mobile base station antenna by calling a target detection algorithm and a semantic segmentation algorithm and using an unmanned aerial vehicle as a carrier is highly applicable, cost-effective, and safe.

Further, the step of collecting image data includes:

Further, the step of predicting a target bounding box includes:

positioning a target antenna in the antenna image, predicting a bounding box by logistic regression, first dividing the entire antenna image into N*N grids, predicting the entire antenna image after the antenna image is input, scanning each grid at a time, and starting to predict the target antenna when the center of the grid where the target antenna is located is positioned, wherein 4 coordinate values predicted for each bounding box are t_x, t_y, t_w, and t_h, respectively, an upper-left offset of each target cell is (c_x, c_y), box heights of prior bounding boxes are p_x, p_yrespectively, box prediction is as shown in FIG. 3, and the network predicts their values as:

b_x=σ(t_x)+c_x (1)
b_y=σ(t_y)+c_y (2)
b_w=p_we^t^w (3)
b_h=p_he^t^h (4)

where σ(·) denotes the activation function, which can be expressed as:

σ(x)+1/1e^−x

where p_w, p_hdenote the width and height of the prior bounding boxes respectively, e denotes the natural constant, which is about equal to 2.71828; where b_x, b_v, b_w, b_hcan be calculated according the above formulas, wherein band b_hdenote the width and the height of the bounding boxes respectively,

where the input antenna image is divided into N*N grids, each grid includes 5 predictors (x, y, w, h, confidence) and a c class, and the output of the network is of a size of S*S*(5*B+C); B is the number of the bounding boxes in each grid, C means the class is only antenna in the present disclosure, and thus is 1, and confidence represents that the predicted grid includes two pieces of information, i.e., confidence of the target antenna and prediction accuracy of the bounding box:

confidence=Pr(object)*IOU_prd^truth (5)

In the accuracy of a target, multi-scale prediction is used. There is no need to fix the size of an input image, so different step sizes can be used to detect feature maps of different sizes. Three different detection layers are used to detect the antenna image for the target antenna, and different detection layers are realized by controlling the step size. The first detection layer is down-sampled with a step size of 32 to reduce the feature dimension. In order to connect with the previous identical feature graph, the layer is up-sampled, and a high resolution can be obtained at this point. The second detection layer with a step size of 16 is used, and the remaining feature processing is consistent with that of the first layer. The step size is set to 8 in the third layer, feature prediction is performed thereon, and finally, the detection accuracy of the target antenna is greater.

Further, the step of performing target recognition and semantic segmentation includes:

wherein Pr(Classi|object) is an object class probability;

then applying the activation function by logistic regression:

$\begin{matrix} f (x) = \frac{1}{1 + e^{- x}} & (7) \end{matrix}$

in a network layer structure, there are 53 convolutional layers and 22 residual layers among layers 0-74; layers 75-105 are feature interaction layers of a neural convolutional network, which can be divided into three scales; local feature interaction is realized by means of convolution kernels, and its network structure is as shown in FIG. 4.

In the production of the data set, only the antenna is detected, so the class is 1. Therefore, in the training, the output of the last convolutional layer is 3*(1+4+1)=18.

Semantic Segmentation

Semantic image segmentation is performed on the antenna image by using a deep convolutional network, and the pixel points of the target image and the background are classified.

After the target image is input, it first goes through feature extraction by a dilated convolutional network. Since the measured boundary precision is not high enough, the pixel of the target image cannot be well separated from the background pixel, and the pixel classification of the image boundary can be improved by combining a fully connected conditional random field, so that the segmentation effect can be better.

It is first feature-extracted by using a dilated convolutional network. The feature extraction of the network convolutional layer can be divided into two cases: a low-resolution input image is feature-extracted by a standard convolutional layer, as shown in FIG. 6. Dense features of a high-resolution input image are extracted by a detour convolution at a rate of 2, as shown in FIG. 7, and its step size is set to 2 to thus reduce the feature dimension. In the convolutional network layer, the convolution kernel is set to 3, the stride length is 1, and the step size is 1. FIG. 8 is a schematic diagram of one-dimensional low-resolution feature map extraction. FIG. 9 is a schematic diagram of dilated convolution.

In a network structure of a serial module and a spatial pyramid pooling layer module, the convolution with holes can effectively increase a receptive field of a filter and integrate multi-scale information. After a feature image is input, dilated convolution is calculated:

y[i]=Σ_kx[i+r*k]*w[k] (8)

For a two-dimensional signal, an output corresponding to each position i is y, w is a filter, and the detour rate r is a step size for sampling the input signal. The receptive field of the filter can be improved, and the convolution with holes can enlarge the convolution kernel. A residual module of multi-scale feature learning is used in the feature network extraction, while the bottleneck block is used in the present disclosure. In the bottleneck block, each convolution is processed by normalization and processed by an activation function. Thus, contextual information of the context is enriched, and the bottleneck block is as shown in FIG. 5.

After the input image is processed by the convolutional network for output, pixel points of the output target image are classified by a fully connected conditional random field, and the classification is mainly performed for the target image and the background boundary.

A view of a random field is as shown in FIG. 10. Each circle represents a pixel point, xi (white circle) is a labeled pixel point (node), two connected pixel points are edges of the pixel, yi (black circle) is a reference value of xi, and the classification of the labeled pixel points is determined by the reference value yi. According to the Gibbs distribution function,

$\begin{matrix} P (Y = y | I) = \frac{1}{Z (I)} \exp (- E (y | I)) & (9) \end{matrix}$

where y is the reference value of xi, E(y|I) is an energy function.

$\begin{matrix} E (y | I) = \sum_{i} Ψ_{u} (y_{i}) + \sum_{i < y} Ψ_{p} (y_{i}, y_{j}) & (10) \end{matrix}$

An image function output through a dilated convolutional network is a unary potential function: A binary potential function is

$\begin{matrix} Ψ_{P} (y_{i}, y_{j}) = u (y_{i}, y_{j}) \sum_{m = 1}^{M} w^{(m)} k_{G}^{(m)} (f_{i}, f_{j}) & (11) \end{matrix}$

The function the relationship between pixels, and will assign the same symbols to the same prime points. The unary potential function extracts feature vectors of a node in different feature maps, and the binary function connects the nodes extracted by the unitary potential function to learn its edges. All the nodes are connected to form a conditional random field of a fully connected layer, and an image finally output by the function is more accurate.

Further, the step of calculating an antenna downtilt angle includes:

$\begin{matrix} θ = \arctan \frac{x}{y} . & (12) \end{matrix}$

The above are merely preferred embodiments of the present disclosure. The present disclosure is not limited to the above implementations. As long as the implementations can achieve the technical effect of the present disclosure with the same means, they are all encompassed in the protection scope of the present disclosure.

Claims

1. A method for measuring an antenna downtilt angle based on a multi-scale deep semantic segmentation network, comprising: collecting image data: wherein base station antenna data is collected by using an unmanned aerial vehicle, and antenna images collected are taken as a data set;predicting a target bounding box: wherein a target antenna in the data set is positioned, and a bounding box is predicted by logistic regression;performing target recognition and semantic segmentation: wherein target features of the target antenna in the data set are extracted, the target features are learned and processed by an activation function, a target image is output for semantic image segmentation, and pixel points of the target image and the background are classified; andcalculating an antenna downtilt angle: wherein the width and height of an antenna box are obtained according to a border of the target image to calculate the antenna downtilt angle;wherein the step of predicting a target bounding box comprises:positioning a target antenna in the antenna image, predicting a bounding box by logistic regression, first dividing the entire antenna image into N*N grids, predicting the entire antenna image after the antenna image is input, scanning each grid at a time, and starting to predict the target antenna when the center of the grid where the target antenna is located is positioned, wherein four coordinate values predicted for each bounding box are tx, ty, tw, and th, respectively, an upper-left offset of each target cell is (cx, cy), box heights of prior bounding boxes are px, py respectively, and the network predicts their values as: bx=σ(tx)+cx by=σ(ty)+cy bw=pwetw bh=pheth where σ(·) denotes the activation function, which can be expressed as:
2. The method for measuring an antenna downtilt angle based on a multi-scale deep semantic segmentation network according to claim 1, wherein the step of collecting image data comprises: locating the unmanned aerial vehicle on the top of a pole of a base station antenna, and recording the longitude and latitude (L0, W0) of the pole in the vertical direction; causing the unmanned aerial vehicle to fly around a point of the base station antenna, setting a flight radius of the unmanned aerial vehicle, and the unmanned aerial vehicle moving around the pole along the radius on the same horizontal plane to acquire antenna images with different attitudes and angles of a mobile base station antenna as a data set.
3. The method for measuring an antenna downtilt angle based on a multi-scale deep semantic segmentation network according to claim 2, wherein: a threshold is set to 0.5 when Pr(Object)=1; the target antenna falls in the center of the grid, that is, the bounding box currently predicted coincides with an actual background box object better than before; if the predicted bounding box is not the best currently, the bounding box is not predicted when the threshold is smaller than 0.5, so as to determine that the target antenna does not fall into the grid.
4. The method for measuring an antenna downtilt angle based on a multi-scale deep semantic segmentation network according to claim 3, wherein the step of performing target recognition and semantic segmentation comprises: performing target recognition by using a network convolutional layer for feature extraction: antenna image pixel 416*416 is input, the channel number is 3, there are 32 layers of convolution kernels, each kernel has a size of 3*3, 32 layers of convolution kernels are used to learn 32 feature maps, and for color differences of the target antenna, features of the target antenna are learned by using different convolution kernels; convolutional layer up-sampling is performed during feature extraction, and a prediction formula for object classes is as follows: Pr(Classi|object)*Pr(object)*Pr(object)*IOUpredtruth=Pr(object)*IOUpredtruth wherein Pr(Classi/object) is an object class probability;then applying the activation function by logistic regression:
5. The method for measuring an antenna downtilt angle based on a multi-scale deep semantic segmentation network according to claim 4, wherein the step of calculating an antenna downtilt angle comprises: obtaining the width x and the height y of the antenna box according to the border of the target image, and calculating a downtilt angle of the base station antenna according to a geometric relation, the downtilt angle of the base station antenna being an angle θ between the base station antenna and a vertical plane:

Priority Claims (1)

Number	Date	Country	Kind
201811338415.4	Nov 2018	CN	national

PCT Information

Filing Document	Filing Date	Country	Kind
PCT/CN2019/076718	3/1/2019	WO

Publishing Document	Publishing Date	Country	Kind
WO2020/093630	5/14/2020	WO	A

US Referenced Citations (24)

Number	Name	Date	Kind
9596617	Priest	Mar 2017	B2
9918235	Brennan et al.	Mar 2018	B2
10402689	Bogdanovych	Sep 2019	B1
10565787	Jordan	Feb 2020	B1
10872228	Zhou	Dec 2020	B1
11257198	Holub	Feb 2022	B1
20110116708	Zhou	May 2011	A1
20110150317	Kim	Jun 2011	A1
20140205205	Neubauer	Jul 2014	A1
20150278632	Rodriguez-Serrano	Oct 2015	A1
20160271796	Babu	Sep 2016	A1
20170077586	Li	Mar 2017	A1
20180089505	El-Khamy	Mar 2018	A1
20180137642	Malisiewicz	May 2018	A1
20180218351	Chaubard	Aug 2018	A1
20180260415	Gordo Soldevila	Sep 2018	A1
20190015059	Itu	Jan 2019	A1
20190043003	Fisher	Feb 2019	A1
20190130189	Zhou	May 2019	A1
20190213438	Jones	Jul 2019	A1
20190332118	Wang	Oct 2019	A1
20200090519	Ding	Mar 2020	A1
20200218961	Kanazawa	Jul 2020	A1
20220004770	Lei	Jan 2022	A1

Foreign Referenced Citations (6)

Number	Date	Country
103256920	Aug 2013	CN
103630107	Mar 2014	CN
104504381	Apr 2015	CN
106683091	May 2017	CN
107664491	Feb 2018	CN
107830846	Mar 2018	CN

Related Publications (1)

	Number	Date	Country
	20210215481 A1	Jul 2021	US

Method for measuring antenna downtilt angle based on multi-scale deep semantic segmentation network

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

CPC

International Classifications

Term Extension