METHOD FOR CONSTRUCTING SUPPORT VECTOR MACHINE OF NONPARALLEL STRUCTURE

Description

CROSS REFERENCE TO THE RELATED APPLICATIONS

This application is based upon and claims priority to Chinese Patent Application No. 202210401847.5, filed on Apr. 18, 2022, the entire contents of which are incorporated herein by reference.

TECHNICAL FIELD

The present disclosure relates to the technical field of machine learning, and in particular to a method for constructing a support vector machine of a nonparallel structure.

BACKGROUND

A support vector machine (SVM) has been proposed by Vapnik, etc., and is suitable for pattern recognition and other fields. The SVM is characterized by considering both an empirical risk and a structural risk, that is, supervised learning is realized by finding a hyperplane that can ensure classification accuracy and maximize an interval between two types of data. With some desirable characteristics, such as kernel tricks, sparsity and global solutions, the SVM is widely used in remote sensing image classification because of its solid theoretical basis and desirable generalization. The premise of a support vector machine classification model is assuming that boundaries of positive and negative classes are parallel. However, for actual remote sensing data, the assumption is difficult to establish, which affects generalization ability of the model. Jayadeva, etc. have proposed a twin support vector machine (TWSVM) to solve the problem. The TWSVM aims to find a pair of nonparallel hyperplanes (a parallel state can be regarded as a special nonparallel state). Data points of each class are close to one of the two nonparallel hyperplanes and far away from the other. Classes of samples are determined by comparing distances between the samples and the two hyperplanes. The TWSVM is especially successful but still has obvious shortcomings: a TWSVM model only considers an empirical risk without a structural risk, and its generalization performance is affected, so that its classification effect is not as good as that of a traditional support vector machine in many cases. Therefore, the TWSVM is not effective in hyperspectral image classification directly. A new nonparallel vector machine algorithm is proposed herein, so as to further improve classification accuracy of hyperspectral images on the basis of the algorithm itself.

In view of the above situation, based on a traditional parallel support vector machine, a nonparallel support vector machine model is constructed herein, that is, an additional empirical risk minimization nonparallel support vector machine (AERM-NSVM), by adding a least square term of samples and an additional empirical risk minimization term, which is referred to as the patent method hereinafter.

SUMMARY

An objective of the present disclosure is to provide a method for constructing a support vector machine of a nonparallel structure, and a new nonparallel vector machine algorithm is proposed to further improve classification accuracy of hyperspectral images on the basis of the algorithm itself, so as to obtain better classification performance.

To achieve the objective, the present disclosure provides the method for constructing a support vector machine of a nonparallel structure. The method includes:

S1, preprocessing data;
S2, solving a Lagrange multiplier of a positive-class hyperplane;
S3, solving a Lagrange multiplier of a negative-class hyperplane;
S4, solving parameters of positive-class and negative-class hyperplanes; and
S5, determining a class of a new data point.

Preferably, the preprocessing data in S1 specifically includes:

(1) reading m n-dimensional training data sets of two classes, conducting standardization, obtaining a training data sample matrix C of m × n, and reading label information as a vector y;
(2) distinguishing training samples according to positive and negative label information to obtain a matrix A of m₊ × n and a matrix B of m_ × n; and
(3) converting a label term y into a diagonal matrix Y.

Preferably, the solving a Lagrange multiplier of a positive-class hyperplane in S2 specifically includes:

(1) constructing a unit vector I₊ of m₊ dimensions, and obtaining a matrix P₊ by means of a formula (1);
$\begin{matrix} P_{+} = [\begin{matrix} A A^{T} + \frac{1}{c_{1}} I_{+} & - (1) A C^{T} Y^{T} \\ - Y C A^{T} & Y C C^{T} Y^{T} \end{matrix}] \end{matrix}$
(2) constructing an all-ones vector e of m dimensions, and obtaining a matrix Q₊ by means of a formula (2);
$\begin{matrix} Q_{+} = - e^{T} & (2) \end{matrix}$
(3) constructing a unit matrix I of m dimensions, and obtaining a matrix H₊ by means of a formula (3);
$\begin{matrix} H_{+} = [\begin{matrix} - 1 \times I \\ I \end{matrix}] & (3) \end{matrix}$
(4) obtaining a matrix J₊ by means of a formula (4);
$\begin{matrix} J_{+} = [C_{3} \times e^{T}] & (4) \end{matrix}$
(5) constructing an all-ones vector e₊ of m₊ dimensions, and obtaining a matrix K₊ by means of a formula (5);
$\begin{matrix} K_{+} = [\begin{array}{l} e_{+}^{T} \\ - e^{T} Y^{T} \end{array}] & (5) \end{matrix}$
(6) obtaining vectors α = (α₁,..., α_m) and λ = (λ_1,..., λ_m+) of the Lagrange multiplier by means of a formula (6);
$\begin{matrix} \begin{array}{l} \min_{μ} \frac{1}{2} {[\begin{matrix} λ^{T} & α^{T} \end{matrix}]}^{T} P_{+} [\begin{matrix} λ^{T} & α^{T} \end{matrix}] + Q_{+} [\begin{matrix} λ^{T} & α^{T} \end{matrix}] \\ s . t . K_{+}^{T} [\begin{matrix} λ^{T} & α^{T} \end{matrix}] = 0 \\ 0 \leq [\begin{matrix} λ^{T} & α^{T} \end{matrix}] \leq J_{+}, i = 1, \dots, n \end{array} & (6) \end{matrix}$

If P₊ is obtained by Equation (7).

$\begin{matrix} P_{+} = [\begin{matrix} A A^{T} + \frac{1}{c_{1}} I_{+} + E_{1} & - (A C^{T} + E_{2}) Y^{T} \\ - Y (C A^{T} + E_{3}) & Y (C C^{T} + E_{4}) Y^{T} \end{matrix}] & (7) \end{matrix}$

At this time, obtaining vectors α = (α₁, ..., α_m) and λ = (λ₁, ..., λ_m+) of the Lagrange multiplier by means of a formula (8);

$\begin{matrix} \begin{array}{l} \min_{μ} \frac{1}{2} {[λ^{T} α^{T}]}^{T} P_{+} [λ^{T} α^{T}] + Q_{+} [λ^{T} α^{T}] \\ s . t . 0 \leq [λ^{T} α^{T}] \leq J_{+}, i = 1, \dots, n \end{array} & (8) \end{matrix}$

Preferably, the solving a Lagrange multiplier of a negative-class hyperplane in S3 specifically includes:

(1) constructing a unit matrix I_ of m_ dimensions, and obtaining a matrix P by means of a formula (9);
$\begin{matrix} P_{-} = [\begin{matrix} A A^{T} + \frac{1}{c_{1}} I_{+} & - (9) A C^{T} Y^{T} \\ - Y C A^{T} & Y C C^{T} Y^{T} \end{matrix}] \end{matrix}$
(2) constructing an all-ones vector e of m dimensions, and obtaining a matrix Q_ by means of a formula (10);
$\begin{matrix} Q_{-} = - e^{T} & (10) \end{matrix}$
(3) obtaining a matrix H by means of a formula (11);
$\begin{matrix} H_{-} = [\begin{matrix} - 1 \times I \\ I \end{matrix}] & (11) \end{matrix}$
(4) obtaining a matrix J_ by means of a formula (12);
$\begin{matrix} J_{-} = [C_{4} \times e^{T}] & (12) \end{matrix}$
(5) constructing an all-ones vector e_ of m_ dimensions, and obtaining a matrix K by means of a formula (13);
$\begin{matrix} K_{-} = [\begin{array}{l} e_{-}^{T} \\ - e^{T} Y^{T} \end{array}] & (13) \end{matrix}$
(6) obtaining vectors θ = (θ₁, ..., θ_m) and γ = (γ₁, ..., γ_{m_}) of the Lagrange multiplier by means of a formula (14);
$\begin{matrix} \begin{array}{l} \min_{μ} \frac{1}{2} {[\begin{matrix} θ^{T} & γ^{T} \end{matrix}]}^{T} P_{-} [\begin{matrix} θ^{T} & γ^{T} \end{matrix}] + Q_{-} [\begin{matrix} θ^{T} & γ^{T} \end{matrix}] \\ s . t . K_{-}^{T} [\begin{matrix} θ^{T} & γ^{T} \end{matrix}] = 0 \\ 0 \leq [\begin{matrix} θ^{T} & γ^{T} \end{matrix}] \leq J_{+}, i = 1, \dots, n \end{array} & (14) \end{matrix}$

If P_ is obtained by Equation (15).

$\begin{matrix} P_{-} = [\begin{matrix} B B^{T} + \frac{1}{c_{2}} I_{+} + F_{1} & (B C^{T} + F_{2}) Y^{T} \\ Y (C B^{T} + F_{3}) & Y (C C^{T} + F_{4}) Y^{T} \end{matrix}] & (15) \end{matrix}$

At this time, obtaining vectors θ = (θ₁, ..., θ_m) and γ = (γ₁, ..., γ_{m_}) of the Lagrange multiplier by means of a formula (16);

$\begin{matrix} \begin{array}{l} \min_{μ} \frac{1}{2} {[θ^{T} γ^{T}]}^{T} P_{-} [θ^{T} γ^{T}] + Q_{-} [θ^{T} γ^{T}] \\ s . t . 0 \leq [θ^{T} γ^{T}] \leq J_{-}, i = 1, \dots, n \end{array} & (16) \end{matrix}$

Preferably, the solving parameters of positive-class and negative-class hyperplanes in S4 specifically includes:

(1) obtaining a normal vector ω₊ of a positive-class hyperplane by means of a formula (17);
$\begin{matrix} ω_{+} = - A^{T} λ + C^{T} Y^{T} α & (17) \end{matrix}$
(2) obtaining an offset b₊ of the positive-class hyperplane by means of a formula (18);
$\begin{matrix} b_{+} = \frac{e_{+}^{T} (- A ω_{+} + \frac{1}{c_{1}} λ)}{m_{+}} - 1 & (18) \end{matrix}$
(3) obtaining a normal vector ω_ of a negative-class hyperplane by means of a formula (19);
$\begin{matrix} ω_{-} = - B^{T} θ + C^{T} Y^{T} γ & (19) \end{matrix}$
(4) obtaining an offset b_ of the negative-class hyperplane by means of a formula (20);
$\begin{matrix} b_{-} = \frac{e_{-}^{T} (- B ω_{-} - \frac{1}{c_{2}} γ)}{m_{-}} + 1 & (20) \end{matrix}$

If the Lagrange multipliers are obtained by Equation (7), the solving parameters of positive-class and negative-class hyperplanes in S4 specifically includes:

(1) obtaining a normal vector ω₊ of a positive-class hyperplane by means of a formula (21);
$\begin{matrix} ω_{+} = - A^{T} λ + C^{T} Y^{T} α & (21) \end{matrix}$
(2) obtaining an offset b₊ of the positive-class hyperplane by means of a formula (22);
$\begin{matrix} b_{+} = - e_{+}^{T} λ + e^{T} Y^{T} α - 1 & (22) \end{matrix}$
(3) obtaining a normal vector ω_ of a negative-class hyperplane by means of a formula (23);
$\begin{matrix} ω_{-} = B^{T} θ + C^{T} Y^{T} γ & (23) \end{matrix}$
(4) obtaining an offset b_ of the negative-class hyperplane by means of a formula (24);
$\begin{matrix} b_{-} = - e_{-}^{T} θ + e^{T} Y^{T} γ + 1 & (24) \end{matrix}$

Preferably, the determining a class of a new data point in S5 specifically includes:

(1) acquiring test data x, and obtaining an Euclidean distance between x and a positive-class hyperplane by means of a formula (25);
$\begin{matrix} d_{+} = \frac{|(x^{T} \cdot ω_{+}) + b_{+}|}{d_{+}} & (25) \end{matrix}$
(2) obtaining an Euclidean distance between x and a negative-class hyperplane by means of a formula (26);
$\begin{matrix} d_{-} = \frac{|(x^{T} \cdot ω_{-}) + b_{-}|}{d_{-}} & (26) \end{matrix}$
(3) determining which one of d₊ and d_- is smaller, where x is in a positive class in response to determining d₊ < d_-, and otherwise in a negative class.

Preferably, construction methods are conducted in a linear manner, and under the condition that the methods are used in a nonlinear case, expansion modes of the methods are consistent with that of a parallel support vector machine (SVM); and

according to description of a case of two classes, under the condition that the methods are used in a multi-class case, the expansion modes of the methods are consistent with that of the parallel SVM.

Therefore, through the method for constructing a support vector machine of a nonparallel structure of the present disclosure, a new nonparallel vector machine algorithm is proposed to further improve classification accuracy of hyperspectral images on the basis of the algorithm itself, so as to obtain better classification performance.

The technical solution of the present disclosure will be further described in detail below with reference to the accompanying drawings and the embodiments.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a classification hyperplane of a method of the present disclosure, where origins in part A indicate positive sample points, origins in part B indicate negative sample points, lines with triangles indicate two parallel planes obtained by solving a positive sample optimization problem, and lines with circles indicate two parallel planes obtained by solving a negative sample optimization problem;

FIGS. 2A-2D are images of a Pavia Center hyperspectral image classification result, where FIG. 2A is ground truth, FIG. 2B is a support vector machine (SVM), FIG. 2C is a twin support vector machine (TWSVM), and FIG. 2D is the method of the present disclosure; and

FIGS. 3A-3D are images of a Pavia University hyperspectral image classification result, where FIG. 3A is ground truth, FIG. 3B is SVM, FIG. 3C is TWSVM, and FIG. 3D is the method of the present disclosure.

DETAILED DESCRIPTION OF THE EMBODIMENTS

The technical solution of the present disclosure will be further described below with reference to the accompanying drawings and the embodiments.

Embodiment 1

The present disclosure provides a method for constructing a support vector machine of a nonparallel structure. The method includes:

S1, data is preprocessed, specifically,
- (1) m n-dimensional training data sets of two classes are read, standardization is conducted, a training data sample matrix C of m × n is obtained, and label information is read as a vector y;
- (2) training samples are distinguished according to positive and negative label information to obtain a matrix A of m₊ × n and a matrix B of m_ × n; and
- (3) a label term y is converted into a diagonal matrix Y.
S2, a Lagrange multiplier of a positive-class hyperplane is solved, specifically,
- (1) a unit vector I₊ of m₊ dimensions is constructed, and a matrix P₊ is obtained by means of a formula (1);
- $\begin{matrix} P_{+} = [\begin{matrix} A A^{T} + \frac{1}{c_{1}} I_{+} & - (1) A C^{T} Y^{T} \\ - Y C A^{T} & Y C C^{T} Y^{T} \end{matrix}] \end{matrix}$
- (2) an all-ones vector e of m dimensions is constructed, and a matrix Q₊ is obtained by means of a formula (2);
- $\begin{matrix} Q_{+} = - e^{T} & (2) \end{matrix}$
- (3) a unit matrix I of m dimensions is constructed, and a matrix H₊ is obtained by means of a formula (3);
- $\begin{matrix} H_{+} = [\begin{matrix} - 1 \times I \\ I \end{matrix}] & (3) \end{matrix}$
- (4) a matrix J₊ is obtained by means of a formula (4);
- $\begin{matrix} J_{+} = [C_{3} \times e^{T}] & (4) \end{matrix}$
- (5) an all-ones vector e₊ of m₊ dimensions is constructed, and a matrix K₊ is obtained by means of a formula (5);
- $\begin{matrix} K_{+} = [\begin{array}{l} e_{+}^{T} \\ - e^{T} Y^{T} \end{array}] & (5) \end{matrix}$
- (6) vectors α = (α₁, ..., α_m) and λ = (λ₁, ..., λ_m+) of the Lagrange multiplier are obtained by means of a formula (6);
- $\begin{matrix} \begin{array}{l} \min_{μ} \frac{1}{2} {[\begin{matrix} λ^{T} & α^{T} \end{matrix}]}^{T} P_{+} [\begin{matrix} λ^{T} & α^{T} \end{matrix}] + Q_{+} [\begin{matrix} λ^{T} & α^{T} \end{matrix}] \\ s . t . K_{+}^{T} [\begin{matrix} λ^{T} & α^{T} \end{matrix}] = 0 \\ 0 \leq [\begin{matrix} λ^{T} & α^{T} \end{matrix}] \leq J_{+}, i = 1, \dots, n \end{array} & (6) \end{matrix}$
S3, a Lagrange multiplier of a negative-class hyperplane is solved, specifically,
- (1) a unit matrix I_ of m_ dimensions is constructed, and a matrix P_ is obtained by means of a formula (7);
- $\begin{matrix} P_{-} = [\begin{matrix} A A^{T} + \frac{1}{c_{1}} I_{+} & - (7) A C^{T} Y^{T} \\ - Y C A^{T} & Y C C^{T} Y^{T} \end{matrix}] \end{matrix}$
- (2) an all-ones vector e of m dimensions is constructed, and a matrix Q_ is obtained by means of a formula (8);
- $\begin{matrix} Q_{-} = - e^{T} & (8) \end{matrix}$
- (3) a matrix H is obtained by means of a formula (9);
- $\begin{matrix} H_{-} = [\begin{matrix} - 1 \times I \\ I \end{matrix}] & (9) \end{matrix}$
- (4) a matrix J_ is obtained by means of a formula (10);
- $\begin{matrix} J_{-} = [C_{4} \times e^{T}] & (10) \end{matrix}$
- (5) an all-ones vector e_ of m_ dimensions is constructed, and a matrix K is obtained by means of a formula (11);
- $\begin{matrix} K_{-} = [\begin{array}{l} e_{-}^{T} \\ e^{T} Y^{T} \end{array}] & (11) \end{matrix}$
- (6) vectors θ = (θ₁, ..., θ_m) and γ = (γ₁,..., γ_m-) of the Lagrange multiplier are obtained by means of a formula (12);
- $\begin{matrix} \begin{array}{l} \min_{μ} \frac{1}{2} {[\begin{matrix} θ^{T} & γ^{T} \end{matrix}]}^{T} P_{-} [\begin{matrix} θ^{T} & γ^{T} \end{matrix}] + Q_{-} [\begin{matrix} θ^{T} & γ^{T} \end{matrix}] \\ s . t . K_{-}^{T} [\begin{matrix} θ^{T} & γ^{T} \end{matrix}] = 0 \\ 0 \leq [\begin{matrix} θ^{T} & γ^{T} \end{matrix}] \leq J_{-}, i = 1, \dots, n \end{array} & (12) \end{matrix}$
S4, parameters of positive and negative-class hyperplanes are solved, specifically,
- (1) a normal vector ω₊ of a positive-class hyperplane is obtained by means of a formula (13);
- $\begin{matrix} ω_{+} = - A^{T} λ + C^{T} Y^{T} α & (13) \end{matrix}$
- (2) an offset b₊ of the positive-class hyperplane is obtained by means of a formula (14);
- $\begin{matrix} b_{+} = \frac{e_{+}^{T} (- A ω_{+} + \frac{1}{c_{1}} λ)}{m_{+}} - 1 & (14) \end{matrix}$
- (3) a normal vector co of a negative-class hyperplane is obtained by means of a formula (15);
- $\begin{matrix} ω_{-} = B^{T} θ + C^{T} Y^{T} γ & (15) \end{matrix}$
- (4) an offset b_ of the negative-class hyperplane is obtained by means of a formula (16);
- $\begin{matrix} b_{-} = \frac{e_{-}^{T} (- B ω_{-} + \frac{1}{c_{2}} γ)}{m_{-}} + 1 & (16) \end{matrix}$
S5, a class of a new data point is determined, specifically,
- (1) test data x is acquired, and an Euclidean distance between x and a positive-class hyperplane is obtained by means of a formula (17);
- $\begin{matrix} d_{+} = \frac{|(x^{T} \cdot ω_{+}) + b_{+}|}{d_{+}} & (17) \end{matrix}$
- (2) an Euclidean distance between x and a negative-class hyperplane is obtained by means of a formula (18);
- $\begin{matrix} d_{-} = \frac{|(x^{T} \cdot ω_{-}) + b_{-}|}{d_{-}} & (18) \end{matrix}$
- (3) which one of d₊ and d is smaller is determined, where x is in a positive class in response to determining d₊ < d_, and otherwise in a negative class.

Embodiment 2

The present disclosure provides a method for constructing a support vector machine of a nonparallel structure. The method includes:

S1, data is preprocessed, specifically,
- (1) m n-dimensional training data sets of two classes are read, and standardization is conducted; and a training data sample matrix C of m × n is obtained, and label information is read as a vector y;
- (2) training samples are distinguished according to positive and negative label information to obtain a matrix A of m₊ × n and a matrix B of m_ × n; and
- (3) a label term y is converted into a diagonal matrix Y.
S2, a Lagrange multiplier of a positive-class hyperplane is solved, specifically,
- (1) a unit vector I₊ of m₊ dimensions is constructed, and a matrix P₊ is obtained by means of a formula (1);
- $\begin{matrix} P_{+} = [\begin{matrix} A A^{T} + \frac{1}{c_{1}} I_{+} + E_{1} & - (A C^{T} + E_{2}) Y^{T} \\ - Y (C A^{T} + E_{3}) & Y (C C^{T} + E_{4}) Y^{T} \end{matrix}] & (1) \end{matrix}$
- (2) a matrix Q₊ is obtained by means of a formula (2);
- $\begin{matrix} Q_{-} = - e^{T} & (2) \end{matrix}$
- (3) a unit matrix I of m dimensions is constructed, and a matrix H₊ is obtained by means of a formula (3);
- $\begin{matrix} H_{+} = [\begin{matrix} - 1 \times I \\ I \end{matrix}] & (3) \end{matrix}$
- (4) an all-ones vector e of m dimensions is constructed, and a matrix J₊ is obtained by means of a formula (4);
- $\begin{matrix} J_{+} = [C_{3} \times e^{T}] & (4) \end{matrix}$
- (5)vectors α = (α₁, ..., α_m) and λ = (λ₁, ..., λ_m+) of the Lagrange multiplier are obtained by means of a formula (5);
- $\begin{matrix} \begin{array}{l} \min_{μ} \frac{1}{2} {[\begin{matrix} λ^{T} & α^{T} \end{matrix}]}^{T} P_{+} [\begin{matrix} λ^{T} & α^{T} \end{matrix}] + Q_{+} [\begin{matrix} λ^{T} & α^{T} \end{matrix}] \\ s . t . 0 \leq [\begin{matrix} λ^{T} & α^{T} \end{matrix}] \leq J_{+}, i = 1, \dots, n \end{array} & (5) \end{matrix}$
S3, a Lagrange multiplier of a negative-class hyperplane is solved, specifically,
- (1) a unit matrix I_ of m_ dimensions is constructed, and a matrix P_ is obtained by means of a formula (6);
- $\begin{matrix} P_{-} = [\begin{matrix} B B^{T} + \frac{1}{c_{2}} I_{-} + F_{1} & (B C^{T} + F_{2}) Y^{T} \\ Y (C B^{T} + F_{3}) & Y (C C^{T} + F_{4}) Y^{T} \end{matrix}] & (6) \end{matrix}$
- (2) a matrix Q_ is obtained by means of a formula (7);
- $\begin{matrix} Q_{-} = - e^{T} & (7) \end{matrix}$
- (3) a matrix H is obtained by means of a formula (8);
- $\begin{matrix} H_{+} = [\begin{matrix} - 1 \times I \\ I \end{matrix}] & (8) \end{matrix}$
- (4) an all-ones vector e of m dimensions is constructed, and a matrix J is obtained by means of a formula (9);
- $\begin{matrix} J_{-} = [C_{4} \times e^{T}] & (9) \end{matrix}$
- (5) vectors θ = (θ₁, ..., θ_m) and γ = (γ₁, ..., γ_{m_}) of the Lagrange multiplier are obtained by means of a formula (10);
- $\begin{matrix} \begin{array}{l} \min_{μ} \frac{1}{2} {[\begin{matrix} θ^{T} & γ^{T} \end{matrix}]}^{T} P_{-} [\begin{matrix} θ^{T} & γ^{T} \end{matrix}] + Q_{-} [\begin{matrix} θ^{T} & γ^{T} \end{matrix}] \\ s . t . 0 \leq [\begin{matrix} θ^{T} & γ^{T} \end{matrix}] \leq J_{-}, i = 1, \dots, n \end{array} & (10) \end{matrix}$
S4, parameters of positive-class and negative-class hyperplanes are solved, specifically.
- (1) a normal vector ω₊ of a positive-class hyperplane is obtained by means of a formula (11);
- $\begin{matrix} ω_{+} = - A^{T} λ + C^{T} Y^{T} α & (11) \end{matrix}$
- (2) an offset b₊ of the positive-class hyperplane is obtained by means of a formula (12);
- $\begin{matrix} b_{+} = - e_{+}^{T} λ + e^{T} Y^{T} α - 1 & (12) \end{matrix}$
- (3) a normal vector ω_ of a negative-class hyperplane is obtained by means of a formula (13);
- $\begin{matrix} ω_{-} = B^{T} θ + C^{T} Y^{T} γ & (13) \end{matrix}$
- (4) an offset b_ of the negative-class hyperplane is obtained by means of a formula (14);
- $\begin{matrix} b_{-} = e_{-}^{T} θ + e^{T} Y^{T} γ + 1 & (14) \end{matrix}$
S5, a class of a new data point is determined, specifically.
- (1) test data x is acquired, and an Euclidean distance between x and a positive-class hyperplane is obtained by means of a formula (15);
- $\begin{matrix} d_{+} = \frac{|(x^{T} \cdot ω_{+}) + b_{+}|}{d_{+}} & (15) \end{matrix}$
- (2) an Euclidean distance between x and a negative-class hyperplane is obtained by means of a formula (16);
- $\begin{matrix} d_{-} = \frac{|(x^{T} \cdot ω_{-}) + b_{-}|}{d_{-}} & (16) \end{matrix}$
- (3) which one of d₊ and d is smaller is determined, where x is in a positive class in response to determining d₊ < d_, and otherwise in a negative class.

In the present disclosure, construction methods are conducted in a linear manner, and under the condition that the methods are used in a nonlinear case, expansion modes of the methods are consistent with that of a parallel support vector machine (SVM); and according to description of a case of two classes, under the condition that the methods are used in a multi-class case, the expansion modes of the methods are consistent with that of the parallel SVM.

As shown in FIG. 1, each quadratic programming problem may obtain a pair of parallel hyperplanes like SVM, which all shift to corresponding samples. Lines with triangles indicate two parallel planes obtained by solving a positive sample optimization problem, and lines with circles indicate two parallel planes obtained by solving a negative sample optimization problem. In this case, a straight line near a positive sample point and a straight line near a negative sample point are respectively taken as a pair of classification decision hyperplanes of an additional empirical risk minimization nonparallel support vector machine (AERM-NSVM).

To illustrate effectiveness of the present disclosure, the following experimental demonstration is conducted.

1. Pavia Center Data Set

The Pavia Center data set is acquired by Reflective optics system imaging spectrometer (ROSIS) sensor in Pavia, northern Italy. The number of spectral bands in a center of Pavia is 102. The center of Pavia is a 1096× 1096 pixel image, which contains 9 classes. A sample division case of a training set test set is shown in Table 1:

TABLE 1

The number of samples of each class of Pavia center scene

#
Class
Samples
Train
Test

1
Water
65971
300
65671

2
Trees
7598
300
7298

3
Asphalt Self-Blocking
3090
300
2790

4
Bricks
2685
300
2385

5
Bitumen
6584
300
6284

6
Tiles
9248
300
8948

7
Shadows
7287
300
6987

8
Meadows
42826
300
42526

9
Bare Soil
2863
300
2563

Classification results are shown in FIGS. 2A-2D, and classification accuracy is shown in Table 2.

TABLE 2

Classification results of hyperspectral images of Pavia Center

Experimental method
SVM
TWSVM
Method of the present disclosure

Test accuracy
98.33
98.25
98.50

Kappa
97.62
97.50
97.86

The Pavia Center data set has a large amount of data, and an equal amount of samples are taken for training. It may be seen from Table 2 that a twin support vector machine (TWSVM) in the Pavia Center data set has a classification result still slightly lower than that of SVM, and has a Kappa coefficient also lower than that of SVM. The method of the present disclosure has classification accuracy exceeding that of a standard SVM, and has a Kappa coefficient higher than that of SVM.

2. Pavia University Data Set

The Pavia University data set is acquired by ROSIS sensor in Pavia, northern Italy. The number of spectral bands in the Pavia University is 103. The Pavia University has a 610×610 pixel, which contains 9 classes. A sample division case of a training set test set is shown in Table 3:

TABLE 3

The number of samples of each class of Pavia University

#
Class
Samples
Train
Test

1
Asphalt
6631
300
6331

2
Meadows
18649
300
18349

3
Gravel
2099
300
1790

4
Trees Painted metal
3064
300
2764

5
sheets
1345
300
1045

6
Bare Soil
5029
300
4729

7
Bitumen Self-Blocking
1330
300
1030

8
Bricks
3682
300
3382

9
Shadows
947
300
647

Classification results are shown in FIGS. 3A-3D, and analysis accuracy is shown in Table 4.

TABLE 4

Classification results of hyperspectral images of Pavia University

Experimental method
SVM

TWSV
Method of the present disclosure

M

Test accuracy
91.49

91.53
92.43

Kappa
88.76

88.77
89.92

It may be seen from Table 4 that TWSVM in the Pavia University data set has classification accuracy higher than that of SVM, and has a Kappa coefficient similar to that of SVM, which is more suitable for a case of a nonparallel classification plane. Compared with the standard SVM and TWSVM, the method of the present disclosure has more excellent classification accuracy. The method of the present disclosure has accuracy that is 1.05% higher than that of the standard SVM, and has a Kappa coefficient that is 1.29% higher than that of the SVM. The method of the present disclosure has accuracy that is 0.95% higher than that of the standard SVM, and has a Kappa coefficient that is 1.16% higher than that of the SVM. It is indicated that the method of the present disclosure with structural risk minimization may achieve better results than TWSVM with only empirical risk minimization.

Finally, it should be noted that the above embodiments are merely used to describe the technical solution of the present disclosure, rather than limiting the same. Although the present disclosure has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solution of the present disclosure may still be modified or equivalently replaced. However, these modifications or equivalent replacement cannot make the modified technical solution deviate from the spirit and scope of the technical solution of the present disclosure.

Claims

1. A method for constructing a support vector machine of a nonparallel structure, comprising: S1, preprocessing data;S2, solving a Lagrange multiplier of a positive-class hyperplane;S3, solving a Lagrange multiplier of a negative-class hyperplane;S4, solving parameters of the positive-class hyperplane and the negative-class hyperplane; andS5, determining a class of a new data point.
2. The method for constructing the support vector machine of the nonparallel structure according to claim 1, wherein the step of preprocessing the data in S1 specifically comprises: (1) reading m n-dimensional training data sets of two classes, conducting standardization, obtaining a training data sample matrix C of m × n, and reading label information as a vector y;(2) distinguishing training samples according to positive and negative label information to obtain a matrix A of m+ × n and a matrix B of m_ × n; and(3) converting a label term y into a diagonal matrix Y.
3. The method for constructing the support vector machine of the nonparallel structure according to claim 1, wherein the step of solving the Lagrange multiplier of the positive-class hyperplane in S2 specifically comprises: (1) constructing a unit vector I+ of m+ dimensions, and obtaining a matrix P+ by means of a formula (1); P+=AAT+1c1I+−ACTYT−YCATYCCTYT(1)(2) constructing an all-ones vector e of m dimensions, and obtaining a matrix Q+ by means of a formula (2); Q+=−eT(2)(3) constructing a unit matrix I of m dimensions, and obtaining a matrix H+ by means of a formula (3); H+=−1×II(3)(4) obtaining a matrix J+ by means of a formula (4); J+=C3×eT(4)(5) constructing an all-ones vector e+ of m+ dimensions, and obtaining a matrix K+ by means of a formula (5); K+=e+T−eTYT(5)(6) obtaining vectors α = (α1, ..., αm) and λ = (λ1, ..., λm+, ) of the Lagrange multiplier by means of a formula (6); minμ12λT αTTP+λT αT+Q+λT αTs.t. K+TλT αT=00≤λT αT≤J+, i=1,…,n(6)when P+ is obtained by Equation (7); P+=AAT+1c1I++E1−ACT+E2YT−YCAT+E3YCCT+E4YT(7)at this time, obtaining vectors α = (α1, ..., αm) and λ = (λ1 ..., λm+ ) of the Lagrange multiplier by means of a formula (8); minμ12λT αTTP+λT αT+Q+λT αTs.t. 0≤λT αT≤J+, i=1,…,n(8) .
4. The method for constructing the support vector machine of the nonparallel structure according to claim 1, wherein the step of solving the Lagrange multiplier of the negative-class hyperplane in S3 specifically comprises: (1) constructing a unit matrix I_ of m_ dimensions, and obtaining a matrix P_ by means of a formula (9); P−=AAT+1c1I+−ACTYT−YCATYCCTYT(9)(2) constructing an all-ones vector e of m dimensions, and obtaining a matrix Q_ by means of a formula (10); Q−=−eT(10)(3) obtaining a matrix H_ by means of a formula (11); H−=−1×II(11)(4) obtaining a matrix J_ by means of a formula (12); J−=C4×eT(12)(5) constructing an all-ones vector e_ of m_ dimensions, and obtaining a matrix K by means of a formula (13); K−=e−TeTYT(13)(6) obtaining vectors θ = (θ1, .., θm) and γ = (γ1, ..., γm_ ) of the Lagrange multiplier by means of a formula (14); minμ12θT γTTP−θT γT+Q−θT γTs.t. K−TθT γT=00≤θT γT≤J−, i=1,…,n(14)when P is obtained by Equation (15); P−=BBT+1c2I−+F1−BCT+F2YT−YCBT+F3YCCT+F4YT(15)at this time, obtaining vectors θ = (θ1, ..., θm) and γ = (γ1, ..., γm_) of the Lagrange multiplier by means of a formula (16); minμ12θT γTTP−θT γT+Q−θT γTs.t. 0≤θT γT≤J−, i=1,…,n(16) .
5. The method for constructing the support vector machine of the nonparallel structure according to claim 1, wherein the step of solving parameters of the positive-class hyperplane and the negative-class hyperplane in S4 specifically comprises: (1) obtaining a normal vector ω+ of the positive-class hyperplane by means of a formula (17); ω+=−ATλ+CTYTα(17)(18);(2) obtaining an offset b+ of the positive-class hyperplane by means of a formula b+=e+T−Aω++1c1λm+−1(18)(3) obtaining a normal vector ω of the negative-class hyperplane by means of a formula (19); ω−=BTθ+CTYTγ(19)(4) obtaining an offset b of the negative-class hyperplane by means of a formula (20); b−=e−T−Bω−−1c2λm−+1(20) .
6. The method for constructing the support vector machine of the nonparallel structure according to claim 4, wherein when the Lagrange multipliers are obtained by Equation (16), the step of solving parameters of the positive-class hyperplane and the negative-class hyperplane in S4 specifically comprises: (1) obtaining a normal vector ω+ of the positive-class hyperplane by means of a formula (21); ω+=−ATλ+CTYTα(21)(2) obtaining an offset b+ of the positive-class hyperplane by means of a formula (22); b+=−e+Tλ+eTYTα−1(22)(3) obtaining a normal vector ω_ of the negative-class hyperplane by means of a formula (23); ω−=BTθ+CTYTγ(23)(4) obtaining an offset b_ of the negative-class hyperplane by means of a formula (24); b−=−e−Tθ+eTYTγ+1(24) .
7. The method for constructing the support vector machine of the nonparallel structure according to claim 1, wherein the step of determining the class of the new data point in S5 specifically comprises: (1) acquiring test data x, and obtaining an Euclidean distance between x and the positive-class hyperplane by means of a formula (25); d+=xT⋅ω++b+d+(25)(2) obtaining an Euclidean distance between x and the negative-class hyperplane by means of a formula (26); d+=xT⋅ω−+b−d−(26)(3) determining which one of d+ and d_ is smaller, wherein x is in a positive class in response to determining d+ < d_, and otherwise in a negative class.
8. The method for constructing the support vector machine of the nonparallel structure according to claim 1, wherein construction methods are conducted in a linear manner, and under the condition that the methods are used in a nonlinear case, expansion modes of the construction methods are consistent with an expansion mode of a parallel support vector machine (SVM); andaccording to description of a case of two classes, under the condition that the construction methods are used in a multi-class case, the expansion modes of the construction methods are consistent with an expansion mode of the parallel SVM.
9. The method for constructing the support vector machine of the nonparallel structure according to claim 2, wherein construction methods are conducted in a linear manner, and under the condition that the methods are used in a nonlinear case, expansion modes of the construction methods are consistent with an expansion mode of a parallel support vector machine (SVM); andaccording to description of a case of two classes, under the condition that the construction methods are used in a multi-class case, the expansion modes of the construction methods are consistent with an expansion mode of the parallel SVM.
10. The method for constructing the support vector machine of the nonparallel structure according to claim 3, wherein construction methods are conducted in a linear manner, and under the condition that the methods are used in a nonlinear case, expansion modes of the construction methods are consistent with an expansion mode of a parallel support vector machine (SVM); andaccording to description of a case of two classes, under the condition that the construction methods are used in a multi-class case, the expansion modes of the construction methods are consistent with an expansion mode of the parallel SVM.
11. The method for constructing the support vector machine of the nonparallel structure according to claim 4, wherein construction methods are conducted in a linear manner, and under the condition that the methods are used in a nonlinear case, expansion modes of the construction methods are consistent with an expansion mode of a parallel support vector machine (SVM); andaccording to description of a case of two classes, under the condition that the construction methods are used in a multi-class case, the expansion modes of the construction methods are consistent with an expansion mode of the parallel SVM.
12. The method for constructing the support vector machine of the nonparallel structure according to claim 5, wherein construction methods are conducted in a linear manner, and under the condition that the methods are used in a nonlinear case, expansion modes of the construction methods are consistent with an expansion mode of a parallel support vector machine (SVM); andaccording to description of a case of two classes, under the condition that the construction methods are used in a multi-class case, the expansion modes of the construction methods are consistent with an expansion mode of the parallel SVM.
13. The method for constructing the support vector machine of the nonparallel structure according to claim 6, wherein construction methods are conducted in a linear manner, and under the condition that the methods are used in a nonlinear case, expansion modes of the construction methods are consistent with an expansion mode of a parallel support vector machine (SVM); andaccording to description of a case of two classes, under the condition that the construction methods are used in a multi-class case, the expansion modes of the construction methods are consistent with an expansion mode of the parallel SVM.
14. The method for constructing the support vector machine of the nonparallel structure according to claim 7, wherein construction methods are conducted in a linear manner, and under the condition that the methods are used in a nonlinear case, expansion modes of the construction methods are consistent with an expansion mode of a parallel support vector machine (SVM); andaccording to description of a case of two classes, under the condition that the construction methods are used in a multi-class case, the expansion modes of the construction methods are consistent with an expansion mode of the parallel SVM.

Priority Claims (1)

Number	Date	Country	Kind
202210401847.5	Apr 2022	CN	national

METHOD FOR CONSTRUCTING SUPPORT VECTOR MACHINE OF NONPARALLEL STRUCTURE

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)