This application claims priority to and the benefit of Korean Patent Application No. 2018-0025816, filed on Mar. 5, 2018, the disclosure of which is incorporated herein by reference in its entirety.
The present invention relates to an apparatus and method for linearly approximating a deep neural network (DNN) model which is a non-linear function.
In general, DNN is composed of an input layer, a hidden layer, and an output layer as shown in
z
0
=x
t
y
i
(l+1)=Σj=1N
z
i
(l+1)=σ(yi(l+1)), (1)
Affine transformations are performed on a weight matrix W and bias vector b of an input signal xt to calculate y, and then a non-linear activation function σ is applied to calculate a result value z. In Affine geometry, Affine transformation is a transformation between two Affine spaces for maintaining collinear points. Affine transformation f:A→B corresponds to a transformation ϕ: VA→VB, {right arrow over (PQ)}→{right arrow over (f(P)f(Q))} between two vector spaces suitable therefor (spaces constituted of vectors connecting two points in Affine space), which satisfies linearity.
In the hidden layer, various non-linear functions given below are used.
In general, a deep neural network (DNN) model shows good performance in the generation or classification task. However, DNN fundamentally has non-linear characteristics, and therefore it is difficult to interpret how a result from inputs given to a black box model has been derived. To solve this problem, the present invention proposes linear approximation of a DNN model.
According to an aspect of the present invention, there is provided a method of linearly approximating a DNN model, the method comprising: a first operation of expanding an input to a neuron of a DNN into a polynomial; a second operation of approximating the neuron of the DNN with a Taylor series in parallel with the polynomial expansion of the input; and a third operation of classifying the polynomially expanded input and the Taylor-series approximated neuron as a polynomial of input signals and a polynomial of weights.
An input which is expanded into a polynomial by performing a polynomial expansion on an input x is p(x). When the input p(x) is approximated with a Taylor series, a non-linear activation function y=tan h(h)≈h−⅓h3 is obtained, and also it is possible to obtain a polynomial
therefrom.
The third operation of classifying the polynomially expanded input and the Taylor-series approximated neuron as the polynomial of input signals and the polynomial of weights may include converting the polynomial
into the form of equation y=a·p(x). The converted equation has the form of Y=A·P, and as a result, it is possible to linearly approximate a DNN model. Therefore, it is possible to handle the DNN model in the same way as calculating a solution to a general linear system.
In the equation y=a·p(x), p(x) is an nth-order polynomial of input signals p(x)=(1,x1,x2,x12,x1,x2,x22,x13,x12,x2,x22,x1,x23), and a weight matrix a is a polynomial of a weight matrix W a=(0, w1, w2, 0, 0, 0, −⅓w13, −w1 w22, −w12 w2, −⅓w23).
According to another aspect of the present invention, there is provided an apparatus for linearly approximating a DNN model, the apparatus comprising: a first means configured to expand an input to a neuron of a DNN into a polynomial; a second means configured to approximate the neuron of the DNN with a Taylor series in parallel with the polynomial expansion of the input; and a third means configured to classify the polynomially expanded input and the Taylor-series approximated neuron as a polynomial of input signals and a polynomial of weights.
The first to third means of the apparatus may be implemented with computing hardware, such as a controller and a processor, including electronic circuits or devices designed to perform signal processing and data computation.
The aforementioned configurations and operations of the present invention will become more apparent through detailed embodiments described below with reference to drawings.
The above and other objects, features and advantages of the present invention will become more apparent to those of ordinary skill in the art by describing exemplary embodiments thereof in detail with reference to the accompanying drawings, in which:
Inventive advantages and features, and methods of accomplishing the same will become apparent with reference to embodiments described in detail below together with the accompanying drawings. However, the present invention is not limited to embodiments disclosed below, and may be implemented in various forms. Rather, the embodiments are provided so that the present disclosure will be thorough and complete and will fully convey the scope of the invention to those of ordinary skill in the art. The present invention is to only be defined by the appended claims.
The terminology used herein is for the purpose of describing embodiments only, and is not intended to limit the present invention. As used herein, singular terms are intended to include the plural forms as well unless the context clearly indicates otherwise. It will be further understood that the terms “comprise” and “comprising”, when used herein, do not preclude the presence or addition of one or more elements, steps, operations, and/or devices other than stated elements, steps, operations, and/or devices.
Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the drawings, like elements are denoted by like reference numerals even though the elements are shown in different drawings. When specific description of known configurations or functions related to the description of the present invention may obscure the gist of the present invention, the detailed description will be omitted.
First, to describe exemplary embodiments of the present invention, it is assumed that there is a neuron having two inputs x1 and x2 and an output y as shown in
An original initial input x is expanded into a polynomial such that a polynomial p(x) is obtained (100).
In the neuron of
h=w
1
x
1
+w
2
x
2 (2)
Referring back to
Assuming that tan h(x) is used as a non-linear activation function, the non-linear activation function tan h(x) is approximated with a Taylor series as shown in Equation 3.
y=tan h(h)≈h−⅓h3 (3)
When Equation 2 is substituted into Equation 3, the neuron of
When the polynomially expanded p(x) and the Taylor-series approximated neuron are classified as a polynomial of input signals and a polynomial of weights w, the polynomially expanded p(x) and the Taylor-series approximated neuron may be arranged in the form of a linear system as shown in Equation 5.
y=a·p(x) (5)
Here, p(x) is an nth-order polynomial of input signals shown in Equation 6, and a weight matrix a equals a polynomial of an original weight matrix W shown in Equation 7.
p(x)=(1,x1,x2,x12,x1,x2,x22,x13,x12,x2,x22,x1,x23) (6)
a=(0,w1,w2,0,0,0,−⅓w13,−w1w22,−w12w2,−⅓w23) (7)
As a result of dividing the DNN model of Equation 1 into the nth-order polynomial of input signals shown in Equation 6 and the polynomial of the original weight matrix W shown in Equation 7, the DNN model can be linearly approximated into the form shown in Equation 8 below (300).
Y=A·P (8)
Therefore, it is possible to solve Equation 8 in the same way as calculating a solution to a general linear system.
According to an exemplary embodiment of the present invention, a DNN is linearly approximated and thus can be analyzed in a linear system interpretation method which is being widely used. Also, an exemplary embodiment of the present invention makes it possible to obtain a single-layer system which is different from a DNN, such that training and interpretation thereof are facilitated.
Exemplary embodiments of the present invention have been described above, but those of ordinary skill in the art to which the invention pertains would appreciate that the present invention can be implemented in modified forms without departing from the fundamental characteristics of the invention. Therefore, exemplary embodiments of the present invention should be construed as describing rather than limiting the present invention in all aspects. It should be noted that the scope of the present invention is defined by the claims rather than the description of the present invention, and the meanings and ranges of the claims and all modifications derived from the concept of equivalents thereof fall within the scope of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
10-2018-0025816 | Mar 2018 | KR | national |