The present invention relates to a signal processing device, a signal processing method, and a program.
A signal source separation technique (or a sound source separation technique) for estimating a source signal before mixing from a mixed signal which is observed is a technique that is widely used for preprocessing of speech recognition and the like. As a method of performing signal source separation using a plurality of sensors, independent component analysis (ICA, Non Patent Literature 1) and independent vector analysis (IVA, Non Patent Literature 2) are known.
As an optimization algorithm for the ICA and the IVA, an algorithm called iterative projection (IP) has been developed. As the IP, IP1 (Non Patent Literature 2) and IP2 (Non Patent Literature 3) have been developed.
As another optimization algorithm for the ICA and the IVA, an algorithm called iterative source steering (ISS, Non Patent Literature 4) has also been developed. The ISS is referred to as ISS1 in the present specification.
The IP2 corresponding to an extension of the IP1 has fast convergence, but has a problem that a calculation amount per iteration is large. On the other hand, the ISS1 has a small calculation amount per iteration, but has a problem that convergence is slow.
Therefore, an object of the present invention is to provide a signal processing device that achieves both fast convergence of the IP2 and a small calculation amount of the ISS1.
The signal processing device of the present invention includes a separation signal updating unit.
The separation signal updating unit solves a minimization problem of an upper bound function related to a separation matrix W in a majorization-minimization algorithm for a signal source separation technique (independent vector analysis, IVA) by dividing a mixed matrix A into sub matrixes A1, . . . , AL having d columns (d is an integer equal to or larger than 2) and updating sets (W, Al) (l=1, . . . , L) of the separation matrix W and the sub matrixes A1, . . . , AL one by one, and updates a separation signal Y according to updating of the separation matrix W.
According to the signal processing device of the present invention, it is possible to achieve both fast convergence of the IP2 and a small calculation amount of the ISS1.
Hereinafter, an embodiment of the present invention will be described in detail. Note that components having the same functions are denoted by the same reference numerals, and redundant description will be omitted.
A problem of signal source separation handled in the present invention is defined as follows. It is assumed that X is an observation signal and is a product of a mixed matrix A and a matrix S in which source signals are arranged (expression (1)).
Assuming that K≥1, m∈N is the number of sensors, n∈N is the number of sample points, X[k]∈Cm×n is an observation signal, S[k]∈Cm×n is original m source signals, and A[k]∈GL(m) is a mixed matrix.
In order to estimate the source signal S, instead of the mixed matrix A, a separation matrix W(=A−1) that is an inverse matrix of A may be estimated. A separation result is Y=WX. The separation matrix W[k]∈GL(m), k=1, . . . , K is defined by the following expression (2).
D[k] and Π are a certain diagonal matrix and a certain permutation matrix each of which has a size of m×m, and respectively correspond to a scale and an ambiguity of a permutation of the separation signal represented as follows.
A model of the signal source separation technique IVA handled in the present invention is defined as follows. In the IVA, it is assumed that a multivariate vector which has a length K and is given by the following expression (3) follows a probability density function having a correlation of a second order or higher.
It is assumed that probability variables {yij}ij are independent of each other. In this model, as a cost function for optimizing the separation matrix represented as follows,
a negative log likelihood represented by the following expression (4) can be used.
The optimization of the separation matrix W is performed so as to minimize the negative log likelihood represented by the expression (4). Both the algorithms IP1, IP2, and ISS1 in the related art and an algorithm ISS2 belonging to the present invention are iteration algorithms for solving “optimization problem of minimizing the expression (4) for W”.
The algorithms IP1, IP2, and ISS1 in the related art and the algorithm ISS2 belonging to the present invention are algorithms belonging to a framework called a majorization-minimization algorithm (MM algorithm). The MM algorithm for the IVA is as follows.
The MM algorithm for the ICA has been proposed in referenced Non Patent Literatures 1 to 3.
Here, assuming that p(y) is a symmetric probability density function,
is defined by the following expression.
When G′(r)/r monotonically decreases at r∈(0, ∞)=R>0, p(y) is said to be a super-Gaussian distribution. Here, G′ is a first derivative function of G (refer to referenced Non Patent Literatures 1, 2, 4 (pp. 60-61), and 5).
For example, a generalized Gaussian distribution (GGD) given by the expression (5) is a super-Gaussian distribution.
The GGD when β=1 is a Laplace distribution. For G(r) that is a super-Gaussian distribution, in referenced Non Patent Literatures 1, 2, and 5, it is known that there is a function φ: R≥0→R satisfying the expression (6).
The right side of the expression (6) takes a minimum value when λ=G′(r)/r. When the expression (6) is used for −log p(yij)=G(∥yij∥2) in the expression (4), for L1(W), a surrogate function (or upper bound function) denoted as L2(W, Λ) is obtained.
Under the surrogate function (expression (8)), the MM algorithm for the IVA (referenced Non Patent Literature 3) alternately updates Λ and W based on the expression (11) and the expression (12).
Hereinafter, when discussing L3[k], the factor Λ will be omitted. From the expression (7), the expression (11) is solved as the following expression (13).
For the expression (12), in a case of m=2, an analytical solution is obtained (referenced Non Patent Literatures 6 and 7).
Here, in a case of m≥3, an algorithm for obtaining a global optimal solution of the expression (12) is not found. For this reason, as a block coordinate descent method (BCD) for solving the expression (12), the IP1, the IP2, and the ISS1 have been developed. These algorithms are referred to as MM+BCD. In the present invention, as a new MM+BCD, ISSd is disclosed.
Hereinafter, in order to simplify notations, the upper right index [k] will be omitted when describing the expression (12).
A difference between the algorithms IP1, IP2, and ISS1 in the related art and the algorithm ISSd disclosed in the present specification is a difference in the method of solving the “optimization problem (expression (12)) related to the separation matrix W of the upper bound function (expression (9))” in the MM algorithm.
The ISS (referenced Non Patent Literature 8, ISS1) in the related art is an algorithm that updates A column by column in each iteration.
The ISS2 disclosed in this specification is an algorithm that updates A by two columns in each iteration. In order to extend the ISS1 to the ISS2, for a certain number d≥1, a unified method for developing an ISSd that updates A by d columns is disclosed.
d is a divisor of m. It is considered that A is divided into L sub matrixes A1, . . . , AL having d columns.
The ISSd is an MM+BCD method of updating (W, Al) based on the expression (16) and updating the expression (15) based on the expression (11) for each l=1, . . . , L. In a case of d=1, the definition of the ISS1 of the present invention is consistent with the ISS1 in the related art (referenced Non Patent Literature 8).
The ISSd can be described as a multiplicative updating algorithm for W (or Y=WX). When l=1, updating (W, A1) based on the expression (16) is equivalent to the next multiplicative updating related to W (and A).
Here, D in the expression (17) is defined as follows.
Also for general l=1, . . . , L, by updating (W, A1) based on the expression (16) as follows, multiplicative updating that is expressed by the expression (17) and the expression (18) can be realized. For the multiplicative updating, a permutation matrix defined by
is prepared. By updating the separation matrix W, the mixed matrix A, the separation signal Y, and auxiliary variable in advance according to
using the permutation matrix, updating (W, Al) based on the expression (16) is equivalent to updating (W, Al) based on the multiplicative updating expression represented by the expression (17) and the expression (18).
Since the problem (17) can be analytically solved when d=2 is satisfied, the ISS2 disclosed in the present invention is an algorithm described using only the following analytical updating expression.
m×n (k = 1, . . . , K)
m×n (k = 1, . . . , K)
) / (∥yij∥2 +
), where
= 10−10 is added to improve numerical stability.
2×2 using (28)-(30).
2×n
Example 1 to be described below discloses a signal processing device 1 that implements an algorithm ISSd (d is a certain natural number) for solving the optimization problem (expression (7)) related to the separation matrix W by the method described in <Definition of ISSd>. As described above, the ISSd is an extension of the technique ISS1 in the related art.
Specifically, as shown in the expression (15), the ISSd is an algorithm that updates sets of (W, Al) one by one according to the optimization problem (expression (16)). Since the update rule (expression (16)) is the same as the update rule (expression (17) and expression (18)), the (W, Al) is updated according to the update rule (expression (17) and expression (18)). A policy of updating the separation matrix W by updating a part of the mixed matrix A is a feature of the ISS.
As described above, the algorithm ISSd when d=1 is satisfied matches the ISS1 in the related art, and the algorithm ISSd when d=2 is satisfied corresponds to the ISS2 disclosed in the present example.
Hereinafter, a functional configuration of a signal processing device 1 according to the present example will be described with reference to
The initial value setting unit 11 sets an appropriate initial value to the separation matrix W, and calculates an initial value Y of the separation signal by Y=WX (S11).
The auxiliary variable updating unit 12 repeatedly updates the auxiliary variable Λ according to a control of the control unit 14 (S12).
The separation signal updating unit 13 repeatedly updates the separation signal Y according to a control of the control unit 14 (S13). Specifically, in the optimization problem (expression (12)) related to the separation matrix W of the upper bound function in the majorization-minimization algorithm for the signal source separation technique IVA (independent vector analysis), the separation signal updating unit 13 repeatedly updates the separation signal Y (=WX) by dividing the mixed matrix A into the sub matrixes A1, . . . , and AL having d (d is an integer equal to or larger than 2) columns (expression (15)) and repeatedly updating sets of (W, Al) (l=1, . . . , L) of the separation matrix W and the sub matrixes A1, . . . , AL one by one according to the minimization problem (expression (16)) related to W of the upper bound function (S13).
Since updating the separation matrix W is equivalent to updating the separation signal Y, it is sufficient to update only the separation signal Y without updating the separation matrix W.
The control unit 14 controls the auxiliary variable updating unit 12 and the separation signal updating unit 13 to alternately and repeatedly execute processing until a predetermined condition is satisfied.
As the predetermined condition, a condition until a predetermined number of repetitions is reached or a condition until an update amount of each parameter becomes equal to or lower than a predetermined threshold value may be used.
The device according to the present invention as a single hardware entity includes, for example, an input unit to which a keyboard or the like can be connected, an output unit to which a liquid crystal display or the like can be connected, a communication unit to which a communication device (for example, a communication cable) that can communicate with the outside of the hardware entity can be connected, a central processing unit (CPU in which a cache memory, a register, or the like may be included), a RAM or a ROM as a memory, an external storage device as a hard disk, and a bus that connects the input unit, the output unit, the communication unit, the CPU, the RAM, the ROM, and the external storage device such that data can be exchanged therebetween. A device (drive) or the like that can write and read data in and from a recording medium such as a CD-ROM may be provided in the hardware entity as necessary. Examples of a physical entity including such a hardware resource include a general-purpose computer.
The external storage device of the hardware entity stores a program required for implementing the above-described functions, data required for processing of the program, and the like (the program may be stored, for example, in a ROM as a read-only storage device instead of the external storage device). Further, data and the like obtained by processing of the program are appropriately stored in the RAM, the external storage device, or the like.
In the hardware entity, each program stored in the external storage device (or ROM or the like) and data required for processing of each program are read into a memory as necessary and are appropriately interpreted and processed by the CPU. Thereby, the CPU implements a predetermined function (each component represented as . . . unit, . . . means, or the like).
The present invention is not limited to the above-described embodiment and can be appropriately modified without departing from the gist of the present invention. Moreover, the processing described in the above embodiment may be executed not only in time-series according to the described order, but also in parallel or individually according to the processing capability of the device that executes the processing or as necessary.
As described above, in a case where the processing function of the hardware entity (the device according to the present invention) described in the above embodiment is implemented by a computer, processing content of the function of the hardware entity is described by a program. In addition, the computer executes the program, and thus, the processing functions of the hardware entity are implemented on the computer.
The above-described various types of processing can be performed by causing a recording unit 10020 of a computer 10000 illustrated in
The program in which the processing content is described can be recorded in a computer-readable recording medium. The computer-readable recording medium may be, for example, any recording medium such as a magnetic recording device, an optical disk, a magneto-optical recording medium, or a semiconductor memory. Specifically, for example, a hard disk device, a flexible disk, a magnetic tape, or the like can be used as the magnetic recording device, a digital versatile disc (DVD), a DVD random access memory (DVD-RAM), a compact disc read only memory (CD-ROM), a CD recordable/rewritable (CD-R/RW), or the like can be used as the optical disc, a magneto-optical disc (MO) or the like can be used as the magneto-optical recording medium, and an electrically erasable and programmable-read only memory (EEP-ROM) or the like can be used as the semiconductor memory.
In addition, the program is distributed by, for example, selling, transferring, or renting a portable recording medium such as a DVD or a CD-ROM in which the program is recorded. Further, a configuration in which the program is stored in a storage device of a server computer and the program is distributed by transferring the program from the server computer to other computers via a network may also be employed.
For example, a computer that executes such a program first temporarily stores a program recorded on a portable recording medium or a program transferred from the server computer in a storage device of the computer. In addition, when executing processing, the computer reads the program stored in the recording medium of the computer and executes the processing according to the read program. Further, in other modes of execution of the program, the computer may read the program directly from a portable recording medium and execute processing according to the program, or alternatively, the computer may sequentially execute processing according to a received program every time a program is transferred from the server computer to the computer. In addition, the above-described processing may be executed by a so-called application service provider (ASP) type service that implements a processing function only by an execution instruction and result acquisition without transferring the program from the server computer to the computer. Note that the program in the present embodiment includes information that is used for processing by an electronic computer and is equivalent to the program (data or the like that is not a direct command to the computer but has property that defines processing performed by the computer).
In addition, although the hardware entity is configured by a predetermined program being executed on a computer in this mode, at least some of the processing contents may be implemented by hardware.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2021/047856 | 12/23/2021 | WO |