The present invention relates to telecommunication systems and more particularly to Multi-User Multiple-Input-Multiple-Output (MU-MIMO) in mmWave systems.
In telecommunication, there exists the classical problem of downlink (DL) Multi-User Multiple-Input-Multiple-Output (MU-MIMO) scheduling with linear transmit precoding. Recently MU-MIMO with linear transmit precoding is being increasingly pursued as a key technology by the industry with a strong emphasis on efficient scheduling algorithms.
However, the intractable combinatorial nature of the problem has so far restricted algorithm design to the realm of simple greedy heuristics. Such algorithms do not exploit any underlying structure in the problem.
There is a need for an improved approach to the problem of MU-MIMO.
According to an aspect of the present invention, a computer-implemented method is provided for downlink scheduling in a Multi-User Multiple Input Multiple Output (MU-MIMO) telecommunication system. The method includes identifying, by a base station, for each of multiple virtual users which collectively form a ground set of virtual users, a respective transmit precoder and receive beamformer combination that maximizes a difference between two submodular set functions applied over the ground set of virtual users, from among a plurality of combinations formed from a respective one of a plurality of transmit precoders and a respective one of a plurality of receive beamformers. The method further includes transmitting, by the base station, data from at least some of the multiple virtual users, based on a downlink transmission schedule determined from the respective transmit precoder and receive beamformer combination identified for the at least some of the multiple virtual users. The ground set of virtual users is formed from respective combinations of multiple actual users and the plurality of receive beamformers. The two submodular set functions correspond to an achievable virtual user transmission rate.
According to another aspect of the present invention, a base station is provided for downlink scheduling in a Multi-User Multiple Input Multiple Output (MU-MIMO) telecommunication system. The base station includes a processor configured to identify, for each of multiple virtual users which collectively form a ground set of virtual users, a respective transmit precoder and receive beamformer combination that maximizes a difference between two submodular set functions applied over the ground set of virtual users, from among a plurality of combinations formed from a respective one of a plurality of transmit precoders and a respective one of a plurality of receive beamformers. The base station further includes a transmitter configured to transmit data from at least some of the multiple virtual users, based on a downlink transmission schedule determined from the respective transmit precoder and receive beamformer combination identified for the at least some of the multiple virtual users. The ground set of virtual users is formed from respective combinations of multiple actual users and the plurality of receive beamformers. The two submodular set functions correspond to an achievable virtual user transmission rate.
These and other features and advantages will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.
The disclosure will provide details in the following description of preferred embodiments with reference to the following figures wherein:
The present invention is directed to Multi-User Multiple-Input-Multiple-Output (MU-MIMO) in mmWave systems.
Herein, practical choices of linear precoding and power allocation are considered, and it is shown that the resulting problem can be expressed as one where a difference of two submodular set functions has to be maximized. This opens up a new framework for MU-MIMO scheduler design. This framework is used to design an algorithm and demonstrate that gains can be achieved over the classical greedy heuristic with a reasonable complexity. The framework can also incorporate analog receive beamforming which is deemed to be essential in mmWave MIMO systems.
A first storage device 122 and a second storage device 124 are operatively coupled to system bus 102 by the I/O adapter 120. The storage devices 122 and 124 can be any of a disk storage device (e.g., a magnetic or optical disk storage device), a solid state magnetic device, and so forth. The storage devices 122 and 124 can be the same type of storage device or different types of storage devices.
A speaker 132 is operatively coupled to system bus 102 by the sound adapter 130. A transceiver 142 is operatively coupled to system bus 102 by network adapter 140. A display device 162 is operatively coupled to system bus 102 by display adapter 160.
A first user input device 152, a second user input device 154, and a third user input device 156 are operatively coupled to system bus 102 by user interface adapter 150. The user input devices 152, 154, and 156 can be any of a keyboard, a mouse, a keypad, an image capture device, a motion sensing device, a microphone, a device incorporating the functionality of at least two of the preceding devices, and so forth. Of course, other types of input devices can also be used, while maintaining the spirit of the present principles. The user input devices 152, 154, and 156 can be the same type of user input device or different types of user input devices. The user input devices 152, 154, and 156 are used to input and output information to and from system 100.
Of course, the processing system 100 may also include other elements (not shown), as readily contemplated by one of skill in the art, as well as omit certain elements. For example, various other input devices and/or output devices can be included in processing system 100, depending upon the particular implementation of the same, as readily understood by one of ordinary skill in the art. For example, various types of wireless and/or wired input and/or output devices can be used. Moreover, additional processors, controllers, memories, and so forth, in various configurations can also be utilized as readily appreciated by one of ordinary skill in the art. These and other variations of the processing system 100 are readily contemplated by one of ordinary skill in the art given the teachings of the present principles provided herein.
Moreover, it is to be appreciated that system 200 described below with respect to
Further, it is to be appreciated that processing system 100 may perform at least part of the method described herein including, for example, at least part of method 400 of
Referring to
At step 410, identify, by a base station, for each of multiple virtual users which collectively form a ground set of virtual users, a respective transmit precoder and receive beamformer combination that maximizes a difference between two submodular set functions applied over the ground set of virtual users or over subsets of the ground set. The respective transmit precoder and receive beamformer combination for each of the multiple virtual users can be identified from among a plurality of combinations formed from a respective one of a plurality of transmit precoders and a respective one of a plurality of receive beamformers.
In an embodiment, the ground set of virtual users can be formed from respective combinations of multiple actual users and the plurality of receive beamformers. In an embodiment, a size of the ground set of virtual users can be constrained relative to a value of a user channel vector.
In an embodiment, the two submodular set functions correspond to an achievable virtual user transmission rate. For example, in the case of Maximal Ratio Transmission (MRT), the two submodular set functions can correspond to ƒψMRT (A) and gψMRT(A). In the case of Zero Forcing (ZF), the two submodular set functions can correspond to ƒψZF (A) and gψZF (A). In the case of Block Diagonalization (BD), the two submodular set functions can correspond to ƒψBD (A) and gψBD (A). In an embodiment, the achievable virtual user transmission rate can be determined relative to one or more of the virtual users in the ground set of virtual users.
In an embodiment, step 410 can include steps 410A-410E.
At step 410A, construct the plurality of transmit precoders under a constraint that each of the multiple virtual users will receive data only in a time internal corresponding to a respective user ranking from among a plurality of user rankings.
At step 410B (corresponding to MRT), construct, for a scheduled one of the multiple virtual users, a corresponding one of the plurality of transmit precoders using Maximal Ratio Transmission, based on a channel matrix and a selected one of the plurality of receive beamformers in the respective transmit precoder and receive beamformer combination for the scheduled one of the multiple virtual users.
At step 410C (corresponding to ZF), construct for a scheduled one of the multiple virtual users, a corresponding one of the plurality of transmit precoders using Zero Forcing, based on a channel matrix and a selected one of the plurality of receive beamformers in the respective transmit precoder and receive beamformer combination for the scheduled one and any co-scheduled ones of the multiple virtual users.
At step 410D (corresponding to BD), construct, for a scheduled one of the multiple virtual users, a corresponding one of the plurality of transmit precoders using Block Diagonalization, based on a constraint that accounts for noise coloring due to receive beamforming by mandating that all of the multiple virtual users that correspond to a same one of the multiple actual users have receive beamforming vectors that are orthogonal with respect to each other.
At step 410E (corresponding to BD), construct, for a scheduled one of the multiple virtual users, a corresponding one of the plurality of transmit precoders using Block Diagonalization, based on a constraint that sets a per stream power level by limiting an overall number of downlink streams used by the at least some of the multiple virtual users at a given same time.
At step 420, transmit, by the base station, data from at least some of the multiple virtual users, based on a downlink transmission schedule determined from the respective transmit precoder and receive beamformer combination identified for the at least some of the multiple virtual users.
A further detailed description will now be given regarding various aspects of the present invention, in accordance with one or more embodiments of the present invention.
In an embodiment, the classical DL MU-MIMO system is considered with Mt transmit antennas at the base station (BS) and Mr receive antennas at each user. K active users are presumed in the cell of interest, with a focus on data transmission on a resource block in each scheduling interval. Without loss of generality, in the following analysis, each resource block is presumed to be of unit size, on which each user sees a frequency non-selective channel. Then, the signal received by the kth user is modeled as follows:
y
k
=H
k
x+η
k
·k=1, . . . ,K, (1)
where Hk∈CM
Define A=[Ak]k∈A, where Ak=√ρ′k, ∀k∈A, as the scaled and concatenated transmit precoding matrix of size Mt×S for MU-MIMO transmission. Each user in order to receive its data, employs an RF analog receive beamforming front-end followed by baseband linear detection. Such an architecture is significantly preferred in mmWave systems. Herein, the inventions incorporate the practically meaningful scenario in which each user uses a codebook W for analog receive beamforming. To describe the data reception, we focus on any user k. To receive data sent on each one of its rk streams that user k employs rk unit-norm beamforming vectors from W. Let Gk denote the Mr×k matrix whose columns are these beamforming vectors. The received signal post receive beamforming is down-converted and detected at baseband. Two types of detection methods are considered at baseband. The first one is the simplest method of detection at the baseband, in which no further mitigation is carried out to suppress inter-stream residual interference. This method is referred to as the matched filter (MF) baseband detector. The resulting signal-to-interference plus noise ratio (SINR) for the ith stream (or layer) of the kth user is given by the following:
where [.]i,j is the (i,j)th element of the matrix argument. The corresponding information rate is given by the following:
ηi,k=log(1+γi,k) (3)
Hence, the information rate over all the streams of user k can be written as follows: Rk=Σi=1r
R
k=log|I+Qk−1Gk†HkAk(Gk†HkAk)†| (4)
where QkGk†Gk+Σl∈A|kGk†HkAl(Gk†HkAl)† represents the covariance matrix of additive noise and interference from streams intended for other users. Note that the additive noise is colored by the receive beamforming operation.
The three linear transmit precoding methods are outlined that are considered herein and which cover all the main practical ones. Consider any given user set U along with a rank vector r. In all these methods, it is presumed for precoder construction that each user k∈U that is assigned rank rk will receive data only in the span of its chosen a receive beamforming vectors in Gk. Consequently, the rk×Mt matrix is defined as follows: Ĥk=Gk†Hk.
The construction of the transmit precoder matrices then proceeds by using the matrices {{tilde over (H)}k}k∈U.
A description will now be given regarding a problem formulation, to which the present invention can be applied, in accordance with an embodiment of the present invention.
Our objective in the subsequent sections is to design efficient algorithms to optimize Σk∈UwkRk, where wk is the weight or priority assigned to user k, under certain practical constraints. Due to space constraints, in an embodiment, only the most natural pairings of precoder construction and receiver detection are considered, which are to use either MRT or ZF transmit precoding with the MF baseband detection. On the other hand, in an embodiment, BD precoding is used in conjunction with optimal baseband detection. Note that for each such combination of the aforementioned transmit precoder construction and receiver detection methods, the resulting weighted sum rate depends on the choice of user set U as well as the choice of transmit ranks and the receive beamforming vectors. Moreover, there can be a non-linear dependence (or coupling) between the choice of receive beamforming vectors and the transmit precoder construction. As a result, the optimization problem at hand appears to be intractable at the first glance.
A description will now be given regarding a structure in the rate expression, to which the present invention can be applied, in accordance with an embodiment of the present invention.
Initially, both MRT and ZF transmit precoders with matched filter baseband detection are considered. Our first observation then is that we can regard each user and receive beamformer combination as a virtual user. In particular, consider any stream of any user k that is received along any beamformer w∈W, and define ψ as the corresponding virtual user with its channel given by the 1×Mt vector, zψ†=w†Hk. Then, the received statistic for this virtual user can be written as follows:
y
ψ
=z
ψ
†
x+η
ψ (5)
where ηψ˜CN(0, 1). Define a ground set Ψ of all virtual users ψ such that zψ≠0 so that the size of Ψ is at-most K|W|1. Consider any choice of co-scheduled virtual users A ⊂Ψ. Suppose MRT precoding at the BS, so that the transmit precoding vector for virtual user ψ is given by vψ=zψ/∥zψ∥. For this choice using (5) and (3) the rate for virtual user ψ∈A is given by the following:
On the other hand, for any ψ∈Ψ\A, set Rψ(A)=0. The following result is provided that reveals the structure in the rate expression.
Proposition 1. The rate achieved by any virtual user ψ∈Ψ under MRT precoding and set A⊂Ψ:A/=φ, can be expressed as follows:
Further, for A=φ, we define Rψ(φ)=0, where φ denotes the empty set, with ƒψMRT(φ)=gψMRT(φ)=−ln(2). Then, the set functions ƒψMRT(.), gψMRT(.) are both submodular set functions over the set Ψ.
The more complicated case of ZF precoding is now considered. The key complication here that we need to overcome is that the transmit precoder for each user depends not only on its channel matrix and choice of receive beamformers, but also on those of other co-scheduled users. Moreover, the latter dependence is non-linear. We again use the virtual user concept and recall the model in (5) for some virtual user ψ∈Ψ. Consider any choice of co-scheduled virtual users A⊂Ψ and define the matrix ZA=[Zψ]ψ∈A along with ZA\ψ=[Zψ′]ψ′∈A\ψ, and suppose that the matrix ZA†ZA is invertible. The ZF matrix is given by ZA(ZA†ZA)−1D, where D is the diagonal matrix normalizing the columns of ZA(ZA†ZA)−1. The rate for virtual user ψ∈A can be expressed as follows:
R
ψ(A)=ln(1+ρ∥zψ∥2/|A|−ρψ†ZA\ψ(ZA\ψ†ZA\ψ)−1ZA\ψ†zψ/|A|) (8)
On the other hand, for any ψ∈Ψ\A, set Rψ(A)=0. Note the following:
Res(ψ,A\ψ)∥zψ∥2−zψ†ZA\ψ(ZA\ψ†ZA\ψ)−1ZA\ψ†zψ
is the squared norm of the component of zψ in the orthogonal complement of ZA\ψ.
We now proceed to unearth the structure in this rate expression. Towards this end, let us first define the matrix
B=ρZ
Ψ
†
Z
Ψ (9)
with the understanding that BA, ∀ A⊂Ψ is the principal submatrix of B with row and column indices drawn from A. Note that BA=ρZA†ZA ∀ A⊂Ψ. Along similar lines, for each virtual user ψ∈Ψ and any scalar a≧0, let us define the matrix,
C(a,ψ)=aeψeψ†+ρZΨ†ZΨ (10)
where eψ is a |Ψ|×1 vector that has a one in its ψth element and zeros everywhere else. As before, let CA(a,ψ) ∀A⊂Ψ be the principal submatrix of C(a,ψ) with row and column indices drawn from A. Let us next define a family of subsets, , of Ψ such that φ∈
and all subsets A of Ψ for which BA is invertible are members of
and conversely for any non-empty member A∈
, BA is invertible. It is readily seen that this family is downward closed and that all singleton sets {ψ}: ψ∈Ψ are members of
.
Our next result reveals that it is possible to write (8) in a more amenable form. The convention that 0 ln(0)=0 is adapted and that ln|.| returns zero whenever the input matrix is empty or null matrix.
Proposition 2. The rate achieved by any virtual user under ZF precoding can be expressed as follows:
The functions ƒψZF(.), ƒψZF(.) are both submodular over the family .
The case where the BS employs BD transmit precoding and each user employs the optimum baseband detector will now be analyzed. In this case, the rate across all virtual users that correspond to the same (real) user should be jointly considered. Furthermore, the coloring of the noise due to receive beamforming should be accounted for. To make the problem tractable, we follow an approach where we first assume that the power per stream (virtual user) is given and does not vary with the number of selected virtual users. This assumption results in no loss of optimality if we also consider all possible total number of streams that can be scheduled, and solve the problem at hand for each such total number. In particular, for each value, S, of the total number of streams, we fix the power per stream to be {circumflex over (ρ)}=ρ/S and solve the weighted sum rate maximization under the constraint that no more than S streams can be scheduled. Then, suppose that we are any given a value for the power per stream {circumflex over (ρ)}. Let u:Ψ→{1, . . . , K} denote a scalar valued function which returns the actual user corresponding to any virtual user in Ψ. Similarly, let w:Ψ→ denote a vector valued function which returns the receive beamforming vector corresponding to any virtual user in Ψ. We will use the index k∈{1, . . . , K} to denote an actual user. For each user k∈{1, . . . , K}, define the matrix as follows:
F
(k)
={circumflex over (ρ)}Z
Ψ
†
Z
Ψ
+L
(k) (12)
where
is a |Ψ|×|Ψ| matrix whose (ψ,ψ′)th entry is given by the following:
As done previously, we let (
), ∀
⊂Ψ denote the principal submatrix of F(k)(L(k)) with row and column indices drawn from
. We offer the following result.
Proposition 3. The rate achieved by any user under BD precoding can be expressed as follows:
The functions ƒkBD BD(.), gkBD(.) are both submodular over the family .
A description will now be given regarding an algorithm design framework, in accordance with an embodiment of the present invention.
We will illustrate the design frame work that is based on optimizing the difference of submodular (DS) set functions. We proceed to explain the DS framework for ZF precoding, while noting that other precoding methods can be handled similarly. Then, the optimization problem at hand can be posed as follows:
where we use the family of sets to impose further constraints. We consider two key practical constraints:
The total number of selected virtual users should not exceed a bound, i.e., a cardinality constraint ||<St is imposed, where St is the number of transmit RF chains.
The total number of selected virtual users that correspond to the same real user k should not exceed a bound, i.e., a cardinality constraint |{ψ∈:u(ψ)=k}|≦Sr,k, ∀k is imposed, where Sr,k,is the number of receive RF chains at user k.
Let be the collection of all subsets of Ψ that meet the aforementioned two constraints. Then, we have the following observation that follows upon verifying the properties stated hereinafter.
Proposition 4. The family defines a matroid over Ψ.
Using (11) we can re-state (15) as follows:
The DS framework entails an iterative approach in which each iteration seeks to improve the current best solution at hand by solving a simpler maximization problem. Suppose at any iteration, the current best solution is given by . Then, let g(
/B)
g(
∪B)−g(B) define the marginal gain obtained upon adding set
to set B for any set function g(.), for any subsets
, B of a ground set such that g(B), g(
∪B) are both defined. Next, define a modular upper bound as follows:
It can be shown that
(
)≧gψZF(
),∀
∈
(19)
with equality in (19) at =
. Thus,
ψ(
)=
−
(
), ∀
∈
, satisfies
,ψ(
)≦Rψ(
), ∀
∈
with equality at
=
. With this bound in hand, we proceed to solve the following problem
Let be an obtained optimized solution. Then, if
ψ()>
(
), we can be sure that the current best solution at hand has been improved, i.e., Rψ(
)>Rψ(
). The key property of (20) is that since the objective is now a submodular set function and the constraint is a matroid, (20) can be relatively well optimized via simple methods such as the classical greedy method. An important by-product of the submodularity of the objective is that we can use the Lazy Greedy implementation to significantly lower the complexity of the greedy method. The DS procedure terminates if there is no improvement in the current best solution at hand. Otherwise, we proceed to the next iteration using
→
as the current best solution.
A description will now be given of various definitions, lemmas, and proposition proofs, in accordance with one or more embodiments of the present invention.
Definition 1. Let Ω be a ground set and h: 2Ω>IR be a real-valued set function defined on the subsets of Ω. The set function h(.) is a submodular set function over Ω if it satisfies,
h(B∪a)−h(B)≦h(A∪a)−h(A),
∀A⊂B⊂Ω&a∈Ω\B (21)
Definition 2. (Ω, I), where I is collection of some subsets of Ω, is said to be a matroid if
Definition 3. Let be any family of subsets of Ω that is downward closed. A real-valued set function h: 2Ω→IR is submodular over
, if it satisfies (21) for each choice of A⊂B⊂Ω&a∈Ω\B such that B∪a∈
(so that A, B, a∈
). Hence, as used herein, a submodular function refers to a function wherein the reward of adding a new element to a set is larger if the set is smaller. In other words, if set B contains all the elements of set A, and possibly more, the reward of adding a new element to set B is less than the reward of adding the same element to the smaller set A.
Lemma 1. Consider any N×N positive definite matrix M and let MS, ∀ S⊂Ω={1, . . . , N}, denote the principal submatrix of M with row and column indices drawn from S. Then, the set function defined as h(S)=ln|MS|, ∀S⊂Ω is a submodular set function over Ω. Thus, for any j∈Ω, the set function defined as hj(S)=ln|MS/j|, ∀ S⊂Ω is also a submodular set function over Ω.
Lemma 2. Consider any choice of co-scheduled virtual users A⊂Ψ and any virtual user ψ∈A. Define the matrix ZA=[zψ]ψ∈A along with ZA\ψ=[zψ′]ψ′∈A\ψ. Further, define diagonal matrices EA=diag{eψ′}ψ′∈A and EA\ψ=diag{eψ′}ψ′A\ψ. Then, we have the following:
|EA+ZA†ZA|=|EA\ψ+ZA\ψ†ZA\ψ|×(eψ+∥zψ∥2−zψ†ZA†ψ(EA†ψ+ZA\ψ†ZA\ψ)−1ZA\ψ†zψ) (22)
Note that when EA\ψ=0 then,
∥EA+ZA†ZA|=|ZA\ψ†ZA\ψ|(eψ+Res(ψ,A\ψ))
where Res(ψ,A\ψ)=∥zψ∥2−zψ†ZA\ψ(ZA\ψ†ZA\ψ)−1ZA†ψ†Zψ
Lemma 3. A few facts are collected that follow after some algebra.
Proof of Proposition 1
Note first that the rate expression in (7) satisfies Rψ(A)=0, ∀ ψ∉A. Further, for each ψ∈A it can be readily verified that (7) follows upon expressing the RHS of (6) in a different form. Then, consider the first term ƒψMRT:2ψ→IR in the RHS of (7). To show that ƒψMRT(.) for each ψ∈Ψ is a submodular set function over Ψ, the following property of the logarithm function is invoked:
ln(c+e)−ln(c)≦ln(d+ƒ)−ln(d),
∀0<d≦c&ƒ≧e≧0 (23)
The above property follows from the concavity of the logarithm function. Considering any ε⊂⊂Ψ:ε≠φ and any ψ″∈Ψ\
, the following is defined as follows:
Note that the scalars so defined satisfy d≦c and ƒ≧e so that we can invoke (23) with this choice to verify that the required condition in (21) is satisfied. Now consider the case ε=φ. Clearly, when =φ the required condition is trivially satisfied. Hence, suppose that
≠φ and define the scalars c, e & ƒ as in (24). To prove that (21) indeed holds, the following is shown:
ln(c+e)−ln(c)≦ln(e)−ƒψMRT(φ)=ln(e)+ln(2), (25)
Note that since c≧1 and e≧1, the LHS in (25) is clearly no greater than ln(1+e). Therefore, (21) holds if it can be shown that ln(2) >ln(1+1/e). The latter inequality is true since e≧1.
Next, to show that gψMRT(.) is a submodular set function, we consider any ε⊂⊂Ψ:ε≠φ with any ψ″∈Ψ\
, and define the following:
where 1{.} denotes an indicator function that is one if the input argument is true and is zero otherwise. Clearly this choice also satisfies d≦c and ƒ≧e, so that (23) can be invoked with this choice to verify that the required condition in (21) is again satisfied. The case with ε=φ can be proved in a similar manner as before.
Proof of Proposition 2
First, the case A∈ with ψ∈A is considered. Here, (8) can be written as follows:
R
ψ(A)=ln(|A|+ρ∥zψ∥2ρzψ†ZA\ψ(ZA\ψ†ZA\ψ)−1ZA\ψ†zψ (26)
Invoking Lemma 2, the RHS of (26) can be re-written to obtain the following:
R
ψ(A)=ln|CA(|A|,ψ)|−ln|A|−ln|CA\ψ(|A|,ψ)| (27)
Then, since BA\ψ=CA\ψ(|A|,ψ) and ln|A|=|A|ln|A|(+A|−1)ln|A|, it can be deduced that (11) holds. On the other hand, whenever ψ∉A, it can be verified that (11) yields Rψ(A)=0 which is consistent.
We proceed to prove the submodularity of gψMRT(.) for each ψ∈Ψ over first. Towards this end, we arbitrarily pick any ψ∈Ψ and consider each one of the two terms whose sum gives gψMRT(.) Considering the first term, if we define h(A)=ln|BA\ψ|, ∀A⊂Ψ, then this set function can be verified to be submodular over I upon invoking Lemma 1. For the second term, we define h(A)=−|A\ψ|ln|A|, ∀A⊂Ψ. It will be shown that this set function can be verified to be submodular over Ω (and hence over
). Consider any ε⊂
∈Ω with any ψ″∈Ψ\
. To establish submodularity when ψ∉
(so that ψ∉ε) and ψ″≠ψ, it is shown that
−(|ε|+1)ln(|ε|+1)+|ε|ln(|ε|)≧−(||+1)ln(|
|+1)+|
|ln(|
|) (28)
holds due to the concavity of −x ln(x) for all x≧0 stated as the first fact in Lemma 3. Further, when ψ∉ but ψ″=ψ, it is shown that
−(|ε|)ln(|ε|+1)+|ε|ln(|ε|)≧−(||)ln(|
|+1)+|
|ln(|
|) (29)
follows from the third fact stated in Lemma 3. Next, when ψ∈ε(so that ψ∈) and ψ″≠ψ, we need to show that
−(|ε|)ln(|ε|+1)+(|ε|−1)ln(|ε|)≧−(||)ln(|
|+1)+(|
|−1)ln(|
|) (30)
holds due to the concavity of −x ln(x+1) for all x≧0 stated as the first fact in Lemma 3. Finally, when ψ∉ε but ψ∈ and ψ″≠ψ, it is shown that
−(|ε|+1)ln(|ε|+1)+(|ε|)ln(|ε|)≧−(||)ln(|
|+1)+(|
|−1)ln(|
|) (31)
(29) follows by first using the concavity of −x ln(x+1) for all x≧0 to deduce
−(+|)ln(|
|+1)+(|
|−1)ln(|
|)≦−(|ε|+1)ln(|ε|+2)+(|ε|)ln(|ε|+1)
and then using the second fact stated in Lemma 3 to confirm that
−(|ε|+1)ln(|ε|+2)+(|ε|)ln(|ε|+1)≦−(|ε|+1)ln(|ε|+1)+(|ε|)ln(|ε|)
In summary since gψMRT(.) is the sum of two terms that are each submodular over , we can confirm that gψMRT(.) is submodular over
.
Now we embark upon the more involved part of proving the submodularity of ƒψMET(.) over . Here although as before ƒψMRT(.) is the sum of two terms, we have to consider both the terms in ƒψMRT(.) together. This is because the first term in ƒψMRT(.) need not be submodular. However, as shown below, the second term in ƒψMRT(.) adequately compensates and makes the sum submodular. Let us define a set function g(A)=−(|A|+1)ln(|A|+1)+|A|ln|A|,∀ A⊂Ψ which can be verified using Lemma 3 to be a decreasing set function. Any ε⊂
∈
with any ψ″∈Ψ\
:
∪ψ″∈
can be considered. Further, it suffices to consider
: |
|=|ε|+1. Then, we systematically analyze one of the four possible cases which captures all the techniques needed to prove the other three cases as well:
Case ψ∈ε: Here, we must have ψ∈ and ψ″≠ψ. Then,
,ψ″
ƒψMRT(
∪ψ″)−ƒψMRT(
) can be expanded using Lemma 2 as follows:
Δ,ψ″=ln(|
|+1+Res(ψ,
\ψ∪ψ″))−ln(|
|+Res(ψ,
\ψ))+ln(Res(ψ″,
\ψ))+g(|
|) (32)
A term is added and subtracted and ,ψ′′ is written as follows:
,ψ″=ln(||+1+Res(ψ,
\ψ∪ψ″))−ln(|
|+Res(ψ,
\ψ))+ln(Res(ψ″,
\ψ))+g(|
|)+ln(|
|+Res(ψ,
\ψ∪ψ″))−ln(|
|+Res(ψ,
\ψ∪ψ″)) (33)
Similarly, Δε,ψ″ƒψMRT(ε∪ψ″)−ƒψMRT(ε) is expressed as follows:
Δε,ψ″=ln(|ε|+1+Res(ψ,ε\ψ∪ψ″))−ln(|ε|+Res(,ε\ψ))+ln(Res(ψ″,ε\ψ))+g(|ε|)+ln(||+Res(ψ,ε\ψ))−ln(|
|+Res(ψ,ε\ψ)) (34)
Now, a key observation using Lemma 1 and the fact that ||=|ε|+1 is that
ln(|ε|+1+Res(ψ,ε\ψ∪ψ″))+ln(Res(ψ″,ε\ψ)ln(||+Res(ψ,ε\ψ))≧ln(|
|+Res(ψ,
\ψ∪ψ″))+ln(Res(ψ″,
\ψ))−ln(|
|+Res(ψ,
\ψ)) (35)
Then, to prove submodularity, i.e., ΔF,ψ″≦ΔE,ψ″, it suffices to show that
ln(||+1+Res(ψ,
\ψ∪ψ″))−ln(|
|+Res(ψ,
\ψ∪ψ″))+g(|
|)≦ln(|ε|+1+Res(ψ,ε\ψ))−ln(|ε|+Res(ψ,ε\ψ))+g(|ε|) (36)
Then, since ln(|ε|+1+Res(ψ,εψ))−ln(|ε|+Res(ψ,ε\ψ))≧0, it suffices to show the following:
ln(||+1+Res(ψ,
\ψ∪ψ″))−ln(|
|+Res(ψ,
\ψ∪ψ″))+g(|
|)≦g(|ε|) (37)
To show (37), the concavity of the logarithm function is exploited to deduce the following fact:
ln(||+1+Res(ψ,
\ψ∪ψ″))−ln(|
|+Res(ψ,
\ψ∪ψ″))≦ln(|
|+1)−ln(|
|) (38)
Using (38)ln(37) and recalling that ||=|ε|+1, it can be seen that to establish submodularity in this case, it is enough to show that
−(|ε|+1)ln(|ε|+2)−|ε|ln(|ε|+1)≦−(|ε|+1)|ln(|ε|+1)−|ε|ln(|ε|) (39)
Finally, (39) holds true from the second fact stated in Lemma 3.
Embodiments described herein may be entirely hardware, entirely software or including both hardware and software elements. In a preferred embodiment, the present invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.
Embodiments may include a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. A computer-usable or computer readable medium may include any apparatus that stores, communicates, propagates, or transports the program for use by or in connection with the instruction execution system, apparatus, or device. The medium can be magnetic, optical, electronic, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. The medium may include a computer-readable medium such as a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk, etc.
It is to be appreciated that the use of any of the following “/”, “and/or”, and “at least one of”, for example, in the cases of “A/B”, “A and/or B” and “at least one of A and B”, is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of both options (A and B). As a further example, in the cases of “A, B, and/or C” and “at least one of A, B, and C”, such phrasing is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of the third listed option (C) only, or the selection of the first and the second listed options (A and B) only, or the selection of the first and third listed options (A and C) only, or the selection of the second and third listed options (B and C) only, or the selection of all three options (A and B and C). This may be extended, as readily apparent by one of ordinary skill in this and related arts, for as many items listed.
Having described preferred embodiments of a system and method (which are intended to be illustrative and not limiting), it is noted that modifications and variations can be made by persons skilled in the art in light of the above teachings. It is therefore to be understood that changes may be made in the particular embodiments disclosed which are within the scope and spirit of the invention as outlined by the appended claims. Having thus described aspects of the invention, with the details and particularity required by the patent laws, what is claimed and desired protected by Letters Patent is set forth in the appended claims.
This application claims priority to U.S. Provisional Pat. App. Ser. No. 62/395,567, filed on Sep. 16, 2016, incorporated herein by reference herein its entirety.
Number | Date | Country | |
---|---|---|---|
62395567 | Sep 2016 | US |