The present application claims priority of European patent application EP 11 159 618.5 filed on Mar. 24, 2011.
The present invention relates to a method and sequence generator for generating sequences using a Markov process, in particular in the framework of control constraint satisfaction. The present invention also relates to a computer program and a computer readable non-transitory medium for implementing said method.
Markov processes, or also called Markov models or Markov chains, are a popular modeling tool used in content generation applications, such as, for example, text generation, music composition and interaction. Markov processes of order 1 are based on the “Markov hypothesis” which states that the future state of a sequence depends only on the last state, i.e.:
p(si|s1, . . . , si-1)=p(si|si-1).
There are systems which use Markov processes to generate finite-length sequences that imitate a given style. For example, a system, also known as the “Continuator”, is disclosed in US 2002/194984 A1. It uses a Markov model to react interactively to music input. It has the capacity to faithfully imitate arbitrary musical styles, at least for relatively short time frames. Indeed, the Markov hypothesis basically holds for most melodies played by users (from children to professionals) in many styles of tonal music (classical, jazz, pop, etc.). Furthermore, a variety of outputs can be produced for a given input. All continuations produced are stylistically convincing, thereby giving the sense that the system creates infinite, but plausible, possibilities from the user's style.
It is often desirable to enforce specific control constraints on the sequences to generate. Unfortunately, control constraints are not compatible with Markov processes, as they induce long-range dependencies that violate the Markov hypothesis of limited memory. Thus, in such interactive contexts, the problem of control constraint satisfaction is a particular issue.
It is outlined in US 2011/0010321 A1 that control constraints raise a fundamental issue since they establish relationships between items that violate the Markov hypothesis. US 2011/0010321 A1 shows that the reformulation of the problem as a constraint satisfaction problem allows, for arbitrary sets of control constraints, to compute optimal, singular solutions, i.e., sequences that satisfy control constraints while being optimally probable.
However, what is often needed in practice is a distribution of good sequences. For this purpose, the approach of US 2011/0010321 A1 is not suitable, as it does not produce a distribution of sequences, but only optimal solutions. Furthermore, it involves a complete search-optimization algorithm, which is not suitable for real-time use.
It is an object of the present invention to provide a method and sequence generator for generating sequences using a Markov process that is computationally less expensive, which means that many sequences can be computed at little computing cost, and which scales up well, which means that long sequences can be generated in little computing time. It is a further object of the present invention to provide a corresponding computer program and computer readable non-transitory medium for implementing said method.
According to an aspect of the present invention there is provided a method for creating a Markov process that generates sequences, each sequence having a finite length L, comprising items from a set of a specific number n of items, and satisfying one or more control constraints specifying one or more requirements on the sequence, the method comprising:
According to a further aspect of the present invention there is provided a sequence generator for creating a Markov process that generates sequences, each sequence having a finite length L, comprising items from a set of a specific number n of items, and satisfying one or more control constraints specifying one or more requirements on the sequence, the generator comprising:
According to still further aspects a computer program comprising program means for causing a computer to carry out the steps of the method according to the present invention, when said computer program is carried out on a computer, as well as a computer readable non-transitory medium having instructions stored thereon which, when carried out on a computer, cause the computer to perform the steps of the method according to the present invention are provided.
Preferred embodiments of the invention are defined in the dependent claims. It shall be understood that the claimed sequence generator, the claimed computer program and the claimed computer readable medium have similar and/or identical preferred embodiments as the claimed method and as defined in the dependent claims, and vice versa.
The present invention is based on the idea to exploit the fruitful connection between Markov processes and constraint satisfaction, but in a random walk setting, thus without search algorithm. The present invention enables to compile control constraints in the form of a non-homogeneous Markov process. This non-homogeneous Markov process can in turn be used straightforwardly with random walk to generate sequences. More complex constraints having a large control constraints scope can be handled, provided they can be filtered to obtain a backtrack-free constraint satisfaction problem (CSP), at the cost of inaccurate probabilities.
The approach generates all the sequences satisfying the constraints with no search, so is suitable for real-time, interactive applications. Also, opposed to generating one optimal solution, the approach generates a statistical distribution of suitable solutions, which is what is needed in practice in content generation applications, for example in music generation/composition applications. It is shown that a back-track free constraint satisfaction problem can be transformed into a non-homogenous Markov process and vice versa. Thus, a bridge between Markov generation and constraint satisfaction is established.
In particular, when the control constraints scope does not exceed the order of the Markov process (for example, unary and binary adjacent for Markov processes of order-1) they can be “compiled” into a new non-homogeneous Markov process that maintains the probability distribution of the initial Markov process. This yields the advantage of retaining the simplicity of random walk, while ensuring that control constraints are satisfied.
Also, the present invention enables to solve the so-called zero-frequency problem, which arises during random walk, when an item with no continuation is chosen, in particular when another item with a continuation could have been chosen. In other words, the invention enables to generate sequences with non-zero probability (if such sequences exist), rather than generating sequences with a zero probability. The invention guarantees that such sequences with non-zero probability are generated and that they also satisfy the control constraints. In particular for control constraints that remain within the Markov scope, an arc-consistency algorithm enables to solve the zero-frequency problem.
These and other aspects of the present invention will be apparent from and explained in more detail below with reference to the embodiments described hereinafter. In the following drawings
a shows exemplary individually normalized intermediary matrices according to the first embodiment of the method;
In
The sequence generator 100 further comprises a sequence generating unit 15 which receives the final non-homogeneous Markov process {tilde over (M)} data from the second processing unit 14. The sequence generating unit 15 is adapted to generate the sequences having the finite length L by using the final non-homogeneous Markov process {tilde over (M)} data with random walk. Random walk means that the first item of a sequence is chosen randomly using the prior probabilities, and, then, a subsequent item is drawn using the Markov process, and appended to the first item. This is iterated to produce a sequence of length L.
The generated sequences can for example be rendered to the user using an output interface.
The sequence generator 100 may also comprise an input interface 21 adapted to receive user input. The input device 20 may for example be a keyboard, mouse, graphic tablet, touch screen or other device 20a that a user operates in order to interact with a user interface, such as a graphical user interface, a conventional game controller 20b comprising user operable buttons, levers and the like, a controller 20c which uses one or more accelerometer and the like to detect gestures made by a user (such as Nintendo's Wiimote), an image capture and analysis system 20d configured to detect and interpret user's gestures, etc. In particular, the input device 20 may be a gesture recognition device and/or a physiological sensor, in particular, a brain/neural sensor, a muscle activity sensor, a respiration/breath sensor or the like.
On the one hand, the input interface 21 can be adapted to receive one or more constraint inputs for generating the one or more control constraints data. These constraint inputs are then supplied to the control unit 12 from the input interface 21. Alternatively or cumulatively, the input interface 21 may be adapted to receive one or more input sequences for generating the initial Markov process M data. The input sequences are then supplied to the Markov process unit 11 from the input interface 21. The input sequence can be used to train or update the initial Markov process.
In a specific application, a user may play a musical phrase using a musical instrument, such as a music keyboard, in particular a MIDI keyboard. The phrase played by the user is then converted into a input sequence of symbols, representing a given dimension of music, such as pitch, duration, or velocity. The input sequence is then analyzed by the system to update the Markov process. When the phrase is finished, typically after a certain temporal threshold has passed, the system generates a new sequence, thus a new phrase, using the Markov process built so far. The user can then play another phrase, or interrupt the phrase being played, depending on the chosen interaction mode. Such incremental learning can create engaging dialogs with users, both with professional musicians and children.
With reference to
An initial Markov process M, illustrated in
In this example, sequences having the finite length L of 4 shall now be generated, thus 4-note melodies. The set of items to choose from is of the specific number n of 3 items, namely the notes C, D and E. There are 60 possible such melodies with non-zero probabilities. For instance, sequence CDED has probability:
Now, a control constraint is for example that the last pitch be a D. There are only 16 such sequences, as illustrated in
It is now shown how to build the non-homogeneous Markov process M given the initial (homogeneous) Markov process M and control constraints. Fixed-length sequences of finite length L shall be generated, from a Markov process, that satisfy the control constraints. To use a random walk approach, a non-homogeneous Markov process {tilde over (M)} is needed that generates exactly the sequences satisfying the control constraints with the probability distribution defined by the initial Markov process M. In general, it is not possible to find such a Markov process because control constraints violate the Markov property, as outlined in US 2011/0010321 A1. However, when control constraints remain within the Markov scope, which means that the scope of the control constraints is equal or smaller than the order of the Markov process, such a Markov process exists and can be created with a low complexity.
In the first embodiment of the method, the Markov process is of order 1 and the single transition matrix M(1) is a stochastic transition matrix (i.e. each row sums up to 1), shown in
In general, a Markov process M is defined over a finite state space A={a1, . . . , an}. A sequence s of length L is denoted by s=s1, . . . , sL with si∈A. S is the set of all sequences of length L generated by M with a non-zero probability:
p
M(s)=pM(s1)·pM(s2|s1) . . . pM(sL|sL-1).
The sequences to be generated, having the finite length L, are represented by the finite-domain constrained variables {V1, . . . , VL}, each with domain A. Markov properties, as Markov constraints {K1, . . . , KL-1} on these variables {Vt, . . . , LL}, are defined, based on the initial Markov process M and its transition probabilities. Control constraints are also represented as finite-domain constraints. Using the Markov properties/Markov constraints and the control constraints, a constraint satisfaction problem CSP is defined. The set of solutions is denoted by SC. The non-homogenous Markov process {tilde over (M)} to be created should verify:
p
{tilde over (M)}(s)=0 for s∉SC, (I)
p
{tilde over (M)}(s)=pM(s|s∈SC) otherwise. (II)
These properties state that {tilde over (M)} generates exactly the sequences s∈SC. Most importantly, sequences in SC have the same probabilities in M and {tilde over (M)} up to a constant factor α=pM(s∈SC), i.e. Vs∈SC, p{tilde over (M)}(s)=1/α·pM(s). In the running example, α=σ.
For a certain class of induced CSPs, hereafter referred to as BinarySequential CSPs, there exists a non-homogeneous Markov process {tilde over (M)} that satisfies (I) and (II).
The scope of a control constraint is the maximum number of consecutive items on which the constraint holds. A Binary-Sequential CSP is a CSP that contains only constraints whose scope remains within the scope of the Markov order. With a Markov order of 1, these constraints consist in 1) unary constraints and 2) binary constraints among adjacent variables.
In the following it will now be described how to build {tilde over (M)} from M and its induced Binary-Sequential CSP. Further, it will be shown that {tilde over (M)} achieves the desired properties, namely satisfies (I) and (II).
The final non-homogeneous Markov process {tilde over (M)} is obtained by applying two successive transformations to the initial Markov process M. The first transformation exploits the induced CSP to filter out state transitions that are explicitly or implicitly forbidden by the constraints. This is achieved by replacing the corresponding transition probabilities by zeros in the initial transition matrices. A side-effect is that the transition matrices are not stochastic anymore, which means that rows do not sum up to 1 any longer. The second transformation consists in renormalizing those matrices to obtain a proper non-homogeneous Markov process {tilde over (M)}. These two successive transformations will now be explained in more detail.
A Binary-Sequential CSP with unary control constraints U1, . . . , UL and binary constraints B1, . . . , BL-1 is considered. Ui defines the states that can be used at position i in the sequence. Bi defines the allowed state transitions between positions i and i+1. Markov constraints, denoted by K1, . . . , KL-1, are posted on all pairs of adjacent variables V1, . . . VL. Markov constraints K1, . . . , KL-1 represent the following relation:
∀i,∀a,b∈A,Ki=truepM(b|a)>0.
The CSP induced by the example of the first embodiment of the method, as shown with respect to
The first step is to make the induced CSP arc-consistent. An arc-consistency algorithm consists in propagating the constraints in the whole CSP, through a fixed-point algorithm that considers constraints individually. This ensures that each constraint c holding on variables Vi and Vj satisfies:
∀x∈D(Vi),y∈D(Vj) such that c(x;y)=true:
It is important to note here that enforcing arc-consistency on a Binary-Sequential CSP is sufficient to allow the computation of the transition matrices once for all, prior to the generation, with no additional propagation. This can be shown as follows:
Proposition: If the induced CSP is arc-consistent, then for all consistent partial sequences s1 . . . si (i.e. sequences that satisfy all the constraints between variables V1, . . . Vi), the following properties hold:
s
i+1
D(Vi+1) such that s1 . . . sisi+1 is consistent. (P1)
s
1
. . . s
isi+1 is consistent, sisi+1 is consistent. (P2)
Proof. The induced CSP is of width 2 as its constraint network is a tree. Arc-consistency enables a backtrack-free resolution of a CSP of width 2, therefore every partial consistent sequence can be extended to a solution, which is equivalent to P1. The condition in P2 is obviously sufficient; it is also necessary as no constraint links Vi+1 back to any other variable than Vi.
Arc-consistency of the Markov constraints K, can be achieved efficiently with the following propagators:
On instantiation: If Vi is instantiated with a∈A, remove every b∈A such that pM(b|a)=0 from the domain of Vi+1. Conversely, if Vi+1 is instantiated with b∈A, remove all the a∈A such that pM(b|a)=0 from the domain of Vi.
On removal: If a∈A is removed from the domain of Vi, remove all the b∈A such that pM(b|c)=0, ∀c≠a from the domain of Vi+1. The same strategy is applied when a value is removed from the domain of Vi+1. This can be implemented efficiently by associating a support counter with each value in the domain of Vi+1.
Arc-consistency of binary control constraints can be implemented with a general binary arc-consistency algorithm or a specific one, depending on the nature of the constraint. Arc-consistency of the induced CSP necessitates enforcing arc-consistency of all constraints until a fixed-point is reached, i.e., no more values are removed.
In the example shown with respect to
In general, the zero-frequency problem could arise during random walk, when an item with no continuation is chosen. However, it should be noted that arc-consistency of Markov constraints as such solves the zero-frequency problem, regardless of control constraints. No choice made during the random walk can lead to a zero-frequency prefix. More specifically, when sequences with non-zero probability exist that satisfy the control constraints, these sequences will be generated. However, when no such sequence with a non-zero probability that satisfies the control constraints exists (i.e. every sequence that satisfies the control constraints has a zero probability), this will be detected when using the arc-consistency algorithm. Then, a constraint relaxation method can be used. For example, the constraint relaxation method can be implemented in that each control constraint is associated to a numeric weight (such as a floating point number between 0 and 1). If no solution exists, the constraint with the minimum weight is discarded. If a solution to the new (simpler) problem exists, this solution is generated. If no solution exists, the constraint with the next minimum weight is discarded and so on, until a solution is found or no constraint is left in the CSP. If there is no solution even after all the constraints have been removed, then a standard Markov generation method can be applied, like for example smoothing of transition probabilities.
Now, the matrices will be extracted. The goal is to generate a non-homogeneous Markov process, represented by a series of transition matrices. A series of non-homogeneous intermediary matrices Z(0), . . . , Z(L-1) are obtained by zeroing, in the initial matrix M(1), the elements that correspond to values or transitions that were removed during arc-consistency. More precisely, the algorithm is:
The intermediary matrices Z(0), . . . , Z(L-1) obtained for the illustrated example are shown in
The final transition matrices {tilde over (M)}(i) of {tilde over (M)} from the intermediary matrices Z(i) need to be built. First, the transition matrices are normalized individually, i.e. by dividing each row by its sum.
The normalization should indeed maintain the initial probability distribution. To do this, subsequently to individual normalization, the intermediary matrices Z(0), . . . , Z(L-1) are processed using a renormalization algorithm. The renormalization algorithm maintains the initial probability distribution. It turns out that a simple right-to-left process can precisely achieve that. The idea is to back propagate the perturbations in the matrices induced by individual normalization, starting from the right-most one. Thus, the renormalization algorithm comprises back-propagating normalizations in the intermediary matrices Z(0), . . . , Z(L-1), starting from the last matrix Z(L-1).
To do this, first the last matrix Z(L-1) is normalized individually. Then, the normalization is propagated from right to left, up to the prior vector Z(0). The elements of the matrices {tilde over (M)}(i) and the prior vector {tilde over (M)}(0) are given by the following recurrence relations:
By construction, when αj(i)=0, the j-th columns of the preceding Z(i) contain only 0 as well. By convention, the division yields 0 since there is no normalization to back propagate. These coefficients can be computed in O(L×n2). In the following it is shown that this non-homogeneous Markov process satisfies the two desired properties:
Proposition: The {tilde over (M)}(i) are stochastic matrices and the non-homogeneous Markov process {tilde over (M)} defined by the {tilde over (M)}(i) matrices and the prior vector {tilde over (M)}(i) satisfies (I) and (II).
Proof. The {tilde over (M)}(i) matrices are stochastic by construction, i.e., each row sums up to 1. The probability of a sequence s=s1 . . . sL to be generated by {tilde over (M)} is:
where k, is the index of si in A. Hence, by construction of Z(i):
p
{tilde over (M)} =0 for s=∈SC, (I)
p
{tilde over (M)}=1/α(0)·pM(s) otherwise. (II)
α(0) is precisely the probability for sequences in M of satisfying the control constraints, i.e. α(0)=pM(SC).
The final matrices of the final non-homogeneous Markov process {tilde over (M)} for the illustrated example are shown in
It is interesting to observe that even the addition of a simple unary constraint (here, last item=D) has an impact that propagates back to the prior vector. In our example, p{tilde over (M)}(C) is slightly increased (from 0.5 to 0.506), p{tilde over (M)}(D) is decreased (from 0.1666 to 0.1558) and p{tilde over (M)}(E) increased (from 0.333 to 0.337).
Thus it has been shown that the first embodiment of the method generates a Markov process that satisfies the desired properties. The complexity of the method is low, as it involves only performing arc-consistency once on the induced CSP, and a renormalization in O(L×nd).
With reference to
In the example shown in
The embodiments above described the method for fixed order-1 Markov processes. However, the order d of the (initial) Markov process can also be two or higher. Generalization to order d consists in first introducing order-d Markov constraints, and second applying the rest of the method to nd×n transition matrices (where rows are d-grams and columns are the state), instead of the n×n transition matrices described previously for order 1. In practice, most d-grams have no continuation, so sparse representations (graphs, oracles) are more appropriate than matrices. When a Markov process with an order d of 2 or higher is used, the control constraints are in particular of arity d+1. The local consistency algorithm used is then a strong-(d+1)-consistency algorithm.
Unary constraints can be used to represent various musical intentions, when producing a melody from an initial Markov process M, and an input sequence in form of a melody provided in real-time. For instance, the following types of melodic output can be defined:
Continuation: input is continued to produce a sequence of the same size. A constraint is posted on the last note to ensure that it is “terminal”, i.e., occurred at the end of an input melody, to produce a coherent ending.
Variation: is generated by adding two unary constraints that the first and last notes should be the same, respectively, as the first and last notes of the input.
Answer is like a Continuation, but the last note should be the same as the first input note. This creates a phrase that resolves to the beginning, producing a sense of closure.
Thus, in general, if a control constraint requires that the last note of each sequence is a specific note it is called a Continuation. When a control constraint requires that the last note of each sequence is the same as the first note of the input sequence, it is called a Variation. When a first constraint requires that the first note of each output sequence is the same as the first note of the input sequence, and a second constraint requires that the last note of the output sequence is the same as the last note of the input sequence, it is called an Answer.
The described method can be used to implement an augmented instrument, with which the user plays bebop melodies by targeting specific notes ahead of time. These targets are selected using a gesture controller and transformed into unary constraints, e.g., on the last note. The underlying harmony is provided in real-time by an mp3 file previously analyzed, from which time-lined harmonic metadata is extracted.
The invention has been illustrated and described in detail in the drawings and foregoing description, but such illustration and description are to be considered illustrative or exemplary and not restrictive. The invention is not limited to the disclosed embodiments. Other variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed invention, from a study of the drawings, the disclosure, and the appended claims.
In the claims, the word “comprising” does not exclude other elements or steps, and the indefinite article “a” or “an” does not exclude a plurality. A single element or other unit may fulfill the functions of several items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.
A computer program may be stored/distributed on a suitable non-transitory medium, such as an optical storage medium or a solid-state medium supplied together with or as part of other hardware, but may also be distributed in other forms, such as via the Internet or other wired or wireless telecommunication systems.
Any reference signs in the claims should not be construed as limiting the scope.
Number | Date | Country | Kind |
---|---|---|---|
11159618.5 | Mar 2011 | EP | regional |