The present invention relates to the field of models built using a set of constraints, and more particularly, to finding a minimal explanation for an output of a model comprising a plurality of constraints.
Models built using a set of constraints are used in many technical areas. Real world systems can be modelled using a set of suitable constraints and a solver can provide a solution to the model or prove the absence of solution. An explanation for the output provided by the solver is required in some cases. An explanation may be sought for the solution itself, for a specific property of this solution, or for the absence of solution, all of which can be considered as an output from the model.
Computing systems using a solver are able to make logical inferences from the model and such solvers are used in many different domains in computer science. Sometimes the inferences that are generated by such a solver may surprise people. A surprising inference may reveal a bug in the solver, or that a wrong model has been given to the solver, for example. A surprising inference may require significant cognitive effort to be understood. As a result, an approach is needed to reduce the cognitive effort required to understand the inference.
In one embodiment of the present invention, a computer implemented method for determining a minimal explanation comprises receiving a model comprising a plurality of constraints. The method further comprises determining an output for the model. The method additionally comprises constructing, by a processor, a subset of the constraints that provide the same output for the model. The construction comprises determining a first constraint that forms part of the subset of the constraints. The construction further comprises testing the constraint(s) that form the subset of the constraints to determine if the subset of constraints provides the same output for the model. Furthermore, the construction comprises selecting a further constraint that has a variable in common with at least one constraint in the subset of constraints, the further constraint forming part of the subset of constraints. Additionally, the construction comprises repeating the testing of the constraint(s) and the selecting of the further constraint until the testing of the constraint(s) that form the subset of the constraints determines that the subset of constraints does provide the same output for the model.
Other forms of the embodiment of the method described above are in a system and in a computer program product.
The foregoing has outlined rather generally the features and technical advantages of one or more embodiments of the present invention in order that the detailed description of the present invention that follows may be better understood. Additional features and advantages of the present invention will be described hereinafter which may form the subject of the claims of the present invention.
A better understanding of the present invention can be obtained when the following detailed description is considered in conjunction with the following drawings, in which:
A minimal explanation can be considered to be a subset of the constraints 12 that describe the result given by the model 10 under the operation of the solver. For example, the model 10 may be modelling the launch of a new product, with constraints 12 relating to resources required for marketing and legal approval and so on. The model 10 is sent to the solver, which provides an output. A minimal explanation would be those constraints 12 that result in the same output. For example, a solver may produce an output from a model 10 that a new product cannot be launched within 3 months, but this output cannot easily be understood without knowing which constraints 12 in particular lead to the output. A minimal explanation will be those constraints 12 that produce the same output using the same solver.
The time needed to compute a minimal explanation may be large, depending on the kind of output under consideration. In particular, for combinatorial optimization problems, such as scheduling problems, where a cognitively challenging output can be, for example, that there is no solution to a given problem, it can take a long time to find a minimal explanation to this absence of solution. Therefore, a system is provided that improves the performance of finding a minimal explanation.
For example, the model 10 could relate to the building of five houses. For each house, several tasks must be planned, such as masonry, carpentry, plumbing, ceiling, roofing, painting, windows, façade, garden, etc. Some of constraints 12 relate to precedence (masonry before roofing, plumbing before painting etc.) and some relate to resources (a single plumber works on all houses, the roof in houses 2 and 3 are made by the same workers, etc.). A constraint solver infers (from the model 10) that the garden of the first house cannot be finished before Christmas. An explanation for this inference is required, which is equal to a minimal subset of the constraints 12 that leads the inference.
The methodology performed by the processor 22 for generating the minimal explanation 18 starts from a current partial explanation, called partialExp. By definition, this explanation is a subset of a minimal explanation 18. The current partial explanation is initialized with a non-empty set, which is called a seed. The seed is a part of a minimal explanation 18. In some cases, for some specific reasons, such an initial seed is known. For example, given a constraint model M where it is desirable to know why a variable X has been assigned a value v, the computer system 20 will find a minimal explanation of the fact that the model M′ derived from M has no solution: in this case, the derived model M′ is defined to be the union of M and the constraint{X!=v}, meaning that X is different from v, and it is known in advance that the constraint X!=v is necessarily in the minimal explanation, and then this constraint 12 can be taken as the seed without computation.
In other cases, the computer system 20 does not know a seed without performing some computation. In such cases, the processor 22 will simply compute a seed. In order to carry this out, the processor 22 will use one of the known methods to compute the first element of an explanation. The best approaches use a binary search, and calls O(log n) times the solver, where n is the number of candidates that could appear in the minimal explanation. Then, a seed can be found in O(log n) calls to the solver. For computing more efficiently a minimal explanation, the processor 22 can first order the different constraints 12 by their number of neighbors, where a neighbor is defined as any other constraint 12 that has a variable in common with the constraint 12 in question. The binary search is then implemented in such a way that the processor 22 finds the constraint 12 that is the least connected amongst the constraints 12 of the minimal explanation 18.
The advantage of beginning with the least connected element of a minimal explanation will become clear from the description below. The processor 22 will consider as candidates to be injected in the current partial explanation only those that are connected to the constraints 12 of the current partial explanation. The fewer constraints 12 that have to be considered the better, since this reduces the time required to identify a minimal explanation 18. Hence, beginning with the least connected constraint 12 of a minimal explanation 18 is a heuristic that may lead to the consideration of fewer possible candidates.
The processor 22 provides an extension of the current partial explanation. The processor 22 is operated to address the problem of how to extend the current partial explanation, while preserving the property of the partial explanation being a subset of a minimal explanation 18. The approach used is built on the idea that it is sufficient to extend the current partial explanation by considering only the neighbors of the constraints 12 in the current partial explanation. The rationale behind this idea is that a minimal explanation 18 is necessarily composed of constraints 12 that are connected; otherwise, the explanation 18 will not be minimal.
In other words, there are different ways of constructing the minimal explanation 18. The processor 22 has a degree of freedom in choosing the next constraint 12 that is added to the current partial explanation. The processor 22 exploits the fact that the processor 22 can complete the current partial explanation by following the connections. In a sparse graph, the number of candidates to consider is strongly reduced.
The implementation executed by the processor 22 performs a focused binary search. Instead of considering the n possible candidates, the processor 22 considers only those constraints 12 that are connected to constraints 12 within the current partial explanation. This is achieved with reference to the graph of
When selecting a new constraint 12 for consideration, the processor 22 navigates the built graph from a node defining a constraint 12 in the subset 18 of constraints to a connected node defining a constraint 12 not in the subset 18 of constraints. A heuristic to reach a better efficiency is to order the k neighbors by their number of neighbors outside the current explanation (and inside the still possible candidates), and the binary search is implemented so that the processor 22 returns the least connected constraint 12 that extends the current explanation. The connected node selected has the lowest number of connections of all of the nodes connected to the node defining a constraint 12 in the subset 18 of constraints.
Referring to the example mentioned in the text above, this is an example in the context of scheduling problems. There is a need to build five houses. Each house requires tasks, such as masonry, carpentry, plumbing, ceiling, roofing, painting, windows, façade and the garden to be completed. Precedence constraints 12 between the tasks of a particular house are given, for instance, such that masonry precedes roofing, plumbing precedes painting, and also there are constraints 12 that relate to some resources that are shared between some houses: the same plumber works on the five houses, the roof in the houses 2 and 3 are made by the same workers, and so on.
The graph shown in
{the garden of the first house must be finished before Christmas,
the garden of the first house must begin after the roofing of the first house,
the roofing of the first house must begin after the masonry of the first house}.
The method performed by the processor starts from the seed “the garden of the first house must be finished before Christmas.” Then the processor 22 tries to extend the seed by looking at the neighbors of the seed of which there is only one: “the garden of the first house must begin after the roofing of the first house.” A call to the solver is performed to check if this is a sufficient explanation, which is not the case. Therefore one new neighbor is entered: “the roofing of the first house must begin after the masonry of the first house.” A call to the solver is performed again and this time returns an output that the set of these three constraints 12 is a sufficient explanation. The processor 22 has got to the minimal explanation 18 with only two calls to the solver.
The methodology used by the processor 22 leverages the property that an explanation verifies a connection property because the explanation is minimal. The processor 22 exploits this property and accelerates the computation of an explanation for sparse graphs. The methodology starts from a partial explanation (which has to be seeded in some way) and tries to extend the partial explanation by testing the neighbor constraints 12 of a constraint 12 that is already present in the partial explanation. Once the partial explanation provides the same outcome as the entire model 10, then the partial explanation is the minimal explanation 18 of the outcome. At this point, the processor 12 has completed the necessary steps to determine the minimal explanation and the process can terminate.
Step S5.1 comprises determining a first constraint 12 that forms part of the subset 18 of the constraints 12. As discussed above, this can be done in a number of different ways, depending upon the preferred methodology, but at this point the processor 22 finds a first constraint 12 for the minimal explanation defined by the subset 18 of constraints 12.
At step S5.2, the method continues by testing the constraint(s) 12 that form the subset 18 of the constraints 12, in order to determine if the subset 18 of constraints 12 provides the same output 14 for the model 10 (and the process is therefore ended). At this point, the processor 22 is checking the completeness of the current explanation 18. In the first pass through the method shown in
The final step in the method is Step S5.3, which comprises selecting a further constraint 12 that has a variable in common with at least one constraint 12 in the subset 18 of constraints 12, where the further constraint 12 forms part of the subset 18 of constraints 12. The further constraint 18 is chosen as one that is connected to at least one of the constraints 12 already present within the subset 18. As can be seen from
The algorithm operated by the processor 22 repeats the testing of the constraints 12 and the selecting of a further constraint 12 until the testing of the constraint(s) 12 that form the subset 18 of the constraints 12 (in step S5.2) determines that the subset 18 of constraints 12 does indeed provide the same output 14 from the model 10. In this way, a new constraint 12 is continually sought and then added to the subset 18 until the subset 18 does provide the same output 14, thereby confirming that the current members of the subset 18 do provide the sought minimal explanation.
First, at step S6.1, the method builds a NeighborList, which is the list of all the constraints 12 of the neighborhood of a constraint 12 in the partial explanation, ranked from 1 to K, and initializes a set of candidate indices, which is called the candidates (numbered 30 in the Figure), from [1 . . . K]. Then, at step S6.2, the method selects any index in the candidates 30. The selection strategy may be to always take the largest index in the candidates 30, leading to a linear search, or it may be to take the median in the candidates 30, leading to a binary search. At this point, if there is only one remaining candidate constraint (all others having been eliminated), the last remaining constraint is necessarily the one which extends the current explanation and this step terminates the method, returning the sole remaining constraint 12 as the output of the method. Otherwise, at step S6.3, a current constraint system 32 is build, called S(i), which is defined as the union of partialExp and currentSystem, except the constraints occurring at ranks i+1 to K in the list NeighborList are excluded.
Then, at step S6.4, this current constraint system S(i) is sent to the constraint solver to check whether the solver can deduce from S(i) the output 14. If the answer is yes, then this means that the method proceeds to step S6.5 where the set of candidates is reduced by removing the candidates with indices i+1 to K and all the constraints occurring at ranks i+1 to K in the list NeighborList are definitely not in the explanation, and can therefore be removed from currentSystem 32 to simplify the constraint system. If the answer is no, then the solver cannot deduce from S(i) the output 14 and moves to step S6.6. Here, the process needs to take into account that at least one of the constraints that is not in S(i), and thus removes all candidates with a rank less than or equal to i.
To summarize the process shown in the
partialExp:=partialExp+(constraint of rank i in NeighborList)
currentSystem:=currentSystem−(all constraints from rank i to K in NeighborList)
If this process is implemented in order to perform a binary search on NeighborList, and if K is the size of NeighborList, the method calls O(log K) times the constraint solver. A heuristic to reach a better efficiency is to order the K neighbors by their number of neighbors outside the current explanation (and inside currentSystem) and the binary search is implemented so that it returns the less connected element that extends the current explanation. Once the current explanation has been extended with one additional constraint 12, the method returns to the step S5.2 in
The present invention may be a system, a method, and/or a computer program product. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention.
The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
Computer readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.
Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.
The descriptions of the various embodiments of the present invention have been presented for purposes of illustration, but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to best explain the principles of the embodiments, the practical application or technical improvement over technologies found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.