The present Application is related to the following co-pending application:
U.S. patent application Ser. No. 10/671,932, filed Sep. 29, 2003, to Chen et al., entitled “Method and Structure for Monitoring Moving Objects”; and
U.S. patent application Ser. No. 10/673,651, filed Sep. 29, 2003, to Chen et al, entitled “System and Method for Indexing Queries, Rules and Subscriptions”,
both assigned to the present assignee, and both incorporated herein by reference.
1. Field of the Invention
The present invention generally relates to activity/event monitoring in various application areas such as business activity monitoring for corporate management, sensor activities monitoring for continual queries, road traffic condition monitoring for traffic control, event matching for pub/sub applications, information monitoring for selective information dissemination, and health activity monitoring for disease outbreaks or bio-attacks. More specifically, it discloses a predicate/query indexing method for monitoring activities/events against a plurality of continual range predicates/queries.
2. Description of the Related Art
Fast matching of events against a large number of predicates/queries is important for many applications, such as business activity monitoring, content-based pub/sub (publication/subscription), continual queries, health activity monitoring, and selective information dissemination services. Users simply specify their interests in the form of a conjunction of predicates. The system then automatically monitors these user interests against a continual stream of events, conditions, or activities.
Generally, an efficient predicate index is needed. Prior work for fast event monitoring has mostly focused on building predicate indexes with equality-only clauses, as in, for example:
However, many queries/predicates contain non-equality range clauses. For example, stock price, salary, and object location tend to involve non-equality range predicates.
It is difficult to construct an effective index for multidimensional range predicates. It is even more challenging if these predicates are overlapping, as they usually are because people tend to share similar interests. For instance, people tend to be interested in the current price ranges of individual stocks. Hence, the range predicates of their interests are likely to be overlapping.
Although multidimensional range predicates can be treated as spatial objects, a typical spatial index, such as an R-tree, is generally not effective for monitoring events. This is because an R-tree method is generally a disk-based indexing method and an R-tree quickly degenerates if spatial objects are highly overlapping (V. Gaede et al., “Multidimensional access methods,” ACM Computing Surveys, 30(2):170-231, 1998.; A. Guttman, “R-trees: A dynamic index structure for spatial searching,” Proceedings of ACM SIGMOD, 1984.)
Hence, a need is recognized for a new and effective system and method for efficient monitoring of events against range queries, some of them may overlap with one another.
In view of the foregoing problems, drawbacks, and disadvantages of the conventional systems, it is an exemplary feature of the present invention to provide a structure (and method) for building an efficient query index for monitoring continual range queries against events.
It is, therefore, an exemplary purpose of the present invention to provide a structure and method for application areas such as business activity monitoring for corporate management, sensor activities monitoring for continual queries, road traffic condition monitoring for traffic control, event matching for publication/subscription applications, information monitoring for selective information dissemination, and health activity monitoring for disease outbreaks or bio-attacks, using an efficient query index for monitoring continual range queries against events.
Hence, in a first aspect of the present invention, described herein is a method (and structure) for monitoring continual range queries against events includes decomposing each range query into one or more predefined virtual constructs, building a query index, and using the query index to match an event with the range queries.
In a second aspect of the present invention, described herein is a method of providing a service of monitoring events or conditions, including at least one of: providing a service that monitors events against interests of a customer, the service monitoring the events by decomposing continual range queries related to the customer interests into one or predefined virtual constructs, building a query index, and using the query index to match an event with said range queries; maintaining one or more customer interests expressed as continual range queries for a service that monitors events in the manner described; and notifying a subset of the customers whose interests match an event.
In a third aspect of the present invention, described herein is a system for monitoring continual range queries against events, including a decomposing module that decomposes each range query with one or more predefined virtual constructs, a query index construction module, and an event matching module that uses the query index to match an event with the range queries.
In a fourth aspect of the present invention, described herein is an apparatus for monitoring continual range queries against events, wherein the apparatus is one of: a query monitor that includes a decomposing module that decomposes each range query into one or more predefined virtual constructs, a query index construction module, and an event matching module that uses the query index to match an event with the range queries; a sensor to detect occurrence of events and provides the occurrence of events into the query monitor; and a client receiver to permit a client to be notified of occurrence of an event of interest to the client.
According to a fifth aspect of the present invention, described herein is signal-bearing medium tangibly embodying a program of machine-readable instructions executable by a digital processing apparatus to perform the method for monitoring continual range queries against events, as described above.
The foregoing and other objects, aspects and advantages will be better understood from the following detailed description of a preferred embodiment of the invention with reference to the drawings, in which:
Referring now to the drawings, and more particularly to
The sensors, clients and query monitors are connected via a communication network 110, e.g., the Internet. The query monitors 121, 122 typically are computer servers. The query indexing method disclosed in the present invention is employed by the query monitors 121, 122 to efficiently identify all the range queries that match an incoming event. Those skilled in the art will appreciate that the sensors, the clients, or the query monitors may employ wireless technologies for communication.
As shown in
As shown in steps 206-209, event matching is conceptually simple. For each event point, the search results are stored in the ID lists associated with the activated VCRs that cover that point. However, it is computationally nontrivial to identify the covering VCRs, and the present invention provides a very efficient method to accomplish this task.
A covering VCR set for each point is defined. These covering VCR sets share two common properties: constant size and identical gap pattern. Based on these properties, a procedure for efficient event monitoring with a constant time complexity is disclosed.
For exposition, the two-dimensional case will be discussed. A set of virtual construct rectangles is defined so that each point in the monitoring region, which is a rectangle for the two-dimensional case, will be covered. As will be demonstrated shortly, the set of VCRs for these points in the monitoring region will be identified as based on location of the VCR's bottom-left corner and the VCR size, but the VCRs in the monitoring region may have different sizes and shapes. Each VCR has a unique ID and this ID can be computed with a simple formula, given the location of its bottom-left corner and its width and height and the width of the overall region being monitored.
Before a range query, which is represented as a rectangle in the two-dimensional case, is inserted into the ID list, it is decomposed into one or more VCRs (e.g., step 204). Then, in step 205, the query ID is inserted into the ID lists associated with the decomposed VCRs. There are many ways that range query decomposition could be done.
One simple way discussed shortly is to cut a strip rectangle from the bottom of the query rectangle and progressively move upwards. The height of the strip rectangle is the maximum VCR height that will fit into the range query. For each strip rectangle, the largest VCR having the same height as the strip rectangle is used to cut the strip rectangle, and each strip rectangle is decomposed with VCRs of the same height.
As shown in step 207 of
The covering VCR sets for all the points share two common properties, enabling an efficient way to enumerate all the IDs of VCRs in a covering VCR set. A distance table which stores the ID differences between a VCR in the covering VCR set and a pivot point is pre-computed. As a result, all the VCRs in a covering VCR set can be enumerated by adding the differences to the ID of the pivot point.
Finally, as shown in step 210, to delete a range query, it is first decomposed into one or more VCRs, similar to query insertion. Then, in step 211, the query ID is removed from the ID lists associated with the decomposed VCRs.
The Virtual Construct Rectangles (VCRs) used in this exemplary preferred embodiment is shown in
The query indexing method exemplarily disclosed herein predefines B virtual construct rectangles for each point (x, y).
It can be seen that, in
In accordance with the exemplary identification scheme, the VCR IDs shown in
Therefore, returning to
The formula above is based on the fact that there are (x+yRx) points prior to (x, y) in a horizontal scan 403 of the integer points in the monitoring region 400, as shown in the lower portion of
Moreover, for all the B VCRs sharing the same bottom-left corners, their IDs are assigned according to
In the exemplary embodiment, Lx and Ly are assumed to be numbers that are a power of 2. Namely, kx=log(Lx), and ky=log(Ly), or more accurately, kx=log2(Lx), and ky=log2(Ly), where kx and ky are integers. However, it should be apparent that the IDs can be assigned differently. As one alternative method, the scan could be performed vertically (e.g., holding x constant and varying y from 0 to Ry-1.
With predefined VCRs, the range queries are decomposed with one or more VCRs.
Incidentally, using the formula shown in
VCR501, ID(3,3,22,22)=9(3+3*17)+2(2+1)+2=494;
VCR502, ID(7,3,22,22)=9(7+3*17)+2(2+1)+2=530;
VCR504, ID(13,3,20,22)=9(13+3*17)+2(2+1)+0=582;
VCR505, ID(3,7,22,21)=9(3+7*17)+1(2+1)+2=1103; and
VCR508, ID(13,7,20,21)=9(13+7*17)+1(2+1)+0=1713.
When that happens, in step 608, the decomSet is returned, where decomSet stores all the IDs of VCRs that are used to decompose (a, b, w, h). In step 604, if the height of the working rectangle is greater than 0, a strip rectangle with a height of maxVCRh(Hw) is cut from the working rectangle. That is, the height of the strip rectangle is the maximum VCR height that is smaller than or equal to the height of the working rectangle. In step 606, for this strip rectangle, the largest VCR that can decompose the strip rectangle from the left and moving towards the right is found and added to the decomSet.
After each strip rectangle is decomposed, as detected in step 605, another strip rectangle is cut from the working rectangle, in step 607, and similarly decomposed.
At the end, in step 608, the decomposed VCRs are contained in decomSet. It should be apparent that there are other ways to decompose a range query. For example, some of the decomposed VCRs may overlap one another. It should also be apparent that it would be possible to expand the set of VCRs to include a larger power of 2, should an entered range query be larger than initially expected.
After decomposition, the query ID is inserted into the ID lists associated with the decomposed VCR (e.g., step 205 in
According, assuming that Cov(a, b) represents the covering VCR set for point (a, b). That is, Cov(a, b) contains all the VCRs that actually cover point (a, b).
Remembering that the VCR set contains a maximum VCR having dimensions Lx, Ly, then Cov(a, b) will include the set of any activated VCRs whose bottom-left corners are in the shaded region southwest of (a, b) and whose upper-right corners are in the shaded region northeast of (a, b), as shown in
The covering VCR sets for all points in the monitoring region share two common properties: constant size and identical gap pattern. For ease of exposition, consider the region that is inside the monitoring region separated with a boundary strip region, namely, the region Lx≦a≦Rx−Lx−1, and Ly≦b≦Ry−Ly−1. For the boundary strip region, a similar method can be applied. It is noted that a and b are considered variables in this context.
The constant size property says that |Cov(a, b)|=|Cov(c, d)|, for any two different points (a, b) and (c, d). Namely, the sizes of the covering VCR set for individual points are all the same. This can be visually appreciated from
The identical gap pattern property says that, if the IDs inside a covering VCR set are sorted, the ID differences between any two VCRs are constant among all the covering VCR sets of different points. Let Vi,(a,b) denote the ID of a covering VCR for (a, b) and Vi+1,(a,b)>Vi,(a,b). Then, Vi+1,(a,b)−Vi,(a,b)=Vi+1,(c,d)−Vi,(c,d), for 1≦i≦| Cov(a,b)| and any two points (a, b) and (c, d). This property can also be appreciated from
With these two properties, a difference table DT can be pre-computed, which table stores the ID differences between all the covering VCRs and a pivot VCR. For a point (a, b), the pivot VCR is defined as (a−Lx,b−Ly,1,1), and shown in
Then, in step 804, for each value in difference table DT, each value is added with the ID of the pivot VCR and included in Cov(a, b). At the end, in step 803, the covering VCR set stored in Cov(a, b) is returned.
With the IDs of all the covering VCRs for a point, the range queries can be found directly from the ID lists associated with all the covering VCRs.
Those skilled in the art will appreciate that the VCR indexing method can be extended to K>2 dimensions. Assuming that the K dimensions have R1, R2, . . . , RK values (all starting from 0), and also assuming that L1, L2, . . . , LK are the maximum sizes of a K-dimensional virtual construct regions (VCR), B VCRs can potentially be defined for each point. Each VCR is assigned with a unique ID.
To insert a range query, it is first decomposed into a set of VCRs and then the query ID is inserted into the ID lists associated with the decomposed VCRs. To find all the range queries matching an event, all the covering VCRs are first found and then the query IDs associated with the covering VCRs are found.
To delete a range query, it is first decomposed into one or more VCRs, similar to query insertion. Then, the query ID is removed from the ID lists associated with the decomposed VCRs.
Those skilled in the art will also appreciate that, in a single dimensional space, the virtual construct rectangles become virtual construct intervals (VCI), as further described in the second above-listed copending patent application.
Those skilled in the art will appreciate that various kinds of services can be provided based on the system and method disclosed in the current invention. For example, a service can be offered to monitor stock prices. Customers can express their interests in the form of queries, such as “send me alerts whenever IBM stock price is over 100”. The stock prices represent the events and are continually monitored against one or more queries.
Another exemplary service can be provided to monitor public health conditions. In this case, the events are the various statistics from hospitals, doctor offices, school absentee data, and others. These data are continuously collected and monitored against one or more continual range queries. Alerts can be sent to proper agencies when one or more of such range queries match an event.
Yet another exemplary service can be provided to offer subscription services to one or more publishers. The publishers publish contents and are filtered or monitored against one or more subscriptions. The subscribers express their individual interests in the form of a continual range queries. The service providers will monitor and match the published contents against one or more subscription queries. Matched publications are then forwarded to the subscribers.
However, it should be apparent that these examples above are only exemplary possible applications of the present invention and are not intended as limiting the present invention in any way. The present invention provides a computerized technique of monitoring events against queries and, returning back to the block diagram of
Thus, a consumer of the present invention could be considered as the end user represented as the one or more clients 111, 112 requesting the end result of the present invention, or as a service provider, represented by query monitors 121, 122, that receives a query from clients 111, 112 and provides the end result back to the clients 111, 112. Under certain conditions, it is possible that the owner/operator of the event monitors (e.g., shown as monitors 101, 102) or even the communication network 110 might be considered as the consumer of the present invention.
Exemplary Hardware Implementation
The CPUs 911 are interconnected via a system bus 912 to a random access memory (RAM) 914, read-only memory (ROM) 916, input/output (I/O) adapter 918 (for connecting peripheral devices such as disk units 921 and tape drives 940 to the bus 912), user interface adapter 922 (for connecting a keyboard 924, mouse 926, speaker 928, microphone 932, and/or other user interface device to the bus 912), a communication adapter 934 for connecting an information handling system to a data processing network, the Internet, an Intranet, a personal area network (PAN), etc., and a display adapter 936 for connecting the bus 912 to a display device 938 and/or printer 939 (e.g., a digital printer or the like).
In addition to the hardware/software environment described above, a different aspect of the invention includes a computer-implemented method for performing the above method. As an example, this method may be implemented in the particular environment discussed above.
Such a method may be implemented, for example, by operating a computer, as embodied by a digital data processing apparatus, to execute a sequence of machine-readable instructions. These instructions may reside in various types of signal-bearing media.
Thus, this aspect of the present invention is directed to a programmed product, comprising signal-bearing media tangibly embodying a program of machine-readable instructions executable by a digital data processor incorporating the CPU 911 and hardware above, to perform the method of the invention.
This signal-bearing media may include, for example, a RAM contained within the CPU 911, as represented by the fast-access storage for example. Alternatively, the instructions may be contained in another signal-bearing media, such as a magnetic data storage diskette 1000 (
Whether contained in the diskette 1000, the computer/CPU 911, or elsewhere, the instructions may be stored on a variety of machine-readable data storage media, such as DASD storage (e.g., a conventional “hard drive” or a RAID array), magnetic tape, electronic read-only memory (e.g., ROM, EPROM, or EEPROM), an optical storage device (e.g. CD-ROM, WORM, DVD, digital optical tape, etc.), paper “punch” cards, or other suitable signal-bearing media including transmission media such as digital and analog and communication links and wireless. In an illustrative embodiment of the invention, the machine-readable instructions may comprise software object code.
While the invention has been described in terms of a single preferred embodiment, those skilled in the art will recognize that the invention can be practiced with modification within the spirit and scope of the appended claims.
Further, it is noted that, Applicants' intent is to encompass equivalents of all claim elements, even if amended later during prosecution.
Number | Name | Date | Kind |
---|---|---|---|
4114601 | Abels | Sep 1978 | A |
4774657 | Anderson et al. | Sep 1988 | A |
5190059 | Fabian et al. | Mar 1993 | A |
5315709 | Alston et al. | May 1994 | A |
5513110 | Fujita et al. | Apr 1996 | A |
5519818 | Peterson | May 1996 | A |
5560007 | Thai | Sep 1996 | A |
5560014 | Imamura | Sep 1996 | A |
5664172 | Antoshenkov | Sep 1997 | A |
5664173 | Fast | Sep 1997 | A |
5745894 | Burrows et al. | Apr 1998 | A |
5761652 | Wu et al. | Jun 1998 | A |
5778354 | Leslie et al. | Jul 1998 | A |
5812996 | Rubin et al. | Sep 1998 | A |
5838365 | Sawasaki et al. | Nov 1998 | A |
5873079 | Davis et al. | Feb 1999 | A |
5884304 | Davis et al. | Mar 1999 | A |
5884307 | Depledge et al. | Mar 1999 | A |
5893088 | Hendricks et al. | Apr 1999 | A |
5903876 | Hagemier | May 1999 | A |
5915251 | Burrows et al. | Jun 1999 | A |
5930785 | Lohman et al. | Jul 1999 | A |
5931824 | Stewart et al. | Aug 1999 | A |
5953707 | Huang et al. | Sep 1999 | A |
5961572 | Craport et al. | Oct 1999 | A |
5963956 | Smartt | Oct 1999 | A |
6003016 | Hagemier | Dec 1999 | A |
6038559 | Ashby et al. | Mar 2000 | A |
6061677 | Blinn et al. | May 2000 | A |
6076007 | England et al. | Jun 2000 | A |
6078899 | Francisco et al. | Jun 2000 | A |
6088648 | Shah et al. | Jul 2000 | A |
6092115 | Choudhury et al. | Jul 2000 | A |
6105019 | Burrows | Aug 2000 | A |
6141656 | Ozbutun et al. | Oct 2000 | A |
6154219 | Wiley et al. | Nov 2000 | A |
6175835 | Shadmon | Jan 2001 | B1 |
6195656 | Ozbutun et al. | Feb 2001 | B1 |
6199201 | Lamping et al. | Mar 2001 | B1 |
6205447 | Malloy | Mar 2001 | B1 |
6289334 | Reiner et al. | Sep 2001 | B1 |
6298170 | Morita et al. | Oct 2001 | B1 |
6353819 | Edwards et al. | Mar 2002 | B1 |
6353832 | Acharya et al. | Mar 2002 | B1 |
6366206 | Ishikawa et al. | Apr 2002 | B1 |
6400272 | Holtzman et al. | Jun 2002 | B1 |
6424262 | Garber et al. | Jul 2002 | B2 |
6438528 | Jensen et al. | Aug 2002 | B1 |
6470287 | Smartt | Oct 2002 | B1 |
6487549 | Amundsen | Nov 2002 | B1 |
6510423 | Ichikawa et al. | Jan 2003 | B1 |
6529903 | Smith et al. | Mar 2003 | B2 |
6546373 | Cerra | Apr 2003 | B1 |
6571250 | Hara | May 2003 | B1 |
6591270 | White | Jul 2003 | B1 |
6600418 | Francis et al. | Jul 2003 | B2 |
6636849 | Tang et al. | Oct 2003 | B1 |
6640214 | Nambudiri et al. | Oct 2003 | B1 |
6669089 | Cybulski et al. | Dec 2003 | B2 |
6735590 | Shoup et al. | May 2004 | B1 |
6768419 | Garber et al. | Jul 2004 | B2 |
6809645 | Mason | Oct 2004 | B1 |
6861954 | Levin | Mar 2005 | B2 |
6900731 | Kreiner et al. | May 2005 | B2 |
6931392 | Skeen | Aug 2005 | B1 |
7010507 | Anderson et al. | Mar 2006 | B1 |
7010522 | Jagadish et al. | Mar 2006 | B1 |
7019650 | Volpi et al. | Mar 2006 | B2 |
7142118 | Hamilton et al. | Nov 2006 | B2 |
7155402 | Dvorak | Dec 2006 | B1 |
7177829 | Wilson et al. | Feb 2007 | B1 |
7236173 | Chithambaram et al. | Jun 2007 | B2 |
7307530 | Fabian et al. | Dec 2007 | B2 |
7557710 | Sanchez et al. | Jul 2009 | B2 |
20020007360 | Hawkinson | Jan 2002 | A1 |
20020067263 | Tafoya et al. | Jun 2002 | A1 |
20020143320 | Levin | Oct 2002 | A1 |
20030066537 | Fabian et al. | Apr 2003 | A1 |
20030105394 | Fabian et al. | Jun 2003 | A1 |
20030187867 | Smartt | Oct 2003 | A1 |
20040084525 | Kreiner et al. | May 2004 | A1 |
20040129279 | Fabian et al. | Jul 2004 | A1 |
20040201479 | Garber et al. | Oct 2004 | A1 |
20040250819 | Blair et al. | Dec 2004 | A1 |
20070093709 | Abernathie | Apr 2007 | A1 |
20070290030 | Fox et al. | Dec 2007 | A1 |
Entry |
---|
Liu, Ling et al, “Continual Queries for Internet Scale Event-Driven Information Delivery,” 1999, published by Oregon Graduate Institute of Science and Technology, pp. 1-30. |
Japan sensor array can image full hemisphere without PC Help, Apr. 15, 2002, AsiaPulseNews. |
Number | Date | Country | |
---|---|---|---|
20050071321 A1 | Mar 2005 | US |