These and other features and advantages will become more apparent from a detailed consideration of the invention when taken in conjunction with the drawings in which:
An access control system 10, as shown in
The card readers 14 are located at the portals (such as doorways, gates, etc.) that permit entrance to and exit from restricted areas. For example, some of the readers 14 may be positioned at the doorways leading into and out of a building, some of the readers 14 may be positioned at gates restricting access to elevators, escalators, and other appliances or areas of the building, some of the readers 14 may be positioned at the doorways leading into and out floors of the building, some of the readers 14 may be positioned at the doorways leading into and out of offices or groups of offices of the building, etc. Other locations and arrangements of the readers 14 are, of course, possible.
The readers 14 read identification indicia (referred to herein as IDs) that are stored on the access control cards and that uniquely identify card holders who are authorized access into and/or out of the restricted areas. Such authorization may be selective. For example, some card holders may be authorized to enter some areas of a building but not others. The IDs read by the readers 14 are processed by the controller 12 to determine authenticity of the card holder and to detect unusual patterns from the log data that might indicate suspicious behavior such as fraudulent or improper attempts to enter restricted areas.
Each of the access permitting devices 16 is located at a corresponding portal protected by a corresponding one of the readers 14. The access permitting devices 16, for example, may be locks, gates, etc. that allow authorized card holders to pass through the portals once their access control card IDs have been read and authenticated. The access permitting devices 16 are controlled by the controller 12.
The controller 12 includes components that facilitate processing of the data accumulated from the readers 14 and that facilitate appropriate control of the access permitting devices 16. These components, for example, may include a central processing unit 18 that is coupled to a memory 19, various input/output devices 20, an input interface 22, and an output interface 24. The memory 19 includes a RAM 26 and a ROM 28 and may be used, for example, to store the access control card IDs associated with the card holders who are authorized access to the various restricted spaces. The memory 19 may also be used to store the programming necessary for the proper functioning of the access control system 10, to store log data based on the access control card IDs received from the readers 14, to store the programming permitting the access control system 10 to detect unusual ingress and egress patterns of the card holders, etc.
The input/output devices 20 may include, for example, a keyboard, a mouse, a printer, a display, and/or various ports for the connection of other equipment useful to the access control system 10. The input interface 22 controllably passes inputs from the readers 14, and the output interface 24 controllably passes outputs to the access permitting devices 16.
As discussed above, the controller 12 processes data to detect unusual patterns in the data that might indicate fraudulent, improper, or other suspicious behavior of the card holders. In order to detect such unusual data patterns, the controller 12 maintains in the memory 19 an initial log file of the data accumulated from the readers 14 and executes a process, which is shown as a high level process in
The initial log file stores the access control information derived from information supplied by the readers 14. The initial log file, for example, contains access control card IDs for the card holders who access restricted areas protected by the access control system 10, the times at which the access control cards are read by the readers 14, and an identification of the corresponding restricted areas accessed by the corresponding card holders. Thus, each log entry may have the form shown in
To the extent that the access control system 10 requires the use of the access control cards to exit as well as to enter restricted areas, and to the extent the readers 14 can distinguish between entering and exiting a restricted area, the time information may be broken down into entrance times (time in) and exit times (time out).
Each row in
As shown in
Also, missing IN and OUT entries in the initial log file may be supplied under certain circumstances. Missing IN and OUT entries can be extrapolated from known data. For example, if the initial log file shows that a card holder has exited a restricted area at his or her usual time in the afternoon of a given day (an OUT entry), but the initial log file shows no IN entry for that card holder on that day, the controller 12 may supply the missing IN entry. Similarly, the controller 12 may supply a missing OUT entry.
A missing IN log entry for the present day can be inserted into the initial log file for an access control card ID if the last log entry of the previous day for that access control card ID is an OUT entry and if the first log entry of the present day for that access control card ID is an OUT entry. The time of the inserted IN log entry for that access control card ID is calculated based on the average of a predetermined number of previous IN times between 7:30 A.M. and 11:30 A.M.
Care must be taken that a missing entry is not itself an indication of an unusual pattern. For example, a number of missing entries suggests unusual behavior.
A missing OUT log entry for the last OUT of the present day can be inserted into the initial log file for an access control card ID if the first entry for the next day is not an OUT entry. The time of the inserted OUT log entry for that access control card ID is calculated based on the average of a predetermined number of previous OUT times between 5:30 P.M. and 9:30 P.M. and on restricted area information based on the IN log entries of the present day.
An IN entry relates to an entrance into a restricted area and an OUT entry relates to an exit from a restricted area. An access control card ID is an identification that is stored on an access control card and that uniquely identifies the holder of the access control card.
Moreover, it may be desirable at 44 to extract only certain fields from the initial log file when creating the new log file discussed more fully below. These fields may pertain to restricted area information, date information, time information, and access control card IDs. The new log file may have a form similar to that shown in
Further, the average break times for each access control card ID can be calculated and can be used in the detection of unusual patterns. The typical morning coffee break time can be calculated using the first OUT log entry during the time interval of 10:30 A.M. to 11:45 A.M. The typical lunch break time can be calculated using the first OUT log entry during the time interval of 12:30 P.M. to 2:30 P.M. The typical evening tea break time can be calculated using the first OUT log entry during the time interval of 3:30 P.M. to 4:30 P.M. The number and time intervals used for the calculation of these breaks will depend on local customs.
The detection of unusual patterns may be accomplished, for example, in stages. Two stages are shown in connection with the present invention. However, more or fewer stages can be used.
The first stage at 46 involves finding unusual data patterns using a probabilistic, statistical, and historical approach. In this first stage, the probability of a person visiting a restricted area, such as a floor, is considered. Each visit by a card holder to a restricted area increases the probability that the card holder will in the future visit that restricted area and decreases the probability that the same card holder will visit some other restricted area.
According to an example of an algorithm that may be employed to calculate this probability, the log data for the first x number of days (where x may, for example, be 15) may be used to calculate an initial probability for each restricted area and for each card holder. This calculation of the initial probability is based on the assumption that a card holder who visits a restricted area in an x day pattern will likely visit the same restricted area on days x+1, x+2, etc. This calculation may be made dependent on the position of the card holder in an office.
Thus, a probability PROB that a card holder will visit a restricted area is determined. The probability PROB can be determined, for example, according to the following equation:
where the numbers of visits in the numerator and denominator of equation (1) are for a corresponding card holder. These initial probabilities may be stored in the initial log file. For example, these initial probabilities may be stored in the tabular form of
Thus, as shown in
The probabilities stored in the table of
As further shown in connection with
After calculation of the initial probabilities, data for the next y number of days (where y, for example, may be one month) is used to adjust PROB based on subsequent visits to the restricted area. The increase and decrease in PROB may be calculated, for example, as follows:
where χ=Total Number of restricted areas−1.
Statistics, such as the mean, the standard Deviation, and the variance, are then computed for the data in the initial log file maintained by the controller 12.
Unusual data in the initial log file is then determined from these statistics based on a probabilistic approach, and a new log file is created such that the new log file contains only entries in the initial log file which fall below the probability threshold. This threshold is the square of the variance σ. The variance σ may be calculated in accordance with the following equation:
where X is the score, μ is the mean or average of the scores, and N is the number of scores.
For example, after the initial 15 days, a particular card holder may have history of visits to different floors as shown in the table below.
After the initial days (e.g., 15 days), the probability is updated for each card data, and the variance is re-calculated using the floor probability data.
Further, the duration of a stay in a restricted area for an unusual log entry is also calculated and entered in the new log file. The duration of a stay in the restricted area is calculated only if the next entry in the initial log file for that restricted area and for that access control card ID is an OUT entry; otherwise, this entry in the new log file is marked as NA (not available). If there are no entries in the initial log file for a particular access control card ID, then an entry for that access control card ID is marked in the new log file as ABSENT.
All the entries in the initial log file will be analyzed. If the entry probability of a corresponding entry in the initial log file falls below the calculated threshold, that entry will be separated, will be moved to the new log file, and will be used for the next stage.
Accordingly, the detection of unusual access patterns at 46 can be carried out based on a plurality of criteria. Such criteria can include, for example, (i) visits to less probable areas, floors, or buildings, (ii) visits to less probable areas, floors, or buildings as compared to previous visits, (iii) unusual durations of visits to areas, floors, or buildings, (iv) deviations in arrival times as compared to a mean arrival time, (v) deviations in departure times as compared to a mean departure time, (vi) deviations in break times, etc.
The second stage at 48 of
This association information is useful in identifying unusual patterns because, most of the time, co-workers move together when they work on a common team or on a common project. Thus, the data stored in the new log file might not indicate suspicious behavior if a card holder corresponding to an entry in the new log file had been moving with associates in and out of the same restricted areas during the same time periods. Even when the card holder associated with an entry in the new log file does not have IN and OUT entries that precisely match the IN and OUT entries of most or all of the group with which that card holder has moved, such entries might not involve an unusual pattern. In certain cases, the group movement can also be categorized as a unusual pattern.
If the new log file contains IN and OUT times of a card holder of interest that match the IN and OUT times for other card holders involving the same restricted areas, it can be assumed, at least with respect to time periods involving the matching IN and OUT times, that the card holder of interest and the other card holders are moving as a group (i.e., the card holder of interest is associated with a group of other card holders during these time periods).
At 50, the unusual patterns found at 46 and the group associations found at 48 are analyzed in order to detect anomalies in the behavior of card holders. Each of the unusual entries separated out in the first stage at 46 is analyzed for unusual patterns. The time stamp of each entry for each access control card ID is considered within a predetermined tolerance (such as ±15 minutes). Associations between card holders detected at 48 are searched within that time frame. Accordingly, if a card holder has moved with associates within this time frame, the unusual pattern detected at 46 will not be treated as unusual pattern for the purpose of determining suspicious behavior. Hence, such unusual patterns detected at 46 are moved out of the new (unusual) data file. Otherwise, the unusual patterns detected at 46 will be considered suspicious.
The data in the extracted fields is pre-processed at 64 in order to remove inconsistencies from the data. As described above by way of example, duplicate data can be eliminated, successive IN and OUT entries for the same access control card ID at the same instant in time can be eliminated, and missing IN and/or OUT entries in the log file may be supplied under certain circumstances such as those described above.
The pre-processed data is then analyzed at 66, 68, and 70 for unusual patterns. The analysis performed at 66, 68, and 70 is a probabilistic, statistical, and historical analysis. Accordingly, at 66, the probability that a card holder is visiting a corresponding restricted area, such as a floor, is calculated. As explained above, each visit by a card holder to a restricted area increases the probability that the card holder will in the future visit that restricted area and decreases the probability that the same card holder will visit other restricted areas.
At 68 of
At 70, all entries in the initial log file are analyzed to detect unusual data patterns. As indicated above, the data in the log file is examined to determine whether any visits by card holders are to less probable restricted areas. Thus, if an entry of the initial log file relates to a visit by a particular card holder to a particular restricted area and if the probability of that visit as calculated above and discussed in connection with
The data in the initial log file is also examined to determine (i) whether any visits by any card holders to any restricted areas were for unusual durations as compared to past visits by the card holders to those restricted areas, (ii) whether the arrival time of a card holder deviates from the mean arrival time for that card holder, (iii) whether the departure time of a card holder deviates from the mean departure time for that card holder, (iv) whether the break time for a card holder deviates from the mean break time for that card holder, etc. Any entries in the initial log file corresponding to any such deviations are also added to the new log file.
The new log file is then passed to the second stage processing shown in
Accordingly, at 72, the time stamp and access control card ID for a first entry in the new log file is considered in detecting whether a group association exists. At 74 and 76, a group association is determined if the card holder of interest corresponding to this first entry have mover through in the same restricted areas during the same times (within ±15 minutes) as other card holders. As indicated at 78, if the card holder of interest associated with the first entry in the new log file is in such a group association, then the first entry in the new log file will not be treated as unusual pattern and will be removed from new the log file; otherwise, the first entry remains in the new log file and will be treated as an unusual pattern. At 80, the process returns to 72 to begin processing of the next entry in the new log file, and 72-80 are repeated until all entries in the new log file are so processed.
The entries in the new log file that remain after the processing of 72-80 correspond to suspicious behavior that can be further investigated to determine if the suspicious behavior amounts to fraudulent or improper behavior.
Accordingly, at 90, the data from the pre-processed log file is input and, at 92, similar movement patterns are detected from this log data. Thus, if two or more card holders repeatedly enter and exit the same restricted areas at roughly the same times, it may be inferred that such card holders are engaged in a group work activity. At 94, these associations are grown based the common working patterns of future days. That is, each association (i.e., each group) may be weighted. For example, when an association is first formed, it may be given some small weight. As the same association is seen in subsequent days, the weight given to that association may be incremented (grown).
At 96, the final groupings are identified with their strengths after the passage of a sufficient amount of time. For example, the final groupings and their corresponding weights may be identified and calculated at the end of each day.
Strengths show how strong the groups are. If a group has a very low weight, then this group has less significance and is not a strong association. However, a group having a high weight value has more significance and is a stronger association. An unusual pattern associated with a strong group might be ignored because this pattern might be a usual pattern of the movement. However, an unusual pattern associated with a weak group might not be ignored.
At 110, the strength (weight) of the smallest group found at 106 is increased if this smallest group is found to be part of bigger groups also found at 106, the strength of the next larger group found at 106 is increased if this next larger group is found to be part of a still bigger group, and so on until all groups are processed. Accordingly, only groups larger than the group currently being processed are examined in order to determined whether to increase the strength of the group currently being processed. The size of the group is determined by the number of members on the group. Each time the strength of a group is increased (incremented), it is increased by a predetermined amount, such as 0.01.
At 112, the grouping data and corresponding strengths are updated to the main grouping table. At 114, a determination is made as to whether all days covered by the log entries have been processed at 102-114. If not, the first and next days are incremented by one, and flow returns to 102. If all days covered by the log entries have been processed at 102-114, the group associations and their corresponding strengths resulting from the processing at 102-114 are made available to 78 of
Alternatively, group associations may be obtained using association rules in the process described below. An association rule has two parts, a left hand side and a right hand side. The left hand side and the right hand side are sets of one or more card holders. Each association rule gives the confidence of finding right hand side card holders given left hand side card holders.
Association rules are used to discover patterns and correlations that may be buried deep inside a database. The entire process comprises preprocessing, preparation of transactions, finding frequent sets, and finding association rules.
The preprocessing involves the separation of entry and exit data by restricted area, such as by floor, and, if by floor, the separation of the data with respect to each entry and exit point in a floor, and the removal of multiple entries that are closely spaced together in.
The preparation of transactions (group associations) involves generating transactions or groups using a difference time threshold between a current entry and a previous entry, thus transactionalizing the data in the log file. The procedure for preparing transactions is given as follows:
The frequent sets are found using the FP-Growth algorithm, where stands for Frequent Pattern. The objective here is to generate all combinations of items such that Support(item set)>min_sup.
The FP-Growth algorithm is a known algorithm and generally comprises following steps:
1. Scan the transactions database once, and find frequent 1-itemset (single item pattern);
2. Order the frequent items or transaction in frequency descending order;
3. Scan the transactions database again, construct FP-tree;
4. Mine for frequent patterns according to the order of items in FP-tree;
5. Generate candidate frequent patterns using set intersection operations;
6. Based on the candidate-frequent patterns set, construct conditional pattern bases for each node in the FP-tree;
7. Recursively mine the conditional FP-trees and grow frequent patterns obtained so far;
8. If the conditional FP-tree contains a single path, simply enumerate all the patterns.
Each card holder moves at least two times a day (arriving at and leaving a restricted area) so that 2 can be used as the minimum support for a one day database per point.
In finding the association rules, the frequent sets are used to generate the desired rules. A priori algorithms are used for generating the association rules. For example, if ABCD and AB are frequent sets, then one association rule can be generated by posing the rule that AB≧CD. In order to test this rule, the following ratio is computed:
conf=support(ABCD)/support(AB).
Certain modifications of the present invention have been discussed above. Other modifications of the present invention will occur to those practicing in the art of the present invention. For example, the description above implies that the controller 12 controls access to an entire building. Instead, a building may be divided into zones with each zone having its own controller 12. Alternatively, there may be a master controller for the entire building and a separate zone controller for each of one or more zones of the building. As another alternative, the controller 12 may be arranged to control access to a group of buildings. Still other alternatives are possible.
Also, as described above, unusual data in an initial log file is moved during a first stage from the initial log file to the new log file, and the data in the new log file is processed in the second stage so as to remove any entries corresponding to group associations. Alternatively, instead of separating the unusual data from the initial log file and moving the unusual data to the new log file, the data in the initial log file can simply be given a tag identifying it as unusual data. If so, the tagged data can be considered to be a new log file even though that data is still stored in the initial log file.
Accordingly, the description of the present invention is to be construed as illustrative only and is for the purpose of teaching those skilled in the art the best mode of carrying out the invention. The details may be varied substantially without departing from the spirit of the invention, and the exclusive use of all modifications which are within the scope of the appended claims is reserved.