An organization may generate various types of reports related to operations of the organization. For example, the organization may generate expense reports, time reports, revenue reports, and/or the like. A report may be associated with an individual (e.g., that submitted the report, that is associated with the content of the report, and/or the like), a location (e.g., of a subject matter of the report, an individual associated with the report, and/or the like), an amount of value (e.g., of an expense for an expense report, of time for a time report, and/or the like), and/or the like.
According to some possible implementations, a method may comprise receiving, by a device, data that is related to: historical reports associated with an organization, historical audits of the historical reports, and individuals associated with the historical reports; determining, by the device, a multi-entity profile for the data after receiving the data, wherein the multi-entity profile includes a set of groupings of the data by a set of attributes included in the data; determining, by the device and using the multi-entity profile, a set of supervised model features for the historical reports based on the historical audits, wherein the set of supervised model features is associated with training a model to process a report in a context of the historical audits; determining, by the device and using the multi-entity profile, a set of unsupervised model features for the historical reports independent of the historical audits, wherein the set of unsupervised model features is associated with training the model to process the report independent of the context of the historical audits; determining, by the device and utilizing the model, a score for the report after the model is trained using the set of supervised model features and the set of unsupervised model features, wherein the score indicates a likelihood of an issue related to the report; and performing, by the device, one or more actions based on the score.
According to some possible implementations, a device may comprise one or more memories; and one or more processors, communicatively coupled to the one or more memories, to: receive data that is related to training a model to identify an issue included in a report; determine a multi-entity profile for the data after receiving the data, wherein the multi-entity profile includes a set of groupings of the data by a set of attributes included in the data; determine, using the multi-entity profile, a set of supervised model features for historical reports based on historical audits, wherein the set of supervised model features is associated with training the model to process the report in a context of the historical audits; determine, using the multi-entity profile, a set of unsupervised model features for the historical reports independent of the historical audits, wherein the set of unsupervised model features is associated with training the model to process the report independent of the context of the historical audits; process, utilizing the model, the report to identify a score indicating whether the issue is included in the report after the model is trained using the set of supervised model features and the set of unsupervised model features; and flag the report as including the issue or not including the issue based on the score.
According to some possible implementations, a non-transitory computer-readable medium storing instructions, the instructions comprising: one or more instructions that, when executed by one or more processors, cause the one or more processors to: receive data that is to be used to train a model to identify an issue included in a report, wherein the data is related to: historical reports associated with an organization, historical audits of the historical reports, and individuals associated with the historical reports; determine a multi-entity profile for the data after receiving the data, wherein the multi-entity profile includes a set of groupings of the data by a set of attributes included in the data; determine, using the multi-entity profile, a set of supervised model features for the historical reports based on the historical audits, wherein the set of supervised model features is associated with training the model to process the report in a context of the historical audits; determine, using the multi-entity profile, a set of unsupervised model features for the historical reports independent of the historical audits, wherein the set of unsupervised model features is associated with training the model to process the report independent of the context of the historical audits; train the model based on the set of supervised model features and the set of unsupervised model features after determining the set of supervised model features and the set of unsupervised model features; determine, utilizing the model, a score for the report after training the model, wherein the score indicates a likelihood of the issue related to the report; and perform one or more actions based on the score.
The following detailed description of example implementations refers to the accompanying drawings. The same reference numbers in different drawings may identify the same or similar elements.
An organization may generate various types of reports related to operations of the organization. The organization may want to audit reports to determine whether the reports generated by the organization include an issue (e.g., are fraudulent, are inaccurate, fail to conform to formatting rules, and/or the like). One technique for auditing the reports may include identifying a sample of the reports (e.g., a random sample, a sample based on a schedule, and/or the like) and auditing the sample of the reports. While this technique may identify some issues included in the reports, this technique may have a low accuracy for identifying issues in the reports and/or may be time consuming. In addition, as a quantity of reports generated by the organization increases, this technique may have difficulty scaling with the increase in the quantity. This can result in a significant majority (e.g., 90 percent or more) of the reports, generated by the organization during a time period, never being audited. This significantly reduces the organization's capability to identify and/or fix reports that include an issue, thereby consuming significant resources of the organization (e.g., monetary resources that are consumed based on issue-containing reports, time resources that are consumed using issue-containing reports, computing resources that are consumed processing issue-containing reports, and/or the like).
Some implementations described herein provide a report analysis platform that is capable of processing reports (e.g., thousands, millions, or more reports) associated with an organization utilizing a machine learning model and detecting issues in the reports. In this way, the report analysis platform can process a significant majority (e.g., 90 percent or more), or all, of the reports generated by the organization in a quick and efficient manner. This improves an accuracy of processing reports to identify an issue relative to other techniques. In addition, this increases a throughput of an organization's capability to process reports associated with the organization, thereby reducing or eliminating a risk of missed reports that include an issue. Further, this conserves resources of the organization (e.g., monetary resources, time resources, computing resources, and/or the like) that would otherwise be consumed as a result of using other techniques for processing reports.
Further, in this way, several different stages of the process for detecting an issue related to a report are automated, which may remove human subjectivity and waste from the process, and which may improve speed and efficiency of the process and conserve computing resources (e.g., processor resources, memory resources, and/or the like). Furthermore, implementations described herein use a rigorous, computerized process to perform tasks or roles that were not previously performed or were previously performed using subjective human intuition or input. Further, automating the process for detecting an issue related to a report conserves computing resources (e.g., processor resources, memory resources, and/or the like) of a device that would otherwise be wasted in attempting to use another technique to process a report generated by the organization and/or in using, processing, and/or the like an issue-containing reports.
As shown by reference number 105, the report analysis platform 230 may receive data related to processing expense reports associated with an organization. For example, the data may be related to historical audits of expense reports (e.g., data that identifies audit outcomes of historical expense reports), historical expense reports, employees (or other individuals) associated with the organization (e.g., data that identifies a job title, a location, a tenure, and/or the like), exchange rates between various currencies, and/or the like. In some implementations, the report analysis platform 230 may receive the data from the server device 220, the client device 210, and/or the user device. In some implementations, the report analysis platform 230 may receive the data based on requesting the data, according to a schedule, periodically, and/or the like.
In some implementations, the report analysis platform 230 may receive the data in various forms. For example, the report analysis platform 230 may receive the data in the form of an image (e.g., an image of a receipt associated with an expense report, a historical audit in the form of an image, and/or the like), as text (e.g., text of a historical expense report input to an expense reporting system, text of a historical audit report, and/or the like), as application data from an application hosted on, executed on, and/or the like the server device 220, the client device 210, and/or the user device, as input to the report analysis platform 230 (e.g., via a user interface associated with the report analysis platform 230), as transactional data from the server device 220, the client device 210, the user device, and/or the like generated in association with completing a transaction associated with a historical expense report, and/or the like. In some implementations, when receiving the data, the report analysis platform 230 may receive thousands, millions, or more data elements for thousands, millions, or more historical audits, historical expense reports, employees (or other individuals), and/or the like. In this way, the report analysis platform 230 may receive a data set that cannot be processed manually or objectively by a human actor.
Turning to
In some implementations, the report analysis platform may organize the data for the multi-entity profile based on unique identifiers included in the data (e.g., unique identifiers that uniquely identify an individual associated with the data, a location associated with the data, a vendor associated with the data, and/or the like). In some implementations, the unique identifiers may be included in the data as an attribute of the data (e.g., as a field with a unique value, such as a name, an identification number, and/or the like), and the report analysis platform may organize the data based on the unique identifiers included as the attribute in the data.
Additionally, or alternatively, the report analysis platform may process the data to identify the unique identifiers. For example, the report analysis platform may process images using an image processing technique, such as a computer vision technique, a feature detection technique, an optical character recognition (OCR) technique, and/or the like to identify an alphanumeric string, a symbol, a code (e.g., a barcode, a matrix barcode, and/or the like) in the image (e.g., that identify the presence of a unique identifier, that are a unique identifier, and/or the like). Continuing with the previous example, the report analysis platform may compare the alphanumeric string, the symbol, the code, and/or the like to information stored in a data structure and/or in memory resources of the report analysis platform to determine which unique identifiers are included in the image.
Additionally, or alternatively, and as another example, the report analysis platform may process the data using a text processing technique, such as a natural language processing technique, a text analysis technique, and/or the like. Continuing with the previous example, the report analysis platform may process the text to identify an alphanumeric string, a symbol, a code, and/or the like included in the data (e.g., that indicate a presence of a unique identifier, that are a unique identifier, and/or the like), and may identify the unique identifiers included in the text in a manner similar to that described above.
Additionally, or alternatively, and as another example, the report analysis platform may process the data using a model (e.g., a machine learning model, an artificial intelligence model, and/or the like) to identify a unique identifier included in the data. For example, the report analysis platform may use the model to process an image and/or text to identify an alphanumeric string, a symbol, a code, and/or the like included in the data, to identify an area of the data (e.g., an area of an image and/or text) that likely includes a unique identifier, and/or the like (e.g., based on having been trained to identify unique identifiers in the data, a likely area in the data that may include a unique identifier, and/or the like). In some implementations, the model and/or training of the model may be similar to that described elsewhere herein.
Reference number 115 shows example multi-entity profiles that the report analysis platform may generate. As shown, a multi-entity profile may organize the data that the report analysis platform received by employee, by vendor, and/or the like. In this way, a multi-entity profile facilitates quick and easy access to data in an organized manner. This conserves processing resources of the report analysis platform relative to not using a multi-entity profile, facilitates training of a model to identify issues in a report based on attributes included in the data (e.g., the report analysis platform may train the model on a particular employee or employees generally, on a particular vendor or vendors generally, and/or the like), thereby improving an accuracy of the model with regard to identifying issues in reports.
Turning to
As shown by reference number 125, the report analysis platform may input data related to historical audits and the historical expense reports into a machine learning model to determine the set of supervised model features. For example, the report analysis platform may input the data related to the historical audits and to the historical expense reports, and the machine learning model may output the set of supervised model features (e.g., based on the training of the machine learning model).
In some implementations, when processing the data related to the historical audits and the historical expense reports, the machine learning model may group the historical expense reports by outcome of the historical audits. For example, the machine learning model may group, utilizing the data related to the historical audits, the data related to the historical expense reports into groups, such as historical expense reports that failed a historical audit, that passed a historical audit, that failed or passed an initial historical audit but the outcome was subsequently reversed, that were not audited, and/or the like.
In some implementations, the report analysis platform may use the multi-entity profile for the data as the input to the machine learning model. This facilitates identification of the set of supervised model features for different attributes included in the data, which can make the supervised model features more dynamic, can improve an accuracy of the set of supervised model features, and/or the like.
In some implementations, prior to inputting the data related to the historical audits and/or the historical expense reports, the report analysis platform may prepare and/or pre-process the data. For example, the report analysis platform may identify keywords included in the data, such as unique identifiers that are common to both the data related to the historical audits and to the data related to the historical expense reports, terms that identify historical audits that resulted in a pass, terms that identify historical audits that resulted in a fail, amounts associated with a historical expense report, locations of the historical expense reports, and/or the like. Additionally, or alternatively, the report analysis platform may remove leading and/or trailing spaces from text included in the data related to the historical audits and the historical expense reports, may remove non-American Standard Code for Information Interchange (ASCII) characters, and/or the like. This facilitates quick and/or easy processing of the data related to the historical audits and the historical expense reports by making the data more uniform, thereby facilitating fast determination of the supervised model features, more accurate determinization of the supervised model features, and/or the like.
In some implementations, the report analysis platform may generate the machine learning model. For example, the report analysis platform may have trained the machine learning model to identify the set of supervised model features from the data related to the historical audits and the historical expense reports.
In some implementations, the report analysis platform may have trained the machine learning model on a training set of data. For example, the training set of data may include data related to historical audits and historical expense reports and data that identifies supervised model features from the data related to the historical audits and the historical expense reports. Additionally, or alternatively, when the report analysis platform inputs the data related to the historical audits and the historical expense reports into the machine learning model, the report analysis platform may input a first portion of the data as a training set of data, a second portion of the data as a validation set of data, and third portion of the data as a test set of data (e.g., to be used to determine the set of supervised model features). In some implementations, the report analysis platform may perform multiple iterations of training of the machine learning model, depending on an outcome of testing of the machine learning model (e.g., by submitting different portions of the data as the training set of data, the validation set of data, and the test set of data).
In some implementations, when generating the machine learning model, the report analysis platform may utilize a random forest classifier technique to generate the machine learning model. For example, the report analysis platform may utilize a random forest classifier technique to construct multiple decision trees during training and may output a classification of data. Additionally, or alternatively, when generating the machine learning model, the report analysis platform may utilize a gradient boost tree classifier technique to generate the machine learning model. For example, the report analysis platform may utilize a gradient boost tree classifier technique to generate a prediction model from a set of weak prediction models (e.g., by generating the machine learning model in a stage-wise manner, by optimizing an arbitrary differentiable loss function, and/or the like).
In some implementations, when generating the machine learning model, the report analysis platform may utilize logistic regression to generate the machine learning model. For example, the report analysis platform may utilize a binary classification of the data related to the historical audits and the historical expense reports (e.g., a pass classification or a fail classification) to train the machine learning model to identify the set of supervised model features based on the classification of the data. Additionally, or alternatively, when generating the machine learning model, the report analysis platform may utilize a Naive Bayes classifier to train the machine learning model. For example, the report analysis platform may utilize binary recursive partitioning to divide the data related to the historical audits and the historical expense reports into various binary categories (e.g., starting with a pass or fail binary category for a historical audit). Based on using recursive partitioning, the report analysis platform may reduce utilization of computing resources relative to manual, linear sorting and analysis of data points, thereby enabling use of thousands, millions, or billions of data points to train a machine learning model, which may result in a more accurate machine learning model than using fewer data points.
Additionally, or alternatively, when generating the machine learning model, the report analysis platform may utilize a support vector machine (SVM) classifier. For example, the report analysis platform may utilize a linear model to implement non-linear class boundaries, such as via a max margin hyperplane. Additionally, or alternatively, when utilizing the SVM classifier, the report analysis platform may utilize a binary classifier to perform a multi-class classification. Use of an SVM classifier may reduce or eliminate overfitting, may increase a robustness of the machine learning model to noise, and/or the like.
In some implementations, the report analysis platform may train the machine learning model of supervised model features using a supervised training procedure that includes receiving input to the machine learning model from a subject matter expert. In some implementations, the report analysis platform may use one or more other model training techniques, such as a neural network technique, a latent semantic indexing technique, and/or the like. For example, the report analysis platform may perform an artificial neural network processing technique (e.g., using a two-layer feedforward neural network architecture, a three-layer feedforward neural network architecture, and/or the like) to perform pattern recognition with regard to patterns of supervised model features, patterns of supervised model features based on an outcome of a historical audit, and/or the like. In this case, using the artificial neural network processing technique may improve an accuracy of a model generated by the report analysis platform by being more robust to noisy, imprecise, or incomplete data, and by enabling the report analysis platform to detect patterns and/or trends undetectable to human analysts or systems using less complex techniques.
As an example, the report analysis platform may use a supervised multi-label classification technique to train the machine learning model. For example, as a first step, the report analysis platform may map data associated with the historical expense reports to a set of previously generated supervised model features after labeling the historical expense reports. In this case, the historical expense reports may be characterized as having passed a historical audit, having failed a historical audit, as including an issue, as not including an issue, and/or the like (e.g., by a technician, thereby reducing processing relative to the report analysis platform being required to analyze each historical expense report and/or historical audit). As a second step, the report analysis platform may determine classifier chains, whereby labels of target variables may be correlated (e.g., in this example, labels may be a result of a historical audit and correlation may refer to supervised model features common to the different labels, and/or the like). In this case, report analysis platform 230 may use an output of a first label as an input for a second label (as well as one or more input features, which may be other data relating to the historical expense reports and/or the historical audits), and may determine a likelihood that a particular historical expense report includes an issue and/or is associated with a set of supervised model features based on a similarity to other historical expense reports that include similar data. In this way, the report analysis platform transforms classification from a multilabel-classification problem to multiple single-classification problems, thereby reducing processing utilization. As a third step, the report analysis platform may determine a Hamming Loss Metric relating to an accuracy of a label in performing a classification by using the validation set of the data (e.g., an accuracy with which a weighting is applied to each historical report and whether each historical report includes an issue and/or a set of supervised model features, results in a correct prediction of whether a historical expense report includes an issue and/or a set of supervised model features, and/or the like, thereby accounting for variations among historical expense reports). As a fourth step, the report analysis platform may finalize the machine learning model based on labels that satisfy a threshold accuracy associated with the Hamming Loss Metric, and may use the machine learning model for subsequent prediction of whether an expense report includes an issue, includes a set of supervised model features, would pass or fail an audit, and/or the like.
Turning to
Additionally, or alternatively, and as another example, the report analysis platform may perform a linear discriminate analysis of the historical expense reports, the historical audits, the set of supervised model features, and/or the like. For example, the report analysis platform may determine particular supervised model features that are associated with historical expense reports that pass an audit, that fail an audit, that include an issue, that do not include an issue, that match a pattern of previous compliant or non-compliant historical expense reports, and/or the like. Additionally, or alternatively, and as another example, the report analysis platform may perform a text analysis of the historical expense reports, the historical audits, and/or the set of supervised model features (e.g., of information that identifies the set of supervised model features) to identify terms, phrases, patterns of terms and/or phrases, and/or the like that are common to historical expense reports that passed a historical audit, that failed a historical audit, that include an issue, the do not include an issue, and/or the like.
Turning to
As shown by reference number 140, the report analysis platform may determine the set of unsupervised model features by processing data related to historical expense reports using a machine learning model. In some implementations, the machine learning model may be similar to that described elsewhere herein. For example, the machine learning model may output a set of unsupervised model features determined from the data related to the historical expense reports. In some implementations, the report analysis platform may input the multi-entity profile for the data related to the historical expense reports to the machine learning model to determine the set of unsupervised model features, in a manner that is the same as or similar to that described elsewhere herein with regard to the set of supervised model features.
Turning to
Additionally, or alternatively, and as another example, the report analysis platform may perform an inter-individual analysis of the set of unsupervised model features. Continuing with the previous example, the report analysis platform may identify a pattern of unsupervised model features across multiple individuals (e.g., associated with a same location, that submitted historical expense reports during a time period, associated with a same type of expense, and/or the like). This facilitates analysis of a new expense report for an individual in the context of the individual's peer group, thereby improving performance of the report analysis platform with regard to identifying issues included in expense reports.
Additionally, or alternatively, the report analysis platform may perform a kernel density estimation (KDE) anomaly detection analysis of the historical expense reports, the set of unsupervised model features, and/or the like. For example, the report analysis platform may perform the KDE anomaly detection analysis to detect anomalies in the historical expense reports (e.g., anomalous locations, amounts, and/or the like).
Turning to
In some implementations, the super model may include a gradient boosting tree. For example, the report analysis platform may combine the set of supervised model features and the set of unsupervised model features into a gradient boosting tree that can be used to determine a score for a report (e.g., by combining patterns extracted from the data into a single model).
As shown by reference number 155, the report analysis platform may receive an expense report to be processed. For example, the report analysis platform may receive the expense report after generating the super model and/or utilizing the super model to train a machine learning model. In some implementations, the expense report may include data that identifies a set of expenses associated with the expense report, an individual that submitted the expense report, a set of individuals that incurred the expenses, an amount of the expenses, a location of the expenses, a type of the expenses (e.g., food, fuel, lodging, and/or the like), a vendor associated with the expenses, and/or the like. In some implementations, the expense report may be similar to a historical expense report described elsewhere herein.
In some implementations, the report analysis platform 230 may receive the expense report from the server device 220, the client device 210, the user device, and/or the like. In some implementations, the report analysis platform 230 may receive the expense report when the expense report is generated, may receive a batch of expense reports at a particular time or after a threshold quantity of expense reports have been submitted, and/or the like. In some implementations, the report analysis platform 230 may receive thousands, millions, or more expense reports associated with thousands, millions, or more individuals, vendors, locations, and/or the like. In this way, the report analysis platform 230 may receive a quantity of expense reports that cannot be processed manually or objectively (e.g., in a consistent manner) by a human actor.
As shown by reference number 160, the report analysis platform may determine a score for the expense report. For example, the report analysis platform may determine the score after receiving the expense report. In some implementations, the score may indicate a likelihood of the expense report including an issue. For example, the score may indicate a likelihood of the expense report including a fraudulent expense, a likelihood of the expense report failing an audit, a likelihood of the expense report including data that does not match the features of the super model, and/or the like.
As shown by reference number 165, the report analysis platform may input the expense report (or data extracted from the expense report using a text processing technique, an image processing technique, and/or the like similar to that described elsewhere herein) into the super model. For example, the report analysis platform may input the expense report in association with determining to determine the score for the expense report. In some implementations, the report analysis platform may use the super model to process the expense report (e.g., data from the expense report) to determine whether the expense report matches the set of supervised model features, the set of unsupervised model features, and/or the like. Continuing with the previous example, the report analysis platform may determine whether a combination of location, amount, individual, and/or the like associated with the expense report matches a pattern of supervised model features and/or unsupervised model features included in the super model.
In some implementations, the super model may be machine learning model that has been trained on a set of supervised model features and/or a set of unsupervised model features. For example, the super model may be trained to process the expense report after having been trained. Continuing with the previous example, the super model may be similar to other machine learning models described elsewhere herein and/or may be trained to output a score (e.g., based on data associated with the expense report).
As shown by reference number 170, the super model may output the score after the report analysis platform has processed the expense report using the super model. Additionally, or alternatively, when the report analysis platform processes the expense report using a machine learning model, the machine learning model may output the score. In some implementations, the score may be an average score, a range of scores, and/or the like. For example, the report analysis platform may perform multiple iterations of processing the expense report and may generate the score based on the scores associated with the multiple iterations.
As shown by reference number 175, the report analysis platform 230 may perform an action based on the score. For example, the report analysis platform 230 may perform an action after determining the score for the expense report. In some implementations, the report analysis platform 230 may trigger an alarm based on the score (e.g., based on whether the score satisfies a threshold). Additionally, or alternatively, the report analysis platform 230 may send a message to the client device 210, the user device, and/or the server device 220 when the score satisfies a threshold (e.g., that includes information that identifies the expense report, the score, and/or the like). Additionally, or alternatively, the report analysis platform 230 may generate a report that identifies a result of processing a batch of expense reports (e.g., that includes information that identifies scores for each of the reports, whether the scores satisfy a threshold, a trend of the scores over time, and/or the like), and may store the report in the server device 220 and/or may output the report via the client device 210 and/or the user device. Additionally, or alternatively, the report analysis platform 230 may store the expense report in the server device 220 based on the score (e.g., based on whether the score satisfies a threshold), and may populate a user interface with information that identifies the expense report, with a link (e.g., a file path, a uniform resource locator (URL), a clickable icon, and/or the like) to a storage location of the expense report, and/or the like.
Additionally, or alternatively, the report analysis platform 230 may update one or more of the models described herein. Additionally, or alternatively, the report analysis platform 230 may trigger an automated investigation of an expense report (e.g., may trigger a more rigorous analysis of the expense report, such as by requesting input of an explanation of a discrepancy, requesting upload of an itemized receipt, requesting transaction records from a vendor's server device 220, and/or the like). Additionally, or alternatively, the report analysis platform 230 may trigger a manual investigation of the expense report (e.g., by sending a message to a user device associated with an investigator). Additionally, or alternatively, the report analysis platform 230 may freeze a credit card, an account, and/or the like. Additionally, or alternatively, the report analysis platform 230 may remove, add, or modify a requirement to an expense and/or expense report approval process, such as a requirement related to an individual that needs to authorize an expense and/or expense report, a timing of that authorization (e.g., before or after that expense and/or expense report), what needs to be pre-authorized, and/or the like.
Additionally, or alternatively, the report analysis platform may flag the expense report based on the score. For example, the report analysis platform may flag the expense report as possibly including an error and/or for further review based on the score satisfying a threshold. Additionally, or alternatively, the report analysis platform may flag attributes associated with the expense report when the score satisfies a threshold. For example, the report analysis platform may flag an individual, a location, a vendor, and/or the like associated with the expense report when the score satisfies a threshold. In some implementations, and continuing with the previous example, the report analysis platform may process old expense reports associated with the flagged attributes, may process any new expense reports associated with the flagged attributes, and/or the like.
In this way, the report analysis platform is capable of quickly and efficiently processing thousands, millions, or more expense reports generated by an organization in real-time or near real-time. This reduces an amount of time needed to process expense reports associated with the organization, thereby improving an efficiency of processing the expense reports. In addition, this increases a throughput of the organization's capability to process expense reports, thereby reducing or eliminating a risk of a missed expense report that includes an issue. Further, this provides an objective and verifiable tool that can be used to process expense reports, thereby providing the organization with a new way of processing expense reports and/or reducing or eliminating waste associated with a subjective analysis of expense reports.
As indicated above,
Client device 210 includes one or more devices capable of receiving, generating, storing, processing, and/or providing information associated with a report. For example, client device 210 may include a mobile phone (e.g., a smart phone, a radiotelephone, etc.), a laptop computer, a tablet computer, a handheld computer, a gaming device, a wearable communication device (e.g., a smart wristwatch, a pair of smart eyeglasses, etc.), a desktop computer, or a similar type of device. In some implementations, client device 210 may provide, to report analysis platform 230, a report to be processed by report analysis platform 230, as described elsewhere herein. In some implementations, a user device, as described elsewhere herein, may be the same as or similar to client device 210.
Server device 220 includes one or more devices capable of receiving, generating storing, processing, and/or providing information associated with a report. For example, server device 220 may include a server (e.g., in a data center or a cloud computing environment), a data center (e.g., a multi-server micro datacenter), a workstation computer, a virtual machine (VM) provided in a cloud computing environment, or a similar type of device. In some implementations, server device 220 may include a communication interface that allows server device 220 to receive information from and/or transmit information to other devices in environment 200. In some implementations, server device 220 may be a physical device implemented within a housing, such as a chassis. In some implementations, server device 220 may be a virtual device implemented by one or more computer devices of a cloud computing environment or a data center. In some implementations, server device 220 may provide, to report analysis platform 230, a report to be processed by report analysis platform 230, as described elsewhere herein.
Report analysis platform 230 includes one or more devices capable of receiving, generating, storing, processing, and/or providing information related to reports. For example, report analysis platform 230 may include a cloud server or a group of cloud servers. In some implementations, report analysis platform 230 may be designed to be modular such that certain software components can be swapped in or out depending on a particular need. As such, report analysis platform 230 may be easily and/or quickly reconfigured for different uses.
In some implementations, as shown in
Cloud computing environment 232 includes an environment that hosts report analysis platform 230. Cloud computing environment 232 may provide computation, software, data access, storage, and/or other services that do not require end-user knowledge of a physical location and configuration of a system and/or a device that hosts report analysis platform 230. As shown, cloud computing environment 232 may include a group of computing resources 234 (referred to collectively as “computing resources 234” and individually as “computing resource 234”).
Computing resource 234 includes one or more personal computers, workstation computers, server devices, or another type of computation and/or communication device. In some implementations, computing resource 234 may host report analysis platform 230. The cloud resources may include compute instances executing in computing resource 234, storage devices provided in computing resource 234, data transfer devices provided by computing resource 234, etc. In some implementations, computing resource 234 may communicate with other computing resources 234 via wired connections, wireless connections, or a combination of wired and wireless connections.
As further shown in
Application 234-1 includes one or more software applications that may be provided to or accessed by one or more devices of environment 200. Application 234-1 may eliminate a need to install and execute the software applications on devices of environment 200. For example, application 234-1 may include software associated with report analysis platform 230 and/or any other software capable of being provided via cloud computing environment 232. In some implementations, one application 234-1 may send/receive information to/from one or more other applications 234-1, via virtual machine 234-2. In some implementations, application 234-1 may include a software application associated with one or more databases and/or operating systems. For example, application 234-1 may include an enterprise application, a functional application, an analytics application, and/or the like.
Virtual machine 234-2 includes a software implementation of a machine (e.g., a computer) that executes programs like a physical machine. Virtual machine 234-2 may be either a system virtual machine or a process virtual machine, depending upon use and degree of correspondence to any real machine by virtual machine 234-2. A system virtual machine may provide a complete system platform that supports execution of a complete operating system (“OS”). A process virtual machine may execute a single program, and may support a single process. In some implementations, virtual machine 234-2 may execute on behalf of a user (e.g., a user of client device 210), and may manage infrastructure of cloud computing environment 232, such as data management, synchronization, or long-duration data transfers.
Virtualized storage 234-3 includes one or more storage systems and/or one or more devices that use virtualization techniques within the storage systems or devices of computing resource 234. In some implementations, within the context of a storage system, types of virtualizations may include block virtualization and file virtualization. Block virtualization may refer to abstraction (or separation) of logical storage from physical storage so that the storage system may be accessed without regard to physical storage or heterogeneous structure. The separation may permit administrators of the storage system flexibility in how the administrators manage storage for end users. File virtualization may eliminate dependencies between data accessed at a file level and a location where files are physically stored. This may enable optimization of storage use, server consolidation, and/or performance of non-disruptive file migrations.
Hypervisor 234-4 provides hardware virtualization techniques that allow multiple operating systems (e.g., “guest operating systems”) to execute concurrently on a host computer, such as computing resource 234. Hypervisor 234-4 may present a virtual operating platform to the guest operating systems, and may manage the execution of the guest operating systems. Multiple instances of a variety of operating systems may share virtualized hardware resources.
Network 240 includes one or more wired and/or wireless networks. For example, network 240 may include a cellular network (e.g., a long-term evolution (LTE) network, a code division multiple access (CDMA) network, a 3G network, a 4G network, a 5G network, another type of next generation network, etc.), a public land mobile network (PLMN), a local area network (LAN), a wide area network (WAN), a metropolitan area network (MAN), a telephone network (e.g., the Public Switched Telephone Network (PSTN)), a private network, an ad hoc network, an intranet, the Internet, a fiber optic-based network, a cloud computing network, or the like, and/or a combination of these or other types of networks.
The number and arrangement of devices and networks shown in
Bus 310 includes a component that permits communication among the components of device 300. Processor 320 is implemented in hardware, firmware, or a combination of hardware and software. Processor 320 is a central processing unit (CPU), a graphics processing unit (GPU), an accelerated processing unit (APU), a microprocessor, a microcontroller, a digital signal processor (DSP), a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), or another type of processing component. In some implementations, processor 320 includes one or more processors capable of being programmed to perform a function. Memory 330 includes a random access memory (RAM), a read only memory (ROM), and/or another type of dynamic or static storage device (e.g., a flash memory, a magnetic memory, and/or an optical memory) that stores information and/or instructions for use by processor 320.
Storage component 340 stores information and/or software related to the operation and use of device 300. For example, storage component 340 may include a hard disk (e.g., a magnetic disk, an optical disk, a magneto-optic disk, and/or a solid state disk), a compact disc (CD), a digital versatile disc (DVD), a floppy disk, a cartridge, a magnetic tape, and/or another type of non-transitory computer-readable medium, along with a corresponding drive.
Input component 350 includes a component that permits device 300 to receive information, such as via user input (e.g., a touch screen display, a keyboard, a keypad, a mouse, a button, a switch, and/or a microphone). Additionally, or alternatively, input component 350 may include a sensor for sensing information (e.g., a global positioning system (GPS) component, an accelerometer, a gyroscope, and/or an actuator). Output component 360 includes a component that provides output information from device 300 (e.g., a display, a speaker, and/or one or more light-emitting diodes (LEDs)).
Communication interface 370 includes a transceiver-like component (e.g., a transceiver and/or a separate receiver and transmitter) that enables device 300 to communicate with other devices, such as via a wired connection, a wireless connection, or a combination of wired and wireless connections. Communication interface 370 may permit device 300 to receive information from another device and/or provide information to another device. For example, communication interface 370 may include an Ethernet interface, an optical interface, a coaxial interface, an infrared interface, a radio frequency (RF) interface, a universal serial bus (USB) interface, a Wi-Fi interface, a cellular network interface, or the like.
Device 300 may perform one or more processes described herein. Device 300 may perform these processes based on to processor 320 executing software instructions stored by a non-transitory computer-readable medium, such as memory 330 and/or storage component 340. A computer-readable medium is defined herein as a non-transitory memory device. A memory device includes memory space within a single physical storage device or memory space spread across multiple physical storage devices.
Software instructions may be read into memory 330 and/or storage component 340 from another computer-readable medium or from another device via communication interface 370. When executed, software instructions stored in memory 330 and/or storage component 340 may cause processor 320 to perform one or more processes described herein. Additionally, or alternatively, hardwired circuitry may be used in place of or in combination with software instructions to perform one or more processes described herein. Thus, implementations described herein are not limited to any specific combination of hardware circuitry and software.
The number and arrangement of components shown in
As shown in
As further shown in
As further shown in
As further shown in
As further shown in
As further shown in
Process 400 may include additional implementations, such as any single implementation or any combination of implementations described below and/or in connection with one or more other processes described elsewhere herein.
In some implementations, the report analysis platform may determine the set of supervised model features based on at least one of: a pattern of non-compliant reports included in the historical reports and identified in the historical audits, a linear discriminate analysis of the historical reports and the historical audits, or a text analysis of the historical reports and the historical audits. In some implementations, the report analysis platform may determine the set of unsupervised model features based on at least one of: an intra-individual analysis of the individuals, an inter-individual analysis of the individuals, or a kernel density estimation (KDE) anomaly detection analysis of the historical reports.
In some implementations, the report analysis platform may determine the multi-entity profile based on at least one of: the individuals, vendors associated with the historical reports, or locations associated with the individuals, the vendors, or the organization. In some implementations, the set of supervised model features identifies features of compliant historical reports and non-compliant historical reports included in the historical reports, wherein the non-compliant historical reports include the issue and the compliant historical reports do not include the issue. In some implementations, the set of unsupervised model features identifies features of the historical reports that are indicative of a pattern of the data related to the historical reports. In some implementations, the report analysis platform may flag the report as including the issue or not including the issue based on the score, and may store, in a data structure, information that identifies the report and an identifier that identifies whether the report includes the issue or does not include the issue.
Although
As shown in
As further shown in
As further shown in
As further shown in
As further shown in
As further shown in
Process 500 may include additional implementations, such as any single implementation or any combination of implementations described below and/or in connection with one or more other processes described elsewhere herein.
In some implementations, the report analysis platform may train the model, utilizing the set of supervised model features and the set of unsupervised model features, to identify a likelihood of the issue being included in the report. In some implementations, the report analysis platform may process the report to identify the issue included in the report after training the model.
In some implementations, the report analysis platform may determine, utilizing the model, the score for the report after training the model, wherein the score indicates the likelihood of the issue being included in the report, and may flag the report as including the issue after determining the score, wherein the score satisfies a threshold, or flag the report as not including the issue after determining the score, wherein the score fails to satisfy the threshold. In some implementations, the report analysis platform may determine the set of supervised model features based on at least one of: a pattern of non-compliant reports included in the historical reports and identified in the historical audits, a linear discriminate analysis of the historical reports and the historical audits, or a text analysis of the historical reports and the historical audits, and may determine the set of unsupervised model features based on at least one of: an intra-individual analysis of individuals, an inter-individual analysis of the individuals, or a kernel density estimation (KDE) anomaly detection analysis of the historical reports.
In some implementations, the report analysis platform may trigger an alarm after flagging the report as including the issue. In some implementations, the report analysis platform may store a log, related to processing the report, after processing the report, wherein the log identifies whether the report includes the issue or does not include the issue.
Although
As shown in
As further shown in
As further shown in
As further shown in
As further shown in
As further shown in
As further shown in
Process 600 may include additional implementations, such as any single implementation or any combination of implementations described below and/or in connection with one or more other processes described elsewhere herein.
In some implementations, the report analysis platform may combine the set of supervised model features and the set of unsupervised model features into a super model after determining the set of supervised model features and the set of unsupervised model features, and may train the model utilizing the super model after combining the set of supervised model features and the set of unsupervised model features into the super model. In some implementations, the report analysis platform may combine the set of supervised model features and the set of unsupervised model features into a gradient boosting tree after determining the set of supervised model features and the set of unsupervised model features, wherein the gradient boosting tree is the super model, and may train the model based on the gradient boosting tree after combining the set of supervised model features and the set of unsupervised model features into the gradient boosting tree.
In some implementations, the report analysis platform 230 may send a message to a client device 210 after determining the score, wherein the message includes information that identifies the report, the score, or whether the report includes the issue. In some implementations, the report analysis platform 230 may determine, after determining the score, that the report includes the issue based on the score satisfying a threshold, may identify a type of the issue included in the report after determining that the report includes the issue, and may perform the one or more actions based on the type of the issue after identifying the type of the issue. In some implementations, the report analysis platform 230 may populate a user interface, provided for display via a display associated with a device, with information that identifies the report after identifying the type of the issue, wherein the user interface is associated with the type of the issue and is associated with displaying information related to a set of reports that includes the type of the issue.
Although
In this way, report analysis platform 230 provides a tool that can process reports in a technical and/or objective manner to determine whether the reports include an issue. This improves an accuracy of processing the reports and/or removes waste due to subjectivity of a manual review of the reports. In addition, report analysis platform 230 provides a tool that can be used to process reports as a quantity of reports generated by an organization increases, thereby providing a scalable tool that an organization can use. Further, this reduces or eliminates a need for a manual review of the reports, thereby increasing a flexibility of when the reports are reviewed (e.g., the reports can be reviewed 24-hours a day). Further, this increases a consistency of review of the reports (e.g., which may reduce costs associated with quality control of review of the reports).
The foregoing disclosure provides illustration and description, but is not intended to be exhaustive or to limit the implementations to the precise form disclosed. Modifications and variations are possible in light of the above disclosure or may be acquired from practice of the implementations.
As used herein, the term component is intended to be broadly construed as hardware, firmware, and/or a combination of hardware and software.
Some implementations are described herein in connection with thresholds. As used herein, satisfying a threshold may refer to a value being greater than the threshold, more than the threshold, higher than the threshold, greater than or equal to the threshold, less than the threshold, fewer than the threshold, lower than the threshold, less than or equal to the threshold, equal to the threshold, or the like.
It will be apparent that systems and/or methods, described herein, may be implemented in different forms of hardware, firmware, or a combination of hardware and software. The actual specialized control hardware or software code used to implement these systems and/or methods is not limiting of the implementations. Thus, the operation and behavior of the systems and/or methods were described herein without reference to specific software code—it being understood that software and hardware can be designed to implement the systems and/or methods based on the description herein.
Even though particular combinations of features are recited in the claims and/or disclosed in the specification, these combinations are not intended to limit the disclosure of possible implementations. In fact, many of these features may be combined in ways not specifically recited in the claims and/or disclosed in the specification. Although each dependent claim listed below may directly depend on only one claim, the disclosure of possible implementations includes each dependent claim in combination with every other claim in the claim set.
No element, act, or instruction used herein should be construed as critical or essential unless explicitly described as such. Also, as used herein, the articles “a” and “an” are intended to include one or more items, and may be used interchangeably with “one or more.” Furthermore, as used herein, the term “set” is intended to include one or more items (e.g., related items, unrelated items, a combination of related and unrelated items, etc.), and may be used interchangeably with “one or more.” Where only one item is intended, the term “one” or similar language is used. Also, as used herein, the terms “has,” “have,” “having,” or the like are intended to be open-ended terms. Further, the phrase “based on” is intended to mean “based, at least in part, on” unless explicitly stated otherwise.
Number | Name | Date | Kind |
---|---|---|---|
7437327 | Lam et al. | Oct 2008 | B2 |
8589227 | Bridge et al. | Nov 2013 | B1 |
9336302 | Swamy | May 2016 | B1 |
9460155 | Verma | Oct 2016 | B2 |
9535917 | Lin | Jan 2017 | B1 |
9668066 | Betts et al. | May 2017 | B1 |
10038611 | Wu et al. | Jul 2018 | B1 |
10417059 | Arya et al. | Sep 2019 | B1 |
10516999 | Desai | Dec 2019 | B1 |
10585979 | Moyers | Mar 2020 | B2 |
10698868 | Guggilla et al. | Jun 2020 | B2 |
10771562 | Desai et al. | Sep 2020 | B2 |
10904298 | Sondhi et al. | Jan 2021 | B2 |
11087245 | Subramanian et al. | Aug 2021 | B2 |
20020184123 | Sijacic et al. | Dec 2002 | A1 |
20080154957 | Taylor et al. | Jun 2008 | A1 |
20120084235 | Suzuki | Apr 2012 | A1 |
20130346287 | Enzaldo | Dec 2013 | A1 |
20140278165 | Wenzel et al. | Sep 2014 | A1 |
20160012688 | Eagleman et al. | Jan 2016 | A1 |
20160117778 | Costello | Apr 2016 | A1 |
20160299938 | Malhotra et al. | Oct 2016 | A1 |
20160358268 | Verma | Dec 2016 | A1 |
20170084167 | Bump et al. | Mar 2017 | A1 |
20170251007 | Fujisawa | Aug 2017 | A1 |
20180040064 | Grigg et al. | Feb 2018 | A1 |
20180107941 | Siebel et al. | Apr 2018 | A1 |
20180232814 | Dey et al. | Aug 2018 | A1 |
20180322406 | Merrill | Nov 2018 | A1 |
20180349250 | Debnath et al. | Dec 2018 | A1 |
20180350006 | Agrawal | Dec 2018 | A1 |
20190139147 | Mittal et al. | May 2019 | A1 |
20190180389 | Hurst et al. | Jun 2019 | A1 |
20190369570 | Sahinoglu | Dec 2019 | A1 |
20200193234 | Pai et al. | Jun 2020 | A1 |
20200265119 | Desai et al. | Aug 2020 | A1 |
Entry |
---|
CA Technologies, “Application and Infrastructure Monitoring Tools,” https://www.ca.com/us/products/application-and-infrastructure-monitoring.html, Feb. 13, 2016, 11 pages. |
Hewlett Packard Enterprise Development LP, “HPE OneView,” https://www.hpe.com/us/en/integrated-systems/software.html, 2018, 21 pages. |
Extended European Search Report for Application No. EP19212369.3, dated Jan. 14, 2020, 9 pages. |
AppZen, “Audit Every Dollar of Spend,” Mar. 16, 2018, 7 pages, https://www.appzen.com/. |
AppZen., “Using Artificial Intelligence to Automate Expense Report Audit, Fraud Detection, and Risk Compliance,” Aug. 2016, 6 pages. Retrieved from Internet:[URL:https://cdn2.hubspot.net/hubfs/516015/docs/Whitepaper-AI-for-Business-AppZen-2016.pdf]. |
Cognilytica Research., “Innovative Approaches to Solving Expense Report Fraud with AI,” AppZen, Apr. 2018, 8 pages. Retrieved from Internet:[URL:https://cdn2.hubspot.net/hubfs/516015/Cognylitica%20Whitepaper.pdf]. |
Kelly., “Disaggregation of Domestic Smart Meter Energy Data,” University of London, 2016, pp. 1-251. |
Number | Date | Country | |
---|---|---|---|
20200074359 A1 | Mar 2020 | US |