The subject matter described herein relates to enhanced techniques for automated data source join proposals based on process relations within metadata.
Analysis of various data sources can be used to gain an understanding of a particular computer-implemented process such as performance or efficiency of a computer-implemented process. Relevant information can be stored in hundreds of different data sources. The data sources can be document oriented data tables which include technical keys and technical reference information. Joining of the data sources can link information across multiple data sources to provide an end-to-end analysis of a particular computer-implemented process. In joining the data sources, however, a user may need to have technical knowledge on metadata within each data source. For example, a user may need to understand the header information of the documents in order to properly join them. Such joins can require a technical analysis of the contents of a particular document along with generation of a technical keys to understand how the data sources are related. For example, how invoices relate to deliveries and/or how deliveries relate to sales orders.
In one aspect, data derived by user input via a graphical user interface is received. The data encapsulates a first data source and one or more data fields within the first data source. A data object model of the first data source is generated based on the one or more fields. A plurality of data sources are searched for the one or more data fields to identify a second data source having one or more related fields. A technical relationship between the first data source and the second data source is defined based on the one or more data fields. A join condition between the first data source and the second data source is determined based on the defined technical relationship. The join condition is provided to a user via the graphical user interface for confirmation of joining the one or more fields of the first data source and the second data source.
In some variations, via user input via the graphical user interface, a confirmation of the join condition can be received. A joined data source can be generated. The joined data source can include the one or more fields from the first data source and one or more related fields from the second data source. The joined data source can be interactively analyzed via the graphical user interface.
In other variations, based on user input via the graphical user interface, the first data source can be removed from the join condition. A third data source can be added based on user input via the graphical user interface. A selection, based on user input via the graphical user interface, one or more fields of the third data source can be received. The generating, determining, and providing can be repeated using the third data source in place of the first data source.
In some variations, a second join condition to a fourth data source can be generated based on a number of fields of the technical reference. The technical relationship can contain the one or more fields within the first data source and a reference field to a type of the second data source. The technical relationship can also be based on metadata stored within the first data source and the second data source.
In other variations, the first data source can contain document types including a sales order, a customer invoice, a clearing document, a payment order, or a bank payment order.
Non-transitory computer program products (i.e., physically embodied computer program products) are also described that store instructions, which when executed by one or more data processors of one or more computing systems, cause at least one data processor to perform operations herein. Similarly, computer systems are also described that may include one or more data processors and memory coupled to the one or more data processors. The memory may temporarily or permanently store instructions that cause at least one processor to perform one or more of the operations described herein. In addition, methods can be implemented by one or more data processors either within a single computing system or distributed among two or more computing systems. Such computing systems can be connected and can exchange data and/or commands or other instructions or the like via one or more connections, including but not limited to a connection over a network (e.g., the Internet, a wireless wide area network, a local area network, a wide area network, a wired network, or the like), via a direct connection between one or more of the multiple computing systems, etc.
The subject matter described herein provides many technical advantages. For example, the current subject matter can provide data object models that consistently define various documents in a generic language to provide uniformity across the data sources. This uniform world can interpret various data object models and can automatically create a proposal for how to join various data sources.
The details of one or more variations of the subject matter described herein are set forth in the accompanying drawings and the description below. Other features and advantages of the subject matter described herein will be apparent from the description and drawings, and from the claims.
Like reference symbols in the various drawings indicate like elements.
Transparency into an end-to-end computer-implemented process can be important to understand how to evaluate performance of a particular process. For example, a user may be interested in process efficiencies of a particular aspect of a computer-implemented process, reporting on process lead times, or traceability of an end-to-end logistics platform. A user can, using the user interface (UI) capabilities described herein, be provided with guidance on how to define joined data sources by leveraging references of document chains. Using the subject matter described herein, a user no longer needs to have technical knowledge to join multiple data sources. Rather, a UI can automatically generate join proposals to a user by generating data model objects on top of document types and technical references between multiple data sources. The join proposals can occur in manner that is transparent to a user. An automated join proposal can be generated to guide a user within a modeled UI by providing step-by-step assistance in joining data sources, as described herein.
Each data object 250 can include a number of fields contained within the document type. Customer invoice 220 can, for example, include field 220A (e.g., invoice identification) and field 220B (e.g., invoice date). Customer invoice 220 can also include a customer invoice item 222 having a field 222A (e.g., item identification. In another example, sales order 210 can include field 210A (e.g., identification). Sales order 210 can also include sales order item 212 having field 212A (e.g., item identification) and field 212B (e.g., net value).
Data objects 250 can also store the technical references 275 between a number of document types. A technical reference between, for example, customer invoice 220 and sales order 210 can be generated to include relationship data between the two document types. For example, technical reference 275 for customer invoice 220 can indicate a relationship to document type sales order 210 via field 275A. Technical relationship can also include identification of a sales order item 212 via field 275B, identification of the sales order fields (e.g., 210A) via field 275C, and identification of the customer invoice item field 222A via field 275D. The data object 250 for a customer invoice can include the customer invoice data type 220, customer invoice item 222, and technical reference 275. The data object 250 for a sales order can include sales order document type 210 and sales order item 212.
Analytical data sources 260 can be modeled on top of each data object 210 such that the fields of the data source represent the fields of each data object. Each data source can contain the information contained (e.g., fields) within the data object 210. A user can select a number of fields for each data source that are relevant to a particular analysis. For example, customer invoice data source 280 can be modeled on top of the customer invoice data object. In this example, the user selected fields include field 220A (e.g., customer invoice identification), field 220B (e.g., customer invoice date), field 222A (e.g., customer item identification), field 275C (e.g., technical reference 275 field sales order identification), and field 275D (e.g., technical reference 275 field sales order item). Similarly, sales order data source 270 can be modeled on top of the data object for sales orders 210. In this example, a user selected a number of fields within sales order data source 280 including field 210A (e.g., sales order identification), field 212A (e.g., sales order item identification), and field 212B (e.g., net value). The user selection is translated into data which encapsulates the data source along with the selected fields.
For example, if a user selects a field “delivery item ID”, the metadata of the data object can identify that the field refers to another document type. The data sources that are built on top of the field can be identified. Suggestions of these related data sources can be provided to a user based on this metadata. With the join relationships, users no longer need to understand the data types such as universal unique identifiers (UUIDs) and external IDs. The join relationship automatically proposes data sources that have matching data types such as UUIDs or external IDs. The metadata identifies the source of the sales order 210. Utilizing the existing database tables, the application extracts a field and maps that field to another data object such as an invoice 220.
Based on the technical references between various data sources, a join proposal can be automatically generated to propose a join condition between related data sources. The join proposal, once accepted by a user, can generate a joined data source 270 containing a coherent data source 232. For example, in
A user can be guided through the creation of coherent data source 232 using a UI.
For example, a user can select the sales order data source 270 along with the field “buyer”. A listing of all data sources which contain the field “buyer” can be generated (e.g., join condition). The user can select one of the data sources from the listing and a coherent data source can be generated having the buyer field from the sales order data source 270 and the user selected data source. In some cases, more than one join condition can be required based on the number of related data sources. The number of join conditions proposed to the user can be based on the complexity of the related data sources.
One or more aspects or features of the subject matter described herein can be realized in digital electronic circuitry, integrated circuitry, specially designed application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs) computer hardware, firmware, software, and/or combinations thereof. These various aspects or features can include implementation in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which can be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device. The programmable system or computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
These computer programs, which can also be referred to as programs, software, software applications, applications, components, or code, include machine instructions for a programmable processor, and can be implemented in a high-level procedural language, an object-oriented programming language, a functional programming language, a logical programming language, and/or in assembly/machine language. As used herein, the term “computer-readable medium” refers to any computer program product, apparatus and/or device, such as for example magnetic discs, optical disks, memory, and Programmable Logic Devices (PLDs), used to provide machine instructions and/or data to a programmable processor, including a computer-readable medium that receives machine instructions as a computer-readable signal. The term “computer-readable signal” refers to any signal used to provide machine instructions and/or data to a programmable processor. The computer-readable medium can store such machine instructions non-transitorily, such as for example as would a non-transient solid-state memory or a magnetic hard drive or any equivalent storage medium. The computer-readable medium can alternatively or additionally store such machine instructions in a transient manner, for example as would a processor cache or other random access memory associated with one or more physical processor cores.
In one example, a disk controller 648 can interface one or more optional disk drives to the system bus 604. These disk drives can be external or internal floppy disk drives such as 660, external or internal CD-ROM, CD-R, CD-RW or DVD, or solid state drives such as 652, or external or internal hard drives 656. As indicated previously, these various disk drives 652, 656, 660 and disk controllers are optional devices. The system bus 604 can also include at least one communication port 620 to allow for communication with external devices either physically connected to the computing system or available externally through a wired or wireless network. In some cases, the communication port 620 includes or otherwise comprises a network interface.
To provide for interaction with a user, the subject matter described herein can be implemented on a computing device having a display device 640 (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information obtained from the bus 604 to the user and an input device 632 such as keyboard and/or a pointing device (e.g., a mouse or a trackball) and/or a touchscreen by which the user can provide input to the computer. Other kinds of input devices 632 can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback by way of a microphone 636, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input. In the input device 632 and the microphone 636 can be coupled to and convey information via the bus 604 by way of an input device interface 628. Other computing devices, such as dedicated servers, can omit one or more of the display 640 and display interface 614, the input device 632, the microphone 636, and input device interface 628.
To provide for interaction with a user, the subject matter described herein can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) and/or a touchscreen by which the user can provide input to the computer. Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.
In the descriptions above and in the claims, phrases such as “at least one of” or “one or more of” may occur followed by a conjunctive list of elements or features. The term “and/or” may also occur in a list of two or more elements or features. Unless otherwise implicitly or explicitly contradicted by the context in which it is used, such a phrase is intended to mean any of the listed elements or features individually or any of the recited elements or features in combination with any of the other recited elements or features. For example, the phrases “at least one of A and B;” “one or more of A and B;” and “A and/or B” are each intended to mean “A alone, B alone, or A and B together.” A similar interpretation is also intended for lists including three or more items. For example, the phrases “at least one of A, B, and C;” “one or more of A, B, and C;” and “A, B, and/or C” are each intended to mean “A alone, B alone, C alone, A and B together, A and C together, B and C together, or A and B and C together.” In addition, use of the term “based on,” above and in the claims is intended to mean, “based at least in part on,” such that an unrecited feature or element is also permissible.
The subject matter described herein can be embodied in systems, apparatus, methods, and/or articles depending on the desired configuration. The implementations set forth in the foregoing description do not represent all implementations consistent with the subject matter described herein. Instead, they are merely some examples consistent with aspects related to the described subject matter. Although a few variations have been described in detail above, other modifications or additions are possible. In particular, further features and/or variations can be provided in addition to those set forth herein. For example, the implementations described above can be directed to various combinations and subcombinations of the disclosed features and/or combinations and subcombinations of several further features disclosed above. In addition, the logic flows depicted in the accompanying figures and/or described herein do not necessarily require the particular order shown, or sequential order, to achieve desirable results. Other implementations may be within the scope of the following claims.