This disclosure relates to managing data and, more particularly, to systems, methods, and software for implementing, utilizing, or otherwise managing archived data through a generic framework.
Businesses often generate and utilize large amounts of data during their operation and management. Often, this data may be contained in documents, e.g., spreadsheets, correspondence, invoices, purchase orders, or other business forms. Documents may only be used or needed for a finite duration of time, yet a business may not desire to discard the documents completely after they are no longer needed. In some cases, documents that are no longer needed for daily business decisions or management may be moved to an archives These documents are typically stored electronically in active (or other fast access) storage, such as a database, which allows the business application to search the documents using one or more pieces of information, often termed metadata. This metadata may be stored in an index in the database, such that a document may be quickly located. Removal of metadata during such archiving may prevent the business from quickly locating the document or searching the archived documents using the business application.
This disclosure provides various embodiments of systems, methods, and software for managing archived data. For example, in some embodiments, software for archiving data may receive a request to archive an unstructured data object and archive the unstructured data object into an archive object in an offline storage media, the archive object associated with one or more metadata attributes. In some aspects, the request may be received from an exposed application programming interface (API) method embedded within a communicably coupled business application. The software may also receive identification of an archive index via the request from the exposed API, where the archive index points to the offline storage media and is based on one or more metadata attribute criteria. The software may also parse the archive object into one or more metadata attributes according to at least a subset of the attribute criteria and populate the archive index with the one or more metadata attributes indexing the archive object. In certain aspects, the request may be an invoked generic method associated with an attribute table. In some aspects, the software may receive at least one attribute identifier from the requester, identify one or more metadata attributes based on the attribute table and the attribute identifier, and populate the archive index with the one or more metadata attributes indexing the archive object. The software may also present the table to a client for customization. The software may control access to the table based on an access permission level.
In certain aspects, the software may execute a generic archive process to archive the unstructured data object in the offline storage media, parse the archive object in the offline storage media into one or more metadata attributes, and populate a generic archive index with the one or more metadata attributes indexing the archive object, the generic archive index pointing to the offline storage media and based on one or more metadata attribute criteria. The software may, if the generic archive index does not exist, generate the generic archive index. Also, the software may execute a batch process that parses a plurality of archive objects in the offline storage media into one or more respective metadata attributes.
In some implementations, the software may also drop the unstructured data object from an online storage media. Also, one or more archive indices can include at least one metadata attribute and at least one index key utilizing one of the metadata attributes. The software may access the archive object in the offline storage media utilizing the index key. Further, the archive index may be stored in a disparate storage device from the offline storage media.
In certain embodiments, a computer-implemented method for managing archived data includes receiving a first query from a first application instance utilizing an application programming interface (API), based on the first query, asynchronously searching active data and archived data using an archive index that identifies at least a portion of metadata when the archived data was active, and presenting a first results interface to the first application that displays results as they are received from the query executions. The method may also include receiving a second query from a second application instance utilizing the API, based on the second query, asynchronously searching active data and archived data using an archive index, and presenting a second results interface to the second application that displays results as they are received from the query executions. In some aspects, the method may include presenting a search criteria interface to the first application, the search criteria interface comprising a plurality of search criteria and receiving at least one search criteria from the first application, the search criteria comprising one or more of a plurality of metadata attributes. The first and second queries may include at least one of the plurality of search criteria. Also, the one or more metadata attributes may correspond to the portion of metadata identified when the archived data was active.
Each of the foregoing, as well as other disclosed example methods, may be computer implementable. Moreover, some or all of these aspects may be further included in respective systems and software for managing archived data. The details of these and other aspects and embodiments of the disclosure are set forth in the accompanying drawings and the description below. Features, objects, and advantages of the various embodiments will be apparent from the description and drawings, and from the claims.
Environment 100 may be a distributed client/server system that allows clients 104 to submit requests to, for example, archive active data objects 230 from an online data repository 145 to an offline data repository 155 or, as another example, asynchronously search archive data objects 240 utilizing an archive index 220 and active data objects 230. In some cases, active data objects 230 may be unstructured data objects, for example, documents related to or pertinent to business records, such as invoices, bills, purchase orders, or correspondence. But environment 100 may also be a standalone computing environment or any other suitable environment, such as an administrator accessing data stored on server 102, without departing from the scope of this disclosure. Turning to the illustrated embodiment, database environment 100 includes server 102 coupled to one or more clients 104 through one or more networks 112. Server 102 includes interface 117, memory 120, and processor 125 and comprises an electronic computing device operable to receive, transmit, process, and store data associated with environment 100. For example, server 102 may be any computer or processing device such as a mainframe, a blade server, a general-purpose personal computer (PC), a Macintosh, a workstation, a Unix-based computer, or any other suitable device. Generally,
Server 102 includes memory 120, which may include any memory or database module and may take the form of volatile or non-volatile memory including, without limitation, magnetic media, optical media, random access memory (RAM), read-only memory (ROM), removable media, or any other suitable local or remote memory component. In this embodiment, illustrated memory 120 includes database system 200. Database system 200 includes a database management system and an online data repository 145. Generally, illustrated database system 200 is meant to represent a local or distributed database, warehouse, or other information repository that includes or utilizes various components.
Continuing with
In more detail, online data repository 145 is coupled to and accessed, called, or otherwise managed by the database management system. Online data repository 145 may store one or more active data objects 230, as well as one or more active indices 210 and one or more archive indices 220. Generally, active data objects 230 may be unstructured data, e.g., documents or other attachments. However, in some aspects, active data objects 230 may be structured data, i.e., data in a relational format, thus allowing database environment 100 to provide access to such data in online data repository 145 using a structured query language (SQL).
Continuing with
Returning to the illustrated server 102, this server 102 includes processor 125, which executes instructions (such as the logic or software described above) and manipulates data to perform the operations of server 102 such as, for example, a central processing unit (CPU), a blade, an application specific integrated circuit (ASIC), or a field-programmable gate array (FPGA). In particular, processor 125 performs any suitable tasks associated with the database management system, business application 130, and archive framework 140. Although
Processor 125 may include archive framework 140. Generally, archive framework 140 may facilitate the archival of one or more active data objects 230 into archived data objects 240 stored in offline data repository 155, in response to a request from client 104 to archive the active data objects 230. Archive framework 140 manages the archival of the active data objects 230 through various methods. For example, in some aspects, the request by client 104 is received from an API 135 method exposed by the archive framework 140. The archive framework 140 may expose the API 135 through one of several methods internal to the API 135 or through its own method. In this example, the request may also include the identification of one or more archived indices 220. As another example, in some cases, the archive framework 140 may facilitate the archival request from the client 140 through a generic method, i.e., non-API method, associated with attribute customization table 310. As yet another example, in some aspects, the archive framework 140 may execute a generic archival process in order to archive one or more active data objects 230 to archived data objects 240 in offline data repository 155.
Processor 125 includes business application 130, which, in certain embodiments, may request access to retrieve, modify, delete, or otherwise manage the information of one or more database systems 200 in memory 120, as well as any data contained in the offline storage media 160. Business application 130 may be considered a business software or solution that is capable of interacting or integrating with database systems 200 located, for example, in memory 120 to provide access to data for personal or business use. An example business application 130 may be a computer application for performing any suitable business process or logic by implementing or executing a plurality of steps. Business application 130 may also provide the user, such as an administrator, with computer implementable techniques that may result in the management of archived data objects 240. More specifically, business application 130 may facilitate or help facilitate the functionality of the archive framework 140 and API 135.
More specifically, business application 130 may be a composite application, or an application built on other applications, that includes an object access layer (OAL) and a service layer. In this example, application 130 may execute or provide a number of application services such as customer relationship management (CRM) systems, human resources management (HRM) systems, financial management (FM) systems, project management (PM) systems, knowledge management (KM) systems, and electronic file and mail systems. Such an object access layer is operable to exchange data with a plurality of enterprise base systems and to present the data to a composite application through a uniform interface. The example service layer is operable to provide services to the composite application. These layers may help composite application 130 to orchestrate a business process in synchronization with other existing processes (e.g., native processes of enterprise base systems) and leverage existing investments in the IT platform. Further, composite application 130 may run on a heterogeneous IT platform. In doing so, composite application 130 may be cross-functional in that it may drive business processes across different applications, technologies, and organizations. Accordingly, composite application 130 may drive end-to-end business processes across heterogeneous systems or sub-systems. Application 130 may also include or be coupled with a persistence layer and one or more application system connectors. Such application system connectors enable data exchange and integration with enterprise sub-systems and may include an Enterprise Connector (EC) interface, an Internet Communication Manager/Internet Communication Framework (ICM/ICF) interface, an Encapsulated PostScript (EPS) interface, and/or other interfaces that provide Remote Function Call (RFC) capability. It will be understood that while this example describes the composite application 130, it may instead be a standalone or (relatively) simple software program. Regardless, application 130 may also perform processing automatically, which may indicate that the appropriate processing is substantially performed by at least one component of system 100. It should be understood that this disclosure further contemplates any suitable administrator or other user interaction with application 130 or other components of environment 100, without departing from its original scope.
API 135 may be embedded in business application 130, as shown, for example, in
API 135 may also facilitate the querying of archived data objects 240 and active data objects 230 based on one or more identified metadata attribute criteria. The query may be performed asynchronously, e.g., the search for particular archived data objects 240 may occur separate from and distinct to the search for active data objects 230. The results of the query of archived data objects 240 and active data objects 230 may be presented to client 104 through GUI 116 in a seamless, scrolling (i.e., expanding) window.
Server 102 may also include interface 117 for communicating with other computer systems, such as client 104, over network 112 in a client-server or other distributed environment. In certain embodiments, server 102 receives queries 150, for example, requests for data access or archival of active data objects 230 from local or remote senders, through interface 117, for storage in memory 120 and/or processing by processor 125. Generally, interface 117 comprises logic encoded in software and/or hardware in a suitable combination and operable to communicate with network 112. More specifically, interface 117 may comprise software supporting one or more communications protocols associated with communications network 112 or hardware operable to communicate physical signals.
Database environment 100 also may include network 112, which facilitates wireless or wireline communication between server 102 and any other local or remote computer, such as clients 104. Indeed, while illustrated as two networks, 112a and 112b, respectively, network 112 may be a continuous network without departing from the scope of this disclosure, so long as at least portion of network 112 may facilitate communications between senders and recipients of queries 150 and results. In other words, network 112 encompasses any internal and/or external network, networks, sub-network, or combination thereof operable to facilitate communications between various computing components in database environment 100. Network 112 may communicate, for example, Internet Protocol (IP) packets, Frame Relay frames, Asynchronous Transfer Mode (ATM) cells, voice, video, data, and other suitable information between network addresses. Network 112 may include one or more local area networks (LANs), radio access networks (RANs), metropolitan area networks (MANs), wide area networks (WANs), all or a portion of the global computer network known as the Internet, and/or any other communication system or systems at one or more locations.
Database environment 100 may also include one or more clients 104. Client 104 may be any local or remote computing device operable to receive requests from the user via a user interface 116, such as a GUI, a Command Line Interface (CLI), or any of numerous other user interfaces. Thus, where reference is made to a particular interface, it should be understood that any other user interface may be substituted in its place. In various embodiments, each client 104 includes at least GUI 116 and comprises an electronic computing device operable to receive, transmit, process and store any appropriate data associated with environment 100. It will be understood that there may be any number of clients 104 communicably coupled to server 102. Further, “client 104” and “user” may be used interchangeably as appropriate without departing from the scope of this disclosure. Moreover, for ease of illustration, each client 104 is described in terms of being used by one user. But this disclosure contemplates that many users may use one computer or that one user may use multiple computers to submit or review queries 150 via GUI 116. As used in this disclosure, client 104 is intended to encompass a personal computer, touch screen terminal, workstation, network computer, kiosk, wireless data port, wireless or wireline phone, personal data assistant (PDA), one or more processors within these or other devices, or any other suitable processing device. For example, client 104 may comprise a computer that includes an input device, such as a keypad, touch screen, mouse, or other device that can accept information, and an output device that conveys information associated with the operation of server 102 or clients 104, including digital data, visual information, or GUI 116. Both the input device and output device may include fixed or removable storage media such as a magnetic computer disk, CD-ROM, or other suitable media to both receive input from and provide output to users of clients 104 through the display, namely GUI 116.
GUI 116 may include a graphical user interface operable to allow the user of client 104 to interface with at least a portion of environment 100 for any suitable purpose. Generally, GUI 116 provides the user of client 104 with an efficient and user-friendly presentation of data provided by or communicated within environment 100. GUI 116 may provide access to the front-end of business application 130 executing on client 104 that is operable to add or modify data objects of data repository 145, or also to reorganize data repository 145. In some cases, GUI 116 may provide access to the front-end of business application 130 executing on client 104 that is operable to receive requests to archive active data objects 230 to archived data objects 240, as well as search active data objects 230 and/or search archived data objects 240, utilizing one or more archive indices 220. In a further example, GUI 116 may display output reports such as summary and detailed reports. GUI 116 may comprise a plurality of customizable frames or views having interactive fields, pull-down lists, and buttons operated by the user. In one embodiment, GUI 116 may present information associated with queries 150 and receive commands from the user of client 104 via one of the input devices. Moreover, it should be understood that the term graphical user interface may be used in the singular or in the plural to describe one or more graphical user interfaces and each of the displays of a particular graphical user interface. Therefore, GUI 116 contemplates any graphical user interface, such as a generic web browser or touch screen, that processes information in environment 100 and efficiently presents the results to the user.
Archive framework 140 may also be communicably coupled to the offline data repository 155 and API 135. In some aspects, archive framework 140 may receive a request to archive one or more active data objects 230 through an exposed API 135 method such as, for example, through business application 130. The request from the exposed API 135 method may also include an identification of one or more archive indices 220. In some aspects, the exposed API 135 method may identify the archived indices 220 that have been previously generated and reside on online data repository 145, or in some cases, offline data repository 155. Further, in some aspects, the exposed API 135 method may identify one or more archived indices 220 by sending the indices 220 to the archive framework 140 through the archival request. Regardless of the identification method, as described above, each archive index 220 points to the offline data repository 155 and contains one or more metadata attributes associated with the active data objects 230 to be archived.
Continuing with
As an example, a request from the exposed API 135 method may include a request to archive all active data objects 230 which are document-type “invoice.” The request may further identify two archive indices 220 through the identification of metadata attribute criteria document-type “invoice” and date-generated “2004” for one, and date-generated “2005” for the other. During archival of the active data objects 230 into archived data objects, metadata attributes identifying the active data objects 230 as an “invoice” and generated in “2004” or “2005” are stored in the archived data objects 240 in offline data repository 155. The metadata attributes 242 are further populated into the appropriate archived indices 220, which point to the archived data objects 240. Upon a search for archived data objects 240 containing these metadata attributes 242 (performed subsequently to the archival process), the archival indices 220 can normally point to the corresponding archived data objects 240 for retrieval.
Continuing with
Often, the offline data repository 155 may contain more than one archived data object 240 and possibly even many hundreds or thousands. In these cases, client 104 or business application 130 may choose to build the archive index 220 from the multiple archived data objects 240 at the same or substantially same time. For example, parsing module 360 may execute a batch process that parses the multiple archived data objects 240 concurrently or consecutively. The metadata attributes 242 parsed from the archived data objects 240 may then be populated into one or more archived indices 220. However, whether one or multiple archived data objects 240 are parsed concurrently, this may not affect the archival process or subsequent search of the archived data objects 240.
Archive framework 140 may also present the customization table 310 to client 104, so that (perhaps authenticated) client 104 may customize one or more archive indices 220. For example, client 104 may choose to add records to the customization table 310 based on one or more selected index criteria that correspond to metadata attributes 242 in archived data objects 240. As illustrated in
In some aspects, access to particular index criteria may be controlled by the archive framework 140. For example, a developer of business application 130 may define several levels of access permission to the customization table 310. Each level of access permission may allow a client 104 to access different metadata attributes 322 in order to define archive indices 220. Further, access permission levels may be used to control access to an archive index 220 already defined, e.g., only particular clients 104 may adjust or modify an existing archive index 220.
Continuing with
Based on these presented criteria, the archive framework 140 receives the selected search criteria from client 104 through business application 130. For example, as shown in
Continuing with
The preceding flowcharts and accompanying description illustrate example methods. Database environment 100 contemplates using or implementing any suitable technique for performing these and other tasks. It will be understood that these methods are for illustration purposes only and that the described or similar techniques may be performed at any appropriate time, including concurrently, individually, or in combination. In addition, many of the steps in these flowcharts may take place simultaneously and/or in different orders than as shown. Moreover, database environment 100 may use methods with additional steps, fewer steps, and/or different steps, so long as the methods remain appropriate. In short, although this disclosure has been described in terms of certain embodiments and generally associated methods, alterations and permutations of these embodiments and methods will be apparent to those skilled in the art. Accordingly, the above description of example embodiments does not define or constrain the disclosure. Other changes, substitutions, and alterations are also possible without departing from the spirit and scope of this disclosure, and such changes, substitutions, and alterations may be included within the scope of the claims included herewith.