The disclosure relates to in-memory databases and more particularly to manipulations of data in different storage environments.
Manipulation of data stored on a hard disk is slow when compared to manipulation of data in volatile memory. However, storing data in volatile memory is risky. In-memory databases need to consistently maintain power in order to store persistent data. Additionally, volatile memory is more expensive than hard disk (including solid state) storage.
Disclosed herein is a semi-in-memory database data manipulation technique. Performing numerous calculations into and out of hard disk (including solid state drive, “SSD”) storage is prohibitively slow when compared to data manipulation performed in volatile memory. Calculations that require many hours when performed in hard-disk storage may require only a fraction of a second when performed in volatile memory. Conversely, maintaining persistent data in-memory is expensive and risks data loss.
One way of reducing calculations performed in hard disk storage is to merely store all derivable data from a dataset, or limit queries of the database to stored data only. This way no calculations need be performed in hard-disk storage. Only retrieval operations need be performed. However, there are significant downsides of this approach. First, storing all derivable data dramatically increases the storage space required for any given dataset. Limiting queries to those that may be answered with retrieval operations reduces the functionality of the database system.
For example, a database that fields queries regarding account balances as of queried dates cannot feasibly store all derivable data from a given set of account statements. The potential queries regarding various account statements as of every possible arbitrarily determined date cannot be feasibly be stored in hard disk space. There are too many permutations of query that could be requested. Therefore, the database needs to perform calculations to derive the resolution to queries.
In step 104, the records are replicated in volatile memory. The volatile memory may be situated architecturally in the same machine as the hard drive, in another machine, or accessible through the Internet. A processor performs a retrieval operation on relevant portions of the hard drive and the retrieved data is replicated in volatile memory.
In some embodiments, the timing and scope of the replicated records varies. The timing may vary based on user interaction. In some embodiments, the replication of records is automatic based on receipt of a query on the database from a user. In another embodiment, the replication of records is triggered automatically based on a user imitating use of a database management application (e.g., an application configured to generate queries of the database).
Examples of variations of scope relate to which records are retrieved and replicated. In some embodiments, the records retrieved only pertain to entities included in the query, or records that are within a date range relevant to the query. By limiting the scope of the records replicated in volatile memory, the system expense on memory is reduced. Filtering retrieved records does call for operations in the hard drive, though these operations may be limited based on storage organization techniques such as filling the hard drive in predictable ways based on generation of new records. Where new records are created monthly (e.g., invoices), the hard drive may be allocated by entity, time, or other stored data metrics.
In step 106, the system performs calculations based on the query on the records in-memory. The calculations derive a resolution to the query from the records in-memory. The resolution includes whatever information the user was seeking in the query. For example, if the query is how much did X owe 90 days ago, the calculations add all invoices and payments for X until 90 days prior to the query. Multiple calculations may be performed efficiently in volatile memory using dynamic programming.
In step 108, the resolution to the query is output onto the user's graphic user interface (GUI). In step 110, the records that have been replicated into memory remain in-memory until the user has indicated that no more queries of the records will be made. The user indication may come from closing the database management application, or by navigating away from the GUI that enables queries of the database.
The application backend server 26 communicates with an application front end 30. The application front end 30 includes a graphic user interface 32. The graphic user interface 32 receives queries from users. The queries are forwarded to the application backend server 26. The application backend server 26 retrieves the records 24 from the hard drive 22 and replicates the records 24 in the volatile memory 28. Operations or calculations used to derive a resolution to the query are performed on the records 24 in the volatile memory 28. The volatile memory 28 has a significantly faster read/write speed than the hard drive 22. In some architectures, the volatile memory 28 is physically closer to a processor of the application backend server 26 than to the hard drive 22.
The GUI 32 includes a query configuration 40 where a user may select parameters from which to define a query.
Based on the underlying transactions 42, an arbitrarily large number of unique queries can be generated from combinations of parameters. The system may field any number of queries in quick succession, based on any combination of parameters.
The computer 500 may be a standalone device or part of a distributed system that spans multiple networks, locations, machines, or combinations thereof. In some embodiments, the computer 500 operates as a server computer or a client device in a client-server network environment, or as a peer machine in a peer-to-peer system. In some embodiments, the computer 500 may perform one or more steps of the disclosed embodiments in real time, near real time, offline, by batch processing, or combinations thereof.
As shown in
The control 504 includes one or more processors 512 (e.g., central processing units (CPUs)), application-specific integrated circuits (ASICs), and/or field-programmable gate arrays (FPGAs), and memory 514 (which may include software 516). For example, the memory 514 may include volatile memory, such as random-access memory (RAM), and/or non-volatile memory, such as read-only memory (ROM). The memory 514 can be local, remote, or distributed.
A software program (e.g., software 516), when referred to as “implemented in a computer-readable storage medium,” includes computer-readable instructions stored in the memory (e.g., memory 514). A processor (e.g., processor 512) is “configured to execute a software program” when at least one value associated with the software program is stored in a register that is readable by the processor. In some embodiments, routines executed to implement the disclosed embodiments may be implemented as part of an operating system (OS) software (e.g., Microsoft Windows® and Linux®) or a specific software application, component, program, object, module, or sequence of instructions referred to as “computer programs.”
As such, the computer programs typically comprise one or more instructions set at various times in various memory devices of a computer (e.g., computer 500), which, when read and executed by at least one processor (e.g., processor 512), will cause the computer to perform operations to execute features involving the various aspects of the disclosed embodiments. In some embodiments, a carrier containing the aforementioned computer program product is provided. The carrier is one of an electronic signal, an optical signal, a radio signal, or a non-transitory computer-readable storage medium (e.g., memory 514).
The network interface 506 may include a modem or other interfaces (not shown) for coupling the computer 500 to other computers over the network 524. The I/O system 508 may operate to control various I/O devices, including peripheral devices, such as a display system 518 (e.g., a monitor or touch-sensitive display) and one or more input devices 520 (e.g., a keyboard and/or pointing device). Other I/O devices 522 may include, for example, a disk drive, printer, scanner, or the like. Lastly, the clock system 510 controls a timer for use by the disclosed embodiments.
Operation of a memory device (e.g., memory 514), such as a change in state from a binary one (1) to a binary zero (0) (or vice versa) may comprise a visually perceptible physical change or transformation. The transformation may comprise a physical transformation of an article to a different state or thing. For example, a change in state may involve accumulation and storage of charge or a release of stored charge. Likewise, a change of state may comprise a physical change or transformation in magnetic orientation or a physical change or transformation in molecular structure, such as a change from crystalline to amorphous or vice versa.
Aspects of the disclosed embodiments may be described in terms of algorithms and symbolic representations of operations on data bits stored in memory. These algorithmic descriptions and symbolic representations generally include a sequence of operations leading to a desired result. The operations require physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electric or magnetic signals that are capable of being stored, transferred, combined, compared, and otherwise manipulated. Customarily, and for convenience, these signals are referred to as bits, values, elements, symbols, characters, terms, numbers, or the like. These and similar terms are associated with physical quantities and are merely convenient labels applied to these quantities.
While embodiments have been described in the context of fully functioning computers, those skilled in the art will appreciate that the various embodiments are capable of being distributed as a program product in a variety of forms and that the disclosure applies equally, regardless of the particular type of machine or computer-readable media used to actually effect the embodiments.
While the disclosure has been described in terms of several embodiments, those skilled in the art will recognize that the disclosure is not limited to the embodiments described herein and can be practiced with modifications and alterations within the spirit and scope of the invention. Those skilled in the art will also recognize improvements to the embodiments of the present disclosure. All such improvements are considered within the scope of the concepts disclosed herein. Thus, the description is to be regarded as illustrative instead of limiting.
From the foregoing, it will be appreciated that specific embodiments of the invention have been described herein for purposes of illustration, but that various modifications may be made without deviating from the scope of the invention. Accordingly, the invention is not limited except as by the appended claims.