The present invention relates to a method, system and computer program product for identifying caching opportunities. In particular, the present invention relates to a method, system and computer program product for advising a user of an opportunity to cache data and to automatically cache the data when a certain threshold has been met or to cache the data when the user decides to cache the data.
In today's business environment, applications are increasingly hosted on different physical systems than the data they utilize. For instance, in a business environment, the applications may be hosted on an application tier, whereas, the data that these applications utilize may be hosted on a data tier. The cost of getting data from such a distributed tier is expensive. Caching is one technique used to minimize the expense of keeping data on a separate tier of the environment. The complex task of deciding what to cache is usually decided by someone who has an intimate knowledge of the application and its data access pattern, given that deciding what to cache is a delicate balance between performance and costs. If not enough data is cached, then performance improvement opportunities will be missed. If caching is used frequently, that is, a cache is updated too frequently, then the cost may become too prohibitive. As such, there is a need for a business and/or organization to provide a cost-effective way of caching data and improving cache accuracy.
In a first aspect of the invention, there is provided a method for identifying caching opportunities. The method comprises identifying at least one data source among a plurality of data sources utilized by an application, the plurality of data sources being stored on a computer system, establishing a pre-set respective read-update ratio threshold for respective data accessed from a respective data source among the plurality of data sources, wherein exceeding the pre-set respective read-update ratio threshold for the respective data accessed identifies a caching opportunity, defining an action to be taken when the pre-set respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded and taking the action defined when the pre-set respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded. The taking the action step further comprises checking the action defined before taking the action defined when the read-update ratio threshold for the respective data accessed from the respective data source has been exceeded. The method further comprises tracking a respective read-update ratio threshold for data accessed from each data source among the plurality of data sources stored on the computer system and determining when the pre-set respective read-update ratio threshold for the respective data accessed from the respective data source among the plurality of data sources is exceeded. In an embodiment, the action defined to be taken when the pre-set respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded comprises at least one of: advise a user of an opportunity to cache the respective data accessed from the respective data source and automatically cache the respective data accessed from the respective data source. In an embodiment, if the action defined is to advise the user of the opportunity to cache the respective data when the pre-set respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded, the method further comprises sending notification to the user the respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded and inquiring whether the user wants to cache the respective data from the respective data source. Further, in an embodiment, if the user wants to cache the respective data from the respective data source, the method further comprises receiving an affirmative response from the user to cache the respective data from the respective data source and caching the respective data from the respective data source.
In another aspect of the invention, there is provided a system for identifying caching opportunities. The system comprises an application server configured to ascertain one or more data sources utilized by an application, the application server being configured to receive a pre-set respective read-update ratio threshold for respective data accessed from a respective data source of the one or more data sources, the one or more data sources being stored in a database, a database manager configured to track a respective read-update ratio threshold for data accessed from each data source of the one or more data sources in the database and a caching prospector tool configured to take a defined action when the pre-set respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded, wherein exceeding the pre-set respective read-update ratio threshold for the respective data accessed identifies a caching opportunity. In an embodiment, the application server is further configured to register with the database manager to receive notification when the pre-set respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded. The caching prospector tool is further configured to send notification to the application server when the pre-set respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded. In an embodiment, the defined action to be taken when the respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded comprises either of advising a user of an opportunity to cache the respective data accessed from the respective data source or of automatically caching the respective data accessed from the respective data source. In an embodiment, if the defined action is to advise the user of the opportunity to cache the data, the caching prospector tool is further configured to inquire whether the user wants to cache the respective data from the respective data source when the respective read-update ratio threshold for the data accessed from the respective data source has been exceeded. In an embodiment, the caching prospector tool is further configured to cache the respective data from the respective data source upon receiving an affirmative response from the user to cache the respective data. In an embodiment, if the defined action is to automatically cache the respective data from the respective data source, the caching prospector tool automatically caches the respective data from the respective data source when the respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded.
In yet another aspect of the invention, there is provided a computer program product for identifying caching opportunities. The computer program product comprises a computer readable medium, first program instructions to establish a respective read-update ratio threshold for respective data accessed from a respective data source among a plurality of data sources stored in a database, wherein exceeding the read-update ratio threshold for the respective data accessed identifies a caching opportunity. The computer program product further comprises second program instructions to define an action to be taken when the respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded and third program instructions to take the action defined when the read-update ratio threshold for the respective data accessed from the respective data source has been exceeded and wherein the first, second and third program instructions are stored on the computer readable medium. Further, the computer program product further comprises fourth program instructions to track a respective read-update ratio threshold for data accessed from each data source among the plurality of data sources stored in the database and wherein the fourth program instructions are stored on the computer readable medium. In an embodiment, the first program instructions include instructions to identify at least one data source among the plurality of data sources utilized by an application and in an embodiment, the action defined to be taken when the respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded comprises either of advising a user of an opportunity to cache the respective data accessed from the respective data source and of automatically caching the respective data accessed from the respective data source. In an embodiment, if the action defined is to advise the user of the opportunity to cache the respective data when the respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded, the third program instructions include instructions to send notification to the user that the respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded and to inquire whether the user wants to cache the respective data from the respective data source. In an embodiment, if the user sends an affirmative response to cache the respective data from the respective data source, the third program instructions include instructions to cache the respective data from the respective data source upon receiving the affirmative response from the user to cache the respective data. In an embodiment, if the action defined is to automatically cache the respective data when the respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded, the third program instructions include instructions to automatically cache the respective data from the respective data source.
The accompanying drawings, which are incorporated in and form a part of this specification, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the invention:
Many of the functional units described in this specification have been labeled as modules, in order to more particularly emphasize their implementation independence. For example, a module may be implemented as a hardware circuit comprising custom VLSI circuits or gate arrays, off-the-shelf semiconductors such as logic chips, transistors, or other discrete components. A module may also be implemented in programmable hardware devices such as field programmable gate arrays, programmable array logic, programmable logic devices or the like.
Modules may also be implemented in software for execution by various types of processors. An identified module or component of executable code may, for instance, comprise one or more physical or logical blocks of computer instructions which may, for instance, be organized as an object, procedure, or function. Nevertheless, the executables of an identified module need not be physically located together, but may comprise disparate instructions stored in different locations which, when joined logically together, comprise the module and achieve the stated purpose for the module.
Further, a module of executable code could be a single instruction, or many instructions, and may even be distributed over several different code segments, among different programs, and across several memory devices. Similarly, operational data may be identified and illustrated herein within modules, and may be embodied in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed over different locations including over different storage devices, over disparate memory devices, and may exist, at least partially, merely as electronic signals on a system or network.
Furthermore, modules may also be implemented as a combination of software and one or more hardware devices. For instance, a module may be embodied in the combination of a software executable code stored on a memory device. In a further example, a module may be the combination of a processor that operates on a set of operational data. Still further, a module may be implemented in the combination of an electronic signal communicated via transmission circuitry.
Reference throughout this specification to “one embodiment,” “an embodiment,” or similar language means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, appearances of the phrases “in one embodiment,” “in an embodiment,” and similar language throughout this specification may, but do not necessarily, all refer to the same embodiment.
Moreover, the described features, structures, or characteristics of the invention may be combined in any suitable manner in one or more embodiments. It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit and scope of the invention. Thus, it is intended that the present invention cover the modifications and variations of this invention provided they come within the scope of the appended claims and their equivalents. Reference will now be made in detail to the preferred embodiments of the invention.
In one embodiment, the invention provides a method for identifying caching opportunities, using a caching prospector tool, which is described herein below and, in particular, with respect to
Further, as shown in
Reference is now made to
Turning to
Accordingly, the invention provides a method of identifying caching opportunities, using the caching prospector tool, whereby the user and/or administrator can choose based on the information entered in the user defined settings of the caching prospector tool to automatically or autonomically cache data when a prescribed or pre-set read-update ratio threshold or value has been exceeded or to be advised of caching opportunities when a preset read-update ratio threshold has been exceeded, such that, the user and/or administrator can decide whether or not to cache the data. Further, a user and/or administrator can define pre-set read-update ratio thresholds or values for data accessed frequently by one or more applications from one or more data sources, such that, the user and/or administrator can either choose to automatically or autonomically cache any data that exceeds the pre-set read-update ratio threshold or to be advised of caching opportunities when any data exceeds the pre-set read-update ratio threshold or value. Further yet, a user and/or administrator can choose to be advised of caching opportunities for data accessed from a set of data sources (one or more data sources) and to automatically or autonomically cache data accessed from another set of data sources.
Reference is now made to
In one embodiment, as shown in
In yet another embodiment, the invention provides a computer program product for identifying caching opportunities. Preferably, the computer program product comprises a form accessible from the computer-usable or computer-readable medium, which provides program codes or instructions for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the codes or instructions for use by or in connection with the instruction execution system, apparatus, or device. Preferably, the medium can include an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. More preferably, the computer-readable medium can include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Further, examples of optical disks include compact disc-read only memory (CD-ROM), compact disc-read/write (CD-R/W) and digital versatile/video disc (DVD). The invention can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment containing both hardware and software elements. In a preferred embodiment, the invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.
The computer program product further comprises first program instructions to establish a respective read-update ratio threshold for respective data accessed from a respective data source among a plurality of data sources stored in a database, wherein exceeding the read-update ratio threshold for the respective data accessed identifies a caching opportunity. The computer program product further comprises second program instructions to define an action to be taken when the respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded and third program instructions to take the action defined when the read-update ratio threshold for the respective data accessed from the respective data source has been exceeded. Further, the computer program product further comprises fourth program instructions to track a respective read-update ratio threshold for data accessed from each data source among the plurality of data sources stored in the database. In an embodiment, the first program instructions include instructions to identify at least one data source among the plurality of data sources utilized by an application and in an embodiment, the action defined to be taken when the respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded comprises either of advising a user and/or administrator of an opportunity to cache the respective data accessed from the respective data source and of automatically caching the respective data accessed from the respective data source. In an embodiment, if the action defined is to advise the user and/or administrator of the opportunity to cache the respective data when the respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded, the third program instructions include instructions to send notification to the user that the respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded and to inquire whether the user wants to cache the respective data from the respective data source. In an embodiment, if the user sends an affirmative response to cache the respective data from the respective data source, the third program instructions include instructions to cache the respective data from the respective data source upon receiving the affirmative response from the user to cache the respective data. In an embodiment, if the action defined is to automatically cache the respective data when the respective read-update ratio threshold for the respective data accessed from the respective data source has been exceeded, the third program instructions include instructions to automatically cache the respective data from the respective data source. Preferably, the first, second, third and fourth program instructions are stored on the computer readable medium.
Referring now to
In general, a user (such as, user A, reference numeral 430 through user X, reference numeral 432) may interface with infrastructure 402 for accessing the caching prospector tool 416 configured to identify caching opportunities, which is installed on computer system 404. Similarly, an administrator 446 can interface with infrastructure 402 for supporting and/or configuring the infrastructure 402, such as, upgrading the caching prospector tool 416. In general, the parties could access infrastructure 402 directly, or over a network via interfaces (e.g., client web browsers) loaded on computerized devices (e.g., personal computers, laptops, handheld devices, etc.). In the case of the latter, the network can be any type of network such as the Internet or can be any other network, such as, a local area network (LAN), a wide area network (WAN), a virtual private network (VPN), etc. In any event, communication with infrastructure 402 could occur via a direct hardwired connection (e.g., serial port), or via an addressable connection that may utilize any combination of wire line and/or wireless transmission methods. Moreover, conventional network connectivity, such as Token Ring, Ethernet, WiFi or other conventional communications standards could be used. Still yet, connectivity could be provided by conventional TCP/IP sockets-based protocol. In this instance, the parties could utilize an Internet service provider to establish connectivity to infrastructure 402. It should be understood that under the present invention, infrastructure 402 could be owned and/or operated by a party, such as, a provider 444, or by an independent entity. Regardless, use of infrastructure 402 and the teachings described herein could be offered to the parties on a subscription or fee-basis. In either scenario, an administrator 446 could support and configure infrastructure 402, as mentioned herein above.
Computer system or server 404 is shown to include a CPU (hereinafter “processing unit 406”), a memory 412, a bus 410, and input/output (I/O) interfaces 408. Further, computer system 400 is shown in communication with external I/O devices/resources 424 and storage systems 422 through 428. In an embodiment as shown, the infrastructure 402 includes a plurality of storage systems or data sources, such as, storage system 422 that includes data source A, reference numeral 426, up to storage system 428 that includes data source X, reference numeral 429, so that a user A through X (reference numeral 430 through 432) accessing the data sources A through X in the respective storage systems 422 through 428 can be tracked by the caching prospector tool 416 for purposes of identifying caching opportunities. In general, processing unit 406 executes computer program codes, such as, the database manager 414, which is configured to track read-update ratio thresholds for data accessed from one or more databases or storage systems, such as storage system 422 through storage system 428 and the caching prospector tool 416, which is configured to identify caching opportunities when a threshold (read-update ratio threshold) has been exceeded. While executing the database manger 414 and/or the caching prospector tool 416, the processing unit 406 can read and/or write data, to/from memory 412, storage systems 422 and/or 428, and/or I/O interfaces 408. Bus 410 provides a communication link between each of the components in computer system 400. External devices 424 can include any devices (e.g., keyboard, pointing device, display, etc.) that enable a user to interact with computer system 400 and/or any devices (e.g., network card, modem, etc.) that enable computer system 400 to communicate with one or more other computing devices.
Computer infrastructure 402 is only illustrative of various types of computer infrastructures for implementing the invention. For example, in one embodiment, computer infrastructure 402 includes two or more computing devices (e.g., a server cluster) that communicate over a network to perform the various process steps of the invention. Moreover, computer system 400 is only representative of various possible computer systems that can include numerous combinations of hardware. To this extent, in other embodiments, computer system 400 can include any specific purpose computing article of manufacture comprising hardware and/or computer program code for performing specific functions, any computing article of manufacture that includes a combination of specific purpose and general purpose hardware/software, or the like. In each case, the program code and hardware can be created using standard programming and engineering techniques, respectively. Moreover, processing unit 406 may include a single processing unit, or be distributed across one or more processing units in one or more locations, e.g., on a client and server. Similarly, memory 412 and/or storage system 422 can include any combination of various types of data storage and/or transmission media that reside at one or more physical locations. Further, I/O interfaces 408 can include any system for exchanging information with one or more external devices 424. Still further, it is understood that one or more additional components (e.g., system software, math co-processing unit, etc., not shown in
Storage systems 422 and 428 can be any type of system (e.g., a database) capable of storing information or data, such as, data sources A (reference numeral 426) through X (reference numeral 429). To this extent, storage system 422 could include one or more storage devices, such as a magnetic disk drive or an optical disk drive. In another embodiment, storage systems 422 and 428 includes data distributed across, for example, a local area network (LAN), wide area network (WAN) or a storage area network (SAN) (not shown). Although not shown, additional components, such as cache memory, communication systems, system software, etc., may be incorporated into computer system 400.
The foregoing descriptions of specific embodiments of the present invention have been presented for the purpose of illustration and description. They are not intended to be exhaustive or to limit the invention to the precise forms disclosed, and obviously many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and its practical application, to thereby enable others skilled in the art to best utilize the invention and various embodiments with various modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims appended hereto and their equivalents.