The present disclosure relates to resource management systems and methods that manage data storage and computing resources.
Many existing data storage and retrieval systems are available today. For example, in a shared-disk system, all data is stored on a shared storage device that is accessible from all of the processing nodes in a data cluster. In this type of system, all data changes are written to the shared storage device to ensure that all processing nodes in the data cluster access a consistent version of the data. As the number of processing nodes increases in a shared-disk system, the shared storage device (and the communication links between the processing nodes and the shared storage device) becomes a bottleneck that slows data read and data write operations. This bottleneck is further aggravated with the addition of more processing nodes. Thus, existing shared-disk systems have limited scalability due to this bottleneck problem.
Another existing data storage and retrieval system is referred to as a “shared-nothing architecture.” In this architecture, data is distributed across multiple processing nodes such that each node stores a subset of the data in the entire database. When a new processing node is added or removed, the shared-nothing architecture must rearrange data across the multiple processing nodes. This rearrangement of data can be time-consuming and disruptive to data read and write operations executed during the data rearrangement. And, the affinity of data to a particular node can create “hot spots” on the data cluster for popular data. Further, since each processing node performs also the storage function, this architecture requires at least one processing node to store data. Thus, the shared-nothing architecture fails to store data if all processing nodes are removed. Additionally, management of data in a shared-nothing architecture is complex due to the distribution of data across many different processing nodes.
The systems and methods described herein provide an improved approach to data storage and data retrieval that alleviates the above-identified limitations of existing systems.
Non-limiting and non-exhaustive embodiments of the present disclosure are described with reference to the following figures, wherein like reference numerals refer to like parts throughout the various figures unless otherwise specified.
Disclosed herein are methods, apparatuses, and systems for managing semi-structured data. For example, an implementation of a method for managing semi-structured data may receive semi-structured data elements from a data source, and may perform statistical analysis on collections of the semi-structured data elements as they are added to the database. Additionally, common data elements from within the semi-structured data may be identified and may further combine the common data elements from the data source into separate pseudo-columns stored in cache memory. The implementation may further make metadata and statistics corresponding to the pseudo-columns available to a computer based query generator, and may store non-common data elements in an overflow serialized column in computer memory.
In the following description, reference is made to the accompanying drawings that form a part thereof, and in which is shown by way of illustration specific exemplary embodiments in which the disclosure may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the concepts disclosed herein, and it is to be understood that modifications to the various disclosed embodiments may be made, and other embodiments may be utilized, without departing from the scope of the present disclosure. The following detailed description is, therefore, not to be taken in a limiting sense.
Reference throughout this specification to “one embodiment,” “an embodiment,” “one example” or “an example” means that a particular feature, structure or characteristic described in connection with the embodiment or example is included in at least one embodiment of the present disclosure. Thus, appearances of the phrases “in one embodiment,” “in an embodiment,” “one example” or “an example” in various places throughout this specification are not necessarily all referring to the same embodiment or example. In addition, it should be appreciated that the figures provided herewith are for explanation purposes to persons ordinarily skilled in the art and that the drawings are not necessarily drawn to scale.
Embodiments in accordance with the present disclosure may be embodied as an apparatus, method or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware-comprised embodiment, an entirely software-comprised embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, embodiments of the present disclosure may take the form of a computer program product embodied in any tangible medium of expression having computer-usable program code embodied in the medium.
Any combination of one or more computer-usable or computer-readable media may be utilized. For example, a computer-readable medium may include one or more of a portable computer diskette, a hard disk, a random access memory (RAM) device, a read-only memory (ROM) device, an erasable programmable read-only memory (EPROM or Flash memory) device, a portable compact disc read-only memory (CDROM), an optical storage device, and a magnetic storage device. Computer program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages. Such code may be compiled from source code to computer-readable assembly language or machine code suitable for the device or computer on which the code will be executed.
Embodiments may also be implemented in cloud computing environments. In this description and the following claims, “cloud computing” may be defined as a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned via virtualization and released with minimal management effort or service provider interaction and then scaled accordingly. A cloud model can be composed of various characteristics (e.g., on-demand self-service, broad network access, resource pooling, rapid elasticity, and measured service), service models (e.g., Software as a Service (“SaaS”), Platform as a Service (“PaaS”), and Infrastructure as a Service (“IaaS”)), and deployment models (e.g., private cloud, community cloud, public cloud, and hybrid cloud).
The flow diagrams and block diagrams in the attached figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flow diagrams or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It will also be noted that each block of the block diagrams and/or flow diagrams, and combinations of blocks in the block diagrams and/or flow diagrams, may be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. These computer program instructions may also be stored in a computer-readable medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instruction means which implement the function/act specified in the flow diagram and/or block diagram block or blocks.
The systems and methods described herein provide a flexible and scalable data warehouse using a new data processing platform. In some embodiments, the described systems and methods leverage a cloud infrastructure that supports cloud-based storage resources, computing resources, and the like. Example cloud-based storage resources offer significant storage capacity available on-demand at a low cost. Further, these cloud-based storage resources may be fault-tolerant and highly scalable, which can be costly to achieve in private data storage systems. Example cloud-based computing resources are available on-demand and may be priced based on actual usage levels of the resources. Typically, the cloud infrastructure is dynamically deployed, reconfigured, and decommissioned in a rapid manner.
In the described systems and methods, a data storage system utilizes a semi-structured based relational database. However, these systems and methods are applicable to any type of database using any data storage architecture and using any language to store and retrieve data within the database. As used herein, semi-structured data is meant to convey a form of structured data that does not conform with the typical formal structure of data models associated with relational, but nonetheless contains tags or other markers to separate semantic elements and enforce hierarchies of records and fields within the data. The systems and methods described herein further provide a multi-tenant system that supports isolation of computing resources and data between different customers/clients and between different users within the same customer/client.
Disclosed herein are methods and systems that significantly improve performance of databases and data warehouse systems handling large amounts of semi-structured data. Existing database systems are either relational (i.e. SQL databases) or key-value stores.
Relational databases can perform efficient queries due to query data access pruning (excluding portions of the database from the search based on aggregated metadata about values stored in specific columns of the tables). This, however, requires rigid tabular format of the data, which cannot be used to represent semi-structured data.
On the other hand, the key-value stores are more flexible, but introduce severe performance penalties due to lack of pruning. There is a number of ways to add handling of semi-structured data to relational databases in existing products and research projects:
1. Serialized encoding—a semi-structured data record is stored in a column as a serialized representation. Every time a value of some field is used, it is extracted and converted to an elementary type. This method is flexible, but makes access to this data to be improved by pruning, because extraction from serialized representation is costly, and requires significantly more CPU time than working with normal relational data. The entire serialized data records have to be read from persistent storage and processed even only a tiny portion (such as a single element) of them is used in the query.
2. Conversion at ingest—the semi-structured data is converted into relational data at the ingest. This makes access to this data as fast as access to any other relational data, but requires rigid specification of data structure at the ingest, and corresponding database schema to be fully specified beforehand. This method makes handling data with changing structure very costly because of the need to change database schema. Data with structure changing from record to record is impossible to handle using this method. The conversion method has to be specified apriori, and any non-trivial change will require re-ingesting the original semi-structured data.
3. Relational-like representation of structured data equivalent to object-attribute-value triplet representation stored in a conventional relational database. This method is flexible, but effectively requires join operations for access to data sub-components, which depending on data can be very slow.
4. Non-traditional extensions to relational data model, allowing columns with different cardinality to be linked in a hierarchy reflecting structure of the source data. The query generation methods for such data representation are not well-understood (and so no effective query generation is possible with the present state of the art). This method also requires input data to conform to a rigid (though non-tabular) schema, and thus is not sufficiently flexible to handle arbitrary semi-structured data.
What is needed is a system and method for working with semi-structured data that is efficient, low cost, and responsive, because it will preserve the semantics of the semi-structured data while managing the data in at least pseudo columns that can be processed and queried like more traditional data structures.
In an implementation of the following disclosure, data may come in the form of files, elements of files, portions of files, and the like. A file may comprise a collection of documents and portion of data may comprise a file, a plurality of documents from a connection, and/or a portion of documents. Further in the implementation, metadata may be associated with files, portions of files, and portions of data.
As used herein, the terms “common data elements” are intended to mean data elements belonging to the same group and collection of logically similar elements.
In the implementation, if the data element requested is not in a pseudo-column 120, it may be extracted from the “overflow” serialized data 140, and if an entire semi-structured data record 150 is requested, it may be reconstructed from the extracted data elements in pseudo-columns 120 and the “overflow” data 140 and re-serialized.
In an implementation, when extracting data, if a value of a common data element is needed, it may be obtained directly from the corresponding pseudo-column, using efficient columnar access.
In an implementation, a bloom filter may be employed to control resource use. Bloom filters may use identifiers of data elements within semi-structured data to filter data as it is ingested and consumed by the system and processes.
For a user, this method may be indistinguishable from storing serialized records, and imposes no constraints on structure of individual data records. However, because most common data elements are stored in the same way as conventional relational data, access to them may be provided and may not require reading and extraction of the entire semi-structured records, thus gaining the speed advantages of conventional relational databases.
Because the different collections of semi-structured records (from the same table) may have different sets of data elements extracted, the query generator and the pruning should be able to work with partially available metadata (i.e. parts of the table may have metadata and statistics available for a particular data element, while other parts may lack it).
An advantage over the prior art is the ability provided by the method for using a hybrid data storage representation (as both serialized storage of less common elements and columnar storage of common elements). This allows users to achieve both flexibility and ability to store arbitrary semi-structured data of systems using serialized representation and high performance of data queries provided by conventional relational data bases.
Additionally, semi-structured data may represent entire files, partial files, collections of files, and partial collections of files. It should be noted that a semi-structured data element may be a file or a portion of a file. In an implementation, metadata may be used to define data and to assist in its organization and use.
It will be appreciated by those in the art that any data processing platform could use this approach to handling semi-structured data. It does not need to be limited to a DBMS system running SQL.
Illustrated in
Resource manager 302 is further coupled to an execution platform 312, which provides multiple computing resources that execute various data storage and data retrieval tasks, as discussed in greater detail below. Execution platform 312 is coupled to multiple data storage devices 316, 318, and 320 that are part of a storage platform 314. Although three data storage devices 316, 318, and 320 are shown in
In particular embodiments, the communication links between resource manager 302 and users 304-308, metadata 310, and execution platform 312 are implemented via one or more data communication networks. Similarly, the communication links between execution platform 312 and data storage devices 316-320 in storage platform 314 are implemented via one or more data communication networks. These data communication networks may utilize any communication protocol and any type of communication medium. In some embodiments, the data communication networks are a combination of two or more data communication networks (or sub-networks) coupled to one another. In alternate embodiments, these communication links are implemented using any type of communication medium and any communication protocol.
As shown in
Resource manager 302, metadata 310, execution platform 312, and storage platform 314 are shown in
Resource manager 302 also includes an SQL compiler 412, an SQL optimizer 414 and an SQL executor 410. SQL compiler 412 parses SQL queries and generates the execution code for the queries. SQL optimizer 414 determines the best method to execute queries based on the data that needs to be processed. SQL executor 416 executes the query code for queries received by resource manager 302. A query scheduler and coordinator 418 sends received queries to the appropriate services or systems for compilation, optimization, and dispatch to an execution platform. A virtual warehouse manager 420 manages the operation of multiple virtual warehouses implemented in an execution platform.
Additionally, resource manager 302 includes a configuration and metadata manager 422, which manages the information related to the data stored in the remote data storage devices and in the local caches. A monitor and workload analyzer 424 oversees the processes performed by resource manager 302 and manages the distribution of tasks (e.g., workload) across the virtual warehouses and execution nodes in the execution platform. Configuration and metadata manager 422 and monitor and workload analyzer 424 are coupled to a data storage device 426.
Resource manager 302 also includes a transaction management and access control module 428, which manages the various tasks and other activities associated with the processing of data storage requests and data access requests. For example, transaction management and access control module 428 provides consistent and synchronized access to data by multiple users or systems. Since multiple users/systems may access the same data simultaneously, changes to the data must be synchronized to ensure that each user/system is working with the current version of the data. Transaction management and access control module 428 provides control of various data processing activities at a single, centralized location in resource manager 302.
Each virtual warehouse 502-506 is capable of accessing any of the data storage devices 316-320 shown in
In the example of
Similar to virtual warehouse 502 discussed above, virtual warehouse 504 includes three execution nodes 526, 528, and 530. Execution node 526 includes a cache 532 and a processor 534. Execution node 528 includes a cache 536 and a processor 538. Execution node 530 includes a cache 540 and a processor 542. Additionally, virtual warehouse 506 includes three execution nodes 544, 546, and 548. Execution node 544 includes a cache 550 and a processor 552. Execution node 546 includes a cache 554 and a processor 556. Execution node 548 includes a cache 558 and a processor 560.
Although the execution nodes shown in
Further, the cache resources and computing resources may vary between different execution nodes. For example, one execution node may contain significant computing resources and minimal cache resources, making the execution node useful for tasks that require significant computing resources. Another execution node may contain significant cache resources and minimal computing resources, making this execution node useful for tasks that require caching of large amounts of data. In some embodiments, the cache resources and computing resources associated with a particular execution node are determined when the execution node is created, based on the expected tasks to be performed by the execution node.
Additionally, the cache resources and computing resources associated with a particular execution node may change over time based on changing tasks performed by the execution node. For example, a particular execution node may be assigned more processing resources if the tasks performed by the execution node become more processor intensive. Similarly, an execution node may be assigned more cache resources if the tasks performed by the execution node require a larger cache capacity.
Although virtual warehouses 502-506 are associated with the same execution platform 312 of
Additionally, each virtual warehouse is shown in
A particular execution platform 312 may include any number of virtual warehouses 502-506. Additionally, the number of virtual warehouses in a particular execution platform is dynamic, such that new virtual warehouses are created when additional processing and/or caching resources are needed. Similarly, existing virtual warehouses may be deleted when the resources associated with the virtual warehouse are no longer necessary.
Computing device 600 includes one or more processor(s) 602, one or more memory device(s) 604, one or more interface(s) 606, one or more mass storage device(s) 608, and one or more Input/Output (I/O) device(s) 610, all of which are coupled to a bus 612. Processor(s) 602 include one or more processors or controllers that execute instructions stored in memory device(s) 604 and/or mass storage device(s) 608. Processor(s) 602 may also include various types of computer-readable media, such as cache memory.
Memory device(s) 604 include various computer-readable media, such as volatile memory (e.g., random access memory (RAM)) and/or nonvolatile memory (e.g., read-only memory (ROM)). Memory device(s) 604 may also include rewritable ROM, such as Flash memory.
Mass storage device(s) 608 include various computer readable media, such as magnetic tapes, magnetic disks, optical disks, solid state memory (e.g., Flash memory), and so forth. Various drives may also be included in mass storage device(s) 608 to enable reading from and/or writing to the various computer readable media. Mass storage device(s) 608 include removable media and/or non-removable media.
I/O device(s) 610 include various devices that allow data and/or other information to be input to or retrieved from computing device 600. Example I/O device(s) 610 include cursor control devices, keyboards, keypads, microphones, monitors or other display devices, speakers, printers, network interface cards, modems, lenses, CCDs or other image capture devices, and the like.
Interface(s) 606 include various interfaces that allow computing device 600 to interact with other systems, devices, or computing environments. Example interface(s) 606 include any number of different network interfaces, such as interfaces to local area networks (LANs), wide area networks (WANs), wireless networks, and the Internet.
Bus 612 allows processor(s) 602, memory device(s) 604, interface(s) 606, mass storage device(s) 608, and I/O device(s) 610 to communicate with one another, as well as other devices or components coupled to bus 612. Bus 612 represents one or more of several types of bus structures, such as a system bus, PCI bus, IEEE 1394 bus, USB bus, and so forth.
For purposes of illustration, programs and other executable program components are shown herein as discrete blocks, although it is understood that such programs and components may reside at various times in different storage components of computing device 600, and are executed by processor(s) 602. Alternatively, the systems and procedures described herein can be implemented in hardware, or a combination of hardware, software, and/or firmware. For example, one or more application specific integrated circuits (ASICs) can be programmed to carry out one or more of the systems and procedures described herein.
Although the present disclosure is described in terms of certain preferred embodiments, other embodiments will be apparent to those of ordinary skill in the art, given the benefit of this disclosure, including embodiments that do not provide all of the benefits and features set forth herein, which are also within the scope of this disclosure. It is to be understood that other embodiments may be utilized, without departing from the scope of the present disclosure.
This application claims the benefit of U.S. Provisional Application Ser. No. 61/941,986, entitled “Apparatus and method for enterprise data warehouse data processing on cloud infrastructure,” filed Feb. 19, 2014, the disclosure of which is incorporated herein by reference in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
5787466 | Berliner | Jul 1998 | A |
6338056 | Dessloch | Jan 2002 | B1 |
6490590 | Fink | Dec 2002 | B1 |
6604100 | Fernandez | Aug 2003 | B1 |
6757689 | Battas | Jun 2004 | B2 |
7280998 | Aboujaoude | Oct 2007 | B1 |
7823009 | Tormasov | Oct 2010 | B1 |
8341363 | Chou | Dec 2012 | B2 |
8381015 | Kaminski | Feb 2013 | B2 |
8428087 | Vincent | Apr 2013 | B1 |
8516159 | Ananthanarayanan | Aug 2013 | B2 |
8516355 | Gale | Aug 2013 | B2 |
8560887 | Behrendt | Oct 2013 | B2 |
8640137 | Bostic et al. | Jan 2014 | B1 |
8706914 | Duchesneau | Apr 2014 | B2 |
8725875 | Supalov | May 2014 | B2 |
8849764 | Long | Sep 2014 | B1 |
9229967 | Ge | Jan 2016 | B2 |
20020120598 | Shadmon | Aug 2002 | A1 |
20020120630 | Christianson et al. | Aug 2002 | A1 |
20030158884 | Alford, Jr. | Aug 2003 | A1 |
20030177239 | Shinohara et al. | Sep 2003 | A1 |
20040167904 | Wen | Aug 2004 | A1 |
20050050054 | Clark | Mar 2005 | A1 |
20050210049 | Foster | Sep 2005 | A1 |
20050223047 | Shah | Oct 2005 | A1 |
20060059173 | Hirsch et al. | Mar 2006 | A1 |
20060074872 | Gordon | Apr 2006 | A1 |
20070112714 | Fairweather | May 2007 | A1 |
20070198656 | Mazzaferri et al. | Aug 2007 | A1 |
20070276861 | Pryce et al. | Nov 2007 | A1 |
20080027788 | Lawrence et al. | Jan 2008 | A1 |
20080027965 | Garrett et al. | Jan 2008 | A1 |
20090182836 | Aviles et al. | Jul 2009 | A1 |
20090234711 | Ramer | Sep 2009 | A1 |
20090254516 | Meiyyappan et al. | Oct 2009 | A1 |
20090254532 | Yang et al. | Oct 2009 | A1 |
20090300043 | Maclennan | Dec 2009 | A1 |
20100005054 | Smith | Jan 2010 | A1 |
20100031267 | Maessen et al. | Feb 2010 | A1 |
20100100888 | Tene et al. | Apr 2010 | A1 |
20100145929 | Burger | Jun 2010 | A1 |
20100179940 | Gilder et al. | Jul 2010 | A1 |
20100199042 | Bates | Aug 2010 | A1 |
20110145307 | Ananthanarayanan et al. | Jun 2011 | A1 |
20110145489 | Yu | Jun 2011 | A1 |
20110161488 | Anderson et al. | Jun 2011 | A1 |
20110225167 | Bhattacharjee | Sep 2011 | A1 |
20110314001 | Jacobs | Dec 2011 | A1 |
20120005307 | Das et al. | Jan 2012 | A1 |
20120101860 | Ezzat | Apr 2012 | A1 |
20120109888 | Zhang et al. | May 2012 | A1 |
20120110570 | Jacobson | May 2012 | A1 |
20120166771 | Ringseth | Jun 2012 | A1 |
20120173824 | Iyigun et al. | Jul 2012 | A1 |
20120204187 | Breiter et al. | Aug 2012 | A1 |
20120233315 | Hoffman | Sep 2012 | A1 |
20120260050 | Kaliannan | Oct 2012 | A1 |
20120265881 | Chen | Oct 2012 | A1 |
20120296883 | Ganesh et al. | Nov 2012 | A1 |
20120311065 | Ananthanarayanan et al. | Dec 2012 | A1 |
20120323971 | Pasupuleti | Dec 2012 | A1 |
20130007753 | Jain | Jan 2013 | A1 |
20130110778 | Taylor et al. | May 2013 | A1 |
20130110961 | Jadhav | May 2013 | A1 |
20130124545 | Holmberg et al. | May 2013 | A1 |
20130132967 | Soundararajan | May 2013 | A1 |
20130145375 | Kang | Jun 2013 | A1 |
20130151884 | Hsu | Jun 2013 | A1 |
20130174146 | Dasgupta | Jul 2013 | A1 |
20130205028 | Crockett et al. | Aug 2013 | A1 |
20130205092 | Roy et al. | Aug 2013 | A1 |
20130218837 | Bhatnagar | Aug 2013 | A1 |
20130262386 | Kottomtharayil | Oct 2013 | A1 |
20130282795 | Tsao | Oct 2013 | A1 |
20130311454 | Ezzat | Nov 2013 | A1 |
20130332614 | Brunk | Dec 2013 | A1 |
20140025638 | Hu | Jan 2014 | A1 |
20140059017 | Chaney | Feb 2014 | A1 |
20140059226 | Messerli | Feb 2014 | A1 |
20140095646 | Chan | Apr 2014 | A1 |
20140109095 | Farkash | Apr 2014 | A1 |
20140115091 | Lee | Apr 2014 | A1 |
20140136473 | Faerber | May 2014 | A1 |
20140149461 | Wijayaratne | May 2014 | A1 |
20140189432 | Gokhale | Jul 2014 | A1 |
20140189680 | Kripalani | Jul 2014 | A1 |
20140236890 | Vasan | Aug 2014 | A1 |
20140372384 | Long | Dec 2014 | A1 |
20140379775 | Korangy | Dec 2014 | A1 |
20150012551 | Dong | Jan 2015 | A1 |
20150234869 | Chan | Aug 2015 | A1 |
20170177308 | Montagnon | Jun 2017 | A1 |
Number | Date | Country |
---|---|---|
102496060 | Jun 2012 | CN |
203261358 | Oct 2013 | CN |
WO 2006-026659 | Mar 2006 | WO |
WO 2013-006157 | Jan 2013 | WO |
WO2013072232 | May 2013 | WO |
WO2013084078 | Jun 2013 | WO |
Number | Date | Country | |
---|---|---|---|
20150234931 A1 | Aug 2015 | US |
Number | Date | Country | |
---|---|---|---|
61941986 | Feb 2014 | US |