The present invention relates to methods and systems for compressing electronic data for storage or transmission; and in particular to improved value list compression techniques for compressing data stored in database tables including column partitioned tables.
The amount of data generated, collected and saved by businesses is increasing at an unprecedented rate. Businesses are retaining enormous amounts of detailed data, such as call detail records, transaction history, and web clickstreams, and then mining it to identify business value. Regulatory and legal retention requirements add to this burden by requiring businesses to maintain years of accessible historical data.
As businesses enter an era of petabyte-scale data warehouses, advanced technologies, such as data compression, are increasingly utilized to effectively maintain enormous data volumes in the warehouse. Data compression reduces storage cost by storing more logical data per unit of physical capacity. Performance is improved because there is less physical data to retrieve during database queries.
One current technique for compressing data, known as Value List Compression, may be applied to compress column data within a database table. A set of values are identified in a dictionary and any occurrence of a dictionary value in a row is compressed in the sense it is not recorded in the row, but a pointer to the dictionary value is recorded in the row header. An occurrence of a value that is not in the dictionary is stored as an uncompressed value.
Value List compression techniques can also be applied to a column partitioned table which stores multiple column values in a row. Such rows are called container rows in this document. Two sets of structures exist in a container row for this purpose: a fixed length compression dictionary, called static compression dictionary, SCD; and a list of uncompressed values, called the uncompressed column value list.
After a container row has been auto-compressed and stored, values that arrive later for insertion into the table that are not available in the SCD are no longer added to the SCD, but are appended to the uncompressed column value list in the container row. The uncompressed value list is in the order of data arrival and therefore in the order of row-id. The uncompressed column value list is not sorted and built dynamically as values are inserted. In a container row there are some bits for each base table row value that is stored in the container row. These bits include information such as where the value is recorded in the row, whether the value exists or is NULL, whether the value exists in the dictionary or is uncompressed and so on. These bits are collectively called “auto compression bits” in this document.
The above structure and method for compressing container row data have the following drawbacks:
Described below is an improved database compression scheme that overcomes the disadvantaged discussed above.
In the following description, reference is made to the accompanying drawings that form a part hereof, and in which is shown by way of illustration specific embodiments in which the invention may be practiced. These embodiments are described in sufficient detail to enable one of ordinary skill in the art to practice the invention, and it is to be understood that other embodiments may be utilized and that structural, logical, optical, and electrical changes may be made without departing from the scope of the present invention. The following description is, therefore, not to be taken in a limited sense, and the scope of the present invention is defined by the appended claims.
The technique for coding compressible database fields disclosed herein has particular application, but is not limited, to large databases that might contain many millions or billions of records managed by a database system (“DBS”) 100, such as a Teradata Active Data Warehousing System available from Teradata Corporation.
For the case in which one or more virtual processors are running on a single physical processor, the single physical processor swaps between the set of N virtual processors.
For the case in which N virtual processors are running on an M-processor node, the node's operating system schedules the N virtual processors to run on its set of M physical processors. If there are 4 virtual processors and 4 physical processors, then typically each virtual processor would run on its own physical processor. If there are 8 virtual processors and 4 physical processors, the operating system would schedule the 8 virtual processors against the 4 physical processors, in which case swapping of the virtual processors would occur.
Each of the processing modules 1101 . . . N manages a portion of a database that is stored in a corresponding one of the data-storage facilities 1201 . . . N. Each of the data-storage facilities 1201 . . . N includes one or more disk drives. The DBS may include multiple nodes 1052 . . . p in addition to the illustrated node 1051, connected by extending the network 115.
The system stores data in one or more tables in the data-storage facilities 1201 . . . N. The rows 1251 . . . Z of the tables are stored across multiple data-storage facilities 1201 . . . N to ensure that the system workload is distributed evenly across the processing modules 1101 . . . N. A parsing engine 130 organizes the storage of data and the distribution of table rows 1251 . . . Z among the processing modules 1101 . . . N. The parsing engine 130 also coordinates the retrieval of data from the data-storage facilities 1201 . . . N in response to queries received from a user at a mainframe 135 or a client computer 140. The DBS 100 usually receives queries and commands to build tables in a standard format, such as SQL.
In one implementation, the rows 1251 . . . Z are distributed across the data-storage facilities 1201 . . . N by the parsing engine 130 in accordance with their primary index. The primary index defines the columns of the rows that are used for calculating a hash value. The function that produces the hash value from the values in the columns specified by the primary index is called the hash function. Some portion, possibly the entirety, of the hash value is designated a “hash bucket”. The hash buckets are assigned to data-storage facilities 1201 . . . N and associated processing modules 1101 . . . N by a hash bucket map. The characteristics of the columns chosen for the primary index determine how evenly the rows are distributed.
In one example system, the parsing engine 130 is made up of three components: a session control 200, a parser 205, and a dispatcher 210, as shown in
Once the session control 200 allows a session to begin, a user may submit a SQL request, which is routed to the parser 205. As illustrated in
An example of a table with a compressible field, illustrated in
In a typical existing system, an example of which is shown in
In some existing systems, a look-up table, or compression dictionary, 520 is provided to translate the code to the compressible field value. In relational databases using SQL, the compression dictionary 520 is frequently joined with the original table 505 during execution of queries that select information from the compressible field.
In table 635, field 610 has been replaced by a code field 640, wherein the values from field 610 have been replaced with compressed code values from compression dictionary 660 when available. For field 610 values not found in the compression dictionary, e.g., KALAMZOO and MIAMISBURG, the uncompressed values are appended to the ends of the rows 650 and 655, and a code, such as “000,” indicating that an uncompressed value is appended to the end of the row, is stored in field 640.
In the illustrated example table 710, the compressed code values in code field 710 are represented using a first bit to specify the dictionary from which the compressed code is obtained, such as “1” for SCD 720 and “0” for DCD 730, followed by the compressed code value from the applicable compression dictionary. As an alternative to using a first bit to specify the dictionary from which the compressed code is obtained, the index value itself can be used to differentiate between the SCD and DCD. For example, if the SCD has 100 values, then any index value in excess of 100 represents an index into the DCD.
Teradata Corporation provides a database solution, referred to as Teradata Columnar, that eliminates performance bottlenecks by storing data in columns, unlike most relational database management systems which only store data in rows. With a Teradata Columnar database system a user can mix-and-match columnar and row-based physical storage to best suit user applications, so that applications get the right data at the right time. With a Teradata Columnar database system, only the data in the columns required for a query are pulled into memory for processing, vastly reducing the time-constraining input/output (I/O) of a row-based approach that would read data from all the columns.
Within a Teradata Columnar database system, column-partitioned data can be stored in a container. A container has many data values packed into it with a single header. The physical space is allocated and managed by the Teradata file system, along with all other space in the Teradata Database. By filling a container with values from a single column and applying partition elimination logic, only the columns referenced in a query are read.
As discussed above, Value List compression techniques can also be applied to data saved in container rows. Similarly, the improved process for compressing database data utilizing both a static compression dictionary and a dynamic compression dictionary, described above for a row-based storage system, can be implemented within a columnar system.
Referring now to
Table 805 provides a simple example of sales data for a business, with each row in the table containing information for one sales order. Each row includes a row number, an order ID number, a product ID number, and product quantity. The values in the Order_ID field column 810 are compressed using a static compression dictionary 820 and a dynamic compression dictionary 830, and stored within a container row 910, shown in
Within container row 910, the static compression dictionary is saved in field 920, the dynamic compression dictionary is saved in fields 930, and the compressed data values corresponding to the Order_ID values from column 810 of table 805 are saved in field 940. Access to each dictionary is direct based on its indexed location. The dynamic compression dictionary 930 is variable in size and can grow as values are inserted.
Contents of the dynamic compression dictionary are not sorted. The DCD is unsorted for a couple of reasons: 1) it is suitable for large volume inserts; 2) ease of insertion—if the number of unique value exceeds the SCD size, then infrequently occurring values are added to DCD but can still be accessed efficiently; and 3) unlike the SCD, an unsorted DCD is amenable to modifications.
The compressed values contained in field 940 consist of a first bit to specify the compression dictionary used during value compression, and a set of bits to indicate the offset (for direct access) into the specified dictionary. An offset of zero indicates a NULL value for the value. If the index is nonzero and the dictionary bit is set to 1 then the index is a pointer into the SCD. If the index is nonzero and the dictionary bit is set to 0, then the index is a pointer to a value in DCD. This auto-compression technique could be used irrespective of the presence or absence of nulls. Additional presence bit is not required to represent the NULL; instead the index is set to zero if the value is NULL.
The improved value list process for compressing data using a dynamic compression dictionary, and updating the dynamic compression dictionary, is illustrated in the flowchart of
If the uncompressed data value is not found in SCD 1103, a search for the value in DCD 1105 is conducted in step 1150. If the uncompressed value is found in the DCD, then the associated compressed code value is retrieved from DCD 1105 and saved, together with a preceding index bit of “0” indicating the compressed code value was obtained from DSCD 1105, as shown in steps 1160, 1180 and 1190.
When the uncompressed data value is not found in either the static or dynamic compression dictionaries, 1103 and 1105, respectively, this “new” uncompressed data value and a corresponding compressed code value are added to DCD 1105, as shown in step 1170. This new compressed code value is also saved to data storage with a “0” index bit, as shown by steps 1170, 1180 and 1190.
It should be noted that the search into the unsorted DCD is linear and can become expensive for large list sizes. To address this issue, the search for a value in the DCD should only be undertaken if the list size is small. For large list sizes, the new incoming value can be simply appended to DCD. It is also possible to limit the search to a small percentage of the DCD, such as the last value added, or some small number of values.
The size of the dictionary can be ascertained from the “last used” index for the dictionary. This technique is dynamic depending upon a tradeoff that can be turned off without compromising correctness. Alternatively, the size of the DCD can be set by the user, can be a user-specified percentage of the SCD, or can be determined by the system based on a number of parameters not necessarily limited to the container row size, the number of unique values in the SCD, or the number of rows represented by the SCD, i.e., the hit ratio of the SCD.
The improved compression technique described herein increases the level of compression when compared to techniques which store new values in an uncompressed value list without dictionary support. This is because such lists are unsuitable for value list compression techniques for newly arriving (inserted) values that are not in the SCD. The compression techniques possible in such cases are limited to run-length, which requires that recurring values be adjacent to each other, and that a previous value must be located using a linear traversal of the uncompressed value list.
The improved compression technique described herein can be combined with other compression techniques, such as Run Length compression, trim length compression, and multi-value compression techniques.
The container row compression techniques described above are not limited to container rows that store a single column value, and may be applied to multi-column value container rows, such as a container row that stores Social Security Number and Employee Number fields (columns) rather than just the Social Security Number or Employee Number field (column). These compression techniques can also be combined with table level compression for a column, also known as multi-value compression or MVC
Instructions of the various software routines discussed herein, such as the method illustrated in
Data and instructions of the various software routines are stored in respective storage modules, which are implemented as one or more machine-readable storage media. The storage media include different forms of memory including semiconductor memory devices such as dynamic or static random access memories (DRAMs or SRAMs), erasable and programmable read-only memories (EPROMs), electrically erasable and programmable read-only memories (EEPROMs) and flash memories; magnetic disks such as fixed, floppy and removable disks; other magnetic media including tape; and optical media such as compact disks (CDs) or digital video disks (DVDs).
The instructions of the software routines are loaded or transported to each device or system in one of many different ways. For example, code segments including instructions stored on floppy disks, CD or DVD media, a hard disk, or transported through a network interface card, modem, or other interface device are loaded into the device or system and executed as corresponding software modules or layers.
The foregoing description of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed.
Additional alternatives, modifications, and variations will be apparent to those skilled in the art in light of the above teaching. Accordingly, this invention is intended to embrace all alternatives, modifications, equivalents, and variations that fall within the spirit and broad scope of the attached claims.
This application claims priority under 35 U.S.C. §119(e) to the following commonly-assigned patent applications, which are incorporated herein by reference: Provisional Patent Application Ser. No. 61/580,960, entitled “SYSTEM AND METHOD FOR DATA COMPRESSION USING A DYNAMIC COMPRESSION DICTIONARY,” filed on Dec. 28, 2011, by Bhashyam Ramesh, Vinupriya Selvamanee, and Jaiprakash C.
Number | Name | Date | Kind |
---|---|---|---|
5534861 | Chang et al. | Jul 1996 | A |
5592667 | Bugajski | Jan 1997 | A |
5872530 | Domyo et al. | Feb 1999 | A |
6415295 | Feinberg | Jul 2002 | B1 |
7904432 | McKay et al. | Mar 2011 | B2 |
20030084041 | Dettinger | May 2003 | A1 |
20110043387 | Abali et al. | Feb 2011 | A1 |
20110307440 | Panchenko | Dec 2011 | A1 |
20120117082 | Koperda et al. | May 2012 | A1 |
20120296983 | Boehm | Nov 2012 | A1 |
20130110766 | Promhouse et al. | May 2013 | A1 |
Number | Date | Country | |
---|---|---|---|
61580960 | Dec 2011 | US |