This invention generally relates to database management systems and more specifically to a methodology for maintaining unique indexes in a distributed database composed of data records organized into tables.
In many databases unique indexes maintain data integrity by insuring that no two rows (or records) of data in a table have identical key values. That is, in a unique index an indexed key value can only exist in one row or record. An example of such a unique index in a credit card database is the customer's credit card number. Any index to that credit card number must assure that a given credit card number is only assigned to one individual; that is, only appears in one row or record of a corresponding logical table. So, steps must be taken to insure that two users do not attempt to assign the same credit card number to two different individuals; that is, two users do not try to place the same or different index values in one row. Databases that maintain such a function are known as being consistent and concurrent. Several methods have been implemented to assure the consistency and concurrency of such indexes. A popular method involves quiescing operations so that while one index is being updated, any other attempt is blocked. This approach has been implemented in non-shared databases where only a single copy of the index exists. Often these methods involved quiescing the entire database.
The above-identified U.S. Pat. No. 8,224,860 discloses a distributed database management system comprising a network of transactional nodes and archival nodes. Archival nodes act as storage managers for all the data in the database. Each user connects to a transactional node to perform operations on the database by generating queries for processing at that transactional node. A given transactional node need only contain that data and metadata as required to process queries from users connected to that node. This distributed database is defined by an array of atom classes, such as an index class and atoms where each atom corresponds to a different instance of the class, such as an index atom for a specific index. Replications or copies of an atom may reside in multiple nodes as needed. The atom copy in a given node is processed in that node.
In this implementation of the above-identified U.S. Pat. No. 8,224,860 asynchronous messages transfer among the different nodes to maintain database consistency and concurrency. Specifically, each node in the database network has a unique communication path to every other node. When one node generates a message involving a specific atom, it can communicate as necessary with those other nodes that also contain replications of that specific atom. Each node generates its messages independently of other nodes. So it is possible that, at any given instant, multiple nodes contain replications, or copies, of a given atom and that those different nodes may be at various stages of processing them. Consequently, operations in different nodes are not synchronized. It is necessary to provide a means for maintaining concurrency and consistency.
More specifically, in such a database management system, it is possible for multiple nodes to generate a message requesting an insert to add specific information into an index atom for a unique index. If multiple requests occur at different nodes within a short interval, a races problem exists that can produce an erroneous entry in the index atom. Prior methods, such as those involving quiescence, are not readily applicable to a distributed database management system of the type discussed above without introducing unacceptable system performance degradation. What is needed is a method for handling requested inserts to unique indexes in a distributed database management system.
Therefore it is an object of this invention to provide a database management system for a distributed database that processes requested entries into a unique index in a consistent and concurrent fashion.
Another object of this invention is to provide a database management system for a distributed database that processes requested entries into a unique index in consistent and concurrent fashion without any significant performance degradation.
Yet another object of this invention is to provide a database management system for a distributed database that processes requested entries into a unique index that eliminates the involvement of nodes that do not include that unique index.
In accordance with this invention a unique index is maintained in a distributed database concurrently and consistently. The database is composed of data records organized into tables and is distributed over a plurality of interconnected transactional and archival nodes wherein a database management system defines a plurality of atom classes for different classes of data and metadata and one of said atom classes is an index class that produces a given index atom for a unique index in the database and wherein different nodes may include a replication of a given index atom, one copy of a replicated given index atom being designated a chairman. When another node with a replicated given index atom, a requesting node, seeks to insert a new entry into its local replicated given index atom, the requesting node initially inserts the entry into the local replicated given index atom., generates a local-only flag and transmits to the chairman a message requesting that the entry be inserted into the index atom. At the node containing the chairman, it is determined whether the requested entry is unique in the chairman's replicated given index atom. If the request is determined to be unique, the chairman accepts the entry and transmits a success message to the requesting node. The requesting node responds by clearing the local-only flag and by broadcasting its updated replicated given index atom to all other nodes containing a replicated given index atom whereby the index atom is maintained consistently and concurrently across all nodes.
The appended claims particularly point out and distinctly claim the subject matter of this invention. The various objects, advantages and novel features of this invention will be more fully apparent from a reading of the following detailed description in conjunction with the accompanying drawings in which like reference numerals refer to like parts, and in which:
A specific database management system in
Each node in
In this system, the classes/objects set 42 is divided into a subset 43 of “atom classes,” a subset 44 of “message classes” and a subset 45 of “helper classes.” Additional details of certain of these classes that are relevant to this invention are described. As will become apparent, at any given time a transactional node only contains those portions of the total database that are then relevant to active user applications. Moreover, the various features of this distributed database management system enable all portions of database in use at a given time to be resident in random access memory 38. There is no need for providing supplementary storage, such as disk storage, at a transactional node during the operation of this system.
Referring to
Each atom has certain common elements and other elements that are specific to its type.
Each time a copy of an atom is changed in any transactional node, it receives a new change number. Element 76E records that change number. Whenever a node requests an atom from another node, there is an interval during which time the requesting node will not be known to other transactional nodes. Element 76F is a list of all the nodes to which the supplying node must relay messages that contain the atom until the request is completed.
Operations of the database system are also divided into cycles. A cycle reference element 76G provides the cycle number of the last access to the atom. Element 76H is a list of the all active nodes that contain the atom. Element 76I includes several status indicators. Elements 76J contains a binary tree of index nodes to provide a conventional indexing function. Element 76K contains an index level. Such index structures and operations are known to those in skilled in the art.
As previously indicated, communications between any two nodes is by way of serialized messages which are transmitted asynchronously using the TCP or another protocol with controls to maintain messaging sequences.
With this as background,
Referring to the process 200, the chairman sets a “local-only” flag in step 202 to indicate that the insert process is underway. The “local-only” flag can be a component of the status states element 76I in
If step 204 determines that the attempt is successful, step 204 control transfers to step 207. In step 208 the chairman first clears the “local-only” flag associated with the inserted key and then broadcasts the modified index atom to all other nodes that contain a replication of that index atom. More specifically, the chairman transmits an Index Node Added message 150 in
When a non-chairman attempts to insert anew index key value in the process 201, step 211 attempts to insert the key value in the index atom and sets a local-only flag associated with the inserted key. If this attempt fails, step 212 diverts control to step 213 whereupon further processing terminates and a failure indication is generated. As will be apparent, a failure means that the modified index was in conflict with the contents of the existing index atom at the requesting node.
If, however, the insert index is entered, step 212 diverts control to step 214 whereupon the non-chairman attempts to send an Insert Request message, such as the Insert Request message 160 in
In step 220, the requesting non-chairman node processes this Insert Status message. If the Insert Status message indicates that the chairman had accepted the modification to the insert atom, step 221 transfers control to step 222 that clears the local-only flag that was set in step 211.
If the non-chairman request is not inserted by the chairman in step 203, an Insert Status message is generated with a failed state at step 224 and transmitted at step 216 whereupon step 221 diverts to step 223 that removes the local-only flag for the specific key value status of the insert in the requesting node. Then control returns to step 211 to repeat the process. Such a situation may result when the index atom has been updated by a previous request from node N2 in
Thus in the case of an insert request by either the chairman or a non-chairman, the chairman is the sole arbiter of whether an index atom is updated with a new key value. In either case, the modified index atom is also replicated to all other nodes containing that index atom. Thus, such an index atom modification occurs consistently and concurrently.
With this understanding, it will be apparent that a database management system for a distributed database that processes requested entries into a unique index atom in accordance with this invention does so in an orderly fashion so that the all copies of the index atom remains remain in a consistent and concurrent state. This method does not introduce any significant performance degradation of those nodes that contain a copy of the unique index atom. Moreover, the process operates without any involvement of nodes that do not include that unique index atom.
This invention has been disclosed in terms of certain embodiments. It will be apparent that many modifications can be made to the disclosed apparatus without departing from the invention. Therefore, it is the intent of the appended claims to cover all such variations and modifications as come within the true spirit and scope of this invention.
This application is a continuation of U.S. application Ser. No. 16/926,846, filed Jul. 13, 2020, which is a continuation of U.S. application Ser. No. 14/215,461, filed Mar. 17, 2014, which in turn claims priority from U.S. Provisional Patent Application No. 61/789,671, filed Mar. 15, 2013. Each of these applications is incorporated hereby herein by reference in its entirety. U.S. Pat. No. 8,224,860 granted Jul. 17, 2012 for a Database Management System and assigned to the same assignee as this invention is incorporated in its entirety herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
4733353 | Jaswa | Mar 1988 | A |
4853843 | Ecklund | Aug 1989 | A |
5446887 | Berkowitz | Aug 1995 | A |
5524240 | Barbara et al. | Jun 1996 | A |
5555404 | Torbjornsen et al. | Sep 1996 | A |
5568638 | Hayashi et al. | Oct 1996 | A |
5625815 | Maier et al. | Apr 1997 | A |
5701467 | Freeston | Dec 1997 | A |
5764877 | Lomet et al. | Jun 1998 | A |
5806065 | Lomet | Sep 1998 | A |
5960194 | Choy et al. | Sep 1999 | A |
6216151 | Antoun | Apr 2001 | B1 |
6226650 | Mahajan et al. | May 2001 | B1 |
6275863 | Eff et al. | Aug 2001 | B1 |
6334125 | Johnson et al. | Dec 2001 | B1 |
6401096 | Zellweger | Jun 2002 | B1 |
6424967 | Johnson et al. | Jul 2002 | B1 |
6480857 | Chandler | Nov 2002 | B1 |
6499036 | Gurevich | Dec 2002 | B1 |
6523036 | Hickman et al. | Feb 2003 | B1 |
6748394 | Shah et al. | Jun 2004 | B2 |
6792432 | Kodavalla et al. | Sep 2004 | B1 |
6862589 | Grant | Mar 2005 | B2 |
7026043 | Jander | Apr 2006 | B2 |
7080083 | Kim et al. | Jul 2006 | B2 |
7096216 | Anonsen | Aug 2006 | B2 |
7184421 | Liu et al. | Feb 2007 | B1 |
7219102 | Zhou et al. | May 2007 | B2 |
7233960 | Boris et al. | Jun 2007 | B1 |
7293039 | Deshmukh et al. | Nov 2007 | B1 |
7353227 | Wu | Apr 2008 | B2 |
7395352 | Lam et al. | Jul 2008 | B1 |
7401094 | Kesler | Jul 2008 | B1 |
7403948 | Ghoneimy et al. | Jul 2008 | B2 |
7562102 | Sumner et al. | Jul 2009 | B1 |
7853624 | Friedlander et al. | Dec 2010 | B2 |
7890508 | Gerber et al. | Feb 2011 | B2 |
8108343 | Wang et al. | Jan 2012 | B2 |
8122201 | Marshak et al. | Feb 2012 | B1 |
8224860 | Starkey | Jul 2012 | B2 |
8266122 | Newcombe et al. | Sep 2012 | B1 |
8504523 | Starkey | Aug 2013 | B2 |
8756237 | Stillerman et al. | Jun 2014 | B2 |
8930312 | Rath | Jan 2015 | B1 |
9008316 | Acar | Apr 2015 | B2 |
9501363 | Ottavio | Nov 2016 | B1 |
9734021 | Sanocki et al. | Aug 2017 | B1 |
9824095 | Taylor et al. | Nov 2017 | B1 |
10740323 | Palmer et al. | Aug 2020 | B1 |
11176111 | Palmer et al. | Nov 2021 | B2 |
11561961 | Palmer et al. | Jan 2023 | B2 |
11573940 | Dashevsky | Feb 2023 | B2 |
20020112054 | Hatanaka | Aug 2002 | A1 |
20020152261 | Arkin et al. | Oct 2002 | A1 |
20020152262 | Arkin et al. | Oct 2002 | A1 |
20020178162 | Ulrich et al. | Nov 2002 | A1 |
20030051021 | Hirschfeld et al. | Mar 2003 | A1 |
20030149709 | Banks | Aug 2003 | A1 |
20030204486 | Berks et al. | Oct 2003 | A1 |
20030220935 | Vivian et al. | Nov 2003 | A1 |
20040153459 | Whitten et al. | Aug 2004 | A1 |
20040263644 | Ebi | Dec 2004 | A1 |
20050013208 | Hirabayashi et al. | Jan 2005 | A1 |
20050086384 | Ernst | Apr 2005 | A1 |
20050198062 | Shapiro | Sep 2005 | A1 |
20050216502 | Kaura et al. | Sep 2005 | A1 |
20060010130 | Leff et al. | Jan 2006 | A1 |
20060168154 | Zhang et al. | Jul 2006 | A1 |
20070067349 | Jhaveri et al. | Mar 2007 | A1 |
20070156842 | Vermeulen | Jul 2007 | A1 |
20070288526 | Mankad et al. | Dec 2007 | A1 |
20080086470 | Graefe | Apr 2008 | A1 |
20080106548 | Singer | May 2008 | A1 |
20080228795 | Lomet | Sep 2008 | A1 |
20080320038 | Liege | Dec 2008 | A1 |
20090113431 | Whyte | Apr 2009 | A1 |
20090183175 | Walker | Jul 2009 | A1 |
20100094802 | Luotojarvi et al. | Apr 2010 | A1 |
20100115246 | Seshadri et al. | May 2010 | A1 |
20100153349 | Schroth et al. | Jun 2010 | A1 |
20100191884 | Holenstein et al. | Jul 2010 | A1 |
20100235606 | Oreland et al. | Sep 2010 | A1 |
20100297565 | Waters et al. | Nov 2010 | A1 |
20110087874 | Timashev et al. | Apr 2011 | A1 |
20110231447 | Starkey | Sep 2011 | A1 |
20120096043 | Stevens, Jr. | Apr 2012 | A1 |
20120136904 | Ravi | May 2012 | A1 |
20120254175 | Horowitz et al. | Oct 2012 | A1 |
20130060922 | Koponen et al. | Mar 2013 | A1 |
20130086018 | Horii | Apr 2013 | A1 |
20130110766 | Promhouse et al. | May 2013 | A1 |
20130110774 | Shah et al. | May 2013 | A1 |
20130110781 | Golab et al. | May 2013 | A1 |
20130159265 | Peh et al. | Jun 2013 | A1 |
20130159366 | Lyle et al. | Jun 2013 | A1 |
20130232378 | Resch et al. | Sep 2013 | A1 |
20130259234 | Acar | Oct 2013 | A1 |
20130262403 | Milousheff et al. | Oct 2013 | A1 |
20130278412 | Kelly et al. | Oct 2013 | A1 |
20130297565 | Starkey | Nov 2013 | A1 |
20130311426 | Erdogan et al. | Nov 2013 | A1 |
20140108414 | Stillerman et al. | Apr 2014 | A1 |
20140258300 | Baeumges et al. | Sep 2014 | A1 |
20140279881 | Tan et al. | Sep 2014 | A1 |
20140297676 | Bhatia et al. | Oct 2014 | A1 |
20140304306 | Proctor et al. | Oct 2014 | A1 |
20150019739 | Attaluri et al. | Jan 2015 | A1 |
20150032695 | Tran et al. | Jan 2015 | A1 |
20150066858 | Sabdar et al. | Mar 2015 | A1 |
20150135255 | Theimer et al. | May 2015 | A1 |
20150370505 | Shuma et al. | Dec 2015 | A1 |
20160134490 | Balasubramanyan et al. | May 2016 | A1 |
20160350392 | Rice et al. | Dec 2016 | A1 |
20160371355 | Massari et al. | Dec 2016 | A1 |
20170039099 | Ottavio | Feb 2017 | A1 |
20170139910 | Mcalister et al. | May 2017 | A1 |
20220035786 | Palmer et al. | Feb 2022 | A1 |
Number | Date | Country |
---|---|---|
101471845 | Jul 2009 | CN |
101251843 | Jun 2010 | CN |
101268439 | Apr 2012 | CN |
002931 | Oct 2002 | EA |
1403782 | Mar 2004 | EP |
2003256256 | Sep 2003 | JP |
2006048507 | Feb 2006 | JP |
2007058275 | Mar 2007 | JP |
2315349 | Jan 2008 | RU |
2008106904 | Aug 2009 | RU |
2010034608 | Apr 2010 | WO |
Entry |
---|
“Album Closing Policy,” Background, retrieved from the Internet at URL:http://tools/wiki/display/ENG/Album+Closing+Policy (Jan. 29, 2015), 4 pp. |
“Distributed Coordination in NuoDB,” YouTube, retrieved from the Internet at URL:https://www.youtube.com/watch?feature=player_embedded&v=URoeHvflVKg on Feb. 4, 2015, 2 pp. |
“Durable Distributed Cache Architecture”, retrieved from the Internet at URL: http://www.nuodb.com/explore/newsql-cloud-database-ddc-architecture on Feb. 4, 2015, 3 pages. |
“Glossary-NuoDB 2.1 Documentation / NuoDB,” retrieved from the Internet at URL: http://doc.nuodb.com/display/doc/Glossary on Feb. 4, 2015, 1 pp. |
“How It Works,” retrieved from the Internet at URL: http://www.nuodb.com/explore/newsql-cloud-database-how-it-works?mkt_tok=3RkMMJW on Feb. 4, 2015, 4 pp. |
“How to Eliminate MySQL Performance Issues,” NuoDB Technical Whitepaper, Sep. 10, 2014, Version 1, 11 pp. |
“Hybrid Transaction and Analytical Processing with NuoDB,” NuoDB Technical Whitepaper, Nov. 5, 2014, Version 1, 13 pp. |
“No Knobs Administration,” retrieved from the Internet at URL: http://www.nuodb.com/explore/newsql-cloud-database-product/auto-administration on Feb. 4, 2015, 4 pp. |
“NuoDB at a Glance,” retrieved from the Internet at URL: http://doc.nuodb.com/display/doc/NuoDB+at+a+Glance on Feb. 4, 2015, 1 pp. |
“SnapShot Albums,” Transaction Ordering, retrieved from the Internet at URL:http://tools/wiki/display/ENG/Snapshot+Albums (Aug. 12, 2014), 4 pp. |
“Table Partitioning and Storage Groups (TPSG),” Architect's Overview, NuoDB Technical Design Document, Version 2.0 (2014), 12 pp. |
“The Architecture & Motivation for NuoDB,” NuoDB Technical Whitepaper, Oct. 5, 2014, Version 1, 27 pp. |
“Welcome to NuoDB Swifts Release 2.1 GA,” retrieved from the Internet at URL: http://dev.nuodb.com/techblog/welcome-nuodb-swifts-release-21-ga on Feb. 4, 2015, 7 pp. |
“What Is a Distributed Database? and Why Do You Need One,” NuoDB Technical Whitepaper, Jan. 23, 2014, Version 1, 9 pp. |
Amazon CloudWatch Developer Guide API, Create Alarms That or Terminate an Instance, Jan. 2013, downloaded Nov. 16, 2016 from archive.org., pp. 1-11. |
Amazon RDS FAQs, Oct. 4, 2012, 39 pages. |
Bergsten et al., “Overview of Parallel Architectures for Databases,” The Computer Journal vol. 36, No. 8, pp. 734-740 (1993). |
Connectivity Testing with Ping, Telnet, Trace Route and NSlookup (hereafter help.webcontrolcenter), Article ID:1757, Created: Jun. 17, 2013 at 10:45 a.m., https://help.webcontrolcenter.com/kb/a1757/connectivity-testing-with-ping-telnet-trace-route-and-nslookup.aspx, 6 pages. |
Dan et al., “Performance Comparisons of Buffer Coherency Policies,” Proceedings of the International Conference on Distributed Computer Systems, IEEE Comp. Soc. Press vol. 11, pp. 208-217 (1991). |
Extended European Search Report in European Patent Application No. 18845799.8 dated May 25, 2021, 8 pages. |
Final Office Action dated Nov. 24, 2017 from U.S. Appl. No. 14/215,401, 33 pages. |
Final Office Action dated Nov. 3, 2016 from U.S. Appl. No. 14/215,401, 36 pp. |
Final Office Action dated Sep. 9, 2016 from U.S. Appl. No. 14/215,461, 26 pp. |
First Examination Report issued by the Canadian Intellectual Property Office for Application No. 2,793,429, dated Feb. 14, 2017, 3 pages. |
Garding, P. “Alerting on Database Mirorring Events,” Apr. 7, 2006, downloaded Dec. 6, 2016 from technet.microsoft.com, 24 pp. |
Hull, Autoscaling MySQL on Amazon EC2, Apr. 9, 2012, 7 pages. |
International Search Report and Written Opinion in International Patent Application No. PCT/US18/00142 mailed Dec. 13, 2018. 11 pages. |
Iqbal et al., “Performance Tradeoffs in Static and Dynamic Load Balancing Strategies,” Institute for Computer Applications in Science and Engineering, 1986, pp. 1-23. |
Leverenz et al., “Oracle8i Concepts, Partitioned Tables and Indexes,” Chapter 11, pp. Nov. 12, 11/66 (1999). |
Non-Final Office Action dated Jan. 21, 2016 from U.S. Appl. No. 14/215,401, 19 pp. |
Non-Final Office Action dated May 31, 2017 from U.S. Appl. No. 14/215,401, 27 pp. |
Office Action with translation in Korean Application No. 10-2020-7006901 dated Dec. 16, 2022, 30 pages. |
Oracle Database Concepts 10g Release 2 (10.2), Oct. 2005, 14 pages. |
Rahimi, S. K et al., “Distributed Database Management Systems: A Practical Approach,” IEEE Computer Society, John Wiley & Sons, Inc. Publications (2010), 765 pp. |
Roy, N. et al., “Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting,” IEEE 4th International Conference on Cloud Computing, 2011, pp. 500-507. |
Searchcloudapplications.techtarget.com, Autoscaling Definition, Aug. 2012, 1 page. |
Shaull, R. et al., “A Modular and Efficient Past State System for Berkeley DB,” Proceedings of USENIX ATC '14:2014 USENIX Annual Technical Conference, 13 pp. (Jun. 19-20, 2014). |
Shaull, R., “Retro: A Methodology for Retrospection Everywhere,” A Dissertation Presented to the Faculty of the Graduate School of Arts and Sciences of Brandeis University, Waltham, Massachusetts, Aug. 2013, 174 pp. |
Veerman, G. et al., “Database Load Balancing, MySQL 5.5 vs PostgreSQL 9.1,” Universiteit van Amsterdam, System & Network Engineering, Apr. 2, 2012, 51 pp. |
Yousif, M. “Shared-Storage Clusters,” Cluster Computing, Baltzer Science Publishers, Bussum, NL, vol. 2, No. 4, pp. 249-257 (1999). |
Number | Date | Country | |
---|---|---|---|
20230229655 A1 | Jul 2023 | US |
Number | Date | Country | |
---|---|---|---|
61789671 | Mar 2013 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16926846 | Jul 2020 | US |
Child | 18156806 | US | |
Parent | 14215461 | Mar 2014 | US |
Child | 16926846 | US |