Not applicable.
The present invention relates to data management apparatuses, methods, and non-transitory tangible machine-readable media thereof. More specifically, the present invention relates to data management apparatuses, methods, and non-transitory tangible machine-readable media thereof that manage data in a sliding table.
Memory and storage of any electronic computing apparatus (e.g. server, personal computer) are of limited-size and, hence, a database system installed in any electronic computing apparatus has limited storage space. Conventionally, the manager of a database system has to explicitly delete data stored therein from time to time to prevent the database system from being out of storage space. One of the key issues is to determine when and which parts of data to be deleted or moved to a secondary storage space. In the recent trend of big data, a special category of data, referred to as streaming data, has been identified and focused in the area of analytics. The data generated from Internet of Things (IoT), mobile app or large-scale Web services often falls in this category. Streaming data has the following characteristics.
1. The data stream is non-stop and generated from the real world in real time order.
2. It is immutable and collected as time-series data.
3. The more recent data is more valuable in business.
For the database that stores and manages streaming data, the old data is considered obsolete or expired once it has been collected and kept in database longer than certain period of time. It comes to a natural approach to deleting the expired data without affecting too much the business value of analytics results, according to the third characteristics of streaming data. Most database management systems are required to explicitly delete expired or old data, and compress the stored data in storage, when the storage space is about full. However, such way of removing expired/old data is very cumbersome, tedious and often leading to poor performance or even operation disruption. This issue is magnified and becomes a serious problem of data management while continuous, real-time analytics is required for streaming data.
The recent trend of Edge Computing opens up a new area of applications that require effective database management for streaming data collected at the edge. With the vigorous development of the IoT and mobile technologies, many electronic computing apparatuses (e.g. edge devices, equipment, routers, etc.) are designed to operate to collect, store and analyze data in the field, where the environment is harsh, resources such as memory are limited and devices are difficult to manage. An effective and efficient way for such electronic computing apparatuses to manage streaming data in database is an urgent need.
To solve the aforesaid problem, the present invention provides a data management apparatus, method, and non-transitory tangible machine-readable medium thereof for managing data in an in-memory database with a fixed size of memory.
The data management apparatus provided by the present invention comprises an in-memory database and a processor, wherein the processor is electrically connected to the in-memory database. A memory space of the in-memory database is allocated for a sliding table, wherein the sliding table comprises a plurality of records stored in a sequential order according to a plurality of time stamps of the records. For the sliding table, the time stamp of a record reflects the time that the record is inserted into the sliding table. A least recent record and a most recent record are the record with the smallest time stamp and the record with the largest time stamp respectively. A head pointer points to a beginning address of the least recent record, while a tail pointer points to a following address of the most recent record (i.e., the first available memory address). The processor inserts at least one new record into the sliding table according to the tail pointer and moves the tail pointer according to a number of the at least one new record. Each new record comprise a time stamp as well. The processor further identifies at least one expired record in the sliding table according to a preset time bound and the time stamp of each expired record. The processor further moves the head pointer according to a number of the at least one expired record.
The data management method provided by the present invention is for use in an electronic apparatus. An in-memory database is allocated a memory space for a sliding table. The in-memory database may be in the electronic apparatus or external to the electronic apparatus. The sliding table comprises a plurality of records stored in a sequential order according to a plurality of time stamps of the records. For the sliding table, the time stamp of a record reflects the time that the record is inserted into the sliding table. A least recent record and a most recent record are the record with smallest time stamp and the record with the largest time stamp respectively. A head pointer points to a beginning address of the least recent record, while a tail pointer points to a following address of the most recent record. The data management method comprises the following steps of: (a) inserting at least one new record into the sliding table according to the tail pointer, wherein each new record comprises a time stamp, (b) moving the tail pointer according to a number of the at least one new record, (c) identifying at least one expired record in the sliding table according to a preset time bound and the time stamp of each expired record, and (d) moving the head pointer according to a number of the at least one expired record.
The non-transitory tangible machine-readable medium provided by the present invention is stored with a computer program. The computer program comprises a plurality of codes, wherein the codes are able to execute a data management method when the computer program is loaded into an electronic apparatus. An in-memory database is allocated a memory space for a sliding table. The in-memory database may be in the electronic apparatus or external to the electronic apparatus. The sliding table comprises a plurality of records stored in a sequential order according to a plurality of time stamps of the records. For the sliding table, the time stamp of a record reflects the time that the record is inserted into the sliding table. The least recent record and the most recent record are the record with smallest time stamp and the record with the largest time stamp respectively. A head pointer points to a beginning address of the least recent record, while a tail pointer points to a following address of the most recent record The data management method comprising the steps of: (a) inserting at least one new record into the sliding table according to the tail pointer, wherein each new record comprises a time stamp, (b) moving the tail pointer according to a number of the at least one new record, (c) identifying at least one expired record in the sliding table according to a preset time bound and the time stamp of each expired record, and (d) moving the head pointer according to a number of the at least one expired record.
According to the above descriptions, the data management technology provided by the present invention allocates a memory space in an in-memory database for sliding tables and manages data therein. Briefly speaking, a plurality of records stored in a sliding table has a sequential order, wherein the sequential order is determined according to the time stamps of the records. A head pointer points to the beginning address of the least recent record, while a tail pointer points to the following address of the most recent record. With the tail pointer, the present invention can easily locate a storing space in the sliding table for a new record. With the head pointer, the present invention can easily identify the expired record(s) according to the time stamp(s) and a preset time bound. As a result, the data management technology of the present invention can utilize the limited storage space appropriately and can be applied to electronic computing apparatus of every kind of computational ability, especially the electronic computing apparatus that processes streaming data.
The detailed technology and preferred embodiments implemented for the subject invention are described in the following paragraphs accompanying the appended drawings for people skilled in this field to well appreciate the features of the claimed invention.
In the following descriptions, data management apparatuses, methods, and non-transitory tangible machine-readable media thereof of the present invention will be explained with reference to embodiments thereof. However, these embodiments are not intended to limit the present invention to any specific environment, applications, or particular implementations described in these embodiments. Therefore, description of these embodiments is only for the purpose of illustration rather than to limit the present invention. It should be appreciated that elements unrelated to the present invention are omitted from depiction in the following embodiments and the attached drawings. In addition, dimensional relationships among individual elements in the attached drawings are illustrated only for ease of understanding but not to limit the scope of the present invention.
The present invention relates to data management apparatuses, methods, and non-transitory tangible machine-readable media thereof that allocates memory space(s) for sliding table(s) and manages data therein. A sliding table is different from a table used in any conventional database system. A table, as defined in relational database, is a set of records, each of which consists of a fixed number of columns of attributes in a pre-defined order. Each attribute is a piece of data element with a data type such as integer. All the attributes in the same column are of the same data type. In some cases, a subset of attributes defines the primary key of a record, which must be unique and is used to identify the record. In the present invention, a sliding table is a table with constraint of recency. Please refer to
In the present invention, a sliding table is accommodated in an in-memory database such as the one shown in
Please refer to
Initially, the sliding table is empty, which means that the memory space 101 contains no record. In
The processor 15 may define a circular order among the record spaces 101a, . . . , 101z. By using the term “circular order,” it means that the record spaces 101a, . . . , 101z have a linear order and the first one according to the linear order is considered as the next record space of the last one according to the linear order. For example, the record space 101a is the first one, . . . , the record space 101z is the last one, and the record space 101a is the next one of the record space 101z. In some embodiments, the processor 15 defines the circular order among the record spaces 101a, . . . , 101z by defining another circular order for a plurality of memory addresses of the memory space 101.
The processor 15 assigns a tail pointer T1 and a head pointer H1 for the sliding table. The head pointer H1 points to the beginning address of the least recent record of the sliding table, while a tail pointer T1 points to the following address of the most recent record of the sliding table (i.e. the tail pointer T1 points to the next available record space). Initially, there is no record in the sliding table, so the processor 15 assigns both the head pointer H1 and the tail pointer T1 to point to a beginning address of the memory space 101 as shown in
After the environment for the sliding table has been set up (i.e. the memory space 101 for the sliding table has been allocated and the head and tail pointers H1, T1 have pointed to the beginning address of the memory space 101), the processor 15 is able to insert records into the memory space 101 and identify which record(s) belong to the sliding table by the head pointer H1 and the tail pointer T1. The details of these operations are elaborated below.
In this embodiment, the interface 13 receives a plurality of records 12a, 12b, 12c. Please note that the records 12a, 12b, 12c may be received by other interface in other embodiments. The processor 15 inserts the records 12a, 12b, 12c into a portion of the record spaces 101a, . . . , 101z starting from the record space pointed to by the tail pointer T1 and following the circular order. Each of the records 12a, 12b, 12c is of the same size and has a time stamp. In this embodiment, the time stamp of a record is the time that the record is inserted into the sliding table. As shown in
In this embodiment, the processor 15 further identifies at least one expired record in the sliding table according to a preset time bound and the time stamp(s) of the at least one expired record, and moves the head pointer H1 according to a number of the at least one expired record. Specifically, since the records 12a, 12b . . . , 12c of the sliding table are stored in the memory space 101 sequentially (i.e. in a sequential order of the time stamps of the records 12a, 12b . . . , 12c), the processor 15 may identify whether a record is expired from the record pointed to by the head pointer H1. In
In some embodiments, the aforementioned identification of the at least one expired record and move of the head pointer H1 may be tied to a select operator and/or a query operator of a database system. For those embodiments, when the interface 13 receives an instruction 14 including a select operator or a query operator, the processor 15 will identify at least one expired record (if any) in the sliding table and move the head pointer H1 before performing the select operator or the query operator. In some other embodiments, the aforementioned identification of the at least one expired record and move of the head pointer H1 may be performed by the processor 15 on a regular basis. Please note that the preset time bound may be adjusted by a user according to the speed of the incoming records so that the memory space 101 always has available record space for the incoming record(s). Please also note that different utilizations of sliding tables require different preset time bound.
According to the above descriptions, the data management apparatus 1 manages a sliding table by allocating a memory space 101 of the sliding table, defines a circular order for the record spaces 101a, . . . , 101z comprised in the memory space 101, assigns the tail pointer T1 to point to the following address of the most recent record of the sliding table, and assigns the head pointer H1 to the beginning address of the least recent record of the sliding table. When any record(s) need to be inserted, the data management apparatus 1 stores the record(s) in the sliding table from the address pointed to by the tail pointer T1 and then movies the tail pointer T1. Upon some occasions (e.g. receiving an instruction comprising a select operator or a query operator), the data management apparatus 1 identifies expired record(s) starting from the record pointed to by the head pointer H1 and then moves the head pointer H1. With the head pointer H1 and the tail pointer T1, the sliding table “slides” in the memory space 101. By identifying expired records, moving the head pointer H1, and even adjusting the preset time bound, the memory space 101 will have room for the incoming record(s). Therefore, the technology of the data management apparatus 1 in the first embodiment can be applied to electronic computing apparatus of every kind of computational ability.
Please refer to
Please refer to
As shown in
Please refer to
According to the above descriptions, the data management apparatus 1 will expand the memory space 101 for the sliding table by allocating another memory space 102 when it is necessary (e.g. when the memory space 101 for the sliding table is full). Therefore, in addition to the advantages described in the first embodiment, this embodiment can deal with the burst of the incoming records.
Please refer to
In the third embodiment, the data management apparatus 1 allocates a plurality of memory spaces of the in-memory database 11 for a plurality of sliding tables individually. One sliding table serves as the fact table from a data stream, while the others serve as the dimension tables from the same or different data stream(s). The fact table may contain one or more foreign keys that refer to the records in the dimension tables. Each of the dimension tables contains a primary key used to identify a unique record.
A concrete example is given herein. The memory spaces 101, 103, 104 are respectively allocated for the sliding tables 61, 63, 64 as shown in
The data management apparatus 1 can manage the sliding tables 61, 63, 64 in the ways described in the aforementioned embodiments except using a slightly different way for inserting new records into the sliding tables with primary keys (i.e., the dimension tables 63 and 64, which have the primary keys PK1, PK2 respectively). The processor 15 needs to ensure that the most recent record with the same key is always used at reference by key (i.e., at select or join). Any previous records with the same key becomes obsolete when the new one is inserted.
In an example, the sliding table 61 is a fact table comprising sales transaction record(s). The sliding table 61 defines a plurality of first attributes and one of them is a foreign key FK1 (e.g. customer's social security number). The sliding table 63 is a dimension table comprising customer information. The sliding table 63 defines a plurality of second attributes and one of them is the primary key PK1 (e.g. customer's social security number). The foreign key FK1 of the sliding table 61 and the primary key PK1 of the sliding table 63 correspond to the same type of information (e.g. customer's social security number), it is considered as the sliding table 61 and the sliding table 63 have a common key. The processor 15 may apply a join operation to the sliding table 61 and the sliding table 63 according to the common key.
When a new record is inserted into a sliding table with a primary key (i.e., dimension sliding table), the processor 15 first locates and, if exists, marks the old record with the same primary key invalid before adds the new record into the end of the sliding table and moves the tail pointer. At the point when identifying expired records, the processor 15 needs to search through both valid and invalid records starting from the head pointer. The expired records are then removed by moving the head pointer. In
Please refer to
Please refer to
A fifth embodiment of the present invention is a data management method and a flowchart of which is illustrated in
Please refer to
Please refer to
In some embodiments, the flowchart illustrated in
It is possible that the first sliding table is empty at some of the time. When the first head pointer and the first tail pointer point to a same memory address, the first sliding table is empty.
It is also possible that the first memory space for the sliding table is full at some of the time. In this embodiment, the data management method will execute a step (not shown) for adjusting the present time bound when the sliding table is full. In some other embodiments, the data management method will expand the memory space for the first sliding table by the procedure shown in
In some embodiments, the data management method may manage several sliding tables. For example, the data management method may further allocate a second memory space of the in-memory database for a second sliding table and have a plurality of memory addresses of the second memory space defined with a second circular order. For those embodiments, if the first sliding table and the second sliding table have a common key, the data management method may execute another step for applying a join operation to the first sliding table and the second sliding table according to the common key.
In addition to the aforesaid steps, the fifth embodiment is able to execute all the operations and steps of the data management apparatus 1 set forth in the previous embodiments, have the same functions, and deliver the same technical effects as the previous embodiments. How the fifth embodiment executes these operations and steps, have the same functions, and deliver the same technical effects as the previous embodiments will be readily appreciated by those of ordinary skill in the art based on the explanation of the previous embodiment and, thus, will not be further described herein.
The data management method described in the fifth embodiment may be implemented by a computer program having a plurality of codes. The computer program is stored in a non-transitory computer readable storage medium. When the codes are loaded into an electronic apparatus (e.g. the data management apparatus 1), the computer program executes the data management method as described in the fifth embodiment. The non-transitory computer readable storage medium may be a read only memory (ROM), a flash memory, a floppy disk, a hard disk, a compact disk (CD), a mobile disk, a magnetic tape, a database accessible to networks, or any other storage media with the same function and well known to those skilled in the art.
It shall be appreciated that, in the specification of the present invention, the terms “first” and “second” used in the first memory space and the second memory space are only intended to distinguish these memory spaces from each other. In addition, the terms “first” and “second” used in the first circular order and the second circular order are only intended to distinguish that these circular orders are different.
According to the above descriptions, the data management technology provided by the present invention allocates a memory space in an in-memory database for a sliding table and manages data therein. Briefly speaking, a plurality of records stored in the sliding table has a sequential order, wherein the sequential order is determined according to the time stamps of the records. A head pointer points to the beginning address of the least recent record among the records, while a tail pointer points to the following address of the most recent record among the records. With the tail pointer, the present invention can easily locate a storing space in the sliding table for a new record. With the head pointer, the present invention can easily identify the expired record(s) according to the time stamp(s) and a preset time bound. When the memory space for the sliding table is full, the data management technology provided by the present invention allocates another memory space for the sliding table. Therefore, there is always room for the coming records. The data management technology provided by the present invention is able to manage several sliding tables. In that case, database operations (e.g. join) can be applied to these sliding tables. As a result, the data management technology of the present invention can utilize the limited storage space appropriately and can be applied to electronic computing apparatus of every kind of computational ability, especially the electronic computing apparatus that processes streaming data.