This is a U.S. national phase application under 35 USC 371 of international application PCT/JP2018/034786, filed Sep. 20, 2018, which claims priority to Japanese patent application No. 2017-216648, filed on Nov. 9, 2017. The entire disclosures of the above applications are incorporated herein by reference.
The present invention relates to an information accumulation apparatus which can quickly read and write IoT data, a data processing system, and a program.
In recent years, in a system which is constituted by connecting various devices, such as sensors capable of acquiring various data, to a network, attention has been paid to the IoT (Internet of Things) which utilizes data generated by the devices.
In the system using the IoT, since data are continuously or periodically acquired from various devices, a database that can quickly write and read a great amount of data, which is successively generated with the passing of time, is needed.
In addition, in the fields using the IoT, in many cases, various data are acquired by using a large number of various devices. Thus, it is required that the devices to be used be inexpensive. Therefore, in many cases, it is difficult to utilize expensive devices including large-capacity memories.
On the other hand, conventionally, as a database in which data successively generated with the passing of time are written, use is made of postscript data such as PostgreSQL (see, for example, Non Patent Literature 1).
In some of postscript databases, in some case, a process of, for example, creating an index is performed when data is written to a disk. The index in this context refers to search data which is created so as to quickly search or extract data written in a file.
In general, if the number of indices, or the number of kinds of indices, which are generated for data, the write of which is requested by an application, increases, the generation of indices or the write of index data becomes an overhead of the entire write process in accordance with the number of indices or the kinds of indices.
In addition, since the size of data is not limited to a fixed length, if postscript is performed without executing a special process at a time of data write, an operation amount O(n) of linear search of data becomes necessary at a time of data read, and there arises a problem that the read speed lowers in accordance with an increase of data (see, for example, Non Patent Document 2).
A process of, for example, index generation at a time of data write increases the read speed of data, but it is desirable to decrease as much as possible a processing load at a time of data write in a database for the IoT in order to secure a write speed.
A feature of IoT data is that data sequentially generated by devices with the passing of time arrive at the database in the order of generation, unless the order of data changes while the data is transmitted through a network or the like.
In addition, in the IoT, in many cases, necessary data is read out from the database by designating a time instant. Thus, a database is required, in which the speeds of search and read processes based on time instants are high. In particular, since there are many applications which refer to latest values at most recent time instants, it is also important that the read speed of latest values is high.
In this manner, in the fields using the IoT, such a database is required that the data, which are sequentially generated by the devices with the passing of time and arrive at the database in the order of generation, are quickly written in the database, and the data can be quickly searched and read out by designating information such as date/time or the like (a specific time instant, a time range, and a most recent time instant).
As described above, in the conventional art, use is made of the method in which the read speed is increased by providing indices on memory areas of the database. However, there is a trade-off relation between the size of indices and the search performance.
Further, since the amount of data used in the IoT is enormous, the number of indices is also enormous as a matter of course. Thus, there is a problem that processing with saving of a memory is difficult.
Besides, there is secondary problem. In the IoT, there is a case in which the data generated by the devices and media data with a greater data amount than this data, such as images and moving images, are processed by a common database. However, it is difficult to accumulate these data by a common scheme, because of differences between data sizes and characteristics.
The present invention has been made in consideration the above problems. The object of the invention is to provide an information accumulation apparatus which, firstly, enables saving of a memory and high-speed data read/write by minimizing processing at a time of write, and which, secondly, enables, by a common database, high-speed read/write of data generated by devices and media data such as images, and to provide a data processing system, and a program.
An information accumulation apparatus according to the present invention comprises: an index recording unit configured to record an index file including index information which is a data sequence including a time instant of generation of data, and position information of the data to be recorded, the index information being arranged in an order of time instants of generation of the data, the index file including a file name corresponding to the time instant of the generated data; an actual data recording unit configured to record an actual data file including a file name having a rule which enables unique and mutual correlation by a first calculation amount with the file name of the index file, the actual data file being configured such that the data is written to the area indicated by the position information described in the index information; a write processor configured to write the data to the actual data file, and to write the position information of the data which is written in the actual data file, and the time instant of the generation of the data, to the index file as the index information; and a read processor configured to read, when data that is a read target is designated based on a time instant, the data from the actual data file, based on the index information in the index file including the file name corresponding to the time instant.
According to the present invention, firstly, saving of a memory and high-speed data read/write are enabled by minimizing processing at a time of write, and, secondly, high-speed read/write of data generated by devices and media data such as images is enabled by a common database.
Hereinafter, an information accumulation apparatus, a data processing system and a program according to embodiments of the present invention will be described with reference to the accompanying drawings.
The data processing system is configured to include the information accumulation apparatus 10, a message broker 20, devices 30 (sensors) (30a to 30c), a Web server 40, and applications 50 (50a to 50c).
The devices (sensors) 30 (30a to 30c) correspond to data generation apparatuses each having a function of generating various data d which are measured continuously or periodically, and time instants t at which the data d are measured, and transmitting the data d and time instants t to the message broker 20. Hereinafter, the device (sensor) 30 is simply referred to as “device 30”.
In the present embodiment, the unique characteristic of the IoT, i.e. the characteristic that the data d transmitted from the devices 30 are transmitted to the message broker 20 in the order of generation, is utilized, and the data d are managed by the information accumulation apparatus 10.
Thus, the device 30 is configured such that the data d and time instants t transmitted to the information accumulation apparatus 10 do not need to be accumulated in the device 30.
Thereby, in the present data processing system, the devices 30, which are inexpensive, can be adopted.
The message broker 20 has functions of receiving the data d generated by the devices 30 and the time instants t at which the data d are generated (hereinafter, also referred to as “generation time instants t”), and transmitting the data d and time instants t into the information accumulation apparatus 10.
The information accumulation apparatus 10 has a function of writing in a data recording unit 12 the data d received from the message broker 20, according to the generation time instants t by the devices 30, and a function of reading the data d from the data recording unit 12 according to a data read request with the designated time instant t from the application 50 via the Web server 40.
Upon receiving the data request from the application 50 (50a to 50c), the Web server 40 accesses the information accumulation apparatus 10, receives the data d that is read from the data recording unit 12 of the information accumulation apparatus 10, and transmits the data d to the application 50 (50a to 50c).
The information accumulation apparatus 10 includes a data processing controller 11 and the data recording unit 12.
The data processing controller 11 includes a CPU (Central Processing Unit) which controls the information accumulation apparatus 10 according to a data processing control program prestored in a storage unit (not shown) such as a ROM. The CPU, which operates according to the data processing control program, functions as a write processor 11a which executes a write process, and as a read processor 11b which executes a read process.
In addition, the write processor 11a of the data processing controller 11 has a function of writing the received data d in the actual data recording unit 12b according to a write process, which will be described in detail in the description of the operation below, and writing, in an index recording unit 12a, “time t (date/time) of generation, amount (size) of data d, and write area (read start position) of data d in the actual data recording unit 12b” as index information IJ composed of a data sequence of a fixed length relating to the data d, in the order of time instants of the generation of the data d by the devices 30.
Furthermore, the read processor 11b of the data processing controller 11 has a function of searching and acquiring, by using a specific time instant t or specific time range as a query (information request), an index file (1˜n, n: a natural number) including the time instant t from the index recording unit 12a, according to a read process, which will be described in detail in the description of the operation below, and also has a function of searching the index information IJ which is provided in the searched and acquired index file and includes the time instant t in the data sequence.
Note that when the index file 1 to index file n are not distinguished, the index file is simply referred to as “index file”.
The data recording area 14 corresponds to the data recording unit 12 which is provided on the outside of the information accumulation apparatus 10, and may be configured to be connected to the information accumulation apparatus 10 via, for example, a bus BUS.
Specifically, the data recording area 14 includes an index recording unit 14a and an actual data recording unit 14b, which have the same functions as the index recording unit 12a and actual data recording unit 12b. Hereinafter, a description will be given by using the data recording unit 12.
Hereinafter, attention is paid to the index file 1, and a description of the other index files 2 to n is omitted.
As illustrated in the Figure, index information IJ1 to index information IJm are sequentially recorded from a starting line (first line) to a last line (m-th line) in the index file 1. When index information IJ1 to index information IJm are not distinguished, the index information is simply referred to as “index information IJ”.
In each index information IJ, “information relating to data d (date/time, size, read start position)” is described. In the first line, for example, the index information IJ1 including an earliest time instant t in the index file 1 is disposed. In the last m-th line, the index information IJm including a latest time instant t in the index file 1 is disposed. Note that the date/time may be information which is created by converting the date/time.
In addition, as an example of the index information IJ, use may be made of 32-byte information composed of “time instant expressed by 64-bit binary expression”, “data size”, “data record position POS in actual data file” and “system clock at time of data write”.
Further, if the index information up to the index information IJm is recorded in the index file 1, and the file 1 reaches an allowable capacity, a new index file 2 is generated, and index information IJ is sequentially recorded from the first line of the index file 2.
In order to enhance the data read/write speed and the efficiency of use of the storage area, the file size (capacity) of the index file can be set to a multiple of the paging size of an OS (Operating System).
A process in which a kernel of the OS executes read/write from/to an auxiliary storage device such as an HDD (Hard Disk Drive) or SSD (Solid State Drive) is executed in units of a paging size. Thus, by setting the size of the index file, which is particularly frequently read/written, to a multiple of the paging size, the read/write speed of the index file and the efficiency of use of the storage area are enhanced.
Here, the paging size refers to, for example, “a unit of a data sequence constituted by data recorded in memory cells MC of 1 row×k columns” in the memory cells MC which constitute a memory cell array MCA and are arranged, for example, in 1 row×k columns.
Note that an index file 3, . . . , an index file n are newly created when necessary, and the index information IJ is sequentially recorded from the index file 3 to the index file n.
In addition, file names, which are set for the index file 3, . . . , index file n, may be created by adopting any one of the following first to third methods.
Specifically, firstly, as an example of a predetermined criterion for selecting, for example, one of dates/times described in the index information IJ, the date/time of the data d described in the first line of the index file is selected, and this may be used as an index file name.
Secondly, as an example in which an index file name and an actual data file name are set by a simple rule which can execute mutual conversion with a small calculation amount, by using the date/time of the data d selected in the index file, a date/time expressed by 64-bit integer with no sign may be described in the index information IJ, and, besides, a file name including a character string, in which a date/time described by a 128-bit binary expression is coupled by hexadecimal ASCII characters in units of four bits, may be used as these file names.
Thirdly, as an example of the combination between the criterion and the rule, that one of the dates/times described by 64-bit binary expressions in the index information IJ, which is described in the first line of the index file, is selected, and a character string expressed by 128-bit hexadecimal notation may be used as parts of the index file and actual data file.
Hereinafter, attention is paid to the actual data file 1, and a description of the other actual data files 2 to n is omitted.
As illustrated in the Figure, in the actual data file 1, from the beginning, for example, data d1 is recorded in an area 1 (POSAL1), data d2 is recorded in an area 2 (POSAL2), . . . , and data dm is recorded in an area m (POSALm). Note that POSAL1 to POSALm are position information in the actual data file 1 in which the data d of write targets are recorded. For example, an arbitrary sign for distinguishing the data d1 and data d2 may be inserted between the data d1 and data d2.
In addition, the data d1 to data dm correspond to the index information IJ1 to index information IJm from the first line to m-th line recorded in the index file 1.
Specifically, the data d1 to data dm corresponding to the index information IJ1 to index information IJm recorded in the index file 1 are recorded in the actual data file 1.
The same relationship is established between, for example, the index file n and actual data file n.
In addition, in order to easily and quickly enable a read process of data d from the actual data file on the basis of the index file, for example, a file name corresponding to the index file 1 is given to the actual data file 1. The same relationship is established between the index file n and actual data file n.
Even when BLOB (Binary Large Object) data of an image, a moving image and audio with large capacity, which will be described below, is recorded in, for example, the actual data file 1, the same correspondence relation is maintained between the actual data file 1 and index file 1.
Specifically, the name of an actual data file, in which BLOB data is recorded, is set based on a generation time instant of the corresponding BLOB data described in the index information IJ in the index file. Concretely, the actual data file name and the generation time instant of the BLOB data are correlated by a simple rule which enables unique and mutual conversion with a small calculation amount.
Hereinafter, attention is paid to the actual data file 1, and a description of the other actual data files 2 to n is omitted.
As illustrated in the Figure, in the actual data file 1, one BLOB data is recorded. Specifically, if an n-number of actual data files 1 to n are recorded in the actual data recording unit 12b, an n-number of BLOB data in total are recorded.
By adopting this configuration, the data d generated by the device 30 and the BLOB data of an image, a moving image or the like can be processed by a common database. For example, such a problem is solved that it is difficult to accumulate the data d and BLOB data by a common scheme because of differences in size and characteristic between the data d and BLOB data.
In the information accumulation apparatus 10 having the above configuration, the CPU in the data processing controller 11 controls the operations of the respective components according to instructions described in the data processing control program, and the software and hardware cooperates to perform the functions of various data write processes and data read processes, which will be described in the description of operations below.
Next, (write and read) operations by the information accumulation apparatus 10 will be described.
[Write Process (t, d)]
Specifically,
Note that the subroutine A(t) and subroutine B(It, t) are process flows which are used when the data d cannot arrive at the information accumulation apparatus 10 in the order of generation of the data d, and the data d having a reversed order during traveling through the communication path is written by interrupt in the index file and actual data file in which the data d is to be written.
Here, in
tL: a generation time instant t of latest data d of all written data d,
IL: an index file including index information IJ relating to the data d of the time instant tL, and
AL: an actual data file including the data d of the time instant tL.
To start with, the initialization process is executed, and an earlier time instant (e.g. 19XX, YY (month)/ZZ (day), NN (hour), MM (minute), LL (second)) than generation time instants t (e.g. 20xx, yy (month)/zz (day), nn (hour), mm (minute), ll (second)) of all data d, which will be generated by the devices 30 in the future, is substituted for the parameter tL (step P1).
Then, “NULL” is substituted for the parameter IL for the index file (step P2), and similarly “NULL” is substituted for the parameter AL for the actual data file in the actual data recording unit 12b, and thus the initialization process is finished (step P3).
By this initialization process, the index recording unit 12a and actual data recording unit 12b are set in an empty state in which neither an index file nor an actual data file is generated, and, as a matter of course, data d is not recorded.
Next, by the subroutine A(t) process illustrated in
Here, each index file name is described by binary data composed of a data sequence of a fixed length (e.g. a data sequence of “0” and “1” of 8 bytes). Thus, a file can be quickly searched by a system call of the OS, such as a binary search that is one of search methods.
Further, by the subroutine B(It, t) illustrated in
The acquired position information POS (line) is substituted for a parameter POSIt (to be described later).
Based on the above-described initialization process, subroutine A(t) process and subroutine B(It, t) process, a main write process to be described below is executed.
Here, in
It: an index file in which index information IJ of data d generated at a time instant t is written,
POSIt: a position at which index information IJ relating to data d at the time instant t is written in It,
At: an actual data file in which data d generated at the time instant t is written,
POSAt: a position at which data d is written in At, and
POSAL: a position at which data d generated at the time instant t is written in AL.
To start with, when the generation time instant t of data d that is a write target is later than the parameter tL (step S1, YES), i.e. when it is determined that data d generated at a time instant t, which is later than the initial value (a time instant earlier than any of generation time instants t) stored in the parameter tL by the initialization process, normally arrives at the information accumulation apparatus 10 according to the generation time instant t thereof (step S1, YES), it is then determined whether an index file is created in the index recording unit 12a (expressed as IL==NULL in the Figure), or, it is determined, if an index file is already created, whether the capacity of the index file, in which the data d is to be written, reaches an upper limit (step S2)
As a result, if it is determined that an index file has not yet been created (IL==NULL) or that the capacity of the already created index file reaches the upper limit (step S2, YES), “1” (in the case where an index file has not yet been created) or, for example, “2” (in the case where the capacity of the index file reaches the upper limit) is substituted for each of the parameters IL and AL. Thereby, for example, an index file 1 or an index file 2, in which data d is to be written, is newly created, and an actual data file 1 or an actual data file 2 corresponding to the index file 1 or index file 2 is newly created (step S3).
Hereinafter, the description will be given by using, as an example, the index file 2 and the actual data file 2 corresponding to the index file 2.
Next, data d that is a write target is additionally written to an end of the newly created actual data file 2 (or to a beginning of the newly created actual data file 2 if there is no written data d), and the position of the actual data file 2, in which the data d is written, is substituted for the parameter POSAL for position information (step S4).
Then, as index information IJ, “date/time (time instant t at which data d is generated), size (amount of data d) and read start position (parameter POSAL in which position information POS of a position where data d is written in the actual data file 2 is stored)” are additionally written to a last line of the index file 2 (or a first line if there is no written data d) (step S5), and a generation time instant t of latest data d written in the actual data file 2 is substituted for the parameter tL (step S6), thus preparing for a write process of data d which is generated at the next time instant.
Specifically, at this time point, the value of the parameter tL becomes a latest data generation time instant t among written data d recorded in the actual data recording unit 12b.
On the other hand, in step S1, if the time instant t of the generation of the data d by the device 30 is earlier than the tL (step S1, NO), i.e. if it is determined that part of the data d arrives at the information accumulation apparatus 10 with a delay, the processing result of the subroutine A(t) (an index file name (data sequence composed of date/time) in which data d generated at time instant t is to be written) is substituted for the It (step S7).
Next, based on the processing result by the subroutine B(It, t) (the position information POS of data d to be written, which is described in the index information IJ including the time instant t, based on the index file searched and acquired in the subroutine A(t)) is substituted for the parameter POSIt (step S8).
Thereafter, the data d generated at the time instant t is additionally written to an end in the actual data file having the file name corresponding to the index file name acquired in step S7, and position information POS of this end is substituted for the parameter POSAt (step S9). Specifically, the parameter POSAt becomes position information of the end to which the data d is written in the actual data file.
In addition, “date/time (time instant t at which data d is generated), size (amount of data d) and read start position (parameter POSAt of the end)” are written by interrupt to that line in the index file, which corresponds to the parameter POSIt (step S10).
In this manner, the write process (step S1 (YES)→step S2 to step S6) in the case where the data d arrive in the order of time instants of generation of the data d can be executed by the postscript of the index information IJ to the index file and the postscript of the data d to the actual data file, or, in addition to these postscripts, by only the accumulation of generated data d by the new creation of the index file and actual data file. Thus, memory copy involved in sort, insert, etc., does not occur, and data d can be quickly written.
In addition, even when the generated data d do not arrive at the information accumulation apparatus 10 in the order of time instants of generation of the generated data d and the time instants of arrival are reversed (step S1 (NO)→step S7 to step S10), for example, the index file, in which data d is to be written, and the position information POS at which the data d is to be inserted, can be quickly specified by the binary search. This is because the system call of the OS, such as the binary search, is utilized by using the fact that each line of the index file and index information IJ1 to index information IJm has a fixed length. Thereby, high-speed move of existing data d, and high-speed write of new data d can be achieved.
[Read Process (ts, te)]
Specifically,
Specifically, this flowchart is a flowchart in which if a time range T (start time instant ts and end time instant te) is set by a read request of data d by the user, the time range T is designated as a query, and the data d generated in the time range T is read.
In this manner, the read process of data d is executed by designating the time range T of the start time instant ts to the end time instant te as the query. Thus, in the subroutines A(t), B(t) and C(t) to be described below, a process, in which both the start time instant ts and the end time instant te of the read of data d are designated, or one of the time instants t (one of ts or te) is designated, is executed.
In the subroutine A(t) illustrated in
Here, in
Is: an index file including index information IJ of data d generated at a time instant ts,
POSIs: a position at which index information IJ of data d generated at the time instant ts in Is is recorded,
As: an actual data file in which data d generated at the time instant ts is recorded,
POSAs: a position at which data d generated at the time instant ts in As is recorded,
Ie: an index file including index information IJ of data d generated at a time instant te,
POSIe: a position at which index information IJ of data d generated at the time instant te in Ie is recorded,
Ae: an actual data file in which data d generated at the time instant te is recorded, and
POSAe: a position at which data d generated at the time instant te in Ae is recorded.
To begin with, if the subroutine A(ts) is executed, an index file name, in which index information IJ including a time instant that agrees with the read start time instant ts of data d or that is earlier than the start time instant ts and is latest among the earlier time instants is recorded, is searched and acquired (step T1).
A similar process is executed with respect to the end time instant te. Specifically, if the subroutine A(te) is executed, an index file name, in which index information IJ including a time instant that agrees with a read end time instant te of data d or that is earlier than the end time instant te and is latest among the earlier time instants is recorded, is searched and acquired (step T).
Accordingly, by the subroutines A(ts, te), as a search result, one index file (e.g. only index file 1) or index files (e.g. index file 1 and index file 2) are obtained as index files of read targets.
Next, in the subroutine B(It, t) illustrated in
If the subroutine B(Ie, te) is executed, from the index file searched and acquired by the subroutine A(te), position information POS of index information IJ, in which a time instant which agrees with the read end time instant te or which is earlier than the read end time instant te and is latest in the index file is described, is acquired (step T10).
In the subroutine C(Is, ts) illustrated in
If the subroutine C(Is, ts) is executed, from the index file searched and acquired by the subroutine A(ts), position information POS of index information IJ, in which a time instant which agrees with the read start time instant ts or which is earlier than the read start time instant ts and is earliest among the earlier time instants is described, is acquired (step T20).
Based on the above-described subroutine A(ts, e) to subroutine C(Is, ts), a read process of data d illustrated in
To start with, an empty list list for storing data d to be read in the present read process, or an arrangement ar, is stored in a parameter ret (step U1).
Next, a search result (an index file name (data sequence composed of date/time) that is a read start target) acquired by the subroutine A(ts) is substituted for a parameter Is (step U2).
However, when an index file that is a read target is not acquired in the subroutine A(t) and the parameter Is is “empty” (step U3, YES), information to the effect that no search result is obtained is sent to the application 50 via the Web server 40 (step U4).
On the other hand, when a search result acquired by the subroutine A(ts) is stored in the parameter Is (step U3, NO), a search result of the subroutine A(te) (an index file that is a read end target) is substituted for a parameter Ie (step U5), and then a search result of the subroutine C(Is, ts) (position information POS of index information IJ which agrees with the read start time instant ts, or position information POS in which an earliest time instant in the index file is described) is substituted for a parameter POSIs(step U6).
However, when the position information POS corresponding to the time instant ts is not acquired in the subroutine C(Is, ts) and the parameter POSIs is “empty” (step U7, YES), information to the effect that no search result is obtained is sent to the application 50 via the Web server 40 (step U4).
On the other hand, when the search result is stored in the parameter POSIs(step U7, NO), a search result of the subroutine B(Ie, te) (position information POS of index information IJ in which a time instant, which agrees with the read end time instant te or which is earlier than the read end time instant te and is latest in the index file, is described) is then substituted for a parameter POSIe (step U8). Thereby, in the parameter POSIe, the position information POS, in which data d corresponding to the end time instant te is written, is stored.
Thereafter, the value of the parameter Is (designating the index file including the start time instant ts) is substituted for a parameter x, and the value of the parameter POSIs (position information POS of index information IJ which agrees with the read start time instant ts, or position information POS in which the earliest time instant in the index file is described) is substituted for a parameter y (step U9).
Hereinafter, the value substituted for the parameter x is referred to as “parameter Ix”.
Thereafter, the size of the data d which is read according to the parameter Ix and parameter y, and the (start and end) positions of the read data d are substituted for a parameter (Size, POS) (step U10).
Further, according to the parameter (Size, POS), the data d recorded in the actual data file is read, and the read data d is substituted for a parameter D (step U11).
Thereafter, the data d substituted for the parameter D is additionally written to the end of the parameter ret (the beginning of the parameter ret in the case of first read) (step U12).
Then, it is determined whether the “index file name (data sequence of date/time) including the start time instant ts” stored in the parameter x agrees with the “index file name (data sequence of date/time) including the end time instant te” stored in the parameter Ie” (i.e. the read target is complete in one index file), and the “read start position POSIs of index information IJ” stored in the parameter y agrees with the “read end position POSIe of index information IJ” stored in the parameter POSIe (index information IJ of one line is a read target) (step U13).
As a result of step U13, if it is determined that the respective conditions are satisfied, i.e. if the data d of the read target is complete in the index information IJ of one line in a certain single index file (step U13, YES), the data d additionally written to the parameter ret in step U12 is sent to the application 50 via the Web server 40 (step U14).
On the other hand, in step U13, if it is determined that any one of the conditions is not satisfied (step U13, NO), it is first determined whether the read start position POSIs of data d stored in the parameter y agrees with the last line, i.e. the m-th line, of the index file that is a search target (step U15).
If the start position POSIs is not the m-th line (step U15, NO), y=y+1 is set, and the index information IJ of the read target is lowered by one line (step S17). The process returns to step U10, and the read process of the data d with respect to the index information IJ in the line lowered by one line is executed.
Specifically, in the index file of the read target, the data d of a data length corresponding to the parameter Size is read from the corresponding actual data file according to the index information IJ lowered by one line, and then the read data d is stored in the parameter D, and the data d is added to the end of the parameter ret (step S10→step U11→step U12).
Thereafter, if it is determined that the “read start position POSIs of index information IJ” in the line lower by one line at the time of y=y+1 agrees with the “read end position POSIe of index information IJ” stored in the parameter POSIe (step U13, YES), the data d corresponding to the index information IJ of, e.g. two lines, which is added to the parameter ret in step U12, is sent to the application 50 via the Web server 40 (step U14).
In addition, when the start position POSIs is the m-th line (step U15, YES), the number of index files which are read targets is two or more. Thus, x=x+1 is set, and the index file that is the read target is set to be a neighboring index file, and “0 line” is substituted for the value of y, i.e. the start position POSIs (step U16). Thereby, the operation of step U10 onwards is executed from the beginning line (first line) of the index file that is the read target.
By executing the above read process, the data d generated in the time range T designated by the user can be read.
[Read Process (treq)]
Specifically,
Here, unlike
Here, in
Ireq: an index file including index information IJ of data d generated at a time instant treq,
POSIreq: a position at which index information IJ of data d generated at the time instant treq in Ireq is recorded,
Areq: an actual data file in which data d generated at the time instant treq is recorded, and
POSAreq: a position at which data d generated at the time instant treq in Areq is recorded.
To begin with, if the subroutine A(treq) illustrated in
In addition, if the subroutine D(Ireq, treq) illustrated in
Based on the above-described subroutine A(treq) and subroutine D(Ireq, treq), a read process illustrated in
To start with, an empty list list for storing data d to be read out in the read process, or an arrangement ar, is stored in a parameter ret (step X1).
Next, a search result (an index file name (data sequence composed of date/time) that is a read target) acquired by the subroutine A(treq) is substituted for a parameter Ireq (step X2).
However, when an index file name that is a read target is not acquired in the subroutine A(treq) and the parameter Ireq is “empty” (step X3, YES), information to the effect that no search result is obtained is sent to the application 50 via the Web server 40 (step X4).
On the other hand, when the search result is stored in the parameter Ireq (step X3, NO), a search result of the subroutine D(Ireq, treq) (position information POS described in the index information IJ including the time instant treq) is substituted for a parameter POSIreq (step X5).
As a result of step X5, if the position information POS corresponding to the index information IJ including the time instant treq is not acquired in the subroutine D(Ireq, treq) and the parameter POSIreq is “empty” (step X6, YES), information to the effect that no search result is obtained is sent to the application 50 via the Web server 40 (step X4).
On the other hand, when the search result (the position information POS corresponding to the index information IJ including the time instant treq) is stored in the parameter POSIreq (step X6, NO), the “size” and “read start position” described in the index information including the time instant treq are substituted for a parameter (Size, POS) (step X7).
Thereafter, according to the parameter (Size, POS), the data d of the data length corresponding to the parameter Size, which is recorded in the actual data file having the file name corresponding to the index file name, is read, and the read data d is substituted for the parameter ret (step X8).
At last, the value stored in the parameter ret (the data d of the read target) is sent to the application 50 via the Web server 40 (step X9).
In this manner, since the file names that are set for the index file 1 to index file n are the time instants of generation of data d, the index file having the target index information IJ can easily and quickly specified by performing the binary search, based on the time instant treq for the read target. Thus, the recording position of the data d of the read target can easily and quickly be understood.
Thus, according to the information accumulation apparatus 10 with the above configuration, when the data d generated by the devices 30 arrive in the order of time instants t of generation, the data d are additionally described (written) to the actual data file 12 in the order of arrival of the data d, and the generation time instants t of the data d that are additionally written in the actual data file, the data sizes, and the positions of postscript in the actual data file are additionally written, as index information IJ, in the index file having the file name corresponding to the actual data file in which the data d are recorded.
Thereby, it should suffice if the index information IJ of the data d is simply generated in the order of arrival of the data d, and the data d are additionally written to the actual data file. In addition, when the capacity of the index file or actual data file reaches the limit, a new index file and a new actual data file are created and the generated data d are simply accumulated in these files. Thus, memory copy due to sort, insert or the like does not occur, and high-speed write of data d is enabled.
In addition, the data recording unit 12 includes the actual data recording unit 12b in which one BLOB data is recorded in one actual data file.
By adopting this configuration, the data d generated by the devices 30 and the BLOB data, such as images and moving images, can be processed by a common database. It is thus possible to solve, for example, such a problem that it is difficult to accumulate the data d and BLOB data by a common scheme because of differences in size and characteristic.
In addition, even when the arrival time instants of the generated data d to the information accumulation apparatus 10 are interchanged, since each index file name is composed of the binary data of the fixed length, the file in which the data d to be written is inserted and the position of insertion can quickly be specified by the binary search. In this manner, by making use of the fact that the index file name and each line of index information IJ1 to index information IJm have fixed lengths, high-speed move of existing data d and high-speed write of new data d can be performed by the system call of the OS, such as the binary search.
Further, according to the information accumulation apparatus 10 with the above configuration, the file name of the index file and the file name of the actual data file are set based on the time instant t of generation of data d. Thus, the index file including the time instant t and the index information IJ in this index file can easily be searched by designating the time instant t. In addition, since the position POS at which the data d is written is described in the index information IJ, the data d can easily be read out, based on the searched index information IJ.
Aside from the case where the time instant treq for the read target is designated, the same applies to the case where the start time instant ts and the end time instant te are designated and the data d written in the time range T there between are read.
For example, when there are a plurality of actual data files corresponding to the index file, these actual data files may be gathered into one directory, and the name of the directory may be set to be a name which is correlated with the corresponding index file name by a rule which enables unique conversion with a small calculation amount. Thereby, the actual data file can uniquely be determined, and the data d can quickly be read.
In addition, the area (POSAL) in the actual data file, where the data d is recorded, may be indicated by an area represented by a combination of the “size” and “read start position” described in the index information IJ.
The size of the index information IJ, which is a data sequence of a fixed length, may be set to be a page unit, or set to be a divisor of the page unit.
Besides, for example, the index file 1, which is recorded in the index recording unit 12a, may be divided. In this case, such a configuration may be adopted that two index files, which are generated by dividing the index file 1, correspond to the actual data file 1.
The present invention is not limited to the above-described embodiments, and can be variously modified without departing from the scope of the present invention in practical stages. Furthermore, the above-described embodiments incorporate inventions of various stages, and various inventions can be extracted by proper combinations of the disclosed constituent elements. For example, even if some constituent elements are omitted from all the constituent elements disclosed in the embodiments or some constituent elements are combined in other modes, a configuration from which some constituent elements are omitted or in which some constituent elements are combined can be extracted as an invention if the above-described problems to be solved by the invention can be solved and the above-described advantages of the invention can be obtained.
Besides, the methods described in the above embodiments can be stored as computer-executable programs (software means) in a storage medium, such as a magnetic disk (floppy (trademark) disk, hard disk, etc.), an optical disc (CD-ROM, DVD, MO, etc.), or a semiconductor memory (ROM, RAM, flash memory, etc.), and can be distributed by transmission via a communication medium. Note that the programs stored in the medium side include a setup program which constitutes in the computer software means (including not only an execution program but also a table structure and a data structure) that the computer is caused to execute. The computer that realizes the present apparatus reads in the program which is stored in the storage medium, and, in some cases, constructs the software means by the setup program. The operation of the computer is controlled by the software means, thereby executing the above-described processes. The storage medium described in the present specification includes not only a storage medium for distribution, but also a storage medium, such as a magnetic disk or a semiconductor memory, which is provided in the computer or in a device connected via a network.
Number | Date | Country | Kind |
---|---|---|---|
JP2017-216648 | Nov 2017 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2018/034786 | 9/20/2018 | WO | 00 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2019/092990 | 5/16/2019 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
6993246 | Pan | Jan 2006 | B1 |
8156110 | Luo | Apr 2012 | B1 |
9317537 | Fakeih | Apr 2016 | B2 |
20040168084 | Owen | Aug 2004 | A1 |
20060195876 | Calisa | Aug 2006 | A1 |
20110078141 | Fakeih | Mar 2011 | A1 |
20110246603 | Lee | Oct 2011 | A1 |
20160217013 | Song | Jul 2016 | A1 |
20160286254 | Hirose | Sep 2016 | A1 |
20170139968 | Baum et al. | May 2017 | A1 |
Number | Date | Country |
---|---|---|
104394380 | Mar 2015 | CN |
2014164382 | Sep 2014 | JP |
2016187122 | Oct 2016 | JP |
Entry |
---|
International Preliminary Report on Patentability from counterpart PCT/JP2018/034786, including the English translation of the Written Opinion, dated May 12, 2020. |
Hiroki Akama, et al., “Design and Evaluation of a Data Management System for WORM Data Processing”, The IPSJ Journal 49.2(2008): pp. 749-764. |
Seiichi Ohkoma, et al., “Information Storage and Retrieval”, The IPSJ Magazine 7.6 (1966) pp. 318-326. |
International Search Report (English and Japanese) issued in PCT/JP2018/034786, dated Nov. 20, 2018; ISA/JP. |
Toshihiko Okazaki, “Norton Utilties 3 Usage Guide For Windows”, Ohmsha, First Edition, pp. 122-124 (Apr. 14, 1998). |
Number | Date | Country | |
---|---|---|---|
20210173813 A1 | Jun 2021 | US |