This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2017-097666, filed on May 16, 2017, the entire contents of which are incorporated herein by reference.
The embodiments discussed herein are related to an analysis system, an analysis method, and a computer-readable recording medium.
There has been conventionally known a system that identifies interests and tastes of a user who uses a mobile terminal, by collecting and analyzing logs of application operations and the like in the mobile terminal, in a server, and creates and delivers recommendation information specialized for the user. Such recommendation information is frequently updated.
For example, when a user refers to various products in an web shopping service on a personal computer, recommendation information including related software, peripheral devices, and the like is displayed based on a selection result of a main body of a desktop or laptop personal computer, or the like.
In recent years, a technology in which a mobile terminal transmits a compressed log to a server for reducing transfer load, and the server performs analysis based on the compressed log is expected. For example, a technology of partially decompressing compressed files, and a technology of efficiently searching a log are disclosed.
In addition, for searching a log when analyzing the log, the server creates an inverted index indicating a word included in the log, a document ID including each word, and a position of the word, using a programming model such as MapReduce. Specifically, creation of an inverted index using the MapReduce will be described with reference to
Patent Literature 1: Japanese Laid-open Patent Publication No. 2012-141830
Patent Literature 2: International Publication Pamphlet No. WO 2013/136418
According to an aspect of an embodiment, an analysis system includes a terminal and a server, wherein the terminal includes a first processor configured to: collect a log of an operation of the terminal or a log of sensing information that is acquirable in the terminal; and create coded information obtained by encoding the log, or index information indicating an appearance position in the log of a word included in the log that uses the coded information, and wherein the server includes a second processor configured to: acquire the coded information or the index information from the terminal, and when the coded information is acquired, create the index information; and analyze information related to the terminal, using the index information.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
Nevertheless, in the case of compressing collected logs in a mobile terminal, in a conventional compression format such as ZIP, it is impossible to stably increase a compression rate unless a large amount of logs are compressed at a time. In addition, if a large amount of compression target logs are compressed at a time in a mobile terminal, a memory and a storage resource of the mobile terminal become scarce. In addition, for creating an inverted index needed for search and analysis of logs received by the server, logs compressed in a conventional format need to be expanded, and processing cost increases.
Preferred embodiments will be explained with reference to accompanying drawings. In addition, the embodiments are not intended to limit the disclosed technology. In addition, the embodiments can be appropriately combined without causing contradiction in processing content.
Description of Analysis Processing
System Configuration
The terminal 10 is implemented by a mobile terminal wirelessly connected to the network 30, for example. A plurality of terminals 10, which are not illustrated in the drawing, exists, and the terminals 10 are used by respective different users. Through analysis processing to be described later, the terminal 10 receives operation input of an application or the like that is performed by the user, collects and accumulates logs of operations, compresses accumulated logs, and transmits the compressed logs to the server 20. In addition, the terminal 10 presents recommendation information created by the server 20, to the user. In addition, the terminal 10 may directly perform communication with the server 20 not via the network 30, by wireless communication such as Bluetooth (registered trademark). In addition, the terminal 10 is not limited to a mobile terminal, and may be connected to the network 30 in a wired manner.
The server 20 is a server device that acquires logs of operations or the like of the terminal 10 that are performed by the user, analyzes the logs, creates recommendation information for the user of the logs, and delivers the recommendation information to the corresponding terminal 10. Through the analysis processing to be described later, the server 20 of the present embodiment acquires compressed logs from the terminal 10, and creates an inverted index used for search in analyzing logs.
Configuration of Terminal
As an embodiment, the terminal 10 can be implemented by the above-described analysis processing installing, onto a desired computer, an analysis program provided as packaged software or online software. For example, the terminal 10 can be implemented by installing the above-described analysis program onto an information processing apparatus used by the user. In addition to this, the terminal 10 can be implemented by installing the above-described analysis program onto a server device that accommodates an information processing apparatus used by the above-described user, or the like, as a client terminal. In this case, the terminal 10 may be implemented as part of a business system such as sales amount management, or may be implemented as a cloud that provides a service realized by the above-described analysis processing, by outsourcing.
Functional units corresponding to signs 11 to 15 are illustrated in
As illustrated in
The communication unit 11 is a processing unit that controls, via the network 30, data communication between an external device such as the server 20, and the control unit 15. The communication unit 11 corresponds to a communication device such as a network interface card (NIC), for example.
The input unit 12 is an input device for inputting various types of information to the terminal 10. For example, the input unit 12 corresponds to a mouse, a keyboard, a touch panel, an input button, or the like.
The output unit 13 is a display device that displays various types of information. For example, the output unit 13 corresponds to a display device such as a liquid crystal display (LCD) and a cathode ray tube (CRT).
The storage unit 14 is a device that stores data used in an operating system (OS) executed by the control unit 15, or various programs such as an application program. For example, the storage unit 14 is implemented as a main storage device in the terminal 10. For example, as the storage unit 14, various semiconductor memory elements such as, for example, a random access memory (RAM) and a flash memory can be employed. In addition, the storage unit 14 can also be implemented as an auxiliary storage device. In this case, a hard disk drive (HDD), an optical disk, a solid state drive (SSD), or the like can be employed.
As an example of data used in programs executed by the control unit 15, the storage unit 14 stores the word code allocation unit 14a, the compressed log 14b, and the inverted index 14c, which will be described later. In addition to the data, other types of electrical data can also be stored together in the storage unit 14.
The control unit 15 includes an internal memory that stores various programs and control data, and executes various types of processing using these.
For example, the control unit 15 is implemented as a central processor, that is to say, a so-called central processing unit (CPU). The control unit 15 needs not be always implemented as a central processor, and may be implemented as a micro processing unit (MPU) or a digital signal processor (DSP). In this manner, the control unit 15 is implemented as a processor, and a type thereof is not especially limited to a general-purpose type or a specialized type. In addition, the control unit 15 can also be realized by a hard-wired logic such as an application specific integrated circuit (ASIC) and a field programmable gate array (FPGA).
By executing various programs, the control unit 15 virtually realizes the following processing units. For example, the control unit 15 includes a collection unit 15a and an encoding unit 15b as illustrated in
The collection unit 15a is a processing unit that collects logs of operations of the terminal 10, or logs of sensing information that is acquirable in the terminal 10. For example, the collection unit 15a collects, for each user, logs of product purchase, site browse, and the like.
The encoding unit 15b is a processing unit that creates coded information obtained by encoding logs collected by the collection unit 15a. For example, as coded information, the encoding unit 15b compresses logs using the word code allocation unit 14a to be described later. In addition, the encoding unit 15b creates an inverted index using the word code allocation unit 14a and the compressed logs. In addition, the encoding unit 15b stores the compressed log into the compressed log 14b of the storage unit 14. In addition, the encoding unit 15b stores the created inverted index into the inverted index 14c of the storage unit 14.
In addition, in the present embodiment, coded information includes an inverted index indicating an appearance position in a log of a word included in the log. In other words, as exemplified in
In this manner, in the present embodiment, simultaneously with processing of performing compression of logs, the encoding unit 15b can create an inverted index through one-path processing. In addition, when a log is added, the encoding unit 15b needs not re-create an inverted index. Thus, an inverted index can be easily added.
Configuration of Server
As an embodiment, the server 20 can be implemented by the above-described analysis processing installing, onto a desired computer, an analysis program provided as packaged software or online software. For example, the server 20 can be implemented by installing the above-described analysis program onto an information processing apparatus used by an administrator of an internet shopping site. In addition to this, the server 20 can be implemented by installing the above-described analysis program onto a server device that accommodates an information processing apparatus used by the above-described administrator of the internet shopping site, or the like, as a client terminal. In this case, the server 20 may be implemented as part of a business system such as sales amount management, or may be implemented as a cloud that provides a service realized by the above-described analysis processing, by outsourcing.
The description will return to the description of
As illustrated in
The communication unit 21 is a processing unit that controls, via the network 30, data communication between an external device such as the terminal 10, and the control unit 25. The communication unit 21 corresponds to a communication device such as an NIC, for example.
The input unit 22 is an input device for inputting various types of information to the server 20. For example, the input unit 22 corresponds to a mouse, a keyboard, a touch panel, an input button, or the like.
The output unit 23 is a display device that displays various types of information. For example, the output unit 23 corresponds to a display device such as an LCD and a CRT.
The storage unit 24 is a device that stores data used in an OS executed by the control unit 25, or various programs such as an application program. For example, the storage unit 24 is implemented as a main storage device in the server 20. For example, as the storage unit 24, various semiconductor memory elements such as, for example, a RAM and a flash memory can be employed. In addition, the storage unit 24 can also be implemented as an auxiliary storage device. In this case, an HDD, an optical disk, an SSD, or the like can be employed.
As an example of data used in programs executed by the control unit 25, the storage unit 24 stores the word code allocation unit 24a, the compressed log 24b, and the inverted index 24c as described later. In addition to the data, other types of electrical data can also be stored together in the storage unit 24.
The control unit 25 includes an internal memory that stores various programs and control data, and executes various types of processing using these.
For example, the control unit 25 is implemented as a central processor, that is to say, a so-called CPU. The control unit 25 needs not be always implemented as a central processor, and may be implemented as a MPU or a DSP. In this manner, the control unit 25 is implemented as a processor, and a type thereof is not especially limited to a general-purpose type or a specialized type. In addition, the control unit 25 can also be realized by a hard-wired logic such as an ASIC and an FPGA.
By executing various programs, the control unit 25 virtually realizes the following processing units. For example, as illustrated in
The acquisition unit 25a is a processing unit that acquires coded information from the terminal 10. For example, the acquisition unit 25a acquires, as coded information, compressed logs, a word code allocation unit, and a created inverted index, from the terminal 10, and stores these into the storage unit 24. For example, the acquisition unit 25a accumulates the acquired word code allocation unit into the word code allocation unit 24a of the storage unit 24, accumulates the compressed logs into the compressed log 24b, and accumulates the inverted index into the inverted index 24c. In addition, the acquisition unit 25a updates the inverted index 24c accumulated in the storage unit 24, using the acquired inverted index. Here, the word code allocation unit 24a and the inverted index 24c that correspond to the compressed log 24b are associated with each other by a method such as a method of granting respective pieces of identification information.
The analysis unit 25b is a processing unit that analyzes information related to a terminal, using acquired coded information. For example, the analysis unit 25b analyzes tastes such as interests, customs, properties, behaviors, ways of using a terminal, and a living environment that are related to the user who operates the terminal. In this case, the analysis unit 25b calculates, using collaborative filtering, the number of times or frequency of product reference, the number of purchases, and the like, from logs of operations that indicate behaviors of the user, and calculates degrees of similarity between behaviors of the user and behaviors of other users. In addition, the analysis unit 25b creates information of products purchased by other users with high degrees of similarity in behaviors, as recommendation information, and presents the recommendation information to the terminal 10 of the analysis target user.
Flow of Processing
In the server 20, the acquisition unit 25a acquires the compressed log, the inverted index, and the word code allocation unit that have been transferred from the terminal 10, and accumulates these into the storage unit 24 (Step S4). In addition, the analysis unit 25b searches the compressed log accumulated in the compressed log 24b, using the corresponding inverted index and word code allocation unit, and analyzes tastes of the user who operates the terminal 10 (Step S5). In addition, the analysis unit 25b creates recommendation information for the user based on the analysis (Step S6), and delivers the recommendation information to the terminal 10 (Step S7). The terminal 10 that has received the recommendation information causes the output unit 13 to display the recommendation information. Through the above-described steps, a series of analysis processes end.
One Aspect of Effect
As described above, the analysis system 1 according to the present embodiment includes the terminal 10 and the server 20. The terminal 10 includes the collection unit 15a and the encoding unit 15b. The collection unit 15a collects logs of operations of the terminal 10, or logs of sensing information that is acquirable in the terminal 10. The encoding unit 15b creates coded information obtained by encoding logs. In the present embodiment, coded information includes an inverted index indicating an appearance position in a log of a word included in the log. The server 20 includes the acquisition unit 25a and the analysis unit 25b. The acquisition unit 25a acquires coded information from the terminal 10. The analysis unit 25b analyzes testes related to the user who operates the terminal 10, using the acquired coded information.
Here,
In contrast to this, the analysis system 1 of the present embodiment can create an inverted index through one-path processing. In addition, the analysis system 1 can searches the compressed log acquired from the terminal 10, in an original format, without expanding the compressed log, and analyze tastes of the user. Thus, according to the analysis system 1 of the present embodiment, logs collected by the mobile terminal 10 can be analyzed in the server 20 with suppressed load.
The embodiment related to the disclosed device has been described so far. The present invention may be implemented in various different forms aside from the above-described embodiment. Thus, other embodiments included in the present invention will be described below. In addition, parts different from the above-described first embodiment will be described below.
In the above-described first embodiment, when compressing a log, the encoding unit 15b of the terminal 10 simultaneously creates an inverted index, and transfers coded information including the compressed log and the inverted index, to the server 20. Nevertheless, the present invention is not limited to this. For example, the encoding unit 15b may create only a compressed log as coded information.
In this manner, by the server 20 creating an inverted index, load of information transfer from the terminal 10 to the server 20 can be reduced. If an inverted index is not needed when the analysis unit 25b of the server 20 performs analysis of a log, the present embodiment is preferable.
Aside from the above-described first and second embodiments, the encoding unit 15b of the terminal 10 may transfer an inverted index to the server 20 as coded information.
In this manner, the terminal 10 transfers the word code allocation unit corresponding to the inverted index, to the server 20, as coded information, and does not transfer a compressed log, whereby load of information transfer from the terminal 10 to the server 20 can be drastically reduced. If the server 20 performs analysis using only an inverted index without using a log itself, the present embodiment is preferable.
The encoding unit 15b of the terminal 10 may switch whether to create an inverted index, according to a state of connection to a network that depends on a position of the terminal 10 or the like, a storage remaining amount, or a condition of resources such as batteries. For example, when a network environment is good, the encoding unit 15b transfers a compressed log and a word code allocation unit to the server 20 without creating an inverted index, whereby load on the terminal 10 can be reduced. In addition, when a network environment is bad, the encoding unit 15b creates an inverted index, and transfers the inverted index to the server 20, whereby a decline in update frequency of the inverted index can be prevented.
The bit-mapped inverted index created in the above-described first to third embodiments may be further compressed.
Analysis Program
Various types of processing described in the above-described embodiments can be realized by executing programs prepared in advance, on a computer such as a personal computer and a work station. Thus, an example of a computer that executes an analysis program having functions similar to the above-described embodiments will be described below using
As illustrated in
Under such an environment, the CPU 150 reads the analysis program 170a from the HDD 170, and loads the analysis program 170a onto the RAM 180. As a result, the analysis program 170a functions as an analysis process 180a as illustrated in
In addition, the above-described analysis program 170a needs not be always stored in the HDD 170 or the ROM 160 from the beginning. For example, each program is stored into a “portable physical medium” such as a flexible disk, a so-called floppy disk (FD), a CD-ROM, a DVD disc, a magnetic optical disk, and an IC card that are inserted into the computer 100. Then, the computer 100 may acquire each program from these portable physical media, and execute each program. In addition, each program may be stored into another computer or a server device that is connected to the computer 100 via a public line, the internet, a LAN, a wide area network (WAN), or the like, and the computer 100 may acquire each program from these devices, and execute each program.
According to the embodiments, analysis of each terminal can be performed while suppressing analysis load in a server.
All examples and conditional language recited herein are intended for pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although the embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2017-097666 | May 2017 | JP | national |