The present invention relates to a data processing device that performs database creation and the like.
In related arts, a data processing device disclosed in Japanese Patent Laid-Open No. 2011-48527 has been known. In the data processing device, a search target database is created by extracting a sensitivity expression from Japanese text information and associating sensitivity information and side information with a search target using a created sensitivity expression database.
Next, when a user inputs the sensitivity expression as a search condition, the sensitivity information and the side information corresponding to the sensitivity expression are acquired from the sensitivity expression database, the search target database is searched for the sensitivity information according to the side information, and a distance between the sensitivity information acquired from the search target database and the sensitivity information acquired from the sensitivity expression database is calculated. Then, various information items such as a search target ID are displayed side by side on a screen in order from the closest distance.
According to the data processing device disclosed in Japanese Patent Laid-Open No. 2011-48527, since the search target database is merely created from Japanese text information and a data collection range is restricted, there is a problem that the search target database is low in terms of usefulness. In addition, since noise, which is unnecessary information having no value in use, is not considered, the search target database may be created with noise. In this case, the creation efficiency of the search target database is reduced, and the usefulness of the search target database is further reduced.
The present invention has been made to solve the above problems, and is to provide data processing device capable of improving the creation efficiency and database usefulness at the time of creating a database.
In order to achieve the above object, according to a first aspect of the present invention, a data processing device includes: an output interface; an input interface configured to be operated by a user; a text information acquisition unit configured to acquire a plurality of text information items from information published on a predetermined media under a predetermined acquisition condition; a text information display unit configured to display the plurality of text information items on the output interface; a noise-removed information creation unit configured to, when at least a part of each of the plurality of text information items displayed on the output interface is designated as noise by an operation of the input interface from the user, create a noise-removed information item which is text information obtained by removing text information including the part designated as the noise from the plurality of text information items; and a database creation unit configured to create a database by performing predetermined processing on the noise-removed information item.
According to the data processing device, the plurality of first text information are acquired from the information published on the predetermined media under the predetermined acquisition condition, and the plurality of text information items are displayed on the output interface. Then, when at least a part of each of the plurality of text information items displayed on the output interface is designated as noise by the operation of the input interface from the user, the noise-removed information item is created which is text information obtained by removing text information including the part designated as the noise from the plurality of text information items. As described above, it is possible to easily and appropriately remove the text information including the data regarded as the noise by the user from the plurality of text information items only by selecting the noise with the operation of the input interface from the user, and to create the noise-removed information item as a result of the removal.
Further, since the noise-removed information item created in such a manner is subjected to the predetermined processing and thus the database is created, it is possible to create the database in a state where the text information regarded as the noise by the user is excluded. Thereby, the creation efficiency and database usefulness at the time of creating a database can be improved.
According to a second aspect of the present invention, in the data processing device according to the first aspect, the data processing device further includes: a noise storage unit configured to store the noise; and a noise display unit configured to display the noise stored in the noise storage unit on the output interface when a display operation of the noise is executed by the operation of the input interface from the user.
According to the data processing device, when the display operation of the noise is executed by the operation of the input interface from the user, the noise stored in the noise storage unit is displayed on the output interface, so that the user can visually recognize the noise selected up to the present time by the user. Thereby, convenience can be improved.
According to a third aspect of the present invention, in the data processing device according to the first aspect, the text information acquisition unit extracts sensitivity information from the information published on the predetermined media, and acquires the plurality of text information items as information in which the sensitivity information is associated with the information published on the predetermined media, the data processing device further includes a noise-removed information display unit configured to display the noise-removed information item on the output interface together with the sensitivity information associated with the noise-removed information item, and the predetermined processing of the database creation unit includes sensitivity information correction processing of correcting the sensitivity information in the one or more noise-removed information items displayed on the output interface, the sensitivity information correction processing being executed by the operation of the input interface from the user.
According to the data processing device, the sensitivity information is extracted from the information published on the predetermined media, the plurality of text information items are acquired as the information in which the sensitivity information is associated with the information published on the predetermined media, and the noise-removed information item is displayed on the output interface together with the sensitivity information. Then, since the sensitivity information correction processing is executed by the operation of the input interface from the user at the time of creating the database to correct the sensitivity information in the noise-removed information item displayed on the output interface, the user can visually recognize and easily correct the sensitivity information in the noise-removed information item. Thereby, the creation efficiency and database usefulness at the time of creating a database can be improved.
According to a fourth aspect of the present invention, in the data processing device according to the first aspect, the data processing device further includes a tag information storage unit configured to store tag information defined by the user, and the predetermined processing of the database creation unit includes association processing of associating the noise-removed information item with the tag information stored in the tag information storage unit.
According to the data processing device, since the association processing of associating the noise-removed information item with the tag information stored in the tag information storage unit is executed at the time of creating the database, a database search can be executed based on the tag information and the usefulness of the database can be further improved.
According to a fifth aspect of the present invention, in the data processing device according to the first aspect, the text information display unit displays sets of text information on the output interface in order from a largest set, the sets of information each including identical information or identical and similar information when the plurality of text information items are sorted according to meaning of information included in the plurality of text information items.
According to the data processing device, since the sets of text information including the identical information or the identical and similar information when the plurality of text information items are sorted according to the meaning of the information included in the plurality of text information items are displayed on the output interface in order from the largest set, the user can designate the noise in order from the largest text information set. Thereby, the text information including the noise can be efficiently removed from the plurality of text information items. Thus, the creation efficiency at the time of creating a database can be further improved.
According to a sixth aspect of the present invention, in the data processing device according to the third aspect, the database creation unit creates the database in a state where the sensitivity information is sorted into a plurality of categories, and the data processing device includes a sensitivity information display unit configured to display the sensitivity information on the output interface in different colors, the sensitivity information being sorted into the plurality of categories and included in the database.
According to the data processing device, since the sensitivity information sorted into the plurality of categories and included in the database is displayed on the output interface in different colors, the user can easily identify and visually recognize the plurality of categories of sensitivity information.
According to a seventh aspect of the present invention, in the data processing device according to the first aspect, the predetermined acquisition condition is a condition that the information published on the predetermined media includes predetermined information and does not include predetermined confusion information which is confusable with the predetermined information.
According to the data processing device, since the plurality of text information items are acquired from the information published on the predetermined media under the condition that the information published on the predetermined media includes the predetermined information and does not include the predetermined confusion information which is confusable with the predetermined information, the plurality of text information items can be acquired as information including the predetermined information with accuracy. Thereby, the creation efficiency at the time of creating a database can be further improved.
In order to achieve the above object, according to an eighth aspect, a data processing method includes: acquiring a plurality of text information items from information published on a predetermined media under a predetermined acquisition condition; displaying the plurality of text information items on the output interface; creating a noise-removed information item which is text information obtained by removing text information including the part designated as the noise from the plurality of text information items when at least a part of each of the plurality of text information items displayed on the output interface is designated as noise by an operation of the input interface from the user; and creating a database by performing predetermined processing on the noise-removed information item.
A data processing device according to an embodiment of the present invention will be described below with reference to the drawings.
The main server 2 includes a storage, a processor, a memory (for example, RAM, E2PROM, or ROM) and an I/O interface. A large number of external servers 4 (only three are illustrated) are connected to the main server 2 via a network 3 (for example, Internet).
In this case, various SNS servers, servers of predetermined media (for example, newspaper companies), and servers of search sites correspond to the external servers 4. The data processing device 1 acquires text data (text information) from such external servers 4 via the main server 2 as will be described below.
In addition, the data processing device 1 is of a PC type, and includes a display 1a, a device body 1b, and an input interface 1c. The device body 1b includes a storage such as an HDD, a processor, and a memory (RAM, E2PROM, or ROM) (none are illustrated), and application software for data acquisition (hereinafter, referred to as “data acquisition software”) is installed in the storage of the device body 1b.
Further, the input interface 1c includes a keyboard and a mouse configured to operate the data processing device 1. In the present embodiment, the display 1a corresponds to an output interface, and the device body 1b corresponds to a text information acquisition unit, a text information display unit, a noise-removed information creation unit, a database creation unit, a noise storage unit, a noise display unit, a noise-removed information display unit, a tag information storage unit, and a sensitivity information display unit.
In the data processing device 1, database creation processing is executed as will be described below. Specifically, when the data acquisition software starts up with an operation of the input interface 1c from a user, a screen as illustrated in
In the case of the GUI, a data acquisition button 10, a data cleansing button 20, a sensitivity correction button 30, a tagging button 40, and a visualization button 50 are displayed vertically in a row on a left side of the display 1a. Then, the user presses these buttons via the input interface 1c, thereby database creation processing is executed as will be described below. In the following description, the operation of the input interface 1c from the user is referred to as “user operation”.
The above-described database creation processing will be described below with reference to
Note that any data acquired or created during the execution of the database creation processing is stored in the storage of the device body 1b of the data processing device 1. Further, such data may be configured to be stored in the memory of the device body 1b, the storage externally attached to the device body 1b, or the main server 2.
As illustrated in
Next, data cleansing processing is executed (STEP 2 in
Subsequently, sensitivity information correction processing is executed (STEP 3 in
Subsequent to the sensitivity information correction processing, user-definition tagging processing is executed (STEP 4 in
Next, data visualization processing is executed (STEP 5 in
The contents of the above-described data acquisition processing will be described below with reference to
On the other hand, when such determination is affirmative (YES in STEP 10 in
In the media selection screen, the data acquisition button 10 is configured such that an outer frame is displayed with a thick line and an inside is displayed in a shaded state to indicate that the data acquisition button 10 is pressed as described above.
On an upper side of the media selection screen, a media selection icon 11, a period input icon 12, a language selection icon 13, a keyword input icon 14, an additional information selection icon 15, and a final confirmation icon 16 are displayed in this order from left to right. In addition, a Next button 17 is displayed on a lower right side of the media selection screen.
In order to indicate that the media selection processing is being executed, the media selection icon 11 is inversely displayed and characters “Select Media” are displayed below the icon. In
Further, during the execution of the media selection processing, a plurality of check boxes are displayed in a center of the media selection screen to select media. In the example illustrated in
In this case, the check boxes 11a to 11c are used to select “TWITTER (registered trademark)”, “FACEBOOK (registered trademark)”, and “YOUTUBE (registered trademark)” as media, respectively, and the check boxes 11d to 11f are used to select the other three media, respectively.
The check box corresponding to the selected media is checked and the check box is inversely displayed at the same time to indicate that any of the media is selected by the user operation in the state where the check boxes 11a to 11f are displayed as described above. In the example illustrated in
Next, it is determined whether the media selection processing is completed (STEP 12 in
When the determination is negative (NO in STEP 12 in
The period input processing is to input a period at which the text data is acquired from the media selected as described above, and during the execution of the period input processing, a period input screen is displayed on the display 1a as illustrated in
In the period input screen, the period input icon 12 is inversely displayed to indicate that the period input processing is being executed. In a center of the period input screen, an input field 12a is displayed to input a search start date which is a start point of a data acquisition period, and an input field 12b is displayed to input a search end date which is an end point of the data acquisition period.
Further, a Back button 18 is displayed on a lower left side of the period input screen. The Back button 18 is used to return to the screen of the processing (that is, the media selection processing) before the period input processing, and this shall be applied to various screens for processing to be described below. In the period input processing, the search start date and the search end date are input to the input fields 12a and 12b by the user operation. The period input processing is executed as described above.
Next, it is determined whether the period input processing is completed (STEP 14 in
When the determination is negative (NO in STEP 14 in
The language selection processing is to select a language for acquiring the text data from the media selected as described above, and during the execution of the language selection processing, a language selection screen is displayed on the display 1a as illustrated in
Further, three check boxes 13a to 13c are vertically displayed side by side on a left side of the language selection screen. The check box 13a is used to select both Japanese and English as the language for acquiring the text data, and characters “Japanese/English” are displayed on a right side of the check box 13a to indicate such usage.
In addition, the check box 13b is used to select Japanese as the language for acquiring the text data, and a character “Japanese” is displayed on a right side of the check box 13b to indicate such usage. Further, the check box 13c is used to select English as the language for acquiring the text data, and a character “English” is displayed on a right side of the check box 13c to indicate such usage.
In order to indicate that any of the languages is selected by the user operation in the state where the check boxes 13a to 13c are displayed as described above, the check box corresponding to the selected media is checked and the check box is inversely displayed at the same time. In the example illustrated in
Next, it is determined whether the language selection processing is completed (STEP 16 in
When the determination is negative (NO in STEP 16 in
The keyword input processing is to input a search keyword and an exclusion keyword during acquisition of the text data from the external server 4, and during execution of the keyword input processing, a keyword input screen is displayed on the display 1a as illustrated in
In the keyword input screen, the keyword input icon 14 is inversely displayed and characters “Keyword Definition” are displayed on a lower side of the keyword input icon 14 to indicate that the keyword input processing is being executed.
Further, two input fields 14a and 14b and an Add button 14c are displayed in a center of the keyword input screen. The input field 14a is used to input a search keyword, and characters “Search Keyword” are displayed above the input field 14a to indicate such usage. Further, the Add button 14c is used to add the input field 14a.
In addition, the input field 14b is used to input an exclusion keyword, and characters “Exclusion Keyword” are displayed above the input field 14b to indicate such usage. The reason for using the exclusion keyword is as follows.
In other words, when the text data is acquired from the external server 4, if the text data in the external server 4 retains keywords that is not related to the search keyword but is equal to or similar to the search keyword, it is highly possible that such text data will be acquired in a state of being confused with the original text data. Therefore, the exclusion keyword is used to avoid acquisition of such unnecessary text data.
In the keyword input processing, the search keyword and the exclusion keyword are input by the user operation in a state where the keyword input screen is displayed.
Next, it is determined whether the keyword input processing is completed (STEP 18 in
When the determination is negative (NO in STEP 18 in
The additional information selection processing is to select information to be added to the text data when the text data is acquired from the media selected as described above, and during execution of the additional information selection processing, an additional information selection screen is displayed on the display 1a as illustrated in
In the additional information selection screen, the additional information selection icon 15 is inversely displayed and characters “Additional Info” are displayed below the icon to indicate that the additional information selection processing is being executed. In addition, three check boxes 15a to 15c are displayed on a left side of the additional information selection screen. The check box 15a is used to add sensitivity information to be described below to the acquired data, and characters “sensitivity information” are displayed on a right side of the check box 15a to indicate such usage.
In addition, the check box 15b is used to add information related to the keyword to the acquired data, and characters “Keyword Information” are displayed on a right side of the check box 15b to indicate such usage. Further, the check box 15c is used to improve the accuracy of the sensitivity information for long sentences, and characters “Improvement in accuracy of sensitivity information for long sentences” are displayed on a right side of the check box 15c to indicate such usage.
In order to indicate that any of the check boxes 15a to 15c is selected by the user operation in the state where the check boxes 15a to 15c are displayed as described above, the selected check box is checked and the check box is inversely displayed at the same time. In the example illustrated in
Next, it is determined whether the additional information selection processing is completed (STEP 20 in
When the determination is negative (NO in STEP 20 in
The final confirmation processing is to finally confirm the result selected and input by the user as described above, and during execution of the final confirmation processing, a final confirmation screen is displayed on the display 1a as illustrated in
In the final confirmation screen, the final confirmation icon 16 is inversely displayed and a character “Confirmation” is displayed below the icon to indicate that the final confirmation processing is being executed. In addition, various items set as described above and setting values of such items are displayed in a center of the final confirmation screen, and a Finish button 19 is displayed on a lower right side of the screen. The final confirmation processing is executed as described above.
Next, it is determined whether the final confirmation processing is completed (STEP 22 in
When the determination is negative (NO in STEP 22 in
Specifically, the text data is acquired from the external server 4 of the media selected as described above via the main server 2, under various conditions set by the user as described above. In this case, when both Japanese and English are selected as the language for acquiring the text data, mixture data of English machine-translated text data and Japanese text data is acquired as text data. In this case, the text data may be acquired from the external server 4 by the data processing device 1 without using the main server 2.
Subsequently, sensitivity information extraction processing is executed (STEP 24 in
Next, preservation data is created (STEP 25 in
Next, the preservation data created as described above is stored in the storage of the device body 1b as a part of the database (STEP 26 in
Contents of the data cleansing processing (STEP 2 in
On the other hand, when the determination is affirmative (YES in STEP 40 in
In the data selection processing, a data selection screen is displayed on the display 1a as illustrated in
In order to indicate that the data selection processing is being executed, the data file selection icon 21 is inversely displayed, and characters “Select Data File” are displayed below the icon. At the same time, a display window 24a and a selection button 25a are displayed in a center of the data selection screen.
When the selection button 25a is pressed by the user operation, a menu screen (not illustrated) is displayed, and folders and data in the storage of the device body 1b are displayed (neither is illustrated). In such a state, when a data file to be subjected to the data cleansing processing by the user operation is selected, a path name of the folder in which the data file is stored and a data file name are displayed on the display window 24a. In the example illustrated in
In this case, when the respective processed of STEPs 1 to 4 illustrated in
Next, it is determined whether the data selection processing is completed (STEP 42 in
When the determination is negative (NO in STEP 42 in
The cleansing keyword processing is to exclude unnecessary data from the data file selected as described above, and during execution of the cleansing keyword processing, a cleansing keyword screen is displayed on the display 1a as illustrated in
In the cleansing keyword screen, the cleansing keyword icon 22 is inversely displayed and a character “Cleansingkeyword” is displayed on a lower side of the icon to indicate that the cleansing keyword processing is being executed.
Further, in a center of the cleansing keyword screen, text data in the data file are displayed from top to bottom in descending order of the number of overlapping times. In other words, when sets of completely matching text data exist in the data file, the sets are displayed in order from the largest set. Further, in each data, a ranking (No.) of the number of overlapping times, text data (TEXT), and the number of overlapping times (COUNT) are displayed from the left to the right.
On a left side of the text data, an operation button 24, a cleansing button 25, a keyword preservation button 26, and a keyword read button 27 are displayed in order from top to bottom. Further, on a lower right side of the text data, a large number of buttons 28a indicating the number of pages of the text data and buttons 28b and 28b configured to turn the pages of the text data are displayed.
When the user visually recognizes the text data displayed on the cleansing keyword screen and finds unnecessary text data, the user presses the operation button 24 via the input interface 1c, and then selects an exclusion keyword (noise) included in the unnecessary text data with a pointer. Then, when the exclusion keyword is selected in such a way, the selected exclusion keyword (“Kini speed” (in Japanese “”) in
When the cleansing button 25 is pressed by the user operation on the cleansing keyword screen, as illustrated in
In addition, when the cleansing button 25 is pressed by the user operation in the screen display state illustrated in
Next, it is determined whether the cleansing keyword processing is completed (STEP 44 in
When the determination is negative (NO in STEP 44 in
The final confirmation processing is to finally confirm the exclusion keyword selected by the user as described above, and during execution of the final confirmation processing, a final confirmation screen is displayed on the display 1a as illustrated in
In the final confirmation screen, the final confirmation icon 23 is inversely displayed and a character “Confirmation” is displayed below the icon to indicate that the final confirmation processing is being executed. Further, the search keyword and the exclusion keyword input in the cleansing keyword processing are displayed in a center of the final confirmation screen. In the example illustrated in
Next, it is determined whether the final confirmation processing is completed (STEP 46 in
When the determination is negative (NO in STEP 46 in
Contents of the above-described sensitivity information correction processing (STEP 3 in
On the other hand, when such determination is affirmative (YES in STEP 50 in
In the data selection processing, a data selection screen is displayed on the display 1a as illustrated in
In order to indicate that the data selection processing is being executed, the data file selection icon 31 is inversely displayed and characters “Select Data File” are displayed below the icon. At the same time, a display window 34 and a selection button 35 are displayed in a center of the data selection screen.
When the selection button 35 is pressed by the user operation, a menu screen (not illustrated) is displayed, and folders and data in the storage of the device body 1b are displayed (neither are illustrated). In such a state, when a data file to be subjected to sensitivity correction by the user operation is selected, a path name of the folder in which the data file is stored and a data file name are displayed on the display window 34.
Also in the data selection processing, when the preservation data, the cleansed data, the sensitivity-corrected data, and the database are stored in the storage of the device body 1b, the user can arbitrarily select any of these four types of data files. The data selection processing is executed as described above.
Next, it is determined whether the data selection processing is completed (STEP 52 in
When the determination is negative (NO in STEP 52 in
The sensitivity correction processing is to correct erroneous sensitivity information associated with the data file selected as described above, and during execution of the sensitivity correction processing, a sensitivity correction screen is displayed on the display 1a as illustrated in
In the sensitivity correction screen, a sensitivity correction icon 32 is inversely displayed and a character “SenseCheck” is displayed below the icon to indicate that the sensitivity correction processing is being executed.
Further, on the sensitivity correction screen, tabs 36a to 36c of three major categories “Positive”, “Neutral”, and “Negative” are displayed from left to right. Then, when any of these tabs 36a to 36c is selected by the user operation, sensitivity information and text information are displayed.
For example, as illustrated in
When each data is displayed in this way, the user can determine whether the sensitivity information is correct with reference to the contents of the sensitivity information, the sensitivity expression and the text data which are displayed. For example, in the example illustrated in
Then, in the case of correcting the sensitivity information in this way, the user operates the input interface 1c to press a pull-down menu button 37 located on a right side of the display window of the sensitivity information of the No. 1 data. In response, as illustrated in
Next, it is determined whether the sensitivity correction processing is completed (STEP 54 in
When the determination is negative (NO in STEP 54 in
The final confirmation processing is to finally confirm the sensitivity information corrected by the user as described above, and during execution of the final confirmation processing, a final confirmation screen is displayed on the display 1a as illustrated in
In the final confirmation screen, the final confirmation icon 33 is inversely displayed and a character “Confirmation” is displayed below the icon to indicate that the final confirmation processing is being executed. Further, in a center of the final confirmation screen, text data (TEXT), expression (EXPRESSION), sensitivity information before correction (BEFORE), and sensitivity information after correction (AFTER) are displayed from left to right. In the example illustrated in
Next, it is determined whether the final confirmation processing is completed (STEP 56 in
When the determination is negative (NO in STEP 56 in
The contents of the above-described user-definition tagging processing (STEP 4 in
On the other hand, when such determination is affirmative (YES in STEP 60 in
The data selection processing is to select a data file to which a user-definition tag to be described below is added, and during execution of the data selection processing, a data selection screen is displayed on the display 1a as illustrated in
In order to indicate that the data selection processing is being executed, the data file selection icon 41 is inversely displayed and characters “Select Data File” are displayed below the icon. At the same time, a display window 43 and a selection button 44 are displayed in a center of the data selection screen.
When the selection button 44 is pressed by the user operation, a menu screen (not illustrated) is displayed, and folders and data in the storage of the device body 1b are displayed (neither are illustrated). In such a state, when a data file is selected by the user operation, a path name of the folder in which the data file is stored and a data file name are displayed on the display window 43.
Also in the data selection processing, when the preservation data, the cleansed data, the sensitivity-corrected data, and the database are stored in the storage of the device body 1b, the user can arbitrarily select any of these four types of data files. The data selection processing is executed as described above.
Next, it is determined whether the data selection processing is completed (STEP 62 in
When the determination is negative (NO in STEP 62 in
The user-definition tag selection processing is to select the user-definition tag associated with the data file selected as described above, and during execution of the user-definition tag selection processing, a user-definition tag selection screen is displayed on the display 1a as illustrated in
In the user-definition tag selection screen, the user-definition tag selection icon 42 is inversely displayed and characters “Tag Definition” are displayed below the icon to indicate that the user-definition tag selection processing is being executed. At the same time, a display window 45 and a selection button 46 are displayed in a center of the user-definition tag selection screen, and a preview button 47 is displayed below the selection button 46.
When the selection button 46 is pressed by the user operation, a menu screen (not illustrated) is displayed, and folders and data in the storage of the device body 1b are displayed (neither are illustrated). In such a state, when a user-definition tag file tagged with the text data is selected by the user operation, a path name of the folder in which the user-definition tag file is stored and a user-definition tag file name are displayed on the display window 45.
As described above, when the preview button 47 is pressed by the user operation in the state where the user-definition tag file name is displayed on the display window 45, a user-definition tag screen is displayed on the display 1a as illustrated in
In the example illustrated in
The user can confirm the contents of the user-definition tag file selected by himself/herself with reference to the tag list 48. Further, the user can return to the screen display illustrated in
Next, it is determined whether the user-definition tag selection processing is completed (STEP 64 in
When the determination is negative (NO in STEP 64 in
Next, the tagged data is stored in the storage of the device body 1b as a part of the database (STEP 66 in
The contents of the above-described data visualization processing (STEP 5 in
On the other hand, when such determination is affirmative (YES in STEP 70 in
The data selection processing is to select a data file of the database to be displayed as a graph, and during execution of the data selection processing, a data selection screen is displayed on the display 1a as illustrated in
On an upper side ofthe data selection screen, a data file selection icon 51 is displayed. In order to indicate that the data selection processing is being executed, the data file selection icon 51 is inversely displayed and characters “Select Data File” are displayed below the icon. At the same time, a display window 52 and a selection button 53 are displayed in a center of the data selection screen.
When the selection button 53 is pressed by the user operation, a menu screen (not illustrated) is displayed, and folders and data in the storage of the device body 1b are displayed (neither are illustrated). In such a state, when a data file of the database is selected by the user operation, a path name of the folder in which the data file is stored and a data file name are displayed on the display window 52.
Also in the data selection processing, when the preservation data, the cleansed data, the sensitivity-corrected data, and the database are stored in the storage of the device body 1b, the user can arbitrarily select any of these four types of data files. The data selection processing is executed as described above.
Next, it is determined whether the data selection processing is completed (STEP 72 in
When the determination is negative (NO in STEP 72 in
The data display processing is to display various data items in the data file selected as described above in a graph so that the user can visually recognize them. A description will be given with respect to an example of displaying a data file in which the text data file acquired in the above-described data acquisition processing is subjected to all the data cleansing processing, the sensitivity information correction processing, and the user-definition tagging processing.
During execution of the data display processing, an initial display screen is displayed on the display 1a as illustrated in
On a right side of the annular graph, a large number of minor categories (for example, “question”, “inquiry”, and “request”) subordinate to the sensitivity information “Neutral” are displayed in the form of a bar graph. In the case of the bar graph, a horizontal axis indicates the number of hits, and this also applies to bar graphs below.
Further, below the annular graph showing the proportions of the three major categories, a large number of minor categories (for example, “good”, “want to buy”, and “thank you”) subordinate to the sensitivity information “Positive” are displayed in the form of a bar graph. Below the bar graph of the sensitivity information “Neutral”, a large number of minor categories (for example, “bad”, “discontent”, and “being in trouble”) subordinate to the sensitivity information “Negative” are displayed in the form of a bar graph.
In addition, below the bar graph of the sensitivity information “Positive”, a large number of minor categories (for example, “N BOX (registered trademark), FIT (registered trademark), and FREED (registered trademark)) subordinate to the major category of the user-definition tag “4 wheels” are displayed in the form of a bar graph. Further, below the bar graph of the sensitivity information “Negative”, a large number of minor categories (for example, “CUB”, “BIO”, and “GOLD WING (registered trademark)”) subordinate to the major category of the user-definition tag “2 wheels” are displayed in the form of a bar graph.
In the bar graph of the sensitivity information “Neutral” on the initial display screen illustrated in
On the other hand, a return button 62 is displayed above a center of the inquiry related screen. When the return button 62 is pressed by the user operation, the screen displayed on the display 1a returns to the initial display screen from the inquiry related screen. In the bar graph of the sensitivity information “Neutral” on the initial display screen illustrated in
In the bar graph of the major category “2 wheels” of the user definition on the initial display screen illustrated in
A return button 62 is displayed above a center of the CUB related screen illustrated in
Next, it is determined whether the data display processing is completed (STEP 74 in
When the determination is negative (NO in STEP 74 in
As described above, according to the data processing device 1 of the present embodiment, after conditions of a media, a search period, a language, and a search keyword & exclusion keyword are determined as predetermined acquisition conditions by the user operation in the data acquisition processing, the text data is acquired from the external server 4. Then, the acquired text data is stored as preservation data in the storage of the device body 1b.
In this case, even when the text data including the keyword equal to or similar to the search keyword is present in the external server 4 as the exclusion keyword that is not related to the search keyword, since the keyword that can avoid the acquisition of the text data is input by the user operation, the text data related to the search keyword can be accurately acquired.
In the data cleansing processing, when the user finds unnecessary text data on the cleansing keyword screen, the user can delete all text data including the exclusion keyword and create the cleansed data by selecting the exclusion keyword included in the unnecessary text data and pressing the cleansing button 25.
At this time, since the text data in the data file is displayed from top to bottom in order from the largest number of overlapping times on the cleansing keyword screen, the user can select the exclusion keyword in order from the largest number of overlapping times of the text information. Therefore, the text information including the exclusion keyword as noise can be efficiently removed from the plurality of text information items.
Since the exclusion keyword input by the user is displayed on the cleansing keyword screen, the user can visually recognize the exclusion keyword selected up to the present time by the user. Thereby, convenience can be improved.
Further, since the sensitivity information and the text data are displayed on the sensitivity correction screen in the sensitivity information correction processing, the user can easily correct the sensitivity information while visually recognizing the displayed contents.
In addition, since the database is created by associating the user-definition tag with the text data in the user-definition tagging processing, the database search can be executed based on the user-definition tag information, and the usefulness of the database can be further improved.
Since the sensitivity information of the three major categories included in the database are displayed on the display 1a in the data visualization processing such that the colors are different from each other and the proportions thereof are known, the user can easily and visually recognize the proportions of the sensitivity information of the three major categories.
Although the embodiment is an example in which the personal computer-type data processing device 1 is used as the data processing device, the data processing device of the present invention may include the output interface, the input interface, the text information acquisition unit, the noise-removed information creation unit, and the database creation unit without being limited thereto. For example, a configuration in which the personal computer-type data processing device 1 and the main server 2 are combined may be used as the data processing device. In addition, a tablet terminal may be used as the data processing device, and a configuration in which the tablet terminal and the main server 2 are combined may be used as the data processing device.
Further, although the embodiment is an example in which the display 1a is used as the output interface, the output interface of the present invention may be any one capable of displaying a plurality of types of text information without being limited thereto. For example, one monitor or one touch panel-type monitor may be used as the output interface. In addition, a 3D hologram device or a head-mounted VR device may be used as the output interface.
Further, although the embodiment is an example in which the input interface 1c including the keyboard and the mouse is used as the input interface, the input interface of the present invention may be any one in which various operations are executed by the user without being limited thereto. For example, an optical pointing device such as a laser pointer may be used as the input interface, or contact-type devices such as a touch panel and a touch pen may be used as the input interface. Further, a contactless device capable of converting voice into various operations may be used as the input interface.
On the other hand, although the embodiment is an example in which conditions obtained by combinations of the search period, the search language, the search keyword, and the exclusion keyword, and the additional information are used as the predetermined acquisition conditions, the predetermined acquisition conditions of the present invention may use other conditions without being limited thereto. For example, as the predetermined acquisition conditions, conditions in which the search keyword and the exclusion keyword are further added to the above-described acquisition condition may be used.
In the embodiment, when the text data is displayed on the cleansing keyword screen as illustrated in
Further, although the embodiment is an example in which the exclusion keyword (Kini speed) is used as the noise, the noise of the present invention may be at least a part of each of the plurality of text information items without being limited thereto. For example, a combination of a plurality of words may be used as the noise.
On the other hand, the embodiment is an example in which SNS media configured by the external server 4 are used as the predetermined media, but the predetermined media of the present invention may be hardware such as TV and radio, or a mass media whose information is published on paper such as a newspaper without being limited thereto. In this case, when mass media such as TV, radio, and newspaper are used as the predetermined media, information (moving picture information, voice information, and character information) published on TV, radio, and newspaper may be input as text data via an input interface such as a personal computer.
In addition, although the embodiment is an example in which the sensitivity information is classified into two levels, that is, a major category and a minor category, the sensitivity information of the present invention may be classified into a plurality of levels from the highest level to the lowest level without being limited thereto. For example, the sensitivity information may be classified into three or more levels.
Number | Date | Country | Kind |
---|---|---|---|
2019-161263 | Sep 2019 | JP | national |