The present disclosure generally relates to a method for editing an electronic document, and more particularly, to a method, an electronic apparatus, and a non-transitory computer readable medium for masking data on an electronic document.
In current business practices, using the Internet to transmit electronic documents has become a daily routine for office workers. However, if electronic documents containing confidential business information or personal information are received by unauthorized individuals, the data may be misused for illegal activities, causing harm to the interests of individuals or companies.
In order to solve the above problems, it is necessary to provide a method for masking data on an electronic document to prevent confidential business information or personal information from being illegally used.
Accordingly, the present disclosure provides a method for masking data on an electronic document. The method is performed by an electronic apparatus and includes the following steps: displaying the electronic document on a user interface; causing at least one analysis module to perform at least one analysis on the electronic document and a plurality of strings of the electronic document and output a first string among the plurality of strings and first position information associated with the first string according to at least one result of the at least one analysis; obtaining the first string and the first position information from the at least one analysis module; and generating, based on the obtained first position information and the obtained first string, a first masking object to mask at least one portion of the first string on the electronic document.
In an embodiment of the present disclosure, the method further includes: causing the at least one analysis module to output a second string among the plurality of strings and second position information associated with the second string according to the at least one result of the at least one analysis; obtaining the second string and the second position information from the at least one analysis module; and generating, based on the obtained second position information and the obtained second string, a second masking object to mask at least one portion of the second string on the electronic document.
In an embodiment of the present disclosure, the first position information includes at least one first position coordinate to indicate a first position where the first string is located on the electronic document. In addition, the second position information includes at least one second position coordinate to indicate a second position where the second string is located on the electronic document.
In an embodiment of the present disclosure, the first position information further includes a first width value and a first height value for respectively indicating a first width and a first height of the first string on the electronic document. In addition, the second position information further includes a second width value and a second height value for respectively indicating a second width and a second height of the second string on the electronic document.
In an embodiment of the present disclosure, the method further includes: displaying a data type menu on the user interface, wherein the data type menu includes a first option and a second option, and the first option and the second option are respectively associated with a first data type and a second data type; detecting whether at least one of the first option and the second option is selected; and causing the at least one analysis module to perform the at least one analysis based on the first data type and the second data type after detecting that the first option and the second option are selected. In addition, the at least one result of the at least one analysis includes that the first string is determined as the first data type and that the second string is determined as the second data type.
In an embodiment of the present disclosure, the method further includes: displaying a data type list on the user interface after obtaining the first string, the second string, the first position information and the second position information from the at least one analysis module, wherein the data type list includes a first region associated with the first data type and a second region associated with the second data type; and listing, based on the first data type and the second data type, the first string and the second string, respectively, on the first region and the second region.
In an embodiment of the present disclosure, the method further includes: detecting whether the first string listed on the first region is selected; generating, based on the obtained first position information, a first marking object to mark at least one portion of the first string on the electronic document after detecting that the first string listed on the first region is selected; and generating, in response to a first operation event, the first masking object to replace the first marking object.
In an embodiment of the present disclosure, the method further includes recording, in response to a second operation event, the first position information and the second position information in an information file and storing the information file in a database. In addition, the information file is independent from the electronic document.
In an embodiment of the present disclosure, the method further includes: generating, based on the first position information, a first marking object to mark at least one portion of the first string on the electronic document after obtaining the first string, the second string, the first position information and the second position information from the at least one analysis module; detecting whether the first string listed in the first region is selected; and removing the first marking object after the first string listed in the first region is selected.
In an embodiment of the present disclosure, the at least one analysis module is provided in at least one server, and the at least one analysis includes a first analysis and a second analysis.
In an embodiment of the present disclosure, the method further includes: transmitting an analysis request and the electronic document to the at least one server; causing the at least one analysis module to perform, in response to the analysis request, the first analysis on the plurality of strings of the electronic document and output the first string and the second string according to a first result of the first analysis; causing the at least one analysis module to perform, in response to the analysis request, the second analysis on positions where the first string and the second string are located on the electronic document and output the first position information and the second position information according to a second result of the second analysis; and receiving the first string, the second string, the first position information and the second position information from the at least one server to obtain the first string, the second string, the first position information and the second position information.
The disclosure also provides an electronic apparatus for masking data on an electronic document. The electronic apparatus includes a processor, a display and a storage. The display is electrically coupled to the processor. The storage is electrically coupled to the processor and configured to store a plurality of computer-executable instructions. The plurality of computer-executable instructions, when executed by the processor, cause the electronic apparatus to perform the following steps: causing the display to display the electronic document on a user interface; causing at least one analysis module to perform at least one analysis on the electronic document and a plurality of strings of the electronic document and output a first string among the plurality of strings and first position information associated with the first string according to at least one result of the at least one analysis; obtaining the first string and the first position information from the at least one analysis module; and generating, based on the obtained first position information and the obtained first string, a first masking object to mask at least one portion of the first string on the electronic document.
In an embodiment of the present disclosure, the plurality of computer-executable instructions further cause the electronic apparatus to perform the following steps: causing the at least one analysis module to output a second string among the plurality of strings and second position information associated with the second string according to the at least one result of the at least one analysis; obtaining the second string and the second position information from the at least one analysis module; and generating, based on the obtained second position information and the obtained second string, a second masking object to mask at least one portion of the second string on the electronic document.
In an embodiment of the present disclosure, the plurality of computer-executable instructions further cause the electronic apparatus to perform the following steps: causing the display to display a data type menu on the user interface, wherein the data type menu includes a first option and a second option, and the first option and the second option are respectively associated with a first data type and a second data type; detecting whether at least one of the first option and the second option is selected; and causing the at least one analysis module to perform the at least one analysis based on the first data type and the second data type after detecting that the first option and the second option are selected. In addition, the at least one result of the at least one analysis includes that the first string is determined as the first data type and that the second string is determined as the second data type.
In an embodiment of the present disclosure, the plurality of computer-executable instructions further cause the electronic apparatus to perform the following steps: causing the display to display a data type list on the user interface after obtaining the first string, the second string, the first position information and the second position information from the at least one analysis module, wherein the data type list includes a first region associated with the first data type and a second region associated with the second data type; and listing, based on the first data type and the second data type, the first string and the second string, respectively, on the first region and the second region.
In an embodiment of the present disclosure, the plurality of computer-executable instructions further cause the electronic apparatus to perform the following steps: detecting whether the first string listed on the first region is selected; generating, based on the obtained first position information, a first marking object to mark at least one portion of the first string on the electronic document after detecting that the first string listed on the first region is selected; and generating, in response to a first operation event, the first masking object to replace the first marking object.
In an embodiment of the present disclosure, the plurality of computer-executable instructions further cause the electronic apparatus to perform the following steps: generating, based on the first position information, a first marking object to mark at least one portion of the first string on the electronic document after obtaining the first string, the second string, the first position information and the second position information from the at least one analysis module; detecting whether the first string listed in the first region is selected; and removing the first marking object after the first string listed in the first region is selected.
In an embodiment of the present disclosure, the plurality of computer-executable instructions further cause the electronic apparatus to perform the following steps: transmitting an analysis request and the electronic document to the at least one server; causing the at least one analysis module to perform, in response to the analysis request, the first analysis on the plurality of strings of the electronic document and output the first string and the second string according to a first result of the first analysis; causing the at least one analysis module to perform, in response to the analysis request, the second analysis on positions where the first string and the second string are located on the electronic document and output the first position information and the second position information according to a second result of the second analysis; and receiving the first string, the second string, the first position information and the second position information from the at least one server to obtain the first string, the second string, the first position information and the second position information.
The present disclosure also provides a non-transitory computer readable medium that stores a plurality of computer-executable instructions. The plurality of computer-executable instructions, when executed by one or more one processors, cause an electronic apparatus to perform a method for masking data on an electronic document, and the method includes the following steps: displaying the electronic document on a user interface; causing at least one analysis module to perform at least one analysis on the electronic document and a plurality of strings of the electronic document and output a first string among the plurality of strings and first position information associated with the first string according to at least one result of the at least one analysis; obtaining the first string and the first position information from the at least one analysis module; and generating, based on the obtained first position information and the obtained first string, a first masking object to mask at least one portion of the first string on the electronic document.
In order to make the above and other objects, features, advantages and embodiments of the present disclosure more readily understood, the accompanying drawings are described as follows:
In the present disclosure, “a,” “an,” and “the” may refer to a singular form or a plural form, unless an article is specifically restricted to be a singular form in the context.
In addition, as used herein, the terms “comprise/comprising,” “include/including,” “have/having,” and the like are open-ended terms that imply the inclusion of the disclosed features, elements and/or components, but do not preclude the presence or addition of one or more other features, elements, components, and/or groups thereof.
The term “coupled” used in the present disclosure may indicate that two or more elements or devices are in direct physical contact with each other or in indirect physical contact with each other and may also indicate that two or more elements or devices cooperate or interact with each other.
Furthermore, the ordinal terms (such as “first,” “second,” “third,” and the like) used in the present disclosure and claims are used to modify an element itself and do not imply any priority or any order of one element over another element, or do not imply a chronological order of steps of a method performed, but are used only as symbols to distinguish a claimed element having a particular name from another element having the same name.
The spirit of the present disclosure will be clearly illustrated with drawings and detailed descriptions below. After understanding the embodiments of the present disclosure, those skilled in the art with ordinary knowledge can make modifications and variations based on the technologies taught in the present disclosure without departing from the spirit and scope of the present disclosure.
In the present embodiment, the first service server 140 includes a first analysis module 140a configured to receive a first command from the management server 120, perform a first analysis on an electronic document and/or a text content of the electronic document according to the first command, and then transmit a first analysis result generated by the first analysis back to the management server 120 in response to the first command received from the management server 120. In addition, the second service server 150 includes a second analysis module 150a configured to receive a second command from the management server 120, perform a second analysis on a position where the confidential or sensitive information is located on an electronic document according to the second command, and then transmit a second analysis result generated by the second analysis back to the management server 120 in response to the second command received from the management server 120. The services provided by the first service server 140 and the second service server 150 may be, for example, machine learning-based artificial intelligence (AI) analysis services for analyzing a document to automatically and accurately extract texts, keywords, and/or tables and converting the electronic document into useful information. In addition, the first analysis module 140a and the second analysis module 150a may be different AI models generated by analyzing massive electronic documents through a machine learning technology or a deep learning technology. In an embodiment, the first service server 140 may be, for example, a ChatGPT server, and the first command may be, for example, a ChatGPT command (e.g., a prompt command). In addition, the second service server 150 may be a different type of AI server, such as an Azure server provided by Microsoft, and the service provided by the second service server 150 may be an Azure AI document intelligence service. Further, the management server 120 may have a first application interface (API) and a second API installed thereon and transmit analysis requests and input data (e.g., an electronic document and/or text data of the electronic document) to the first service server 140 and the second service server 150 via the first API and the second API, respectively. In addition, the management server 120 may receive analysis results (e.g., strings or data generated from analyses or position information related to the strings or data) from the first service server 140 and the second service server 150 via the first API and the second API, respectively.
In the step S12, the processor 102 may select the function option E “Mask Data” according to the control of the first input device 108 or the second input device 110 and cause the display 104 to display a document type menu on the item selection region 208 of the user interface 200 after the function option E “Mask Data” is selected. In the present embodiment, the document type menu includes a plurality of document type options, such as a “General Document” option 208a, a “Contract” option 208b, a “Personal Information” option 208c, and a “Resume” option 208d. When a user would like to mask confidential information or sensitive information in the electronic document 210, the user can use the first input device 108 or the second input device 110 to select, according to the document type of the electronic document 210, the “Contract” option 208b and a “Next” button 212, thus causing the processor 102 to select the “Contract” option 208b and the “Next” button 212 according to the control of the first input device 108 or the second input device 110. In the present embodiment, when the “Contract” option 208b is selected, a “V” will appear in a check box of the “Contract” option 208b to indicate that the “Contract” option 208b has been selected. After selecting the “Next” button 212, the processor 102 will perform the step S14.
Now referring to
In the step S16, the processor 102 detects whether a data type option is selected. In the present embodiment, when the confidential or sensitive information that the user would like to mask in the electronic document 210 includes names, phone numbers, and ID numbers, the user can use the first input device 108 or the second input device 110 to select the “Name” option 214a, the “Phone No.” option 214b, and the “ID No.” option 214c, thus causing the processor 102 to detect that the “Name” option 214a, the “Phone No.” option 214b, and the “ID No.” option 214c are selected. When the “Name” option 214a, the “Phone No.” option 214b, and the “ID No.” option 214c are selected, each of the check boxes of the “Name” option 214a, the “Phone No.” option 214b, and the “ID No.” option 214c will have a “V” appear to indicate that the “Name” option 214a, the “Phone No.” option 214b, and the “ID No.” option 214c have been selected. In addition, when the processor 102 does not detect any of the options 214a to 214f being selected, the processor 102 will repeat the step S16. Next, the user can use the first input device 108 or the second input device 110 to select a “Start Detection” button 216, thus causing the processor 102 to select, according to the control of the first input device 108 or the second input device 110, the “Start Detection” button 216 to start detecting the strings associated with “Name,” “Phone No.,” and “ID No.” in the electronic document 210. After selecting the “Start Detection” button 216, the processor 102 will perform the step S18.
Referring to
In the present embodiment, the position information associated with each string outputted by the second analysis module 150a may include at least one coordinate, a width value and a height value. The at least one coordinate is used to indicate a position where a corresponding string is located on the electronic document 210 and can be represented, for example, by (x, y) coordinates. The width value and the height value are used to respectively indicate a width and a height of the corresponding string in the electronic document 210, and the length unit of the width value and the length unit of the height value may be defined according to different requirements. For example, the length unit may be the number of characters, the number of letters, the ratio of the width of the electronic document to the width of the string, or the ratio of the height of the electronic document to the height of the string but is not limited thereto. In another embodiment, the position information may further include a page number to indicate a number of the page on which a corresponding string is located on the electronic document 210. In one embodiment of the present disclosure, before transmitting the strings “David Chen,” “John Huang,” “0983001333,” “0977456078,” “S381456745,” and “N123456789” and the position information associated with these strings to the electronic apparatus 100, the management server 120 organizes and arranges these strings and the position information. For example, the management server 120 may arrange these strings and the position information into multiple data records shown in Table 1 below according to numbers and transmit each data record to the electronic apparatus 100 according to the numbering sequence shown in Table 1. In another embodiment of the present disclosure, the management server 120 may store the multiple data records shown in Table 1 below into an information file and transmit the information file to the electronic apparatus 100.
In the step S20, the processor 102 receives the eight data records as listed in Table 1 from the management server 120 to obtain the strings associated with “Name,” “Phone No.,” and “ID No.” output by the first analysis module 140a and the position information of the strings output by the second analysis module 150a. In the present embodiment, each data record may include a data number, a data type, a string content, a page number, position coordinates, a string height and/or a string width, but is not limited thereto. In other embodiments, each data record may further include a data creation date, a data modification date, and/or the name of a person who modified the data. After obtaining the strings associated with “Name,” “Phone No.” and “ID No.” and the position information of the strings, the processor 102 will perform the step S22.
Referring to
Now referring to
In the step S26, after detecting that the string content listed in the selectable regions 223, 224, 226, and 227 is selected, the processor 102 generates, based on the position information obtained in the step S20, marking objects 230, 232, 234, and 236 to mark the strings “0983001333,” “0977456078,” “S381456745,” and “N123456789” on the electronic document 210 as shown in
In an alternative embodiment of the present disclosure, after detecting that the string content listed in the selectable regions 223, 224, 226, 227 is selected, the processor 102 may not generate the marking objects 230, 232, 234, and 236, as shown in
Now referring to
In the above embodiments, the electronic document 210 may be, for example, an electronic document in PDF format. In addition, after the masking objects 240, 242, 244, and 246 mask the strings “0983001333,” “0977456078,” “S381456745,” and “N123456789” on the electronic document 210, a PDF flattening technology can be used to flatten the masking objects 240, 242, 244, and 246 and the strings “0983001333,” “0977456078,” “S381456745,” and “N123456789” into the same layer, thus making the strings “0983001333,” “0977456078,” “S381456745,” and “N123456789” non-editable or non-searchable thereafter to enhance the masking and confidentiality effect.
Now referring to
It should be noted that, in the above embodiments, the masking objects 240, 242, 244, and 246 shown in
The present disclosure also provides a non-transitory computer readable medium that stores a plurality of computer-executable instructions. The plurality of computer-executable instructions, when executed by one or more processors, cause an electronic apparatus (e.g., the electronic apparatus 100 shown in
Although the present disclosure has been disclosed by way of above embodiments, the embodiments are not intended to limit the present disclosure, and those skilled in the art will appreciate that changes and modifications may be made therein as long as those changes and modifications do not deviate from the spirit and the scope of the present disclosure. Therefore, the scope of the present disclosure should be construed according to the definitions in the appended claims.
The present disclosure claims the benefit of and priority to U.S. provisional Patent Application Ser. No. 63/522,245 filed on Jun. 21, 2023, entitled “METHODS AND SYSTEMS FOR ARTIFICIAL INTELLIGENCE-ASSISTED DOCUMENT ANNOTATION AND DATA COLLECTION,” (hereinafter referred to as “the '245 provisional”). The disclosure of the '245 provisional is hereby incorporated fully by reference into the present disclosure.
Number | Date | Country | |
---|---|---|---|
63522245 | Jun 2023 | US |