This application claims priority from Indian Patent Application No. 201611035958, filed on Oct. 20, 2016 in the Indian Patent Office, the entire disclosure of which is hereby incorporated by reference.
Example embodiments of present disclosure relate to a device and method for providing at least one functionality to a user with respect to at least one of a plurality of webpages.
Many news websites, product review websites, e-commerce websites, blogs etc. provide information related to events, products, etc. Most of the times, the problem a user faces in browsing multiple websites presenting information on the same topic is that 70-80% of information presented on each duplicative across different webpages. This results in forcing the user to go through the same information multiple times just to find what extra information may be present on a given website. If the content present on the webpages is mostly text-based, then it is more inconvenient for the user to identify what is new information presented in this content. Also, if the user finds the answer to his query on the first search result page, he has to manually close the rest of the search result tabs. Related art methods have not been able to address this problem in an effective manner.
For example, a page comparison tool may compare the text across two different pages. The results may display information including the Page Title, Meta Tag info, and common phrases occurring on the pages in a side by side analysis, as illustrated in
In yet another example may involve a computerized method for monitoring the content of the documents. A set of documents may be stored in memories of server computers. The first and second abstracts are compared to identify documents that have changed between the time the set of documents were indexed and the time the result set is generated. A webpage may be updated at certain time intervals by comparing content of a previous version of a webpage and the latest version of the webpage so that if any content is updated on that particular webpage, the change will be detected. However, such method fails to provide a contextual comparison of two different webpages based upon information requested by user.
In still another example, a synchronized comparison and present system may present a similar webpage when a page in a website is presented, in a different site based on a search keyword automatically obtained in the page simultaneously and automatically, and also control a display mode of the similar page to be synchronous with a display mode of the original webpage. This method, however, does not identify what is contextually different between the webpages and what new information might be present on a webpage.
In still yet another example, a system may include a module for receiving an identifier of a base site that becomes the basis of a displayed presentation and receives an identifier of a compared website displayed in a language different from that of the base site, a module for specifying a base page from the base site, a module for consolidating words of different languages into a single language, a module for producing information for comparing the base page, a module for specifying a related page similar to the base page from among each compared page based on the word information of the base page and the word information of each compared page, and a module for presenting the related page together with the base page on the same display screen. This system aims to provide related webpages based upon content of current webpage. However, it does not provide any mechanism to highlight or identify what new content is present on a related webpage.
In light of the foregoing, with conventional methods and devices, it is difficult to identify new content on a website when a user switches from a previous website to another website related to a common topic. Also, the user activity pattern is not considered when switching between multiple websites related to the same topic. Further, content that has already been browsed and/or un-browsed on a webpage is not identified.
Thus, there is an unmet need for a device and method that overcomes the disadvantages of the conventional devices in a simple and efficient manner.
One or more example embodiments provide a device and a method for providing functionalities based on webpage similarity analysis.
According to an aspect of an example embodiment, a method for providing at least one functionality using an electronic device, the method including: receiving a first user input for selecting a plurality of first webpages through a web browser; identifying, based on a predetermined criterion, a plurality of second webpages from the plurality of first webpages, the plurality of second webpages being related to each other; extracting key points corresponding to each of the plurality of second webpages; comparing the key points corresponding to the each of the plurality of second webpages; providing, based on a result of the comparing the key points, at least one functionality to a user with respect to the plurality of second webpages. According to an aspect of an example embodiment, a device may include a display, an input device, and a processor electrically connected to the display and the input device. The processor may be configured to: receive, through the input device, a first user input for selecting a plurality of first webpages through a web browser; identify, based on a predetermined criterion, a plurality of second webpages from the plurality of first webpages. The plurality of second webpages being related to each other; extract key points corresponding to each of the plurality of second webpages; compare the key points corresponding to the each of the plurality of second webpages with each other; and provide, based on a result of the comparing the key points, at least one functionality to a user with respect to the plurality of second webpages.
To further clarify advantages and aspects of the disclosure, a more particular description of the disclosure will be rendered with reference to specific embodiments thereof, which is illustrated in the appended drawings. It is appreciated that these drawings depict only example embodiments of the disclosure and are therefore not to be considered limiting of its scope. The disclosure will be described and explained with additional specificity and detail with the accompanying drawings in accordance with various embodiments of the disclosure, wherein:
It should be understood at the outset that although illustrative implementations of the example embodiments of the present disclosure are illustrated below, the present disclosure may be implemented using any number of techniques, whether currently known or in existence. The present disclosure should in no way be limited to the illustrative implementations, drawings, and techniques illustrated below, including the exemplary design and implementation illustrated and described herein, but may be modified within the scope of the appended claims along with their full scope of equivalents. The term “some” as used herein is defined as “none, or one, or more than one, or all.” Accordingly, the terms “none,” “one,” “more than one,” “more than one, but not all” or “all” would all fall under the definition of “some.” The term “some embodiments” may refer to no embodiments or to one embodiment or to several embodiments or to all embodiments. Accordingly, the term “some embodiments” is defined as meaning “no embodiment, or one embodiment, or more than one embodiment, or all embodiments.” The terminology and structure employed herein is for describing, teaching and illuminating some embodiments and their specific features and elements and does not limit, restrict or reduce the spirit and scope of the claims or their equivalents. More specifically, any terms used herein such as but not limited to “includes,” “comprises,” “has,” “consists,” and grammatical variants thereof do not specify an absolute limitation or restriction and certainly do not exclude the possible addition of one or more features or elements, unless otherwise stated, and furthermore must not be taken to exclude the possible removal of one or more of the listed features and elements, unless otherwise stated with the limiting language such as “must comprise” or “needs to include,” etc.
Whether or not a certain feature or element was limited to being used only once, either way it may still be referred to as “one or more features” or “one or more elements” or “at least one feature” or “at least one element.” Furthermore, the use of the terms “one or more” or “at least one” feature or element do not preclude there being none of that feature or element, unless otherwise specified by limiting language such as “there needs to be one or more . . . ” or “one or more element is required. Unless otherwise defined, all terms, and especially any technical and/or scientific terms, used herein may be taken to have the same meaning as commonly understood by one having an ordinary skill in the art. Reference is made herein to some “embodiments.” It should be understood that an embodiment is an example of a possible implementation of any features and/or elements presented in the attached claims. Some embodiments have been described for the purpose of illuminating one or more of the potential ways in which the specific features and/or elements of the attached claims fulfill the requirements of uniqueness, utility and non-obviousness. Use of the phrases and/or terms such as but not limited to “a first embodiment,” “a further embodiment,” “an alternate embodiment,” “one embodiment,” “an embodiment,” “multiple embodiments,” “some embodiments,” “other embodiments,” “further embodiment”, “furthermore embodiment”, “additional embodiment” or variants thereof do not necessarily refer to the same embodiments. Unless otherwise specified, one or more particular features and/or elements described in connection with one or more embodiments may be found in one embodiment, or may be found in more than one embodiment, or may be found in all embodiments, or may be found in no embodiments. Although one or more features and/or elements may be described herein in the context of only a single embodiment, or alternatively in the context of more than one embodiment, or further alternatively in the context of all embodiments, the features and/or elements may instead be provided separately or in any appropriate combination or not at all. Conversely, any features and/or elements described in the context of separate embodiments may alternatively be realized as existing together in the context of a single embodiment. Any particular and all details set forth herein are used in the context of some embodiments and therefore should not be necessarily taken as limiting factors to the attached claims. The attached claims and their legal equivalents can be realized in the context of embodiments other than the ones used as illustrative examples in the description below.
Those of ordinary skill in the art will appreciate that elements in the drawings are illustrated for simplicity and may not have been necessarily drawn to scale. For example, the dimensions of some of the elements in the drawings may be exaggerated relative to other elements to help to improve understanding of aspects of the disclosure. Furthermore, the one or more elements may have been represented in the drawings by conventional symbols, and the drawings may show only those specific details that are pertinent to understanding the embodiments of the disclosure so as not to obscure the drawings with details that will be readily apparent to those of ordinary skill in the art having benefit of the description herein.
At operation 301, at least one key point (e.g., keywords) corresponding to each of the webpages are extracted from each of the plurality of webpages. Content of each webpage may be analyzed to extract the key points corresponding to each of the webpages. A user may select webpages through a web browser to open the webpages in multiple tabs of the web browser. For example, each of the plurality of webpages may correspond to each of a plurality of tabs that are open on a web browser.
At operation 302, the at least one key point corresponding to each of the plurality of webpages are compared to each other.
At operation 303, based on a result of comparing the at least one key point at operation 302, at least one functionality is provided to the user with respect to at least one of the plurality of webpages.
The method 300 may further include, based on a predetermined criterion, identifying whether the plurality of webpages are related to each other. Related webpages may share common topics, keywords, formats, content, etc. The relatedness of websites may be determined based on a number of shared topics, keywords, etc. and threshold value.
The method 300 may further include dynamically updating at least one functionality based on one or more of: (a) interaction with currently opened webpage, and (b) detection of a new related webpage.
The predetermined criterion for identifying a plurality of webpages, as depicted at operation 301, includes presence of at least one matching or similar text in one or more of: (a) URL of the webpages, (b) title of the webpages, the title extracted from a DOM tree structure created for each webpage, (c) heading of the webpages, the heading extracted from the DOM tree structure created for each webpage, and (d) metadata within the webpages.
The presence of at least one matching or similar text in the URL of the webpages can be explained as follows. In an example embodiment, if multiple webpages are open through links which are present on the parent page, there is a high chance some of the opened webpages (i.e. child pages) are related. Also, in such cases, the URLs of the parent page and the child webpages which are related to each other may include matching or similar text. For example, the table below shows content searched by the user, a parent page and the child pages. As can be seen, the URLs of some of the child pages have matching/similar text as that of parent page and therefore such pages may be treated as related webpages.
In an example embodiment, if URLs of multiple webpages contain similar/matching text, there is a high chance that the webpages relate to similar content. For example, the user opens three webpages having the following URLs: As can be seen, all the uniform resource locators (URL 1, URL 2 and URL 3) include text like GANDHI and INDEPENDENCE and therefore can be considered as related webpages.
The presence of at least one matching or similar text in the title and/or heading extracted from a DOM tree structure can be explained with the following example. In this example, the user opens four webpages having the following uniform resource locators
In the above example, the search for the matching text is conducted in title/heading extracted from the DOM tree. As seen, the title/heading extracted from the DOM tree of URL 1, URL 2 and URL 3 includes the same text like HISTORY and SAMSUNG. Hence, webpages having URL 1, URL 2 and URL 3 may be identified as related webpages. With respect to URL 4, the title/heading extracted from the DOM tree of URL 4 includes only the text SAMSUNG and not HISTORY. However, the content related to the URL 4 may include the text HISTORY and, therefore, may also be considered a related webpage. If we conduct a search for matching/similar text only in the URLs of the webpage and/or the title/heading of the DOM Tree, URL 4 may not be considered a related webpage. However, the webpage corresponding to URL 4 is a related webpage. In order to cover such embodiments, the present disclosure discloses further parsing of the DOM tree at the next level, i.e. conducting a search for the matching/similar text under the content of the titles/heading and includes such text while determining/identifying related webpages. By conducting such a search, URL 4 may also be categorized under the same group as URL 1, URL 2 and URL 3 (i.e. related webpages.)
The extraction of one or more key points (e.g., keywords) from webpages, as illustrated at operation 302, can be performed through one or more methods and/or algorithms already known in the art. Such methods include natural language processing (NLP), web scraping, DOM parsing, summarization and semantic analysis. It is to be understood that methods other than those mentioned above may also be used for extraction of one or more key points from the webpages.
The comparison of one or more key points, as illustrated at operation 303, can be through one or more methods and/or algorithms already known in the art. Such methods include point-to-point comparison, textual paragraph/summary comparison and multimedia comparison. It is to be understood that methods other than those mentioned above may also be used for comparison of key points extracted from one or more webpages.
The user input pattern as illustrated at operation 304 may include: (a) closing the one or more webpages, (b) switching between webpages, (c) opening one or more webpages, (d) navigating within the webpage, and (e) switching between web browser windows.
The functionality provided to the user on closing one or more webpages may include one or more of: (a) closing all the webpages, (b) saving all the webpages, (c) printing all the webpages, (d) emailing all the webpages, (e) copying all the webpages, (f) saving highlighted key points in the webpages, (g)printing highlighted key points in the webpages, (h) emailing highlighted key points in the webpages, (i) copying highlighted key points in the webpages, (j) saving the webpages having highlighted key points, (k) printing the webpages having highlighted key points, (1) copying the webpages having highlighted key points, (m) emailing the webpages having highlighted key points, (n) saving the webpages having the maximum number of key points, (o) printing the webpages having maximum number of key points, (p) copying the webpages having the maximum number of key points, (q) emailing the webpages having maximum number of key points, (r) saving a first set of webpages having all the key points, (s) emailing a first set of webpages having all the key points, (t) copying a first set of webpages having all the key points, and (u) printing a first set of webpages having all the key points.
The functionality provided to the user on switching between webpages, opening one or more webpages, navigating within the webpage or switching between web browser windows may include one or more of: (a) highlighting one or more key points on a current webpage, the one or more key points absent in one or more previously accessed and webpages, (b) displaying, on a current webpage, only the one or more key points absent in one or more previously accessed webpages, (c) auto-scrolling to one or more key points on a current webpage, the one or more key points absent in one or more previously accessed and webpages, (d) navigating to one or more key points on a current webpage, the one or more key points absent in one or more accessed and webpages, (e) highlighting one or more key points on a current webpage, the one or more key points present in one or more previously accessed and webpages, (f) displaying, on a current webpage, only the one or more key points present in one or more previously accessed webpages, (g) auto-scrolling to one or more key points on a current webpage, the one or more key points present in one or more previously accessed and webpages, (h) navigating to one or more key points on a current webpage, the one or more key points present in one or more accessed and webpages, (i) highlighting one or more key points on a current webpage, the one or more key points browsed in one or more previously accessed and webpages, (j) displaying, on a current webpage, only the one or more key points browsed in one or more previously accessed webpages, (k) auto-scrolling to one or more key points on a current webpage, the one or more key points browsed in one or more previously accessed and webpages, (l) navigating to one or more key points on a current webpage, the one or more key points browsed in one or more accessed and webpages, (m) highlighting one or more key points on a current webpage, the one or more key points un-browsed in one or more previously accessed and webpages, (n) displaying, on a current webpage, only the one or more key points un-browsed in one or more previously accessed webpages, (o) auto-scrolling to one or more key points on a current webpage, the one or more key points un-browsed in one or more previously accessed and webpages, (p) navigating to one or more key points on a current webpage, the one or more key points un-browsed in one or more accessed and webpages, (q) creating a virtual page corresponding to a current webpage, the virtual page displaying one or more key points absent in one or more previously accessed and webpages.
At operation 401, at least one key point (e.g., keywords) corresponding to each of the webpages are extracted from each of the plurality of webpages. Content of each webpage may be analyzed to extract the key points corresponding to each of the webpages. A user may select webpages through a web browser to open the webpages in multiple tabs the web browser. For example, each of the plurality of webpages may correspond to each of a plurality of tabs that are open on a web browser.
At operation 402, the at least one key point corresponding to the plurality of webpages may be compared to each other.
At operation 403, a list of the at least one key point extracted from all the webpages may be generated.
At operation 404, based on a result of comparing the key points, at least one functionality may be provided to a user with respect to at least one of the plurality of webpages.
The method 400 may further include, based on a predetermined criterion, identifying whether the plurality of webpages are related to each other. Related webpages may share common topics, keywords, formats, content, etc. The relatedness of websites may be determined based on a number of shared topics, keywords, etc. and threshold value.
The method further includes dynamically updating the at least one functionality based on one or more of: interaction with currently opened webpage; and detection of a new related webpage.
The predetermined criterion for identifying a plurality of webpages as depicted at operation 401 includes presence of at least one matching or similar text in one or more of: (a) URLs of the webpages, (b) titles of the webpages and/or content thereof, the titles extracted from a DOM tree structure created for each webpage, (c) headings of the webpages and/or content thereof, the headings extracted from the DOM tree structure created for each webpage, and (d) metadata within the webpages. The same has been explained in detail while explaining the predetermined criterion for identifying a plurality of webpages at operation 302.
The extraction of one or more key points from webpages, as illustrated at operation 402, can be through one or more methods and/or algorithms already known in the art. Such methods include natural language processing (NLP), web scraping, DOM parsing, summarization and semantic analysis. It is to be understood that methods other than those mentioned above may also be used for extraction of one or more key points from the webpages.
The comparison of one or more key points, as illustrated at operation 403, can be performed through one or more methods and/or algorithms already known in the art. Such methods include point-to-point comparison, textual paragraph and multimedia comparison. It is to be understood that methods other than those mentioned above may also be used for comparison of key points extracted from one or more webpages.
The operation of providing at least one functionality to the user may include one or more of : (a) highlighting one or more webpages having a maximum number of the unique key points from the list of unique key points, (b) sorting webpages in a descending order, a first webpage in the order having a maximum number of the unique key points and a last webpage in the order having a minimum number of unique key points, (c) sorting webpages in an ascending order, a first webpage in the order having a minimum number of unique key points and a last webpage in the order having a maximum number of unique key points, (d) automatically switching to a webpage having a maximum number of the unique key points from the list of unique key points, (e) auto-navigate from a webpage having a maximum number of unique key points to one or more webpage having remaining number of key points, (f) indicating one or more minimum navigational paths, the minimum navigational path indicating a webpage having maximum number of unique key points from the list of unique key points and at least one webpage having a remaining number of unique key points, (g) auto-closing one or more webpages not indicated in a minimum navigational path, the minimum navigational path indicating a webpage having maximum number of unique key points from the list of unique key points and at least one webpage having a remaining number of unique key points, (h) dynamically sorting the webpages other than a current webpage in a descending order, a first webpage in the order having a maximum number of remaining key points and a last webpage having a minimum number of remaining key points, (i) dynamically sorting the webpages other than a current webpage in an ascending order, a first webpage in the order having a minimum number of remaining key points and a last webpage having a maximum number of remaining key points, and (j) grouping at least one set of webpages from amongst the plurality of webpages such that the first set of webpages includes all the unique key points. It is to be understood that the set is not a null set and may include at least two webpages. Further, if more than one set occurs, the sets can be visually distinguished from each other. The options for visually distinguishing them include one or more of: font changes, effects, color change and the like. The operation of providing at least one functionality to the user may also include (k) grouping at least one set of webpages from amongst the plurality of webpages such that the first set of webpages includes all the key points and are sorted in a predetermined order. It is to be understood that the predetermined order include an ascending order or a descending order. Further, if more than one set occurs, the sets can be visually distinguished from each other. The options for visually distinguishing the sets may include one or more of: font changes, effects, color change and the like.
When a user is done browsing content on a topic, the user may manually close all the tabs/links opened related to that topic. This is currently a manual task in which user needs to close each individual link/tab. The present disclosure will identify this pattern and will provide an option to user to automatically perform one or more functions.
A web browser 2001 is a software application for retrieving, presenting, and exchanging information resources on the World Wide Web (WWW). The web browser includes a user interface (UI) process 2002, a web browser engine 2003, Java core 2004 (also known as Javascript interpreter), web process 2005 and a web core 2006. The web core 2006 is a layout, rendering, and document object model (DOM) library for Hypertext Markup Language (HTML). The web core 2006 may include an HTML parser module 2007. An HTML document may contain text and/or refer to images. The nodes of every HTML document are organized in a tree structure called the DOM tree which is generated by the HTML parser module.
Further, the web browser engine 2003, also known as a layout engine, is a software component that takes up marked up content such as HTML, extensible Markup language (XML), and image file and formatting information such as cascading style sheets (CSS), extensible style sheet (XSL) and displays the formatted content on the screen. A platform for the web browser can be one of HTML, CSS, DOM and Javascript. Examples of web browsers include but are not limited to Internet explorer, Mozilla, Firefox, Safari, Google chrome, Konqueror, Arora, Dillo, Lynx, Amaya, Netpositive, Planetweb and Netscape.
The device 2000 of the present disclosure, apart from the above-mentioned which is already known in the art, may include a first identification module 2009, an analyzing module 2010, a comparison module 2011, a second identification module 2012 and a control module 2013.
The first identification module 2009 is configured for identifying, based on a predetermined criterion, a plurality of related webpages from a plurality of webpages opened in at least one tab. The first identification module 2009 is in communication with the HTML parser module 2007. The predetermined criterion includes presence of at least one matching/similar text in one or more of: uniform resource locator of the related webpage; title of the related webpage, the title extracted from a DOM tree structure created for each webpage; heading of the related webpages, the heading extracted from the DOM tree structure created for each webpage; and meta data within the related webpage.
The analyzing module 2010 is configured for analyzing content of each related webpage and extracting one or more key points corresponding to each related webpage. The analyzing module 2010 extracts one or more key points using one or more of: (a) natural language processing (NLP), (b) web scraping, (c) DOM parsing, (d) summarization and (e) semantic analysis. The analyzing module is in communication with the HTML Parser Module and the first identification module.
The comparison module 2011 is configured for comparing the key points present in the related webpages. The comparison module 2011 compares one or more key points using one or more of: point-to-point comparison, textual paragraph/summary comparison, and multimedia comparison.
The second identification module 2012 is configured for identifying a user input pattern indicative of performing an action on at least one of the plurality of related webpages. The control module 2012 is configured for providing, based on user input identified by second identification module, at least one functionality to the user with respect to the plurality of related webpages. One of the user input pattern identified by the second identification module 2012 is closing of one or more related webpages. The functionality provided by the control module 2013 on closing one or more related webpages is: (a) closing all the related webpages, (b) saving all the related webpages, (c) printing all the related webpages, (d) emailing all the related webpages, (e) copying all the related webpages, (f) saving highlighted key points in the related webpages, (g) printing highlighted key points in the related webpages, (h) emailing highlighted key points in the related webpages, (i) copying highlighted key points in the related webpages, (j) saving the related webpages having highlighted key points, (k) printing the related webpages having highlighted key points, (l) copying the related webpages having highlighted key points, (m) emailing the related webpages having highlighted key points, (n) saving the related webpages having maximum number of key points, (o) printing the related webpages having maximum number of key points, (p) copying the related webpages having maximum number of key points, (q) emailing the related webpages having maximum number of key points, (r) saving a first set of related webpages having all the key points, (s) emailing a first set of related webpages having all the key points; (t) copying a first set of related webpages having all the key points, and (u) printing a first set of related webpages having all the key points.
Another user input pattern identified by the second identification module 2000 may be one of: switching between related webpages, opening one or more related webpages, navigating within a related webpage, and switching web browser windows. The functionality provided by the control module 2013 on receiving one of the user input pattern may be: (a) highlighting one or more key points on a current related webpage, the one or more key points absent in one or more previously accessed and related webpages, (b) auto-scrolling to one or more key points on a current related webpage, the one or more key points absent in one or more previously accessed and related webpages, (c) highlighting one or more key points on a current related webpage, the one or more key points present in one or more previously accessed and related webpages, (d) auto-scrolling to one or more key points on a current related webpage, the one or more key points present in one or more previously accessed and related webpages, (e) highlighting one or more key points on a current related webpage, the one or more key points browsed in one or more previously accessed and related webpages, (f) auto-scrolling to one or more key points on a current related webpage, the one or more key points browsed in one or more previously accessed and related webpages, (g) highlighting one or more key points on a current related webpage, the one or more key points un-browsed in one or more previously accessed and related webpages, (h) auto-scrolling to one or more key points on a current related webpage, the one or more key points un-browsed in one or more previously accessed and related webpages, and (i) creating a virtual page corresponding to a current webpage, the virtual page displaying one or more key points absent in one or more previously accessed and related webpages.
The device 2000 may further include a dynamic update module, the module configured for dynamically updating the at least one functionality discussed above based on one or more of: interaction with a currently opened webpage; and detection of a new related webpage.
A web browser 2101 is a software application for retrieving, presenting, and traversing information resources on the World Wide Web (WWW). The web browser includes a user interface (UI) process 2102, a web browser engine 2103, Java core 2104 (also known as Java script interpreter), web process 2105 and a web core 2106. The web core 2106 is a layout, rendering, and document object model (DOM) library for Hypertext Mark-up Language (HTML). The web core 2106 may include a HTML parser module 2107. As already known, an HTML document may contain text and/or refer to images. The nodes of every HTML document are organized in a tree structure called the DOM tree which is generated by the HTML parser module 2107.
Further, the web browser engine 2103. also known as a layout engine, is a software component that takes up marked up content such as HTML, extensible Mark-up language (XML), and image file and formatting information such as cascading style sheets (CSS), extensible style sheet (XSL) and displays the formatted content on the screen. A platform 2108 for the web browser may be one of HTML, CSS, DOM and Javascript. Examples of web browsers include but are not limited to Internet explorer, Mozilla, Firefox, Safari, Google chrome, Konqueror, Arora, Dillo, Lynx, Amaya, Netpositive, Planetweb and Netscape.
The device 2100 of the present disclosure, apart from the above-mentioned which is already known in the art, may include an identification module 2109, an analyzing module 2110, a comparison module 2111, a preparation module 2112 and a control module 2113.
The identification module 2109 is configured for identifying, based on a predetermined criterion, a plurality of related webpages from a plurality of webpages opened in at least one tab. The identification module 2109 is in communication with the HTML parser module 2107. The predetermined criterion includes presence of at least one matching/similar text in one or more of: (a) uniform resource locator of the related webpage, (b) title of the related webpage, the title extracted from a DOM tree structure created for each webpage, (c) heading of the related webpages, the heading extracted from the DOM tree structure created for each webpage, and (d) meta data within the related webpage.
The analyzing module 2110 is configured for analyzing content of each related webpage and extracting one or more key points corresponding to each related webpage. The analyzing module 2110 extracts one or more key points using one or more of: (a) natural language processing (NLP), (b) web scraping, (c) DOM parsing, (d) summarization and (e) semantic analysis. The analyzing module 2110 is in communication with the HTML parser module 2107 and the identification module 2109.
The comparison module 2111 is configured for comparing the key points present in the related web-pages. The comparison module compares one or more key points using one or more of: point-to-point comparison, textual paragraph/summary comparison, and multimedia comparison.
The preparation module 2112 is configured for preparing a list of unique key points present in all the related webpages and the control module 2113 is configured for providing at least one functionality to a user with respect to the plurality of related webpages, the functionality includes one or more of: (a) highlighting one or more related webpages having a maximum number of the unique key points from the list of unique key points, (b) sorting related webpages in a descending order, a first related webpage in the order having a maximum number of the unique key points and a last related webpage in the order having a minimum number of unique key points, (c) automatically switching to a webpage having a maximum number of the unique key points from the list of unique key points, (d) indicating one or more minimum navigational paths, the minimum navigational path displaying a related webpage having maximum number of unique key points from the list of unique key points and at least one related webpage having a remaining number of unique key points, and (e) grouping a first set of related webpages from amongst the plurality of related webpages such that the first set of webpages includes all the unique key points.
The device 2100 may further include a dynamic update module, the module configured for dynamically updating the at least one functionality discussed above based on one or more of: interaction with a currently opened webpage, and detection of a new related webpage.
In a networked deployment, the computer system 2200 may operate in the capacity of a server or as a client computer in a server-client user network environment, or as a peer computer system in a peer-to-peer (or distributed) network environment. The computer system 2200 can also be implemented as or incorporated into various devices, such as a personal computer (PC), a tablet PC, a set-top box (STB), a personal digital assistant (PDA), a mobile device, a palmtop computer, a laptop computer, a desktop computer, a communications device, a wireless telephone, a wearable computing device, a land-line telephone, a control system, a camera, a scanner, a facsimile machine, a printer, a pager, a personal trusted device, a web appliance, a network router, switch or bridge, or any other machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine. Further, while a single computer system 2200 is illustrated, the term “device” shall also be taken to include any collection of systems or sub-systems that individually or jointly execute a set, or multiple sets, of instructions to perform one or more computer functions.
The computer system 2200 may include a processor 2201, e.g., a central processing unit (CPU), a graphics processing unit (GPU), or both. The processor 2201 may be a component in a variety of systems. For example, the processor 2201 may be part of a standard personal computer or a workstation. The processor 2201 may be one or more general processors, digital signal processors, application specific integrated circuits, field programmable gate arrays, servers, networks, digital circuits, analog circuits, combinations thereof, or other now known or later developed devices for analyzing and processing data The processor 2201 may implement a software program, such as code generated manually (i.e., programmed).
The computer system 2200 may include a memory 2202, such as a memory 2202 that can communicate via a bus 2203. The memory 2202 may be a main memory, a static memory, or a dynamic memory. The memory 2202 may include, but is not limited to computer readable storage media such as various types of volatile and non-volatile storage media, including but not limited to random access memory (RAM), read-only memory (ROM), programmable read-only memory (PROM), electrically programmable read-only memory (EPROM), electrically erasable read-only memory (EEPROM), flash memory, magnetic tape or disk, optical media and the like. In one example, the memory 2202 includes a cache or random access memory for the processor 2201. In alternative examples, the memory 2202 is separate from the processor 2201, such as a cache memory of a processor, the system memory, or other memory. The memory 2202 may be an external storage device or database for storing data. Examples include a hard drive, a compact disc (CD), a digital video disc (DVD), a memory card, a memory stick, a floppy disc, a universal serial bus (USB) memory device, or any other device operative to store data. The memory 2202 is operable to store instructions executable by the processor 2201. The functions, acts or tasks illustrated in the figures or described may be performed by the programmed processor 2201 executing the instructions stored in the memory 2202. The functions, acts or tasks are independent of the particular type of instructions set, storage media, processor or processing strategy and may be performed by software, hardware, integrated circuits, firm-ware, micro-code and the like, operating alone or in combination. Likewise, processing strategies may include multiprocessing, multitasking, parallel processing and the like.
As shown, the computer system 2200 may further include a display 2205, such as a liquid crystal display (LCD), an organic light emitting diode (OLED), a flat panel display, a solid state display, a cathode ray tube (CRT), a projector, a printer or other now known or later developed display device for outputting determined information. The display 2205 may act as an interface for the user to see the functioning of the processor 2201, or specifically as an interface with the software stored in the memory 2202 or in a drive unit 2207.
The computer system 2200 may also include a disk or optical drive unit 2207. The disk drive unit 2207 may include a computer-readable medium 2208 in which one or more sets of instructions 2209, e.g. software, can be embedded. Further, the instructions 2209 may embody one or more of the methods or logic as described. In a particular example, the instructions 2209 may reside completely, or at least partially, within the memory 2202 or within the processor 2201 during execution by the computer system 2200. The processor 2201 and the memory 2202 may also include computer-readable media as discussed above.
One or more example embodiments may be implemented by a computer-readable medium that includes instructions 2209 or receives and executes instructions 2209 responsive to a propagated signal so that a device connected to a network 2210 can communicate voice, video, audio, images or any other data over the network 2210. Further, the instructions 2209 may be transmitted or received over the network 2210 via a communication port or interface 2211 or using the bus 2203. The communication port or interface 2211 may be a part of the processor 2201 or may be a separate component. The communication interface 2211 may be created in software or may be a physical connection in hardware. The communication interface 2211 may be configured to connect with a network 2210, external media, the display 2205, or any other components in system 2200, or combinations thereof. The communication interface 2211 may include transmitter for transmitting data and/or receiver for receiving data. The connection with the network 2210 may be a physical connection, such as a wired Ethernet connection or may be established wirelessly as discussed later. Likewise, the additional connections with other components of the system 2200 may be physical connections or may be established wirelessly. The network 2210 may alternatively be directly connected to the bus 2203.
The network 2210 may include wired networks, wireless networks, Ethernet Audio Video Bridging (AVB) networks, or combinations thereof. The wireless network may be a cellular telephone network, an 802.11, 802.16, 802.20, 802.1Q or WiMax network. Further, the network 2210 may be a public network, such as the Internet, a private network, such as an intranet, or combinations thereof, and may utilize a variety of networking protocols now available or later developed including, but not limited to Transmission Control Protocol/Internet Protocol (TCP/IP)-based networking protocols.
Additionally, the computer system 2200 may include an input device 2206 configured to allow a user to interact with any of the components of system 2200. The input device 2206 may be a number pad, a keyboard, or a cursor control device, such as a mouse, or a joystick, touch screen display, remote control or any other device operative to interact with the computer system 2200.
In an alternative example, dedicated hardware implementations, such as application specific integrated circuits, programmable logic arrays and other hardware devices, can be constructed to implement various parts of the system 2200. Applications that may include the systems can broadly include a variety of electronic and computer systems. One or more examples described may implement functions using two or more specific interconnected hardware modules or devices with related control and data signals that can be communicated between and through the modules, or as portions of an application-specific integrated circuit. Accordingly, the present system encompasses software, firmware, and hardware implementations.
The system described may be implemented by software programs executable by a computer system. Further, in a non-limited example, implementations can include distributed processing, component/object distributed processing, and parallel processing. Alternatively, virtual computer system processing can be constructed to implement various parts of the system.
The system is not limited to operation with any particular standards and protocols. For example, standards for Internet and other packet switched network transmission (e.g., TCP/IP, User Datagram Protocol/Internet Protocol (UDP/IP), HTML, Hypertext Transfer Protocol (HTTP)) may be used. Such standards are periodically superseded by faster or more efficient equivalents having essentially the same functions. Accordingly, replacement standards and protocols having the same or similar functions as those disclosed are considered equivalents thereof. It may be noted that the method as described in the present disclosure can be implemented in a wide variety of electronic devices including but not limited to desktop computers, lap top computers, palm top computers, tabs, mobile phones, televisions, etc. Also, the user input can be received by the system using a wide variety of techniques including but not limited to using a mouse, a gesture input, a touch input, a stylus input, a joy stick input, a pointer input, etc.
While certain present preferred embodiments of the disclosure have been illustrated and described herein, it is to be understood that the disclosure is not limited thereto. Clearly, the disclosure may be otherwise variously embodied, and practiced within the scope of the following claims.
Number | Date | Country | Kind |
---|---|---|---|
201611035958 | Oct 2016 | IN | national |