This invention pertains generally to login management utilities, and more specifically to using positional analysis to identify login credentials on a web page.
It is useful for login management utilities to be able to identify fields on web pages used for inputting login information, such as user name and password entry fields. Password entry fields can be identified by analyzing the underlying Hypertext Markup Language (HTML) describing a web page. HTML uses a specific type of field to represent a corresponding password entry field on a web page. However, there is no specific field type in HTML to be used for the entry of other types of login information. Some web pages use generic text entry field types for this purpose, whereas some web pages use additional password entry fields to prompt for the input of other login information.
For these reasons, login management utilities typically analyze the text on a web page proximate to given entry fields to attempt to identify login information input fields. For example, if the text “Enter User Name” appears next to a generic text entry field, a login management utility might conclude that the generic text entry field comprises a user name entry field. However, web pages containing login forms are written in many different languages, and use many different terms and criteria to identify their login information entry fields. This makes a proximate text based identification of specific login entry fields difficult and potentially inaccurate.
A text based analysis typically requires maintaining a database of keywords associated with different login entry fields in different languages. Such a database requires entries and updates for every language and all known words/phrases used to prompt a user to enter login information. In practice, no such database can ever be complete or current, and attempting to keep it so is very labor intensive.
It would be desirable to address these issues.
A login credentials identification component uses analysis of the relative positions of text entry fields to identify login credentials on a web page. The login credentials identification component identifies both a password entry field on a web page, and the text entry field immediately preceding the identified password entry field. The login credentials identification component uses the positional relationship between the identified password entry field and the immediately preceding text entry field, as well as other supplemental factors, to determine that the identified text entry field immediately preceding the password entry field comprises a user name entry field.
The features and advantages described in this summary and in the following detailed description are not all-inclusive, and particularly, many additional features and advantages will be apparent to one of ordinary skill in the relevant art in view of the drawings, specification, and claims hereof. Moreover, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter, resort to the claims being necessary to determine such inventive subject matter.
The Figures depict embodiments of the present invention for purposes of illustration only. One skilled in the art will readily recognize from the following discussion that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the invention described herein.
As illustrated in
To identify the user name entry field 103 once the password entry field 107 has been identified, the login credentials identification component 101 leverages the fact that common textual and logical flows dictate that a user name entry field 103 typically immediately precedes the password entry field 107. When looking at almost any web page 105 with login entry fields, it is almost universally true that the user name entry field 103 immediately proceeds the password field 107. In other words, in languages that are read from left-to-right and top-to-bottom, the user name entry field 103 appears directly above and/or to the left of the password entry field 107 on the web page 105 (appropriate directional adjustments can be made for languages read from right-to-left and/or bottom-to-top). Thus, once the password entry field 107 has been identified, the login credentials identification component 101 identifies the text entry field 106 immediately preceding the password entry field 107 as the user name entry field 103. An example of a web page 105 for which the login credentials identification component 101 identifies a user name entry field 103 and a password entry field 107 with such a positional relationship is illustrated in
As noted above, in some instances a password entry field 107 (as opposed to a generic text entry field 106) is used for the user name entry field 103. Where this is the case, the login credentials identification component 101 identifies both password entry fields 107. Responsive to identifying two password entry fields 107 appearing one after the other on a single web page 105, the login credentials, identification component 101 can determine which is the true password entry field 107 and which is the user name entry field 103, based on their relative positions. In other words, the login credentials identification component 101 can identify the password entry field 107 that immediately proceeds the other one as the user name entry field 103, and the second one as the true password entry field 107. An example of a web page 105 for which the login credentials identification component 101 identifies a user name entry field 103 and a password entry field 107 in this manner is illustrated in
Where more than two password entry fields 107 appear on a single web page 105 (or in some embodiments even where only two appear), the login credentials identification component 101 can use key word analysis of adjacent text 117 to supplement the analysis of the relative positions of the fields. For example, if the position of one of the multiple password entry fields 107 does not indicate definitively whether or not it is in fact a user name entry field 103, the login credentials identification component 101 could identify adjacent text 117, and check the identified text 117 against a keyword database 113, looking for words or phrases indicating the nature of the field. Note that this type of supplemental keyword analysis can also be utilized in scenarios in which only a single password entry field 107 appears on a web page 105, in order to identify given entry fields with a greater degree of certainty. Such supplemental keyword analysis involves maintaining a keyword database 113 (or similar keyword storage mechanism), but because the keyword analysis is only used to supplement the positional analysis, the database 113 would not need to be as extensive or require as much upkeep as a more generic keyword database.
In addition to or instead of supplemental keyword analysis, the credentials identification component 101 can also augment the positional analysis by referring to a database 115 of exceptions and/or clarifying information. For example, such information could identify and clarify known instances of text 117 or text entry field 106 combinations that the login credentials identification component 101 would otherwise be likely to misinterpret. This clarification database 115 would also require less maintenance than a generic keyword database.
As illustrated in
Turning now to
Returning to
As will be understood by those familiar with the art, the invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. Likewise, the particular naming and division of the portions, modules, agents, managers, components, functions, procedures, actions, layers, features, attributes, methodologies, data structures and other aspects are not mandatory or significant, and the mechanisms that implement the invention or its features may have different names, divisions and/or formats. Additionally, software components of the present invention are in no way limited to implementation in any specific programming language, or for any specific operating system or environment. Furthermore, it will be readily apparent to those of ordinary skill in the relevant art that where components of the present invention are implemented in whole or in part in software, the software components thereof can be stored on computer readable storage media as computer program products. Any form of tangible computer readable storage medium can be used in this context, such as magnetic or optical storage media. As used herein, the term “computer readable storage medium” does not mean an electrical signal separate from an underlying physical medium. Additionally, software components of the present invention can be instantiated (for example as object code or executable images) within the memory of any computing device, such that the software component(s) cause(s) the computing device to perform corresponding functionality. As used herein, the terms “computer” and “computing device” mean one or more computers configured and/or programmed to execute the described functionality. Accordingly, the disclosure of the present invention is intended to be illustrative, but not limiting, of the scope of the invention, which is set forth in the following claims.
Number | Name | Date | Kind |
---|---|---|---|
5754306 | Taylor et al. | May 1998 | A |
6171112 | Clark et al. | Jan 2001 | B1 |
6657647 | Bright | Dec 2003 | B1 |
6868499 | Buckle | Mar 2005 | B1 |
6981028 | Rawat et al. | Dec 2005 | B1 |
7725332 | Soong | May 2010 | B1 |
7970754 | Jarboe et al. | Jun 2011 | B1 |
8353039 | Cooley et al. | Jan 2013 | B2 |
20030191703 | Chen et al. | Oct 2003 | A1 |
20030225696 | Niwa | Dec 2003 | A1 |
20030233580 | Keeler et al. | Dec 2003 | A1 |
20040064704 | Rahman | Apr 2004 | A1 |
20040080529 | Wojcik | Apr 2004 | A1 |
20040119749 | Luque | Jun 2004 | A1 |
20040205176 | Ting et al. | Oct 2004 | A1 |
20050003801 | Randall et al. | Jan 2005 | A1 |
20050177731 | Torres et al. | Aug 2005 | A1 |
20050256841 | Rawat et al. | Nov 2005 | A1 |
20060059247 | Marappan et al. | Mar 2006 | A1 |
20060174127 | Kalavade et al. | Aug 2006 | A1 |
20080010377 | Nissennboim | Jan 2008 | A1 |
20080027217 | Hodge | Jan 2008 | A1 |
20080172381 | Suh | Jul 2008 | A1 |
20080320310 | Florencio et al. | Dec 2008 | A1 |
20090158399 | Cooley et al. | Jun 2009 | A1 |