1. The Field of the Invention
This application relates to the field of word processing software, and more particularly, to a method and system for generating and displaying citations in word processing software.
2. The Relevant Technology
Most scholarly writings are carried out on word processing software such as Microsoft® Word from Microsoft® Corporation, OpenOffice.org™ from Sun Microsystems® Inc., Google Docs™ from Google® Inc. and Zoho® Writer from Zoho® Inc. Authors of scholarly works often refer to the works of others in their writings to acknowledge the source of a quotation or a paraphrase of a source's ideas. Such references to the works of others are generally known as citations. The most commonly cited works are published journal articles. Research articles published in reputable journals are often peer reviewed by others who are expert in the same field. There are several categories of item types that are cited in scholarly writings including theses, books, web pages, and newspaper articles. The intent of citations is to correctly attribute other authors as the source of the information. The citation normally contains sufficient detail so that the reader is able to uniquely identify and retrieve the referenced article.
Publishers often require authors to format citations in accordance with specific guidelines generally known as citation styles. For each citation style, the guidelines specify how citation for each item type is to be formatted. There are many official guidelines on citation styles put forth by various organizations and publishers. MLA (Modern Language Association) style is most commonly used to cite sources for the liberal arts and the humanities. The US National Library of Medicine (NLM) provides citation styles that are commonly used by journals in life sciences and medicine. Organizations such as the American Psychological Association (APA) and the Institute of Electrical and Electronics Engineers (IEEE) have developed their own citation styles and these citation styles have been adopted by many scientific and technical journals. Some journals use their own citation style derived from well-known existing citation style such as the Harvard style and the Vancouver style. If an article submitted to a journal is rejected for publication, the author may have to reformat the citations on re-submission to a different journal which may require a different citation style.
Most modern reference management software such as EndNote® (Thomson Reuters®) provides automatic generation of citations in accordance with a selected citation style. EndNote's “Cite While You Write” feature inserts EndNote® commands into Microsoft® Word's Tools menu to give user direct access to users' references inside EndNote® while writing in Microsoft® Word. The “Cite While You Write” commands enable EndNote® to do citation formatting inside Microsoft® Word. EndNote® uses a type of embedded fields as disclosed in U.S. Pat. No. 5,552,982 for insertion and formatting of citations.
Research and scientific writing often involve multiple collaborators. Not all collaborators necessarily use the same word processing software. Software companies recognize this issue and have made the documents produced by one software to be interoperable with different word processing software. A Microsoft® Word document contains, in addition to the raw text created by the author, also contain certain proprietary encoding to define the formatting of the text. Such a document file created using Microsoft® Word will have a file extension of doc or docx. Word processing software from OpenOffice.org™ (Oracle Corporation) is capable of opening a Microsoft® Word document file. After extracting the text information and translating Microsoft® Word's formatting information to the OpenOffice.org™ encoding format, the document will be displayed by the OpenOffice.org™ software in the same format as was originally created in Microsoft® Word. The user can edit this document inside the OpenOffice.org™ word processing software and then save it either in native OpenOffice.org™ word processing software format with an odt file extension or in Microsoft® Word format with a doc file extension.
Many word processing software ensure that during import or export of document file, encoding for text formatting are faithfully translated for interoperability amongst the major word processing software. However, citations inserted into a document (e.g., created by Microsoft® Word using Endnote's “Cite While You Write” command) will not be correctly converted into corresponding citations when opened in OpenOffice.org™ word processing software. This lack of interoperability is due to the use of proprietary “fields” that are specific to Microsoft® Word.
Fields in a document-processing environment as disclosed in U.S. Pat. No. 5,552,982 are placeholders embedded within a stream of text. Each field is delineated by unique beginning and ending field characters, enclosing a field keyword, one or more field arguments, a separator field character (“|”) and a field result. When a field is encountered in a document, the keyword and arguments are parsed and field-specific functions are invoked through a look-up table that is indexed by the keyword. Field results are generated based on field arguments. The field results are displayed in the position occupied by the placeholders. U.S. Pat. No. 5,552,982 disclosed means whereby fields may be displayed in a result mode or field code mode. U.S. Pat. No. 5,552,982 also discloses a single mechanism for handling all fields, wherein a field specific algorithm that is invoked by the generic field mechanism defines the specific behavior of a field. Because there is no common field specific function between different word processing software, the fields cannot be ported directly from one software to another to produce the same field results and be displayed correctly.
Online word processors such as Google Docs™ and Zoho® Writer are based on HTML documents. HTML documents do not support the implementation of fields as taught by U.S. Pat. No. 5,552,982. Moreover, there is no common set of fields that would work for all word processors.
In general, a citation comprises of one or more in-text citations and a corresponding bibliographic citation containing bibliographic items for each in-text citation. The bibliographic citation is usually appended at the end of the main body of the writing as a bibliographic section with section titles such as “References”, “Cited Work” or “Bibliography.” The bibliographic citation starts with a bibliographic section title followed by one or more bibliographic items. Each bibliographic item contains sufficient information such as the authors' names, title of the article, the publisher, the page number and the date of publication so that the readers of the article are able to locate the cited work.
Referring to
Citation style dictates how the in-text citations and the bibliographic citations are formatted and displayed. The citation style includes rules for punctuations, separators between texts, prefixes, suffixes and may also include font style for text. The citation style also includes rules as to what bibliographic parts are to be displayed and the sequence of appearance of the bibliographic parts. In many scholarly publications, in particular peer-reviewed publications, journal articles are the most commonly cited items. Other item types include but not limited to books, thesis, proceedings of scholarly conferences, technical reports and web pages. The citation style rules are specific for each item type. For example in most citation styles for journal articles, the page number part is required but the page number is not required if a web page is cited. The page number is not required for web page citation as page number is not relevant for web pages.
The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. The intent of this organization is to develop technical specifications and guidelines by consensus to facilitate compatibility in the implementation of technology for the World Wide Web. HTML is the most successful document markup language due in part to compliance to the W3C's HTML specifications by software developers, webmasters and users in general. The descriptions and definitions of HTML terms used here generally conform with the meaning from the official W3C's HTML specifications Version 4.01.
HTML document is a type of document that can be displayed as web pages on a computer screen by software applications, such as web browsers. Modern word processing software such as Microsoft® Word and OpenOffice.org™ can also open HTML documents for editing. Online applications like Google Docs™ and Zoho® Writer also have capabilities to format and edit HTML documents online. Web browsers from different operating systems such as Window OS, Mac OS and Linux can all open HTML documents and display the HTML document as web pages.
HTML documents are made up of HTML elements.
One form of element content 34 is the text to be rendered on the computer screen. Thus when a web browser application encounters an element such as, <FONT SIZE=“1”>This is an example of the smallest font.</FONT>
in an HTML document, the text “This is an example of the smallest font.” will be rendered on the computer screen in font size 1. In addition to text, element content can also include images, videos, sound and other elements.
In 1994, Tim Berners-Lee saw the possibility of retrieving information over the Internet by defining in a standard fashion the physical addresses of resources which are retrievable using protocols already deployed on the Internet. For the purpose of this invention, resources comprise of electronic files, electronic documents, computer programs, data, source of information, the operators and operands of a mathematical equation, types of relationships, or numeric values. Uniform Resource Locator (URL) specifies where an identified resource is available and the protocol to retrieve the identified resource. Tim Berners-Lee discloses an early draft specification of a URL in a document in March 1994.
Within the context of computing, the term “path” is meant to specify a unique location in a file system in the host computer of the desired resource. The path segment 43a in the context of the Internet is the path that points to the location of the desired resource on the host computer.
Query segment 44a, also known as “query string” to those skilled in the art, is made up of a series of field-value pairs. By standard convention the field-value pairs are separated from its associated value by an equal sign “=”. A series of field-value pairs are separated from each other by the ampersand “&” sign or by a semicolon. When a host computer receives a request from a client computer for a resource such as a query to a database, the information contained in the query string in the form of field-value pairs is passed as a query to the database. For example in
Hyperlinks are integral to the functioning of the World Wide Web. In the context of the Internet, hyperlinks are HTML elements generally known to those skilled in the art as “anchor elements” with the starting tag <a>. Hyperlinks can point to any resources on the Internet using the attribute href=“URL”. In general a hyperlink consists of three components, a URL, an element content and an element title with the following syntax: <a href=“URL” title=“Title”>Label</a>.
In an HTML document the text “Label” will be displayed usually as underlined text with a color different from the other text in the document. When the cursor hovers over the text “Label”, the cursor changes into a hand icon and when the cursor hover over the text “Label” for a predetermined length of time the element title appears as the text “Title.” Element title as described here is also known as “tool tip” or “screen tip” to those skilled in the art. In the context of hyperlinks, the term hyperlink label will be used interchangeably with the term element content as in general the element content for hyperlinks is generally text. The text rendered on the computer screen in accordance with the hyperlink label will be termed as the displayed label. Many types of word processing software including Microsoft® Word and OpenOffice.org™ will translate the hyperlinks in html documents into the corresponding encoding for hyperlinks in the respective word processing software. Likewise hyperlinks in the native word processing format are translated correctly as html hyperlinks when exported as an html document. Developers of word processing application have taken great care to ensure interoperability of hyperlinks because of the ubiquitous use of hyperlinks in html documents and in word processing documents.
Hypertext Transfer Protocol (HTTP) is a fast stateless information retrieval protocol for the Internet most commonly using the TCP/IP platform. Resources to be accessed are located using URLs. A typical HTTP session is a request/response transaction in which the client computer initiates a request for the desired action to be taken on the identified resource, eventually receiving the result of the request from the server. When HTTP is combined with secure sockets layer (SSL) and/or transport layer security (TLS) protocols to provide encryption and secure identification of the server, it is known as Hypertext Transfer Protocol Secure (HTTPS). In the implementation of this invention, HTTP or HTTPS can be used interchangeably depending on the level of security desired.
Add-ons are optional computer programs that, if combined with the host program, will supplement or enhance the functionalities of the host program. Third party software developers usually provide such add-ons. Add-ons include plug-ins and extensions, well understood by those skilled in the art. The add-ons by themselves are usually non-functional. The host program provides means for the add-ons to register with host program.
Most computer software, for example a generic word processing software of which the user interface 5 is shown in
Various embodiments of the present invention will now be discussed with reference to the appended drawings. It is appreciated that these drawings depict only typical embodiments of the invention and are therefore not to be considered limiting of its scope.
Embodiments of the present invention relate to methods, computer program products and/or systems that manage and generate citations for many different types of writings including scholarly writings. The drawings and accompanying description are merely exemplary. Accordingly, the scope of the present invention is not intended to be limited by the examples discussed herein.
Embodiments of the present invention provide an interoperable method of managing and generating citations amongst different types of word processing software.
The following discussion now refers to a number of methods and method acts that may be performed. It should be noted, that although the method acts may be discussed in a certain order or illustrated in a flow chart as occurring in a particular order, no particular ordering is necessarily required unless specifically stated, or required because an act is dependent on another act being completed prior to the act being performed.
Embodiments of the present invention may comprise or utilize a special purpose or general-purpose computer including computer hardware, such as, for example, one or more processors and system memory, as discussed in greater detail below. Embodiments within the scope of the present invention also include physical and other computer-readable media for carrying or storing computer-executable instructions and/or data structures. Such computer-readable media can be any available media that can be accessed by a general purpose or special purpose computer system. Computer-readable media that store computer-executable instructions are computer storage media. Computer-readable media that carry computer-executable instructions are transmission media. Thus, by way of example, and not limitation, embodiments of the invention can comprise at least two distinctly different kinds of computer-readable media: computer storage media and transmission media.
Computer storage media includes RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store desired program code means in the form of computer-executable instructions or data structures and which can be accessed by a general purpose or special purpose computer.
A “network” is defined as one or more data links that enable the transport of electronic data between computer systems and/or modules and/or other electronic devices. When information is transferred or provided over a network or another communications connection (either hardwired, wireless, or a combination of hardwired or wireless) to a computer, the computer properly views the connection as a transmission medium. Transmissions media can include a network and/or data links which can be used to carry data or desired program code means in the form of computer-executable instructions or data structures and which can be accessed by a general purpose or special purpose computer. Combinations of the above should also be included within the scope of computer-readable media.
Further, upon reaching various computer system components, program code means in the form of computer-executable instructions or data structures can be transferred automatically from transmission media to computer storage media (or vice versa). For example, computer-executable instructions or data structures received over a network or data link can be buffered in RAM within a network interface module (e.g., a “NIC”), and then eventually transferred to computer system RAM and/or to less volatile computer storage media at a computer system. Thus, it should be understood that computer storage media can be included in computer system components that also (or even primarily) utilize transmission media.
Computer-executable instructions comprise, for example, instructions and data which cause a general purpose computer, special purpose computer, or special purpose processing device to perform a certain function or group of functions. The computer executable instructions may be, for example, binaries, intermediate format instructions such as assembly language, or even source code. Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the described features or acts described above. Rather, the described features and acts are disclosed as example forms of implementing the claims.
Those skilled in the art will appreciate that the invention may be practiced in network computing environments with many types of computer system configurations, including, personal computers, desktop computers, laptop computers, message processors, hand-held devices, multi-processor systems, microprocessor-based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, mobile telephones, PDAs, pagers, routers, switches, and the like. The invention may also be practiced in distributed system environments where local and remote computer systems, which are linked (either by hardwired data links, wireless data links, or by a combination of hardwired and wireless data links) through a network, both perform tasks (e.g. cloud computing, cloud services and the like). In a distributed system environment, program modules may be located in both local and remote memory storage devices.
A computer network is a collection of computers and devices interconnected by communications channels that facilitate communications and sharing of resources and information among the interconnected computers and devices. Computer networks are identified by their scale and scope, such as Local Area Network (LAN) and Wide Area Network (WAN). The Internet is a global system of interconnected computer networks most commonly using the TCP/IP communication protocol. These terms are well known to those skilled in the art. The terms network, computer network and the Internet are used interchangeably and the present invention is applicable to all of these configurations.
For the purpose of describing one embodiment of this invention, the term “citation add-ons” refers to add-ons provided for by this invention for the purpose of managing and generating citations. In one embodiment, the invention provides citation add-ons to be installed in existing word processing software such as Microsoft® Word, OpenOffice.org™, Google Docs™ and Zoho® Writer for the purpose of managing and generating citations. This invention is applicable to any rich text documents including but not limited to HTML documents. A user interface 5 of a generic word processing software as shown in
In one embodiment, the HTML elements are hyperlinks. The element content of the hyperlinks is displayed as in-text citations or as bibliographic citations. An in-text hyperlink is a hyperlink in which the element content is the appropriately formatted text to represent in-text citation in accordance with a desired citation style. In the context of in-text hyperlink, an in-text label is the element content containing the formatted text representing the in-text citation. An in-text URL is the URL of an in-text hyperlink. An in-text title is an element title of an in-text hyperlink.
Similarly, a bibliographic hyperlink is a hyperlink in which the element content contains the text for a bibliographic citation that is appropriately formatted in accordance with a desired citation style. In general there is only one bibliographic hyperlink per document. In the case of books there can be multiple bibliographic hyperlinks with one bibliographic hyperlink for each chapter of the book. A bibliographic URL is the URL for a bibliographic hyperlink. In the context of bibliographic hyperlink, a bibliographic label is the element content containing the formatted text representing the bibliographic citation. A bibliographic label includes a section title and a list of bibliographic items derived from the list of in-text hyperlinks formatted in accordance to the desired citation style. The bibliographic hyperlink is inserted at a defined point in the document. The location of the bibliographic hyperlink in the document may be dictated by the citation style, by the requirement of the publisher or by generally accepted convention. A bibliographic title is an element title of a bibliographic hyperlink pertaining to the selected item. In an alternative embodiment, the bibliographic title can be configured to display any desired message (e.g. an advertisement) or, alternatively, it could be left blank.
The installation of a citation add-on in the host program creates a series of command buttons 531, 532, 533, 534 in the tool bar area 52 of the host program as illustrated in
The following is a description of one embodiment of the citation add-on. Referring to
An example of an in-text URL in response to the HTTP request is as follows.
https://www.wizfolio.com/?flag=1 &type=1&ver=3&ItemID=2E171Y&UserID=53X49& AccessCode=F799BFC25&OptionalSuff=
In the above example the field-value pair “flag=1” indicates that this URL relates to in-text citation. The field-value pair “type=1”, indicates that the type of item this in-text citation represent is a journal article. A different number for example “type=2” may indicate that the type of time is a book. The field-value pair “ver=3” indicates that the version of the citation add-ons that was used. The field-value pairs “UserID=53×49” and “ItemID=2E171Y” are identifiers relating to the user and the specific item for this in-text citation respectively. These identifiers are useful for retrieving data (e.g., bibliographic data from a database). The in-text URL, can provide links to the corresponding bibliographic data for the selected stored reference items. Additionally, bibliographic data of the selected item can also be embedded in the in-text URL (e.g., as field-value pairs in the query segment of the in-text URL). The name-value pair “AccessCode=F799BFC25” is an access code for giving permission to the user to retrieve bibliographic data for a specific item in a specific user's collection. The access code is generated at the time the item is created in the database of a user's account or alternatively it can be created by the Selection Function 91 when an item is selected by the user. The role of the access code in this invention is described later. To those skilled in the art, it will be apparent that additional name-value pair can also be included for example an optional attribute OptionalSuff. The in-text label and in-text title resulting from the Insert Call Routine 81 is in the form of a temporary in-text label and a temporary title respectively. The Insert Call Routine 81 then generates an in-text hyperlink for each of the selected items. The in-text hyperlinks are generated in accordance with the encoding for hyperlinks of the host software using values from the in-text triplets.
The Scan Element Routine 82 is next invoked to collect information on all the in-text hyperlinks in the document. The Scan Element Routine 82 scans the whole document for in-text hyperlinks to collect information contained within the in-text hyperlinks. Hyperlinks within the document are detected in accordance with the encoding for hyperlinks of the host program. The Scan Element Routine 82 identifies a hyperlink as an in-text hyperlink inserted by the Insert Call Routine 81 by recognizing certain predetermined characteristics of the URL of the in-text hyperlink such as the hostname within the in-text URL or one or more predetermined field-value pair(s) in the query segment of the in-text URL. As an alternative, the name-value pair within the opening tag of the hyperlink can be used as a flag for identifying a hyperlink as an in-text hyperlink. The Scan Element Routine 82 identifies each of the in-text hyperlinks, the sequence of appearance of the in-text hyperlinks and if there is intervening text between the sequential in-text hyperlinks. If there is no text intervening between two sequential in-text hyperlinks then the two in-text hyperlinks are considered as contiguous in-text hyperlinks. For the purpose of determining contiguous in-text hyperlinks, the term “text” here means any character including punctuations and spaces. Each citation style has its own rules with respect to how in-text citations are formatted for two or more contiguous in-text citations. For example, in APA format, three contiguous in-text citations are formatted as (Lee, 2007; Johnson, 1997; Rock, 1994) whereas in one form of NLM citation style, the three in-text citations are represented as (1-3). A group of contiguous in-text hyperlinks forms an in-text citation block. The information pertaining to in-text citation block is used for the proper formatting of citations in accordance to specific citation style.
The Scan Element Routine 82 also identifies a particular hyperlink as a bibliographic hyperlink by recognizing a predetermined element attribute flagging that hyperlink as a bibliographic hyperlink. Additional element attributes include name-value pair for version of citation add-on, name-value pair for user identification and name-value pair for the selected citation style. An example of a bibliographic URL is as follows.
https://www.wizfolio.com/?flag=2&ver=3&UserID=5S$4AA&StyleName=APA
In the above example, the name-value pair “flag=2” indicate that this hyperlink is a bibliographic hyperlink. The name-value pair “ver=3” is the version of citation add-on. The name-value pair “UserID=5S$4AA” is used to identify the user. The name of the citation style selected is APA as given by the name-value pair “StyleName=APA”. In addition to a default citation style, means can be incorporated to make the citation style selectable by the user. Further means can also be incorporated to make the citation style modifiable by the user.
Citation packet is a packet of information comprising the information embedded in all the in-text hyperlinks, information embedded in the bibliographic hyperlink, sequence of the hyperlinks and in-text citation blocks. Bibliographic Call Routine 83 initiates an HTTP request to a host computer on the Internet by sending the citation packet to the Citation Engine Function 92 (
The response to the HTTP call of Bibliographic Call Routine 83 includes the updated element contents of the all the in-text hyperlinks and the bibliographic hyperlink. The Update Routine 84 then refreshes the documents with the updated element contents for all the in-text hyperlinks and the bibliographic hyperlink.
To insert a citation into the document, the user positions the cursor 60 in the user interface to a desired position in the document as depicted in
For the purpose of explaining the function of the access code, we use the example of two users collaboratively working on a single document and each user cites from their own database of bibliographic data. When an item is selected by a first user using the Selection Function 91, the right to access this item is determined by the unique access code of selected item belonging to the first user. As previously described, the access code is embedded in the in-text URL of the in-text triplet. The in-text URL as previously described also contains identification of the first user and the item identification of the selected item whose bibliographic data resides in a database controlled by the first user.
One embodiment of this invention comprises the use of hyperlinks to represent in-text citations and bibliographic citation in which the hyperlinks provide the means to locate the appropriate bibliographic data and citation style for generating in-text citations and bibliographic citations. The hyperlinks themselves may not contain the full set of bibliographic data used to generate in-text citation and bibliographic citation for all citation styles. The in-text labels and bibliographic label contain only a subset of the bibliographic data sufficient for the prevailing citation style. After the first user inserted certain citations into the document, he sends it to the second user. The second user opens the document created by the first user to edit. When he chooses a citation style different from the first user, the Citation Engine Function 92 will retrieve bibliographic data for each of the citations cited by first user from the database controlled by the first user. The access codes give permission for the second user to access specific items in the first user's database. The access codes are item-specific and, therefore, do not give permission for the second user to have unlimited access to all data in a database of the first user.
This invention provides a citation style button 532 which when clicked will open a choose citation style window 7 as shown in
This invention also provides a list bibliography button 534. Clicking on this button will invoke the Scan Element Routine 82 followed by the Bibliographic Call Routine 83. However in this instance, the Bibliographic Call Routine 83 will make a HTTP request for List Citations Function, a resource residing on a target host computer residing on the Internet. The List Citation Function processes the information contained in the citation packet transferred by the HTTP request. The List Citation Function abstracts all the information from the in-text hyperlinks and generates a bibliographic list to be displayed in a bibliographic list window. The bibliographic list window allows the user to select the desired item to be saved into a desired location controlled by the user such as a bibliographic database.
In
In one embodiment of the invention, hyperlinks are used to represent in-text citation and bibliographic citation. In an alternative embodiment any HTML element with attributes and element content can be used to represent in-text citations and bibliographic citations.
In yet another alternative embodiment of the invention, the information pertaining to the bibliographic parts such as the author, title, volume, page number, date of publication and publisher can be directly embedded into an HTML element as name-value pairs or any other data encoding format well known to those skilled in the art such as XML and JSON. The element content to represent in-text citation is generated using the information from the embedded data in the HTML element.
This application is based upon and claims priority to U.S. Provisional Patent Application No. 61/313,803, filed on Mar. 15, 2010, entitled “Methods for Managing and Generating Citations in Scholarly Work” which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61313803 | Mar 2010 | US |