The foregoing aspects and many of the attendant advantages of this invention will become more readily appreciated as the same become better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein:
Generally described, aspects of the present invention are directed toward providing available source data and localized information for utilization in applications or documents. Applications can include software applications, multimedia applications, Internet applications such as a Web browser, scripts, Web pages, libraries, word processing programs, applications used on portable devices, or any type of application which processes data. More specifically, a source string is obtained and used to search for corresponding source data. The source data can include one or more available resources or one or more content elements. Further, the resources can correspond to strings. For example, the source string can be a string processed by an application and the source data can correspond to strings which have been or will be used in other applications. The resources can also correspond to linking information. For example, an author can link to a resource which has not yet been finalized. One of the resources or content elements included in the source data can be selected to obtain data associated with the selected resource. The associated data can include localized information corresponding to the selection of source data. For example, the associated data can include translations corresponding to the selection of source data. Thus, a user can be provided with available source data and localized information which can be used in an application or document.
Validation processes can be performed on the source string. For example, the source string can be validated as it is entered by a user. Validation processes can also be performed on the source data and the associated data. For example, source data and associated data may be validated before it is used in a new application. Several different types of validation processes can be performed on the source string, source data, and associated data. Examples of the types of validation processes which can be performed include functional validation, linguistic validation, validation according to a black-list or white-list, stylistic validation, terminology validation, and any other type of validation which helps to ensure that data conforms to the validation procedures. The various types of validation processes are described in more detail below.
With reference now to
A data provider 102 can store and provide data from a third-party data provider, from a machine-translation engine, and/or from data providers designated by a system administrator. A data manager can obtain data from the one or more data providers and consolidate the various data sets. Consolidation of data can involve prioritizing data such that it is displayed to the user or persisted in a certain order. For example, the data manager 104 can learn from user feedback, either explicit or implicit, how to prioritize the results. If a user consistently discards items from a certain data provider, the data can be consolidated such that the items retrieved from that data provider are ranked lower on the data stack over time. There are no inherent restrictions on the data sources. However, a user of the system may choose to limit and/or prioritize the sources of data. For example, a user may choose to exclude third-party data providers or to display or persist data from third-party providers last. Alternatively, the data manager 104 can use other methods to sort the data within the consolidated data set such that data is displayed or persisted in an order relative to the priority of the data. The one or more data providers 102 can provide information directly to the client device 102. In the case where multiple data providers 102 provide information directly to the client device 108, a data manager component 104 located on the client device 108 can consolidate the data sets received from each data provider 102. For example, the data sets can be consolidated such that redundant data is removed from the consolidated set of data. Further, within the consolidated data set, the data can be sorted and displayed in different ways showing higher priority data first.
In an illustrative embodiment, a data-provider interface component 112 can obtain data from one or more data providers 102. The data provider interface component 112 can obtain data from one or more local data providers 102 or from one or more remote data providers 102 over a network 105. A local data provider 102 can be distributed via CD, USB device, or other type of removable storage device. Local data providers 102 can be used in markets where there is insufficient communication structure to use remote data providers 102. Data from the data providers can be consolidated by a data manager component 104 before being provided to the data-provider interface component 112. Locally retrieved data can be consolidated by a data manager component local to the client device 108. The one or more data providers 102 can store source data, associated data, or both. A user-interface component 114 can obtain data from the data-provider interface component 112 for display to a user. In an illustrative embodiment, the user-interface component 114 represents the front-end of a stand-alone application. Alternatively, an application-interface component 110 can integrate the user-interface component 114 with one or more software applications 116. For example, the user-interface component 114 can be integrated into a software development application such that authors have access to validated resources and localized information in the typical development environment. The user-interface component can be integrated with any type of authoring program, such as, for example, a content authoring program. In the case of content authoring, instead of reusing available resources, available content would be re-used. Content authoring is discussed in more detail below.
One skilled in the relevant art will appreciate that the data and/or components may be stored on a computer-readable medium and loaded into memory of the client device 108 using a drive mechanism associated with the computer-readable medium, such as a floppy, CD-ROM, DVD-ROM drive, or network interface. Further, the components can be included on a single device or distributed in any manner. For example, all the components could be located on the client device 108. Furthermore, the components can be integrated in any manner. For example, the user-interface component 114, data-provider interface component 112, and application interface component 110 could be integrated into a single component. Furthermore, the components shown in
With reference now to
With continued reference to
With reference now to
With continued reference to
With reference now to
With continued reference to
The data provider queries can also include queries across data types. For example, a user can search for a source string which matches an audio file, another string, a bitmap, a video file, or any other date type. Furthermore, searches can be further restricted using metadata which specifies attributes of corresponding source data and associated data. For example, metadata can specify the author of the data, the language, the size of the corresponding data, in what context the data was previously used (e.g., on a button, in a script, etc.), or other attributes. If the corresponding data is a bitmap, the metadata can specify the color depth of the bitmap. Utilizing the metadata, the data queries can be restricted such that only data from a specific author or of a specific size is returned. For example, if an author creates a string for use as a button label, when running the search, the author can specify that only data which was previously used as a button label should be returned.
With reference now to
The source string 404 can be validated. In an illustrative embodiment, the source string is validated as the user types it into the source string display portion 406. Different types of validation can be performed on the source string 404. For example, a functional validation can be performed to detect localizability issues. Further, functional validation can be used to verify that portions of a source string which have functional aspects will continue to function correctly in a target string. For example, variables within a source string, sometimes referred to as “placeholders”, are replaced with a value before the string is displayed to a user. Functional validation can be used to verify that placeholders will continue to function correctly in localized versions of the source string. In an illustrative embodiment, functional validation can be directly integrated into the software development environment. For example, functional validation can be directly integrated such that validation can be performed while the author is entering source text. This allows the author to be notified immediately of any potential issues. Because validation can be performed during development of an application, an author or developer can decide whether to change the source string to be consistent with constraints imposed by the application or whether to modify any constraints imposed by the application such that the source string validates against those constraints. Performing validation at design time allows the designer to modify the application when it is most efficient to do so.
A linguistic validation can also be provided to help users create source strings that are free of grammar and spelling errors. Linguistic validation can also be performed as the source string is input by a user. White-list and black-list validation can be performed against terms in a string. As discussed above, a white-list validation validates that terms on a white list appear in the string whereas a black-list validation validates that terms on a black list do not appear in the string. Geopolitical validation is a type of terminology validation and can be performed against a black-list or white-list to verify that a source string conforms to geopolitical requirements. For example, country A may not recognize the existence of country B. As such, using the name of country B in products targeted to country A the can cause the product to be ill-received. Geopolitical validation can verify that source strings do not contain such geopolitically sensitive terms. Likewise, geopolitical validation can be used to verify that products shipped to country B include country B's name where appropriate. Geopolitical validation can be performed as the source string is input by a user. Geopolitical validation can also be performed on target strings. As demonstrated by geopolitical validation, the validation techniques used are market-dependent. That is, the validation techniques used can change based on the target market. A stylistic validation can verify that the source string 404 is understandable to humans and/or machine translation engines. Stylistic validation can be performed as the source string is input by a user. The different types of validation can be provided within a stand-alone tool or integrated into other applications.
With continued reference to
Illustrative user interface 400 can also include a status display control 408 for hiding or displaying a status display portion. The status display portion can be used to provide status information. Search display control 410 can be used to display or hide a source data display portion 412. The source data display portion 412 can display source data corresponding to the source string 404. The source data can be retrieved by a user by clicking button 426. In an illustrative embodiment, the source data is retrieved from one or more data providers 102. The source data can include available resources. Alternatively, the source data can include available content. The source data can also include tags which link to resources or content which have not yet been finalized. For example, an author may create a new resource or content and make the item available. Other authors can choose to link to the tag even though it has not been finalized or validated. Once the resource or content is finalized, the link allows the finalized version to be obtained automatically. In an illustrative embodiment, the source data includes user-interface strings which have been previously used in other applications. Consistency within an application and across applications can be achieved by re-using existing strings. A key term or terms can be entered into the source string display portion 406 and used to form the query for source data 412. The user can also attempt to type the string exactly as they would like it to appear. Metadata corresponding to the resources in the source data 412 can also be retrieved from the one or more data providers 102, either simultaneously or any time at the user's request. The metadata can provide information about the resources and be used to validate the resources. User-interface 400 can also include an indicator 416 of the number of resources included in the source data set. Metadata display control 418 can hide or display a metadata display portion. The metadata display portion will be described below. An MT (machine translation) check display control 424 can hide or display an MT check display portion. Running a machine-translation check provides the user with an indicator of the quality of a potential machine translation of the source string. Copy button 428 can be used to copy source strings or source data to other portions of the display or to other applications.
With reference now to
Each time an available resource or content is selected, the selected item can include a link back to the element of source data such that if the element is modified at a later date, the modification can be obtained automatically. For example, a user may select a resource such as “File not found in root directory”. At a later time, the resource may be modified to “File not found in base directory”. Because of the linking, all users who have selected this item can automatically retrieve the updated item. In this way, terminology consistency across applications can be provided. Likewise, links can be provided for elements within the associated data. As discussed above, the associated data can include localized information corresponding to the selected element of source data. If the localization changes, the modified localization can be obtained automatically due to the provided linking. Furthermore, any new localization can be automatically retrieved as a result of the provided linking. Thus, the present invention provides for consistent source and associated data across applications. Furthermore, the validation techniques can provide for centralized validation of the source and associated data.
In an illustrative embodiment, the associated data set can include one or more localized versions of the selected source data 414. For example, the associated data set can include one or more translations. In an illustrative embodiment, the associated data set can include a mapping of the selected source data 414 into a sound with equivalent meaning in the same language or another language. Other types of localized information well-known in the art can also be provided. The other types of localized information are sometimes referred to as equivalent re-interpretations. The associated data can be validated and verified that it was localized correctly. Further, the associated data could have been used previously in another application. Thus, by choosing to use a localization included in the associated data set corresponding to the selected source data, a user can ensure consistency both within an application and across applications. The associated data set display portion 500 can include a language column 502 to provide an indication of the language corresponding to a translation and a text column 504 for displaying localized versions of the selected source data. The user can select one or more of the localizations to include in a localized application. The source string 404 and source data 412 can be in any language. Thus, not only can translations from English to another language be provided, but translations from any language into any other language can be provided. Furthermore, by seeing which localizations are available for a particular element of source data, the user can choose to use an element of source data which has corresponding localizations which the user will eventually need.
Metadata 430 can also be obtained from the user. Metadata can include a name-value pair 420. Further metadata can be used to validate a resource, source string, or associated data. For example, a user can place a maximum length constraint on a string using metadata. If the user desires to constrain a string to a maximum length of 40, the user can select or enter “MaxLength” as the name of the constraint and enter “40” into the value column. The “MaxLength=40” constraint 422 can be used to verify that the source string 432 does not exceed a length of 40. Metadata entered by a user can be used in addition to or in lieu of the validation techniques discussed above. The capturing of metadata can be integrated into the software development environment. Terminology metadata can also be entered by a user. For example, terms in a source string can be ambiguous. Terminology metadata entered by a user can clarify the ambiguity. Terminology metadata can be used by localizers to produce accurate localizations.
Although illustrative embodiments have been generally discussed in the context of developing applications, it will be appreciated by one skilled in the art that the embodiments discussed herein are extensible to include content-authoring environments. For example, content authors could utilize the embodiments discussed herein to ensure that content is consistent across various platforms. Web page authors could utilize linking to ensure that the look and feel of Web pages is consistent across a Web site. Likewise, document authors could utilize linking to create document templates which ensure that standard content within a document is consistent. Furthermore, by linking content, not only is consistency assured in the source data, but valid, verified, and consistent localizations can also be efficiently made available. For example, once linked content in language A is translated into language B, the translation will be available to all the linkers of the content in language A.
Although illustrative embodiments have been generally discussed in the context of a user interface, it will be appreciated by one skilled in the art that the embodiments discussed herein are extensible to include a batch processing system. For example, a batch processing system with batch processing components can process several source strings at once. The source strings can be read from data storage, with corresponding source data, associated data, and any errors persisted to storage. In this manner, several source strings could be processed with the results persisted to data storage.
While illustrative embodiments have been illustrated and described, it will be appreciated that various changes can be made therein without departing from the spirit and scope of the invention.