Extensible Markup Language (XML) is a common data language employed for various applications such as website development and other applications typically designed for the Internet. Generally, XML is considered a markup language for documents containing structured information. Structured information includes both content (words, pictures, and so forth) and some indication of what role that content plays (for example, content in a section heading has a different meaning from content in a footnote, which means something different than content in a figure caption or content in a database table, and so forth). Almost all documents have some structure. Thus, a markup language such as XML provides a mechanism to identify structures in a document, where the XML specification defines a standard way to add markup to documents. Another aspect of XML is referred to as XSD which is an XML based language that defines validation rules for XML files, where XSD can be employed for XML Schema Definition. Generally, XSD is an XML based language which implies that XSD statements are written in XML files. One important function of XSD is that it defines validation rules for XML files, meaning that XSD can be utilized to replace Document Type Definitions (DTD), which is another language for defining XML validation rules.
Since the structure of XML files and XSD definitions is defined by textual data and statements, tools for manipulating such languages have not developed along a similar path such as traditional code-based models for developing source code for example. For instance, code-based models typically operate with object classes where tools have developed over time to create desired software functionality. Although XML and XSD type declarations may have some similarity to previous code-based models and class structures, the differences with code-based models are such that XML/XSD tools over the last several years have developed according to a different path offering different types of functionality than code-based tools. One area where this difference is stark and apparent is in how files are operated upon in the XML/XSD development environment where files are processed according to a “one-file-at-a-time” format which provides substantial challenges to developers.
In one are where such challenges are encountered, a large number of XML schemas likely contain multiple XSD files. A collection of XSD files that define a single XML schema is referred to as a schema set where the larger the domain described by the schema, the larger its schema set. For example, an HL7 schema includes multiple schema sets, which can have hundreds or thousands of XSD files. As noted above, tools that developers employ to work with schemas only work with one file at a time. This makes schema set operations either impossible or very difficult to achieve.
To illustrate the single file operation and processing problem, consider searching for a string in a schema set containing a large number of files. First, the user needs to know all the files in the set. To achieve this, the user would generally start with the top file in the set and then recursively traverse down its “include” files and import statements. Then, the user would have to either search each file individually or perform a bulk “find in files” operation. Searching files individually is very time consuming, especially for large schemas such as HL7. Performing bulk “find in files” operation is also not trivial, since the files can be located in multiple folders, on multiple machines or in multiple internet locations.
The following presents a simplified summary in order to provide a basic understanding of some aspects described herein. This summary is not an extensive overview nor is intended to identify key/critical elements or to delineate the scope of the various aspects described herein. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.
Set-based tools and methods are provided that enable manipulation, viewing, and development across schema sets in contrast to conventional single file schema operations. In one aspect, a file or subset of files is designated for development operations such as in the context of a software development environment. From the file (or file subset) designated for operations, links to other related files are automatically determined and located from schema directories, Internet locations, local directories and so forth. A schema set is automatically constructed which is then employed for further operations. By automatically building a set of files and later performing operations on the set, much time is saved over conventional single file manipulation operations. For example, after building the schema set, search operations can be performed across the set as opposed to individually trying to locate files and then individually searching the files to potentially find relevant data. Set-based operations can include semantic queries in another example where a developer can locate data items that may be associated with one or more other data items related to a query. Other features include editing and developing across the schema set. Thus, if a change were made in one portion of the schema set, the change can be propagated to other related members of the set. This mitigates having to manually search for related files and then manually performing desired operations on the files as is the case with conventional single file development systems.
To the accomplishment of the foregoing and related ends, certain illustrative aspects are described herein in connection with the following description and the annexed drawings. These aspects are indicative of various ways which can be practiced, all of which are intended to be covered herein. Other advantages and novel features may become apparent from the following detailed description when considered in conjunction with the drawings.
A schema development system is provided. The system includes a location component to automatically determine members of a schema set. A processor component performs software development operations across the schema set. The members of the schema set can be implicitly determined from at least one development file, where the development file can be an XML file or an XSD file, for example. The software development operations can be associated with at least one of a search operation or a semantic query, for example.
As used in this application, the terms “component,” “query,” “schema,” and the like are intended to refer to a computer-related entity, either hardware, a combination of hardware and software, software, or software in execution. For example, a component may be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a server and the server can be a component. One or more components may reside within a process and/or thread of execution and a component may be localized on one computer and/or distributed between two or more computers. Also, these components can execute from various computer readable media having various data structures stored thereon. The components may communicate via local and/or remote processes such as in accordance with a signal having one or more data packets (e.g., data from one component interacting with another component in a local system, distributed system, and/or across a network such as the Internet with other systems via the signal).
Referring initially to
The system 100 facilitates automated operations and in particular building the schema sets 130. For example, when a user points to an XSD file at 120 (or other XML type), the location component 110 can recursively traverse a schema for includes and imports, create a schema set at 130 and perform all user operations (such as search, go to definition, show all references, and so forth) on the schema set via the processor component 150. The system 100 provides tools to automatically create the schema set 130 for the user, performing user operations on the schema set as opposed to a single file, and displaying to the user a structure of the entire set and not just a single file such as illustrated in
The common paradigm used by existing tools is that “users work with files that they opened.” The system 100 changes this model to include “working with schema sets 130.” Schema sets 130 can include XSD files that may or may not be opened by users. When users open an XSD file at 120 for instance, the location component 110 finds other XSD files explicitly and/or implicitly referenced by the original file and creates the schema set 130. This resulting set 130 can be displayed to the users and employed for other operations. Some conventional systems can build schema sets however they build them using only explicit references and use the sets only for validation of a single XSD file (the one that was opened by the user). All other operations in existing tools are performed against that same single file. Thus, for example, if searching for a string in existing tools, they will display results in the file single opened but not related files.
Projects generally have a file that describes what files and resources are in the project, so that the tools know how to create it. In some cases, XSD files have explicit references to other XSD files (listing namespace/URI pairs or just URI location). In other cases, XSD files have references to namespaces without specifying a URI. In addition, all files in the project are usually under the developers' control. With schemas, a lot of files included in schema sets 130 can be outside of the developers' control (for example all w3c standards and industry schema references). These files outside the developers control can be automatically located and included in the schema sets 130. In another aspect, a schema development system is provided via the system 100. This includes means for identifying components of a schema (location component 110) in order to automatically determine members of the schema set 140. The system 100 also means for processing the schema set 140 (processing component 150) to facilitate software development across the schema set.
Proceeding to 210 of
Referring now to
Referring to
A Schema Set may appear as follows:
As illustrated in
any
Choice
Sequence
an icon
text “any”
anyAttribute
complexType
Restriction
Extension
attribute Group
an icon
text “anyAttribute”
attribtute
Schema
complexType
Restriction
Extension
attribute Group
An icon
Name of the attribute.
Type of the attribute (if the type is global)
attributeGroup
<xs:attributeGroup> schema element can appear under the following parents:
Schema
Redefine
attribute Group
complexType
Restriction
Extension
An icon
Name of the attributeGroup
complexType
Schema
Redefine
element
schema
redefine
ComplexType nodes are not displayed when their parent is <xs:element> xs:complexType nodes are represented in the SE by the following items:
an icon
name of the complex type
derivation section (for derived type)
an icon to represent derivation method (restriction or extension)
name of the base type
If this type is redefined somewhere else in the schema, a glyph is shown to indicate this fact
element
schema
all
choice
sequence
an icon
Name of the element
Type of the element
group
schema
redefine
choice
complexType
extension
restriction
sequence
An icon
Name of the group
If this group is redefined somewhere else in the schema, a glyph is shown to indicate this fact
Import
<xs:import> schema element can appear only under <xs:schema> node. In the interface, it can be shown as a child of a file node and is represented in the interface by the following items:
An icon
Namespace (string value of the namespace attribute). If <xs:import> statement does not have a namespace attribute (it is optional according to the spec), a string “Empty Namespace” should be displayed.
include
<xs:include> schema element can appear only under <xs:schema> node. In the interface, it is shown as a child of a file node and is represented in the interface by the following items:
An icon
Value of the “schemaLocation” attribute
redefine
<xs:redefine> schema element can appear only under <xs:schema> node. In the interface it is shown as a child of a file node and is represented in the interface by the following items:
An icon
Value of the “schemaLocation” attribute
A list of child nodes
When a schema component is redefined, the old one and the new schema are shown in the interface. The old component can have an indicator (via an icon/glyph) that it has been redefined. The new component can appear under redefine parent if files are not filtered in the interface tree. If files are filtered, the new node can appear on the same level as other globals. The new component can have an option on the context menu to “go to original definition”. The old component can have an option on the context menu to “go to redefinition”. If there is more than one redefinition of a schema component, the redefinitions are shown and an error is reported.
simpleType
Schema
Redefine
Element
Attribute
List
Union
Restriction
The following applies to <xs:simpleType> schema elements when they appear under
schema
redefine
an icon
name of the simple type
derivation section (for derived type)
an icon to represent derivation method (restriction, list or union)
name of the base type
If this type is redefined somewhere else in the schema, a glyph is shown to indicate this fact
targetNamespace
targetNamespace appears as an attribute of the <xs:schema> node. It is shown in as a child of the SchemaSet node and is a parent to file nodes that define the same target namespace. It is represented in the interface by the following items:
an icon
name of the targetNamespace
If the targetNamespace attribute is absent, string “Empty Namespace”
Referring now to
Selecting a node in the Schema Explorer will highlight the node and display its properties in a Property Window (not shown). A tick mark 530 can be displayed on the scroll bar to indicate position of the currently selected node in the tree relative to the rest of the schema set. Activating a node (double-clicking a node or pressing “Enter” when a node is selected) will highlight the node and display its properties in the Property Window. Activating a Schema Set node displays the File & Namespace View. Activating an element/type/attribute/group/attr.group can display its content model in the Content Model View, for example.
Referring to
When sort by name (default) option is selected, global nodes can be sorted in the following order:
1. includes (in alphabetical order of schemaLocation attributes)
2. imports (in alphabetical order of namespaces)
3. redefines (in alphabetical order of schemaLocation attributes)
4. Other globals in alphabetical order
When “Show Schema Files” option is enabled, users can have an option to sort global nodes in document order. When this option is enabled, global nodes can be displayed in the order they appear in the xsd files. When “Show Schema Files” option is disabled sorting in document order is also disabled.
Users will also have an option to group globals by type. When this option is enabled, global nodes can be sorted in the following order (note that within each group nodes are sorted alphabetically): 1 includes; 2 imports; 3 redefines; 4 attributes; 5 attribute groups; 6 complex types; 7 simple types; 8 elements; and groups, for example. As can be appreciated, other arrangements are possible.
Turning to
A number of schema-specific queries are available for various selected nodes. These include:
SchemaSet
Namespaces
Files
Elements
Types (note: we only show global types)
Attributes
Attribute Groups
Named groups
Referring to
Users can also navigate search results in the following manner: Clicking on a tick mark to get to a specific search result; Using keyboard navigation—F3 to go to the next hit, Alt-F3 to go to the previous hit; and Clicking Next/Previous Search Result buttons. When the next/previous result is visible on the screen, navigating to it via one of the above methods can select that node. As can be appreciated, the interfaces and interface options shown in
In order to provide a context for the various aspects of the disclosed subject matter,
With reference to
The system bus 918 can be any of several types of bus structure(s) including the memory bus or memory controller, a peripheral bus or external bus, and/or a local bus using any variety of available bus architectures including, but not limited to, 11-bit bus, Industrial Standard Architecture (ISA), Micro-Channel Architecture (MSA), Extended ISA (EISA), Intelligent Drive Electronics (IDE), VESA Local Bus (VLB), Peripheral Component Interconnect (PCI), Universal Serial Bus (USB), Advanced Graphics Port (AGP), Personal Computer Memory Card International Association bus (PCMCIA), and Small Computer Systems Interface (SCSI).
The system memory 916 includes volatile memory 920 and nonvolatile memory 922. The basic input/output system (BIOS), containing the basic routines to transfer information between elements within the computer 912, such as during start-up, is stored in nonvolatile memory 922. By way of illustration, and not limitation, nonvolatile memory 922 can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM), or flash memory. Volatile memory 920 includes random access memory (RAM), which acts as external cache memory. By way of illustration and not limitation, RAM is available in many forms such as synchronous RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), enhanced SDRAM (ESDRAM), Synchlink DRAM (SLDRAM), and direct Rambus RAM (DRRAM).
Computer 912 also includes removable/non-removable, volatile/non-volatile computer storage media.
It is to be appreciated that
A user enters commands or information into the computer 912 through input device(s) 936. Input devices 936 include, but are not limited to, a pointing device such as a mouse, trackball, stylus, touch pad, keyboard, microphone, joystick, game pad, satellite dish, scanner, TV tuner card, digital camera, digital video camera, web camera, and the like. These and other input devices connect to the processing unit 914 through the system bus 918 via interface port(s) 938. Interface port(s) 938 include, for example, a serial port, a parallel port, a game port, and a universal serial bus (USB). Output device(s) 940 use some of the same type of ports as input device(s) 936. Thus, for example, a USB port may be used to provide input to computer 912 and to output information from computer 912 to an output device 940. Output adapter 942 is provided to illustrate that there are some output devices 940 like monitors, speakers, and printers, among other output devices 940 that require special adapters. The output adapters 942 include, by way of illustration and not limitation, video and sound cards that provide a means of connection between the output device 940 and the system bus 918. It should be noted that other devices and/or systems of devices provide both input and output capabilities such as remote computer(s) 944.
Computer 912 can operate in a networked environment using logical connections to one or more remote computers, such as remote computer(s) 944. The remote computer(s) 944 can be a personal computer, a server, a router, a network PC, a workstation, a microprocessor based appliance, a peer device or other common network node and the like, and typically includes many or all of the elements described relative to computer 912. For purposes of brevity, only a memory storage device 946 is illustrated with remote computer(s) 944. Remote computer(s) 944 is logically connected to computer 912 through a network interface 948 and then physically connected via communication connection 950. Network interface 948 encompasses communication networks such as local-area networks (LAN) and wide-area networks (WAN). LAN technologies include Fiber Distributed Data Interface (FDDI), Copper Distributed Data Interface (CDDI), Ethernet/IEEE 802.3, Token Ring/IEEE 802.5 and the like. WAN technologies include, but are not limited to, point-to-point links, circuit switching networks like Integrated Services Digital Networks (ISDN) and variations thereon, packet switching networks, and Digital Subscriber Lines (DSL).
Communication connection(s) 950 refers to the hardware/software employed to connect the network interface 948 to the bus 918. While communication connection 950 is shown for illustrative clarity inside computer 912, it can also be external to computer 912. The hardware/software necessary for connection to the network interface 948 includes, for exemplary purposes only, internal and external technologies such as, modems including regular telephone grade modems, cable modems and DSL modems, ISDN adapters, and Ethernet cards.
What has been described above includes various exemplary aspects. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing these aspects, but one of ordinary skill in the art may recognize that many further combinations and permutations are possible. Accordingly, the aspects described herein are intended to embrace all such alterations, modifications and variations that fall within the spirit and scope of the appended claims. Furthermore, to the extent that the term “includes” is used in either the detailed description or the claims, such term is intended to be inclusive in a manner similar to the term “comprising” as “comprising” is interpreted when employed as a transitional word in a claim.
This application claims the benefit of U.S. Provisional Patent Application Ser. No. 60/868,485 filed on Dec. 4, 2006, entitled “BUILDING, VIEWING, AND MANIPULATING SCHEMA SETS” the entirety of which is incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
6611843 | Jacobs | Aug 2003 | B1 |
6910040 | Emmick et al. | Jun 2005 | B2 |
6950815 | Tijare et al. | Sep 2005 | B2 |
6990654 | Carroll, Jr. | Jan 2006 | B2 |
7017112 | Collie et al. | Mar 2006 | B2 |
7249316 | Collie et al. | Jul 2007 | B2 |
7260585 | Krishnaprasad et al. | Aug 2007 | B2 |
7313756 | Panditharadhya et al. | Dec 2007 | B2 |
7539747 | Lucovsky et al. | May 2009 | B2 |
20020147745 | Houben et al. | Oct 2002 | A1 |
20030195885 | Emmick et al. | Oct 2003 | A1 |
20030204481 | Lau | Oct 2003 | A1 |
20030204511 | Brundage et al. | Oct 2003 | A1 |
20030225774 | Davidov et al. | Dec 2003 | A1 |
20040254922 | Vincent, III | Dec 2004 | A1 |
20050050054 | Clark et al. | Mar 2005 | A1 |
20050120029 | Tomic et al. | Jun 2005 | A1 |
20050125431 | Emmick et al. | Jun 2005 | A1 |
20060004827 | Stuart | Jan 2006 | A1 |
20060041838 | Khan | Feb 2006 | A1 |
20060047648 | Martin | Mar 2006 | A1 |
20060122961 | Kalia et al. | Jun 2006 | A1 |
20060195413 | Davis et al. | Aug 2006 | A1 |
20060271506 | Bohannon et al. | Nov 2006 | A1 |
20070043702 | Lakshminarayanan et al. | Feb 2007 | A1 |
20070083543 | Chen | Apr 2007 | A1 |
Number | Date | Country |
---|---|---|
1225516 | Jul 2002 | EP |
2002318798 | Oct 2002 | JP |
2004295674 | Oct 2004 | JP |
WO 2004107195 | Dec 2004 | WO |
Entry |
---|
Storing and querying XML data using denormalized relational databases, Andrey Balmin and Yannis Papakonstantinou, 2005, The VLDB Journal—The International Journal on Very Large Data Bases, vol. 14 , Issue 1, pp. 30-49, retrieved from ACM digital library. |
D.Chamberlin. XQuery: An XML Query Language. vol. 41, No. 4, 2002. Accepted for publication Jun. 17, 2002; Internet publication Oct. 29, 2002 http://www.research.ibm.com/journal/sj/414/chamberlin.html. Last accessed Apr. 24, 2007. |
Exchanger XML Editor—XML Schema, RelaxNG and DTDs. 2005 Cladonia Ltd. http://www.exchangerxml.com/editor/pdf/XMLSchemaDTDRelaxNGUS.pdf. Last accessed Apr. 24, 2007. |
Gargi M. Sur. Implementing Update Extensions to Xquery 1.0. Aug. 2003, Florida http://etd.fcla.edu/UF/UFE0001184/su—g.pdf. Last accessed Apr. 24, 2007. |
Number | Date | Country | |
---|---|---|---|
20080133553 A1 | Jun 2008 | US |
Number | Date | Country | |
---|---|---|---|
60868485 | Dec 2006 | US |