Claims
- 1. A method for markup language schema validation, comprising the steps of:
(a) loading a markup language document into a runtime validation engine, wherein the runtime validation engine comprises a mark up language schema validation parser; (b) loading an annotated automaton encoding for a mark up language schema definition into the markup language schema validation parser; and (c) validating the markup language document against the markup language schema definition by the markup language schema validation parser utilizing the annotated automaton encoding.
- 2. The method of claim 1, wherein the markup language comprises an Extensible Markup Language (XML).
- 3. The method of claim 1, wherein the annotated automaton encoding comprises at least one element node, wherein one or more attributes can be associated with the element node, and wherein one or more data type constraints can be associated with the element node or the attribute
- 4. The method of claim 1, wherein the annotated automaton encoding comprises at least one element annotation record for the at least one element node, wherein the at least one element annotation record comprises one or more of a group consisting of:
a scanner ID for an element content and arguments; a start tag token; an end tag token; an attribute list; and a candidate sub-element map, capable of comprising a pointer to a sub-element name.
- 5. The method of claim 1, wherein prior to the loading step (a) comprises:
(a1) receiving an Extensible Markup Language (XML) schema definition; (a2) generating an element structure hierarchy for the XML schema definition and representing the hierarchy in an annotated tree; (a3) encoding the annotated tree and generating the annotated automaton encoding; (a4) serializing the annotated automaton encoding; and (a5) storing the serialized annotated automaton encoding.
- 6. The method of claim 1, wherein the validating step (c) comprises:
(c1) obtaining at least one token for an Extensible Markup Language (XML) document; (c2) performing a low level validation of the at least one token by a generic XML parser; and (c3) performing a high level validation of the at least one token by a XML schema validation parser if the token is an element token or an attribute token.
- 7. The method of claim 6, wherein the validating step (c) further comprises:
(c4) outputting a validation pass if the validations by the generic XML parser and the XML schema validation parser are successful; and (c5) outputting a validation fail if the validation by the generic XML parser or the XML schema validation parser is not successful.
- 8. The method of claim 6, wherein the element token comprises one or more of a group consisting of:
a start tag name; and an end tag name.
- 9. The method of claim 6, wherein the attribute token comprises an attribute name.
- 10. The method of claim 6, wherein if the element token is a start tag name, then the performing step (c3) comprises:
(c3i) finding a current annotation record based upon a previous annotation record and the start tag name; (c3ii) pushing the current annotation record onto a stack; (c3iii) obtaining a start tag token for the start tag name from the current annotation record; (c3iv) inputting the start tag token into an element validation module of the XML schema validation parser; and (c3v) determining if a validation of the start tag token is successful.
- 11. The method of claim 6, wherein if the attribute token is an attribute name, then the performing step (c3) comprises:
(c3i) passing a current annotation record and the attribute name to an attribute validation module of the XML schema validation parser; (c3ii) searching an attribute list of the current annotation record for the attribute name, wherein the validation of the XML document fails if the attribute name is not found in the current annotation record; (c3iii) obtaining the attribute token if the attribute name is found in the current annotation record; and (c3iv) determining if the validation of the attribute token is successful.
- 12. The method of claim 6, wherein if the element token is an end tag name, then the performing step (c3) comprises:
(c3i) removing a current annotation record from a stack; (c3ii) obtaining an end tag token from the current annotation record; (c3iii) inputting the end tag token into an element validation module of the XML schema validation parser; (c3iv) determining if the validation of the end tag token is successful; and (c3v) determining if all attributes of the current annotation record has been validated or if the attribute list of the current annotation record is empty, wherein the validation of the end tag token is not successful if less than all of the attributes of the current annotation record has been validated and the attribute list of the current annotation record is not empty.
- 13. A system, comprising:
a markup language schema compilation for generating at least one annotated automaton encoding for at least one markup language schema definition; and a runtime validation engine comprising a runtime schema validation parser, wherein the runtime schema validation parser receives a markup language document and the at least one annotated automaton encoding as input, wherein the runtime schema validation parser validates the markup language document against the at least one markup language schema definition utilizing the at least one annotated automaton encoding.
- 14. The system of claim 13, wherein the markup language comprises an Extensible Markup Language (XML).
- 15. The system of claim 13, wherein the annotated automaton encoding comprises at least one element node, wherein one or more attributes can be associated with for the element node, and wherein one or more data type constraints can be associated with the element node or the attribute.
- 16. The system of claim 13, wherein the annotated automaton encoding comprises at least one element annotation record for the at least one element node, wherein the at least one element annotation record comprises one or more of a group consisting of:
a scanner ID for an element content and arguments; a start tag token; an end tag token; an attribute list; and a candidate sub-element map, capable of comprising a pointer to a sub-element name.
- 17. The system of claim 13, wherein the markup language schema compilation comprises:
an Extensible Markup Language (XML) schema compiler front-end; and an XML schema compiler back-end.
- 18. The system of claim 17, wherein the XML schema compiler front-end receives the at least one XML schema definition, generates an element structure hierarchy for the XML schema definition, and represents the hierarchy in an annotated tree.
- 19. The system of claim 18, wherein the XML schema compiler back-end encodes the annotated tree, generates the at least one annotated automaton encoding from the encoded annotated tree, and serializes the at least one annotated automaton encoding.
- 20. The system of claim 13, further comprising:
a storage medium for storing the at least one annotated automaton encoding.
- 21. The system of claim 13, wherein the runtime validation engine further comprises:
a generic Extensible Markup Language (XML) parser, wherein the generic XML parser performs a low level validation of an XML document, wherein the runtime schema validation parser performs a high level validation of the XML document.
- 22. The system of claim 13, wherein the schema runtime validation parser comprises:
an Extensible Markup Language (XML) schema loading module for loading the at least one annotated automaton encoding; and an XML schema validation module, comprising:
an element validation module for validating element tokens, and an attribute validation module for validating attribute tokens.
- 23. The system of claim 22, wherein the element token comprises one or more of a group consisting of:
a start tag name; and an end tag name.
- 24. The system of claim 22, wherein the attribute token comprises an attribute name.
- 25. The system of claim 13, wherein the runtime validation engine further comprises an Extensible Markup Language (XML) scanner pool, wherein the XML scanner pool comprises a generic scanner and at least one type specific scanner.
- 26. A computer readable medium with program instructions for a markup language schema validation, comprising the instructions for:
(a) loading a markup language document into a runtime validation engine, wherein the runtime validation engine comprises a markup language schema validation parser; (b) loading an annotated automaton encoding for a markup language schema definition into the markup language schema validation parser; and (c) validating the markup language document against the markup language schema definition by the markup language schema validation parser utilizing the annotated automaton encoding.
- 27. The medium of claim 26, wherein the markup language comprises an Extensible Markup Language (XML).
- 28. The medium of claim 26, wherein the annotated automaton encoding comprises at least one element node, wherein one or more attributes can be associated with the element node, and wherein one or more data type constraints can be associated with the element node or the attribute.
- 29. The medium of claim 26, wherein the annotated automaton encoding comprises at least one element annotation record for the at least one element node, wherein the at least one element annotation record comprises one or more of a group consisting of:
a scanner ID for an element content and arguments; a start tag token; an end tag token; an attribute list; and a candidate sub-element map, capable of comprising a pointer to a sub-element name.
- 30. The medium of claim 26, wherein prior to the loading instruction (a) comprises the instructions for:
(a1) receiving an Extensible Markup Language (XML) schema definition; (a2) generating an element structure hierarchy for the XML schema definition and representing the hierarchy in an annotated tree; (a3) encoding the annotated tree and generating the annotated automaton encoding; (a4) serializing the annotated automaton encoding; and (a5) storing the serialized annotated automaton encoding.
- 31. The medium of claim 26, wherein the validating instruction (c) comprises instructions for:
(c1) obtaining at least one token for an Extensible Markup Language (XML) document; (c2) performing a low level validation of the at least one token by a generic XML parser; and (c3) performing a high level validation of the at least one token by a XML schema validation parser if the token is an element token or an attribute token.
- 32. The medium of claim 31, wherein the validating instructions (c) further comprises instructions for:
(c4) outputting a validation pass if the validations by the generic XML parser and the XML schema validation parser are successful; and (c5) outputting a validation fail if the validation by the generic XML parser or the XML schema validation parser is not successful.
- 33. The medium of claim 31, wherein the element token comprises one or more of a group consisting of:
a start tag name; and an end tag name.
- 34. The medium of claim 31, wherein the attribute token comprises an attribute name.
- 35. The medium of claim 31, wherein if the element token is a start tag name, then the performing instruction (c3) comprises instructions for:
(c3i) finding a current annotation record based upon a previous annotation record and the start tag name; (c3ii) pushing the current annotation record onto a stack; (c3iii) obtaining a start tag token for the start tag name from the current annotation record; (c3iv) inputting the start tag token into an element validation module of the XML schema validation parser; and (c3v) determining if a validation of the start tag token is successful.
- 36. The medium of claim 31, wherein if the attribute token is an attribute name, then the performing instruction (c3) comprises instructions for:
(c3i) passing a current annotation record and the attribute name to an attribute validation module of the XML schema validation parser; (c3ii) searching an attribute list of the current annotation record for the attribute name, wherein the validation of the XML document fails if the attribute token is not found in the current annotation record; (c3iii) obtaining the attribute token if the attribute name is found in the current annotation record; and (c3iv) determining if the validation of the attribute token is successful.
- 37. The medium of claim 31, wherein if the element token is an end tag name, then the performing instruction (c3) comprises instructions for:
(c3i) removing a current annotation record from a stack; (c3ii) obtaining an end tag token from the current annotation record; (c3iii) inputting the end tag token into an element validation module of the XML schema validation parser; (c3iv) determining if the validation of the end tag token is successful; and (c3v) determining if all attributes of the current annotation record has been validated or if the attribute list of the current annotation record is empty, wherein the validation of the end tag token is not successful if less than all of the attributes of the current annotation record has been validated and the attribute list of the current annotation record is not empty.
CROSS REFERENCE TO RELATED APPLICATION
[0001] This application is claiming under 35 USC 119(e) the benefit of provisional patent application serial No. 60/418,673, filed on Oct. 15, 2002.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60418673 |
Oct 2002 |
US |