Claims
- 1. A method for compressing character-based markup language files, said markup language files including a text having a plurality of tags, and said tags including a plurality of attributes and arguments having standard and non-standard characters, the method comprising:
converting said tags and said attributes into a single case format; placing said attributes in an order within said tags, said order enabling larger strings of common text to be found; determining and using a shortest text string representation of a plurality of text string representations for any non-standard characters in the tags; and eliminating a plurality of spaces from within said tags.
- 2. The method of claim 1, further defined by using a compression algorithm to compress a web document that includes the markup language files.
- 3. The method of claim 2, wherein the compression algorithm is GZIP.
- 4. The method of claim 1, wherein the plurality of spaces includes extra white spaces.
- 5. The method of claim 1, wherein the plurality of spaces includes end-of-line characters.
- 6. The method of claim 1, wherein the step of placing said attributes in an order includes placing the attributes in an alphabetical order.
- 7. The method of claim 1, wherein the markup language is HTML language.
- 8. The method of claim 1, wherein the markup language is XML language.
- 9. The method of claim 8, further comprising:
rewriting the tags to include fewer characters; and changing the tags to have all of the tags begin with a same character.
- 10. The method of claim 1, wherein the markup language is SGML language.
- 11. The method of claim 1, wherein the single case format consists of uppercase text.
- 12. The method of claim 1, wherein the single case format consists of lowercase text.
- 13. The method of claim 1, the plurality of text string representations of the non-standard characters includes a character name representation and a character number representation.
- 14. The method of claim 13, wherein the character number representation is chosen when the character name representation and the character number representation have a same length.
CROSS REFERENCE TO RELATED APPLICATION
[0001] This application is a continuation-in-part of U.S. patent application Ser. No. 09/777,401, filed Feb. 6, 2001.
Continuation in Parts (1)
|
Number |
Date |
Country |
Parent |
09777401 |
Feb 2001 |
US |
Child |
09800846 |
Mar 2001 |
US |