Claims
- 1. A method for aligning similarity of two biological sequences, the biological sequences consisting of bases, the method comprising the steps of:
(a) selecting a seed pair of the two biological sequences; (b) respectively extending two fragments adjacent to the a seed pair by a predetermined number of successive bases; (c) determining if the extended fragments satisfy an extension condition, if yes, going to step (d), if no, going to step (e); (d) extending respectively two fragments adjacent to the extended fragments by the predetermined number of successive bases and returning to step (c); (e) respectively selecting two identical sub-fragments from the extended fragments unsatisfying the extension condition; (f) determining either one of the sub-fragments closer to the a seed pair; (g) matching the extended fragments by inserting at least one gap in front of the one of the sub-fragments determined in step (f); (h) determining if the matched fragments satisfy the extension condition, if yes, going to step (i), if no, going to step (j); (i) respectively extending two fragments adjacent to the matched fragments by the predetermined number of successive bases and returning to step (c); and (j) stopping extension and obtaining resulted fragments.
- 2. The method of claim 1, wherein the predetermined number of successive bases is from 4 to 400.
- 3. The method of claim 1, wherein the extension condition comprises having 40%˜100% similarity of fragments.
- 4. The method of claim 1, wherein a base number of the two identical sub-fragments is at least 2.
- 5. The method of claim 4, wherein a base number of the two identical sub-fragments is from 3 to 400.
- 6. The method of claim 5, wherein a base number of the two identical sub-fragments is from 3 to 50.
- 7. The method of claim 1, wherein step (j) further comprises:
(k) intercepting preceding substantially identical bases of the matched fragments; and (l) combining all of the extended fragments satisfying the extension condition and the intercepted bases into the resulted fragments.
- 8. The method of claim 1, wherein step (j) further comprises:
(m) waiving the matched fragments; and (n) combining all of the extended fragments satisfying the extension condition into the resulted fragments.
- 9. The method of claim 1, wherein step (j) further comprises:
(o) remaining the matched fragments; and (p) combining all of the extended fragments satisfying the extension condition and the matched fragments into the resulted fragments.
- 10. A computer program product for aligning similarity of two biological sequences, the biological sequences consisting of bases, the computer program product comprising:
a computer readable storage medium having code segments embodied therein, the code segments comprising: a first code segment configured to select a seed pair of the two biological sequences; a second code segment configured to respectively extend two fragments by a predetermined number of successive bases; a third code segment configured to determine whether the extended fragments satisfy an extension condition; a fourth code segment configured to respectively select two identical sub-fragments from the extended fragments; a fifth code segment configured to determine either one of the sub-fragments closer to the a seed pair; a sixth code segment configured to match the extended fragments by inserting at least one gap in front of the one of the sub-fragments; and a seventh code segment configured to obtain resulted fragments.
- 11. The computer program product of claim 10, wherein the predetermined number of successive bases is from 4 to 400.
- 12. The computer program product of claim 10, wherein the extension condition comprises having 40%˜100% similarity of fragments.
- 13. The computer program product of claim 10, wherein a base number of the two identical sub-fragments is at least 2.
- 14. The computer program product of claim 13, wherein a base number of the two identical sub-fragments is from 3 to 400.
- 15. The computer program product of claim 14, wherein a base number of the two identical sub-fragments is from 3 to 50.
- 16. The computer program product of claim 10, wherein the seventh code segment further comprises:
an eighth code segment configured to intercept preceding substantially identical bases of the matched fragments; and a ninth code segment configured to combine all of the extended fragments satisfying the extension condition and the intercepted bases into the resulted fragments.
- 17. The computer program product of claim 10, wherein the seventh code segment further comprises:
a tenth code segment configured to waive the matched fragments; and an eleventh code segment configured to combine all of the extended fragments satisfying the extension condition into the resulted fragments.
- 18. The computer program product of claim 10, wherein the seventh code segment further comprises:
a twelfth code segment configured to remain the matched fragments; and a thirteenth code segment configured to combine all of the extended fragments satisfying the extension condition and the matched fragments into the resulted fragments.
Parent Case Info
[0001] This Application, as a continuation in part application, claims priority to U.S. patent application Ser. No. 09/741,078 filed on Dec. 21, 2000.
Continuation in Parts (1)
|
Number |
Date |
Country |
Parent |
09741078 |
Dec 2000 |
US |
Child |
10609657 |
Jul 2003 |
US |