Claims
- 1. Software for controlling message traffic and migration of a process, the process including an execution state, a memory state and a communication state; from a first computer to a second computer in a distributed computing virtual machine, the virtual machine including a scheduler, peer processes in addition to the process intended for migration, and communication links between processes, the software comprising:
a) communications protocols and process migration protocols allowing point-to-point communication between processes; and b) the software being an independent layer over standard communication protocols and operating systems of the virtual machine.
- 2. The software according to claim 1 wherein:
a) all processes are allotted a received-message-list memory location in which messages for their processes can be stored; b) when a migration request is granted for the process in the first computer to migrate to the second computer the process in the first computer becomes a migrating process, the migrating process then sends out marker messages to each connected peer process with which it is in communication indicating the migrating process has sent its last message; c) the connected peer processes acknowledge the migrating process marker message with their own marker message, whereafter the communication link between the acknowledging connected peer processes and the migrating process is closed; d) when all connected peer process communication links with the migrating process are closed, migration operations begin to transfer the migrating process from the first computer to the second computer; e) migration operations include the virtual machine verifying that the second computer has an initialized process as an executable copy of the migrating process, whereupon the migrating process sends at least its received-message-list as the communication state of the process to the initialized process, and the execution state and memory state of the migrating process are gathered from the migrating process and sent to the second computer for use by the initialized process; f) peer process connection requests for a communications link to the migrating process are refused during migration operations, a peer process upon receiving a refusal to its connection request will request a new location of the initialized process at the second computer, whereby all message traffic during migration is routed to the initialized process in the second computer and stored in order of receipt in the received-message-list of the initialized process; g) all process execution, memory, and communication states are restored in the second computer to create a migrated process; and h) migrated process location information is updated in each peer process which has been informed of the migrated process location.
- 3. The software according to claim 2 wherein the migrating process is identified by a non-negative integer rank number at the application level.
- 4. The software according to claim 2 wherein the migrating process is identified by a rank number and computer location at the virtual machine level.
- 5. The software according to claim 2 wherein the migration operations include the virtual machine loading an executable copy of the process programming on the second computer.
- 6. The software according to claim 1 wherein the communication links include a connection-oriented communication protocol that delivers messages in FIFO fashion between processes.
- 7. The software of claim 1, further comprising:
a) the communication protocols having algorithms including:
i) a Connect algorithm for managing connection requests, denials and rerouting requests between sender and receiver computers, ii) a Send algorithm for allowing a sender computer to send messages to a receiver computer by verifying that a sender computer connection is established with a receiver computer process before messages are exchanged, and iii) a Receive algorithm for managing transfer of messages between peer processes whether said processes have migrated or not; managing communication link maintenance or cancellation, and storing of messages in the correct order in a received-message-list of a particular process; and b) the process migration protocol having algorithms including:
i) a Migration algorithm for managing acceptance of communications link between processes during migration operations, and ii) an Initialization algorithm for restoring the migrating process as necessary to restart the migrating process at the second computer.
- 8. The software of claim 7 wherein the Send algorithm is coded as:
send (m, dest)
1: if (destεConnected) then 2: cc[dest]=connect (dest); 3: end if 4: send m along the cc[dest]communication channel.
- 9. The software of claim 7 wherein the Connect algorithm is coded as:
connect (dest)
1: while (destεConnected) do 2: send conn_req to pl[dest]; 3: if (receive conn_ack from pl[dest]); 4: cid:=make_connection_with (pl[dest]); 5: Connected:={dest}U Connected 6: else if (receive conn_req from any process p) then 7: grant-connection_to(p); 8: else if (receive conn_nack from pl[dest]) then 9: consult scheduler for exe status and new_vmid of Pdest 10: if (status=migrate) then 11: pl[dest]: =newvmid; 12: else report “error: destination terminated”; 13: return error; end if 14: end if; end if; end while; 15: return cid.
- 10. The software of claim 7 wherein the Receive algorithm is coded as:
recv (src, m, tag)
1: While (m is not found) do 2: if (m is found in received_message_list) then 3: return m, delete it from the list, and return to a caller function; 4: end if 5: get a new data or control message, n; 6: if (n is data message) then 7: append n to received_message_list; 8: else (handle control messages) 9: if n is con_req then 10: grant_connection_to(sender of n); 11: else if n is peer_migrating then 12: close down the connection with the sender of peer migrating; 13: end if; end while.
- 11. The software of claim 7 wherein the Migration algorithm is coded as:
migrate( )
1: if (migrate_request is received) then 2: inform the scheduler migration_start 3: get new_vmid of Pi from scheduler; 4: All con-req arrived beyond this point will be rejected; 5: Send disconnection signal and peer-migrating 6: Receive incoming messages to received-message-list until getting end-of-messages 7: close all existing connections; 8: Send received-message-list to the new process; 9: perform exe and memory state collection; 10: Send the exe and memory state to the new process; 11: wait for migration_commit msg from scheduler; 12: cooperate with the virtual machine daemon to make sure that no more con_req control message left to reject; 13: terminate; 14: end if.
- 12. The software of claim 7 wherein the Initialization algorithm is coded as:
initialize( )
1: All con_req messages are accepted beyond this point; 2: Receive received-message-list of the migrating process; 3: insert received-message-list to the front of the original received-message-list; 4: Receive “exe and mem state” of the migrating process; 5: Restore process state; 6: inform the scheduler restore_complete; 7: wait for contents of the PL table and old_vmid from the scheduler; 8: inform the scheduler migration_commit.
- 13. The software of claim 1 wherein the combined data communications protocols and process migration protocols function to provide maintenance of correct message order and updating of process location information during process migration.
- 14. A method for controlling message traffic and transfer of a migrating process from a first computer to a second computer in a distributed computing virtual machine environment, the environment including peer processes on computers in the virtual machine in addition to the first computer and the second computer, the software comprising communications protocols and process migration protocols allowing point-to-point communication between computers; the software being an independent layer over standard communication protocols and operating systems of the computers in the virtual machine; in which all processes have a memory allocation for their message storage, the method comprising the steps of:
a) granting a communication connection to enable migration of the migrating process in the first computer to migrate to the second computer; b) sending out marker messages from the migrating process to each connected peer process with which the migrating process is in communication, the marker messages indicating the migrating process has sent its last message; c) loading an executable copy of the migrating process programming on the second computer; d) having peer processes respond to the marker message with an acknowledgement message; e) having the migrating process close communication with a peer process when that peer process acknowledgement message is received by the migrating process; f) having the first computer perform migration operations including sending the Execution state, Memory state, and Communications state of the migrating process to an initialized process on the second computer when all peer process communications are closed; g) refusing peer process connection requests to the migrating process during migration operations; h) having the peer processes, upon receiving a refusal to a connection request, request new location information of the migrating process; i) rerouting all refused peer process messages during migration operations to the initialized process and compiling the rerouted messages in order in a memory location of the initialized process; j) restoring all migrating process memory and communication states in the second computer to create a migrated process; k) resuming execution of the migrated process on the second computer; and l) updating the location of the migrated process for each peer process requesting a connection to the migrated process.
- 15. Software for controlling message traffic and migration of a process, the process including an execution state, a memory state and a communication state; from a first computer to a second computer in a distributed computing virtual machine, the virtual machine including a scheduler, peer processes in addition to the process intended for migration, and FIFO communication links between processes, the software comprising:
a) communications protocols and process migration protocols allowing point-to-point communication between processes; b) the software being an independent layer over standard communication protocols and operating systems of the virtual machine; c) all processes are allotted a received-message-list memory location in which messages for their processes can be stored; d) when a migration request is granted for the process in the first computer to migrate to the second computer the process in the first computer becomes a migrating process, the migrating process then sends out marker messages to each connected peer process with which it is in communication indicating the migrating process has sent its last message; e) the connected peer processes receive messages from the FIFO communication link, store them in order into the receive-message-list, and acknowledge the migrating process marker message with their own marker message; f) the migrating process receives messages from the FIFO communication link and stores them in order into the receive-message-list until all the acknowledgement marker messages from the connected peers are received, whereafter the communication link between the acknowledging connected peer processes and the migrating process is closed; g) when all connected peer process communication links with the migrating process are closed, migration operations begin to transfer the migrating process from the first computer to the second computer; h) migration operations include the virtual machine verifying that the second computer has an initialized process as an executable copy of the migrating process, whereupon the migrating process sends at least its received-message-list as the communication state of the process to the initialized process, and the execution state and memory state of the migrating process are gathered from the migrating process and sent to the second computer for use by the initialized process; i) peer process connection requests for a communications link to the migrating process are refused during migration operations, a peer process upon receiving a refusal to its connection request will request from the scheduler a new location of the initialized process at the second computer, whereby all message traffic during migration is routed to the initialized process in the second computer and stored in order of receipt in the received-message-list of the initialized process; j) all process execution, memory, and communication states are restored in the second computer to create a migrated process; k) the reconstruction of the communication state of the process on the second computer is accomplished by inserting the content of the receive-message-list sent from the migrating process on the first computer to the front of the receive-message-list of the initialized process on the second computer; and l) migrated process location information is updated in each peer process which has been informed of the migrated process location.
- 16. Software for controlling message traffic and migration of a process, the process including an execution state, a memory state and a communication state; from a first computer to a second computer in a distributed computing virtual machine, the virtual machine including a scheduler, peer processes in addition to the process intended for migration, and FIFO communication links between processes, the software comprising:
a) communications protocols and process migration protocols allowing point-to-point communication between processes; b) the software being an independent layer over standard communication protocols and operating systems of the virtual machine; c) the communication protocols having algorithms including:
i) a Connect algorithm for managing connection requests, denials and rerouting requests between sender and receiver computers, ii) a Send algorithm for allowing a sender computer to send messages to a receiver computer by verifying that a sender computer connection is established with a receiver computer process before messages are exchanged, and iii) a Receive algorithm for managing transfer of messages between peer processes whether said processes have migrated or not; managing communication link maintenance or cancellation, and storing of messages in the correct order in a received-message-list of a particular process; and d) the process migration protocol having algorithms including:
i) a Migration algorithm for informing the scheduler of a process migration; managing acceptance of communications link between processes during migration operations; managing the gathering of the execution, memory, and communication state of the migrating process, and ii) an Initialization algorithm for restoring the execution, memory, and communication state of the migrating process as necessary to restart the migrating process at the second computer.
- 17. The software of claim 16 wherein the Send algorithm is coded as:
send (m, dest)
1: if (destεConnected) then 2: cc[dest]=connect (dest); 3: end if 4: send m along the cc[dest]communication channel.
- 18. The software of claim 17 wherein the Connect algorithm is coded as:
connect (dest)
1: while (destεConnected) do 2: send conn_req to pl[dest]; 3: if (receive conn_ack from pl[dest]); 4: cid:=make_connection_with (pl[dest]); 5: Connected:={dest}U Connected 6: else if (receive conn_req from any process p) then 7: grant-connection_to(p); 8: else if (receive conn_nack from pl[dest]) then 9: consult scheduler for exe status and new_vmid of Pdest 10: if (status=migrate) then 11: pl[dest]:=new_vmid; 12: else report “error: destination terminated”; 13: return error; end if 14: end if; end if; end while; 15: return cid.
- 19. The software of claim 18 wherein the Receive algorithm is coded as:
recv (src, m, tag)
1: While (m is not found) do 2: if (m is found in received_message_list) then 3: return m, delete it from the list, and return to a caller function; 4: end if 5: get a new data or control message, n; 6: if (n is data message) then 7: append n to received_message_list; 8: else (handle control messages) 9: if n is con_req then 10: grant_connection_to(sender of n); 11: else if n is peer_migrating then 12: close down the connection with the sender of peer_migrating; 13: end if; end while.
- 20. The software of claim 19 wherein the Migration algorithm is coded as:
migrate( ) 1: if (migrate_request is received) then 2: inform the scheduler migration_start 3: get new_vmid of Pi from scheduler; 4: All con-req arrived beyond this point will be rejected; 5: Send disconnection signal and peer-migrating 6: Receive incoming messages to received-message-list until getting end-of-messages 7: close all existing connections; 8: Send received-message-list to the new process; 9: perform exe and memory state collection; 10: Send the exe and memory state to the new process; 11: wait for migration_commit msg from scheduler; 12: cooperate with the virtual machine daemon to make sure that no more con_req control message left to reject; 13: terminate; 14: end if.
- 21. The software of claim 20 wherein the Initialization algorithm is coded as:
initialize( ) 1: All con_req messages are accepted beyond this point; 2: Receive received-message-list of the migrating process; 3: insert received-message-list to the front of the original received-message-list; 4: Receive “exe and mem state” of the migrating process; 5: Restore process state; 6: inform the scheduler restore_complete; 7: wait for contents of the PL table and old_vmid from the scheduler; 8: inform the scheduler migration_commit.
- 22. A method for controlling message traffic and transfer of a migrating process from a first computer to a second computer in a distributed computing virtual machine environment, the environment including peer processes on computers in the virtual machine in addition to the first computer and the second computer, the software comprising communications protocols and process migration protocols allowing point-to-point communication between computers; the software being an independent layer over standard communication protocols and operating systems of the computers in the virtual machine; in which all processes have a memory allocation for their message storage, the method comprising the steps of:
a) granting a migration of the migrating process in the first computer to migrate to the second computer; b) loading an executable copy of the migrating process programming on the second computer and wait for the execution, memory, and communication state information of the migrating process; c) sending out marker messages from the migrating process to each connected peer process with which the migrating process is in communication, the marker messages indicating the migrating process has sent its last message; d) having peer processes receive incoming messages from the migrating process into its received-message-list until the marker message from the migrating process is received and respond to the marker message with an acknowledgement message; e) having the migrating process receive incoming messages from the peer processes into its received-message-list until the marker messages from all the connected peer processes are received and close communication with a peer process when that peer process acknowledgement message is received by the migrating process; f) having the first computer perform migration operations including sending the Execution state, Memory state, and Communications state of the migrating process to an initialized process on the second computer when all peer process communications are closed; g) refusing peer process connection requests to the migrating process during migration operations; h) having the peer processes, upon receiving a refusal to a connection request, request new location information of the migrating process from the scheduler; i) updating the location of the migrated process for each peer process requesting a connection to the migrated process; j) rerouting all refused peer process messages during migration operations to the initialized process and compiling the rerouted messages in order in a memory location of the initialized process; k) restoring all migrating process execution, memory, and communication states in the second computer to create a migrated process; and l) resuming execution of the migrated process on the second computer.
Parent Case Info
[0001] This application claims priority from U.S. Provisional Application No. 60/369,025 filed Mar. 29, 2002.
Government Interests
[0002] The development of this invention was partially funded by the Government under grant numbers ASC-9720215 and CCR-9972251 awarded by the National Science Foundation. The Government has certain rights in this invention.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60369025 |
Mar 2002 |
US |