The present invention provides a system and method for the execution of jobs in a distributed computing architecture that uses worker clients which are characterized by a checkpointing mechanism component for generating checkpointing information being assigned to at least one worker client, at least one failover system being assigned to the worker client, a component (failover system selection component) for automatically assigning at least one existing or newly created failover system to the failure system being assigned to a worker client in the case said worker clients fails, wherein the assigned failover system provides all function components in order to take over the execution of the job when said assigned worker client fails, wherein the assigned failover system further includes at least a failover monitor component for detecting failover situations of said assigned worker client.

 
Web www.patentalert.com

< Synchronized memory channels with unidirectional links

> Method and computer program product for error monitoring of partitions in a computer system using supervisor partitions

> Method, system, computer program product, and computer system for error monitoring of partitions in the computer system using partition status indicators

~ 00510