I don't know it, it is also quite new, but it looks nice. For a detailed intro paper by the authors, including results (so it does seems to work): http://arxiv.org/abs/cs/0701037
Cheers,
Mischa
On Thu, Jan 08, 2009 at 09:02:17PM +0100, Sven Gabriel wrote:
don't know this one ...
Cheers,
Sven On Thursday 08 January 2009 19:39:17 Sander Klous wrote:
Hi, Ever used this? If so, does it work?
DMTCP is a distributed checkpointing system that can not only checkpoint sequential programs, but also threaded programs (with pthreads), families of processes (made with fork), and distributed processes across machines (like MPI). You can find reference for it here:
Would it be worthwhile to investigate DMTCP on grid and see how well it works by wrapping it around the job execution boundaries? Thanks, Sander _______________________________________________