A Checkpoint/Restart Scheme for CUDA Applications with Complex Memory Hierarchy