SymptomTM: Symptom-based error detection and recovery using Hardware Transactional Memory


Yalcin G., Unsal O. S., Cristal A., Hur I., Valero M.

20th International Conference on Parallel Architectures and Compilation Techniques, PACT 2011, Galveston, TX, United States Of America, 10 - 14 October 2011, pp.199-200, (Full Text) identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1109/pact.2011.39
  • City: Galveston, TX
  • Country: United States Of America
  • Page Numbers: pp.199-200
  • Ankara Yıldırım Beyazıt University Affiliated: Yes

Abstract

Fault-tolerance has become an essential concern for processor designers due to increasing transient and permanent fault rates. In this study we propose SymptomTM, a symptombased error detection technique that recovers from errors by leveraging the abort mechanism of Transactional Memory (TM). To the best of our knowledge, this is the first architectural fault-tolerance proposal using Hardware Transactional Memory (HTM). SymptomTM can recover from 86% and 65% of catastrophic failures caused by transient and permanent errors respectively with no performance overhead in error-free executions. © 2011 IEEE.