Multi-threaded code is becoming very common, both on the server side, and very recently for personal computers as well. Consequently, looking for intermittent bugs is a problem that is receiving more and more attention. As there is no silver bullet, research focuses on a variety of partial solutions. We outline a road map for combining the research within the di erent disciplines of testing multi-threaded programs and on evaluating the quality of this research. We have three main goals. First, to create a benchmark that can be used to evaluate di erent solutions. Second, to create a framework with open APIs that enables the combination of techniques in the multi-threading domain. Third, to create a focus for the research in this area around which a community of people who try to solve similar problems with di erent techniques can congregate. We have started creating such a benchmark and describe the lessons learned in the process. The framework will enable technology developers, for example, developers of race detection algorithms, to concentrate on their components and use other ready made components, (e.g., an instrumentor) to create a testing solution.