A processor capable of running multiple threads runs a program in one
thread (called the "main" thread) and at least a portion of the same
program in another thread (called the "pre-execution" thread). The
program in the main thread includes instructions that cause the processor
to start and stop pre-execution threads and direct the processor as to
which part of the program is to be run through the pre-execution threads.
Preferably, such instructions cause the pre-execution thread to run ahead
of the main thread in program order. In that way, any cache miss
conditions that are encountered by the pre-execution thread are resolved
before the main thread requires that same data. Therefore, the main
thread should encounter few or no cache miss conditions.