In speculative multithreading, successive sections of sequential code are assigned to hardware threads to run simultaneously.
Each thread has the illusion of executing its task in program order. It sees its own writes and writes that occurred earlier in the program.
It does not see writes that take place later in program order even if — because of the concurrent execution — these writes have actually taken place earlier in time.
To sustain the illusion, the L2 gives threads private storage as needed. It lets threads read their own writes and writes from threads earlier in program order, but isolates their reads from threads later in program order.
Thus, the L2 might have several different data values for a single address. Each occupies an L2 way, and the L2 directory records, in addition to the usual directory information, a history of which threads have read or written the line.
A speculative write is never allowed to be written out to main memory.
Only one situation will break the program-order illusion: a thread earlier in program order writes to an address that a thread later in program order has already read.
The later thread should have read that data, but did not. The solution is to kill the later thread and invalidate all the lines it has written in L2, and to repeat this for all younger threads.
Otherwise a thread can complete successfully, and its writes can move to DDR when the line is cast out or flushed.
Not all threads are speculative. The running thread earliest in program order is non-speculative, or committed, and runs conventionally; in particular its writes can go to DDR. The threads later in program order are speculative and are subject to be killed. When the committed thread completes, the next-oldest thread becomes the committed thread.