A method, system, and processor chip design for reducing the latency
between completing a LARX operation and receiving the associated STCX
operation to complete the update to the cache line. Each entry of the
store queue of the issuing processor is provided an additional tracking
bit (priority bit). The priority bit is set whenever a STCX operation is
placed within the entry. During selection of an entry for dispatch by the
arbitration logic, the arbitration logic scans the value of the priority
bits of each eligible entry. An entry with the priority bit set is given
priority in the selection process within architectural rules. That entry
is then selected for dispatch as early as is possible within the
established rules.