A low latency memory system access is provided in association with a
weakly-ordered multiprocessor system. Each processor in the
multiprocessor shares resources, and each shared resource has an
associated lock within a locking device that provides support for
synchronization between the multiple processors in the multiprocessor and
the orderly sharing of the resources. A processor only has permission to
access a resource when it owns the lock associated with that resource,
and an attempt by a processor to own a lock requires only a single load
operation, rather than a traditional atomic load followed by store, such
that the processor only performs a read operation and the hardware
locking device performs a subsequent write operation rather than the
processor. A simple perfecting for non-contiguous data structures is also
disclosed. A memory line is redefined so that in addition to the normal
physical memory data, every line includes a pointer that is large enough
to point to any other line in the memory, wherein the pointers to
determine which memory line to prefect rather than some other predictive
algorithm. This enables hardware to effectively prefect memory access
patterns that are non-contiguous, but repetitive.