Transactional Memory Everywhere: Summary
If you actually read my full set of
Transactional Memory Everywhere blog postings,
I offer my enthusiastic congratulations and heartfelt condolences.
In short, although TM offers much promise for small changes to memory-only
data structures, there are a number of issues with large-scale
transacations in real software systems:
- I/O operations,
especially
RPCs,
which cannot in general be rolled back.
-
Memory-mapping operations,
particularly when you unmap some of a given transaction's
variables from outside of that transaction.
-
Multi-threaded transactions
do not seem to be supported by most TM implementations, which means
that TM does not generally support things like pthread_create().
-
Extra-transactional accesses,
whose behavior varies greatly from one TM implementation to another.
- Time delays,
which can interact with the notion of atomicity in interesting
and ill-defined ways.
- Interactions with
locking,
particularly
reader-writer locking.
These interactions are clearly critically important if TM is
to be introduced into large existing software programs that
use locking.
Of course,
the interactions between TM and RCU
are a special interest of mine.
-
Persistent transactions are an interesting possibility.
There are forms of locking that can span address spaces,
and that can survive reboots and even operating-system upgrades.
Should TM offer similar persistence?
-
Dynamic linking and loading of functions invoked from
within transactions.
- Debugging
transactions, especially setting breakpoints within transactions
for hardware TM implementations.
- Transactions
containing the exec() system call
have interesting implications.
So, what can we conclude from the above list of issues?
- One interesting property of TM is the fact that transactions
are subject to rollback and retry.
This property underlies TM's difficulties with irreversible operations,
including unbuffered I/O, RPCs, memory-mapping operations, time delays,
and the exec() system call.
- Another interesting property of TM, noted by
Shpeisman et al.,
is that TM intertwines the synchronization with the data it protects.
This property underlies TM's issues with I/O, memory-mapping operations,
extra-transactional accesses, and debugging breakpoints.
In contrast, conventional synchronization primitives, including
locking and RCU, maintain a clear separation between the synchronization
primitives and the data that they protect.
- One of the stated goals of many workers in the TM area is to
ease parallelization of large sequential programs.
As such, individual transactions are commonly expected to execute
serially, which might do much to explain TM's issues with
multithreaded transactions.
What should TM researchers and developers do about all of this?
One approach is to focus on TM in the small, focusing on situations
where hardware assist potentially provides substantial advantages
over other synchronization primitives.
This is in fact the approach Sun took with its
Rock research CPU.
Some TM researchers seem to agree with this approach, while others
have much higher hopes for TM.
Of course, it is quite possible that TM will be able to take on
larger problems, and
this series of blog posts
lists a few of the issues that must be resolved if TM is to achieve
this lofty goal.
My personal hope is that everyone involved treats this as a learning
experience.
It appears to me that TM researchers have great deal to learn from
practitioners who have successfully built large software systems using
traditional synchronization primitives.
And vice versa.