Haven't found anything new yet. I'll try a no-draw run in a few mins, and tell you how it goes.
The BltFast method performs a source copy blit or transparent blit using a source or destination color key. BltFast always attempts to perform an asynchronous blit if the hardware supports this. It only works on video memory surfaces and cannot clip. The software implementation of BltFast is 10% faster than Blt. There is no speed difference if the video hardware is being used.