The time it takes for the HHT shot isn't all shooting time - the actual frames are taken at approx. 10fps - so the 6 frames can come off as fast as around 1/2 second. If the shutter speed is slower, such as in very low light, you might be shooting 6 frames at 1/30 or 1/60 or so, in which case it might run you 3/4 second.
The extra second or two is the processing time for the camera to perform the aligning and stacking, before you're ready to take your next shot. You don't have to keep still anymore - the actual frames have already been snapped.
The mode is very very good in low light, and high ISO...the results can be astounding for a compact sensor. I snapped this in my living room (it's painted gold, not yellow cast!) at ISO3200 in HHT mode with my TX1, handheld:
Here, I did 3 shot comparisons: standard P mode at ISO3200, standard P mode at ISO1600, and Hand-held twilight, which chose ISO2500. This is the scene resized:
HHT mode, ISO2500:
Here are 100% crops from the above shots, straight from camera:
HHT mode, ISO2500:
The detail increase, the lower noise, the cleaner edges, the extra shadow definition, etc all speak to how useful the mode can be. I would agree the Sony noise reduction in standard modes are higher than other cameras - in some ways the high ISO performance on these Sony cams is still better than some rivals, but in some other ways, not so much. But throw HHT mode into the mix, and they can absolutely crush any small sensor competition, and in fact most other compact cameras.
It doesn't completely rule out using the mode when there's movement in the frame...it just takes some logical analysis of the situation to determine if the movement is small enough for the camera to overcome when stacking without too much blur. I use HHT on moving subjects too - like my cat, or scenes with people moving around in them at night. It still allows these cameras to be much more usable in low light environments than most any other compact camera. Here's my cat at ISO3200 in HHT mode...she was sitting mostly still, but cats are always fidgeting, so this still gives an idea of how the camera can handle it - there's actually some half-decent fur detail, at ISO3200, and extremely low noise:
This is a quickie handheld snap in HHT mode, walking around a shopping area at night...lots of people moving around in the shot, ISO500 (in P mode, this would have required ISO800-1600) and 1/8 shutter speed (x6 frames). Some motion blur in the people if you look close, but low noise in a very dark scene and good detail:
Just to give some idea. Of course, I am shooting on a TX1 - not an HX5. There could be some subtle differences. From the review here though, the HHT and AMB modes appear to produce similar results and are equally usable. As for yellow cast, that may be unique to the HX5, but the easiest cure for me would be to get the WB right when you really want to make sure the shot comes out true - the fact that the camera has manual WB is excellent, and means perfect color with no cast is possible every time if you're willing to get in there and manually control it.
Sony DSLR-A68 / Sony 18-250mm / Minolta 50mm F1.7 / Tamron 150-600mm / Minolta 300mm F4 APO
Sony A6300 / 18-55mm F3.5-5.6 / 55-210mm F4-6.3 / 10-18mm F4 / 35mm F1.8 / 16mm F2.8 / FE70-200mm F4 G OSS / FE70-300mm F4.5-5.6 G OSS / via manual adapter, lots of Pentax K mount, Konica K/AR mount, and Leica M mount manual lenses