ARM64: Various optimizations for symbol decode
Changes stem from redesigning the reduction stage of the multisymbol decode function.
- No longer use adapt4 for 5 possible symbol values
- Specialize reduction for 4/8/16 decode functions
- Modify control flow