x86: Add high bit-depth ipred z2 SSSE3 asm
x86-64 (clang) x86-32 (gcc)
intra_pred_z2_w4_16bpc_c: 225.0 ( 1.00x) 250.7 ( 1.00x)
intra_pred_z2_w4_16bpc_ssse3: 70.0 ( 3.21x) 78.5 ( 3.20x)
intra_pred_z2_w4_16bpc_avx2: 51.1 ( 4.40x)
intra_pred_z2_w8_16bpc_c: 629.3 ( 1.00x) 660.7 ( 1.00x)
intra_pred_z2_w8_16bpc_ssse3: 121.9 ( 5.16x) 134.4 ( 4.91x)
intra_pred_z2_w8_16bpc_avx2: 102.7 ( 6.13x)
intra_pred_z2_w16_16bpc_c: 1779.6 ( 1.00x) 1861.0 ( 1.00x)
intra_pred_z2_w16_16bpc_ssse3: 270.3 ( 6.58x) 286.7 ( 6.49x)
intra_pred_z2_w16_16bpc_avx2: 202.4 ( 8.79x)
intra_pred_z2_w32_16bpc_c: 4234.7 ( 1.00x) 4341.4 ( 1.00x)
intra_pred_z2_w32_16bpc_ssse3: 562.5 ( 7.53x) 585.8 ( 7.41x)
intra_pred_z2_w32_16bpc_avx2: 399.0 (10.61x)
intra_pred_z2_w64_16bpc_c: 9255.7 ( 1.00x) 9284.7 ( 1.00x)
intra_pred_z2_w64_16bpc_ssse3: 1284.1 ( 7.21x) 1358.1 ( 6.84x)
intra_pred_z2_w64_16bpc_avx2: 811.4 (11.41x)