x86: Add high bitdepth cdef_filter SSSE3 asm
cdef_filter_4x4_16bpc_c: 949.6
cdef_filter_4x4_16bpc_ssse3: 95.5
cdef_filter_4x4_16bpc_avx2: 110.6
cdef_filter_4x8_16bpc_c: 1799.6
cdef_filter_4x8_16bpc_ssse3: 155.7
cdef_filter_8x8_16bpc_c: 1471.2
cdef_filter_8x8_16bpc_ssse3: 259.4
cdef_filter_8x8_16bpc_avx2: 242.5
Includes optimized code paths for pri-only and sec-only filter strengths which the hbd AVX2 code currently lacks, hence the good performance of the SSSE3 code compared to AVX2. Those optimizations will be added to AVX2 in a future MR.