Implement chroma-from-luma for AVX2
cfl_pred_cfl_128_w4_8bpc_c: 189.2
cfl_pred_cfl_128_w4_8bpc_avx2: 18.6
cfl_pred_cfl_128_w8_8bpc_c: 522.6
cfl_pred_cfl_128_w8_8bpc_avx2: 28.3
cfl_pred_cfl_128_w16_8bpc_c: 963.9
cfl_pred_cfl_128_w16_8bpc_avx2: 43.9
cfl_pred_cfl_128_w32_8bpc_c: 1593.2
cfl_pred_cfl_128_w32_8bpc_avx2: 72.2
cfl_pred_cfl_left_w4_8bpc_c: 189.5
cfl_pred_cfl_left_w4_8bpc_avx2: 22.3
cfl_pred_cfl_left_w8_8bpc_c: 525.4
cfl_pred_cfl_left_w8_8bpc_avx2: 32.1
cfl_pred_cfl_left_w16_8bpc_c: 977.7
cfl_pred_cfl_left_w16_8bpc_avx2: 49.4
cfl_pred_cfl_left_w32_8bpc_c: 1541.3
cfl_pred_cfl_left_w32_8bpc_avx2: 74.8
cfl_pred_cfl_top_w4_8bpc_c: 187.7
cfl_pred_cfl_top_w4_8bpc_avx2: 17.3
cfl_pred_cfl_top_w8_8bpc_c: 514.7
cfl_pred_cfl_top_w8_8bpc_avx2: 32.2
cfl_pred_cfl_top_w16_8bpc_c: 966.5
cfl_pred_cfl_top_w16_8bpc_avx2: 46.6
cfl_pred_cfl_top_w32_8bpc_c: 1497.4
cfl_pred_cfl_top_w32_8bpc_avx2: 76.4
cfl_pred_cfl_w4_8bpc_c: 197.0
cfl_pred_cfl_w4_8bpc_avx2: 25.7
cfl_pred_cfl_w8_8bpc_c: 545.0
cfl_pred_cfl_w8_8bpc_avx2: 35.5
cfl_pred_cfl_w16_8bpc_c: 994.8
cfl_pred_cfl_w16_8bpc_avx2: 49.3
cfl_pred_cfl_w32_8bpc_c: 1517.0
cfl_pred_cfl_w32_8bpc_avx2: 78.6
Edited by David Michael Barr