arm64: ipred: 8 bpc NEON implementation of the Z2 function
Relative speedup over C code: Cortex A53 A55 A72 A73 A76 Apple M1 intra_pred_z2_w4_8bpc_neon: 3.91 3.55 3.31 3.94 3.46 8.50 intra_pred_z2_w8_8bpc_neon: 5.68 5.67 4.31 5.31 4.34 5.83 intra_pred_z2_w16_8bpc_neon: 8.39 9.28 5.53 7.04 7.01 9.45 intra_pred_z2_w32_8bpc_neon: 7.01 8.01 5.04 6.32 5.48 7.48 intra_pred_z2_w64_8bpc_neon: 8.73 10.25 5.92 7.61 6.63 10.05
Loading
Please register or sign in to comment