arm64: mc: NEON implementation of w_mask for 16 bpc
Checkasm numbers: Cortex A53 A72 A73 w_mask_420_w4_16bpc_neon: 173.6 123.5 120.3 w_mask_420_w8_16bpc_neon: 484.2 344.1 329.5 w_mask_420_w16_16bpc_neon: 1411.2 1027.4 1035.1 w_mask_420_w32_16bpc_neon: 5561.5 4093.2 3980.1 w_mask_420_w64_16bpc_neon: 13809.6 9856.5 9581.0 w_mask_420_w128_16bpc_neon: 35614.7 25553.8 24284.4 w_mask_422_w4_16bpc_neon: 159.4 112.2 114.2 w_mask_422_w8_16bpc_neon: 453.4 326.1 326.7 w_mask_422_w16_16bpc_neon: 1394.6 1062.3 1050.2 w_mask_422_w32_16bpc_neon: 5485.8 4219.6 4027.3 w_mask_422_w64_16bpc_neon: 13701.2 10079.6 9692.6 w_mask_422_w128_16bpc_neon: 35455.3 25892.5 24625.9 w_mask_444_w4_16bpc_neon: 153.0 112.3 112.7 w_mask_444_w8_16bpc_neon: 437.2 331.8 325.8 w_mask_444_w16_16bpc_neon: 1395.1 1069.1 1041.7 w_mask_444_w32_16bpc_neon: 5370.1 4213.5 4138.1 w_mask_444_w64_16bpc_neon: 13482.6 10190.5 10004.6 w_mask_444_w128_16bpc_neon: 35583.7 26911.2 25638.8 Corresponding numbers for 8 bpc for comparison: w_mask_420_w4_8bpc_neon: 126.6 79.1 87.7 w_mask_420_w8_8bpc_neon: 343.9 195.0 211.5 w_mask_420_w16_8bpc_neon: 886.3 540.3 577.7 w_mask_420_w32_8bpc_neon: 3558.6 2152.4 2216.7 w_mask_420_w64_8bpc_neon: 8894.9 5161.2 5297.0 w_mask_420_w128_8bpc_neon: 22520.1 13514.5 13887.2 w_mask_422_w4_8bpc_neon: 112.9 68.2 77.0 w_mask_422_w8_8bpc_neon: 314.4 175.5 208.7 w_mask_422_w16_8bpc_neon: 835.5 565.0 608.3 w_mask_422_w32_8bpc_neon: 3381.3 2231.8 2287.6 w_mask_422_w64_8bpc_neon: 8499.4 5343.6 5460.8 w_mask_422_w128_8bpc_neon: 21823.3 14206.5 14249.1 w_mask_444_w4_8bpc_neon: 104.6 65.8 72.7 w_mask_444_w8_8bpc_neon: 290.4 173.7 196.6 w_mask_444_w16_8bpc_neon: 831.4 586.7 591.7 w_mask_444_w32_8bpc_neon: 3320.8 2300.6 2251.0 w_mask_444_w64_8bpc_neon: 8300.0 5480.5 5346.8 w_mask_444_w128_8bpc_neon: 21633.8 15981.3 14384.8
Loading
Please register or sign in to comment