blend, blend_h, blend_v pwr9 implementation
fn | Time | Speedup |
---|---|---|
blend_w4_8bpc_pwr9 | 14.4 | 1.90x |
blend_w8_8bpc_pwr9 | 19.9 | 3.62x |
blend_w16_8bpc_pwr9 | 50.6 | 5.17x |
blend_w32_8bpc_pwr9 | 125.8 | 5.33x |
blend_h_w2_8bpc_pwr9 | 18.4 | 1.20x |
blend_h_w4_8bpc_pwr9 | 27.2 | 1.26x |
blend_h_w8_8bpc_pwr9 | 27.9 | 2.22x |
blend_h_w16_8bpc_pwr9 | 35.1 | 3.28x |
blend_h_w32_8bpc_pwr9 | 57.4 | 3.88x |
blend_h_w64_8bpc_pwr9 | 97.9 | 4.70x |
blend_h_w128_8bpc_pwr9 | 207.6 | 5.18x |
blend_v_w2_8bpc_pwr9 | 25.0 | 1.12x |
blend_v_w4_8bpc_pwr9 | 79.3 | 1.35x |
blend_v_w8_8bpc_pwr9 | 79.5 | 2.43x |
blend_v_w16_8bpc_pwr9 | 108.0 | 3.58x |
blend_v_w32_8bpc_pwr9 | 153.5 | 4.69x |
Edited by Luca Barbato