Skip to content

riscv64/cdef: filter functions

Bogdan Gligorijević requested to merge BogdanW3/dav1d:cdef_filter into master

Benchmarks:

Kendryte K230 Banana PI BPI-F3
cdef_filter_4x4_01_8bpc_c:       1339.4 ( 1.00x)
cdef_filter_4x4_01_8bpc_rvv:      836.2 ( 1.60x)
cdef_filter_4x4_01_16bpc_c:      1369.1 ( 1.00x)
cdef_filter_4x4_01_16bpc_rvv:     824.7 ( 1.66x)
cdef_filter_4x4_10_8bpc_c:        872.8 ( 1.00x)
cdef_filter_4x4_10_8bpc_rvv:      523.9 ( 1.67x)
cdef_filter_4x4_10_16bpc_c:       938.2 ( 1.00x)
cdef_filter_4x4_10_16bpc_rvv:     517.1 ( 1.81x)
cdef_filter_4x4_11_8bpc_c:       2668.3 ( 1.00x)
cdef_filter_4x4_11_8bpc_rvv:     1285.0 ( 2.08x)
cdef_filter_4x4_11_16bpc_c:      2922.1 ( 1.00x)
cdef_filter_4x4_11_16bpc_rvv:    1291.0 ( 2.26x)
cdef_filter_4x8_01_8bpc_c:       2489.1 ( 1.00x)
cdef_filter_4x8_01_8bpc_rvv:     1594.3 ( 1.56x)
cdef_filter_4x8_01_16bpc_c:      2528.1 ( 1.00x)
cdef_filter_4x8_01_16bpc_rvv:    1566.6 ( 1.61x)
cdef_filter_4x8_10_8bpc_c:       1576.9 ( 1.00x)
cdef_filter_4x8_10_8bpc_rvv:      967.1 ( 1.63x)
cdef_filter_4x8_10_16bpc_c:      1641.3 ( 1.00x)
cdef_filter_4x8_10_16bpc_rvv:     947.1 ( 1.73x)
cdef_filter_4x8_11_8bpc_c:       5164.0 ( 1.00x)
cdef_filter_4x8_11_8bpc_rvv:     2490.7 ( 2.07x)
cdef_filter_4x8_11_16bpc_c:      5732.3 ( 1.00x)
cdef_filter_4x8_11_16bpc_rvv:    2499.2 ( 2.29x)
cdef_filter_8x8_01_8bpc_c:       4742.3 ( 1.00x)
cdef_filter_8x8_01_8bpc_rvv:     1628.6 ( 2.91x)
cdef_filter_8x8_01_16bpc_c:      4785.0 ( 1.00x)
cdef_filter_8x8_01_16bpc_rvv:    1595.5 ( 3.00x)
cdef_filter_8x8_10_8bpc_c:       2962.4 ( 1.00x)
cdef_filter_8x8_10_8bpc_rvv:     1000.8 ( 2.96x)
cdef_filter_8x8_10_16bpc_c:      3022.4 ( 1.00x)
cdef_filter_8x8_10_16bpc_rvv:     975.7 ( 3.10x)
cdef_filter_8x8_11_8bpc_c:      12623.9 ( 1.00x)
cdef_filter_8x8_11_8bpc_rvv:     2525.4 ( 5.00x)
cdef_filter_8x8_11_16bpc_c:     12470.7 ( 1.00x)
cdef_filter_8x8_11_16bpc_rvv:    2528.2 ( 4.93x)
cdef_filter_4x4_01_8bpc_c:       1281.2 ( 1.00x)
cdef_filter_4x4_01_8bpc_rvv:      813.0 ( 1.58x)
cdef_filter_4x4_01_16bpc_c:      1300.8 ( 1.00x)
cdef_filter_4x4_01_16bpc_rvv:     808.9 ( 1.61x)
cdef_filter_4x4_10_8bpc_c:        843.0 ( 1.00x)
cdef_filter_4x4_10_8bpc_rvv:      498.4 ( 1.69x)
cdef_filter_4x4_10_16bpc_c:       903.6 ( 1.00x)
cdef_filter_4x4_10_16bpc_rvv:     497.9 ( 1.81x)
cdef_filter_4x4_11_8bpc_c:       2614.1 ( 1.00x)
cdef_filter_4x4_11_8bpc_rvv:     1219.6 ( 2.14x)
cdef_filter_4x4_11_16bpc_c:      2795.6 ( 1.00x)
cdef_filter_4x4_11_16bpc_rvv:    1243.1 ( 2.25x)
cdef_filter_4x8_01_8bpc_c:       2405.4 ( 1.00x)
cdef_filter_4x8_01_8bpc_rvv:     1548.5 ( 1.55x)
cdef_filter_4x8_01_16bpc_c:      2402.7 ( 1.00x)
cdef_filter_4x8_01_16bpc_rvv:    1542.7 ( 1.56x)
cdef_filter_4x8_10_8bpc_c:       1522.0 ( 1.00x)
cdef_filter_4x8_10_8bpc_rvv:      917.4 ( 1.66x)
cdef_filter_4x8_10_16bpc_c:      1589.2 ( 1.00x)
cdef_filter_4x8_10_16bpc_rvv:     915.9 ( 1.74x)
cdef_filter_4x8_11_8bpc_c:       5050.7 ( 1.00x)
cdef_filter_4x8_11_8bpc_rvv:     2358.7 ( 2.14x)
cdef_filter_4x8_11_16bpc_c:      5510.5 ( 1.00x)
cdef_filter_4x8_11_16bpc_rvv:    2411.6 ( 2.28x)
cdef_filter_8x8_01_8bpc_c:       4558.3 ( 1.00x)
cdef_filter_8x8_01_8bpc_rvv:     1579.7 ( 2.89x)
cdef_filter_8x8_01_16bpc_c:      4551.1 ( 1.00x)
cdef_filter_8x8_01_16bpc_rvv:    1571.1 ( 2.90x)
cdef_filter_8x8_10_8bpc_c:       2869.3 ( 1.00x)
cdef_filter_8x8_10_8bpc_rvv:      948.4 ( 3.03x)
cdef_filter_8x8_10_16bpc_c:      2928.6 ( 1.00x)
cdef_filter_8x8_10_16bpc_rvv:     944.2 ( 3.10x)
cdef_filter_8x8_11_8bpc_c:      12317.5 ( 1.00x)
cdef_filter_8x8_11_8bpc_rvv:     2389.7 ( 5.15x)
cdef_filter_8x8_11_16bpc_c:     11950.6 ( 1.00x)
cdef_filter_8x8_11_16bpc_rvv:    2440.1 ( 4.90x)

Merge request reports