Skip to content
Snippets Groups Projects
  1. Jul 08, 2019
  2. Jul 07, 2019
  3. Jul 06, 2019
  4. Jul 05, 2019
  5. Jul 02, 2019
  6. Jun 30, 2019
  7. Jun 29, 2019
  8. Jun 27, 2019
  9. Jun 26, 2019
    • Martin Storsjö's avatar
      arm64: itx: Add NEON optimized inverse transforms · ef1ea008
      Martin Storsjö authored and Jean-Baptiste Kempf's avatar Jean-Baptiste Kempf committed
      The speedup for most non-dc-only dct functions is around 9-12x
      over the C code generated by GCC 7.3.
      
      Relative speedups vs C for a few functions:
      
                                                    Cortex A53    A72    A73
      inv_txfm_add_4x4_dct_dct_0_8bpc_neon:               3.90   4.16   5.65
      inv_txfm_add_4x4_dct_dct_1_8bpc_neon:               7.20   8.05  11.19
      inv_txfm_add_8x8_dct_dct_0_8bpc_neon:               5.09   6.73   6.45
      inv_txfm_add_8x8_dct_dct_1_8bpc_neon:              12.18  10.80  13.05
      inv_txfm_add_16x16_dct_dct_0_8bpc_neon:             7.31   9.35  11.17
      inv_txfm_add_16x16_dct_dct_1_8bpc_neon:            14.36  13.06  15.93
      inv_txfm_add_16x16_dct_dct_2_8bpc_neon:            11.00  10.09  12.05
      inv_txfm_add_32x32_dct_dct_0_8bpc_neon:             4.41   5.40   5.77
      inv_txfm_add_32x32_dct_dct_1_8bpc_neon:            13.84  13.81  18.04
      inv_txfm_add_32x32_dct_dct_2_8bpc_neon:            11.75  11.87  15.22
      inv_txfm_add_32x32_dct_dct_3_8bpc_neon:            10.20  10.40  13.13
      inv_txfm_add_32x32_dct_dct_4_8bpc_neon:             9.01   9.21  11.56
      inv_txfm_add_64x64_dct_dct_0_8bpc_neon:             3.84   4.82   5.28
      inv_txfm_add_64x64_dct_dct_1_8bpc_neon:            14.40  12.69  16.71
      inv_txfm_add_64x64_dct_dct_4_8bpc_neon:            10.91   9.63  12.67
      
      Some of the specialcased identity_identity transforms for 32x32
      give insane speedups over the generic C code:
      
      inv_txfm_add_32x32_identity_identity_0_8bpc_neon: 225.26 238.11 247.07
      inv_txfm_add_32x32_identity_identity_1_8bpc_neon: 225.33 238.53 247.69
      inv_txfm_add_32x32_identity_identity_2_8bpc_neon:  59.60  61.94  64.63
      inv_txfm_add_32x32_identity_identity_3_8bpc_neon:  26.98  27.99  29.21
      inv_txfm_add_32x32_identity_identity_4_8bpc_neon:  15.08  15.93  16.56
      ef1ea008
    • Marvin Scholz's avatar
      tools: Use DAV1D_ERR for strerror calls · e0346114
      Marvin Scholz authored and Jean-Baptiste Kempf's avatar Jean-Baptiste Kempf committed
      e0346114
    • Marvin Scholz's avatar
      include: Consistently use DAV1D_ERR in docs · 04dc8a4d
      Marvin Scholz authored and Jean-Baptiste Kempf's avatar Jean-Baptiste Kempf committed
      04dc8a4d
  10. Jun 24, 2019
  11. Jun 21, 2019
  12. Jun 20, 2019
  13. Jun 19, 2019
  14. Jun 14, 2019
    • B Krishnan Iyer's avatar
      arm:mc: NEON implementation of blend, blend_h and blend_v function · a1e3f358
      B Krishnan Iyer authored and B Krishnan Iyer's avatar B Krishnan Iyer committed
      	                A73	A53
      
      blend_h_w2_8bpc_c:	149.3	246.8
      blend_h_w2_8bpc_neon:	74.6	137
      blend_h_w4_8bpc_c:	251.6	409.8
      blend_h_w4_8bpc_neon:	66	146.6
      blend_h_w8_8bpc_c:	446.6	844.1
      blend_h_w8_8bpc_neon:	68.6	131.2
      blend_h_w16_8bpc_c:	830	1513
      blend_h_w16_8bpc_neon:	85.9	192
      blend_h_w32_8bpc_c:	1605.2	2847.8
      blend_h_w32_8bpc_neon:	149.8	357.6
      blend_h_w64_8bpc_c:	3304.8	5515.5
      blend_h_w64_8bpc_neon:	262.8	629.5
      blend_h_w128_8bpc_c:	7895.1	13260.6
      blend_h_w128_8bpc_neon:	577	1402
      blend_v_w2_8bpc_c:	241.2	410.8
      blend_v_w2_8bpc_neon:	122.1	196.8
      blend_v_w4_8bpc_c:	874.4	1418.2
      blend_v_w4_8bpc_neon:	248.5	375.9
      blend_v_w8_8bpc_c:	1550.5	2514.7
      blend_v_w8_8bpc_neon:	210.8	376
      blend_v_w16_8bpc_c:	2925.3	5086
      blend_v_w16_8bpc_neon:	253.4	608.3
      blend_v_w32_8bpc_c:	5686.7	9470.5
      blend_v_w32_8bpc_neon:	348.2	994.8
      blend_w4_8bpc_c:	201.5	309.3
      blend_w4_8bpc_neon:	38.6	99.2
      blend_w8_8bpc_c:	531.3	944.8
      blend_w8_8bpc_neon:	55.1	125.8
      blend_w16_8bpc_c:	1992.8	3349.8
      blend_w16_8bpc_neon:	150.1	344
      blend_w32_8bpc_c:	4982	8165.9
      blend_w32_8bpc_neon:	360.4	910.9
      a1e3f358
  15. Jun 10, 2019
  16. Jun 09, 2019
  17. Jun 07, 2019
  18. Jun 06, 2019
  19. Jun 05, 2019
  20. Jun 04, 2019
    • Marvin Scholz's avatar
      meson: Fix nasm detection · 098a565c
      Marvin Scholz authored
      nasm -v can actually fail for example on macOS, where nasm could be a
      stub executable that forwards commands to the real nasm, but if the real
      nasm is not installed, fails.
      This would lead to a confusing error message due to the out of bounds
      array access, to avoid that, explicitly check the exit code.
      098a565c
  21. Jun 01, 2019
  22. May 31, 2019
Loading