Skip to content
Snippets Groups Projects
  1. Jun 12, 2020
  2. Jun 11, 2020
    • Matthias Dressel's avatar
    • Henrik Gramner's avatar
      Remove redundant memset in itx DSP initialization · d606dd24
      Henrik Gramner authored
      The struct is already zero-initialized when the function is called
      except for the checkasm test, so move the zeroing there instead.
      d606dd24
    • Matthias Dressel's avatar
      meson: Make docs generation subproject-safe · bc008834
      Matthias Dressel authored
      meson.source_root() returns the root of a parent project if dav1d is
      embedded as a subproject.
      bc008834
    • Victorien Le Couviour--Tuffet's avatar
      x86: Adapt SSSE3 prep_8tap to SSE2 · 22fb8a42
      Victorien Le Couviour--Tuffet authored
      ---------------------
      x86_64:
      ------------------------------------------
      mct_8tap_regular_w4_h_8bpc_c: 302.3
      mct_8tap_regular_w4_h_8bpc_sse2: 47.3
      mct_8tap_regular_w4_h_8bpc_ssse3: 19.5
      ---------------------
      mct_8tap_regular_w8_h_8bpc_c: 745.5
      mct_8tap_regular_w8_h_8bpc_sse2: 235.2
      mct_8tap_regular_w8_h_8bpc_ssse3: 70.4
      ---------------------
      mct_8tap_regular_w16_h_8bpc_c: 1844.3
      mct_8tap_regular_w16_h_8bpc_sse2: 755.6
      mct_8tap_regular_w16_h_8bpc_ssse3: 225.9
      ---------------------
      mct_8tap_regular_w32_h_8bpc_c: 6685.5
      mct_8tap_regular_w32_h_8bpc_sse2: 2954.4
      mct_8tap_regular_w32_h_8bpc_ssse3: 795.8
      ---------------------
      mct_8tap_regular_w64_h_8bpc_c: 15633.5
      mct_8tap_regular_w64_h_8bpc_sse2: 7120.4
      mct_8tap_regular_w64_h_8bpc_ssse3: 1900.4
      ---------------------
      mct_8tap_regular_w128_h_8bpc_c: 37772.1
      mct_8tap_regular_w128_h_8bpc_sse2: 17698.1
      mct_8tap_regular_w128_h_8bpc_ssse3: 4665.5
      ------------------------------------------
      mct_8tap_regular_w4_v_8bpc_c: 306.5
      mct_8tap_regular_w4_v_8bpc_sse2: 71.7
      mct_8tap_regular_w4_v_8bpc_ssse3: 37.9
      ---------------------
      mct_8tap_regular_w8_v_8bpc_c: 923.3
      mct_8tap_regular_w8_v_8bpc_sse2: 168.7
      mct_8tap_regular_w8_v_8bpc_ssse3: 71.3
      ---------------------
      mct_8tap_regular_w16_v_8bpc_c: 3040.1
      mct_8tap_regular_w16_v_8bpc_sse2: 505.1
      mct_8tap_regular_w16_v_8bpc_ssse3: 199.7
      ---------------------
      mct_8tap_regular_w32_v_8bpc_c: 12354.8
      mct_8tap_regular_w32_v_8bpc_sse2: 1942.0
      mct_8tap_regular_w32_v_8bpc_ssse3: 714.2
      ---------------------
      mct_8tap_regular_w64_v_8bpc_c: 29427.9
      mct_8tap_regular_w64_v_8bpc_sse2: 4637.4
      mct_8tap_regular_w64_v_8bpc_ssse3: 1829.2
      ---------------------
      mct_8tap_regular_w128_v_8bpc_c: 72756.9
      mct_8tap_regular_w128_v_8bpc_sse2: 11301.0
      mct_8tap_regular_w128_v_8bpc_ssse3: 5020.6
      ------------------------------------------
      mct_8tap_regular_w4_hv_8bpc_c: 876.9
      mct_8tap_regular_w4_hv_8bpc_sse2: 171.7
      mct_8tap_regular_w4_hv_8bpc_ssse3: 112.2
      ---------------------
      mct_8tap_regular_w8_hv_8bpc_c: 2215.1
      mct_8tap_regular_w8_hv_8bpc_sse2: 730.2
      mct_8tap_regular_w8_hv_8bpc_ssse3: 330.9
      ---------------------
      mct_8tap_regular_w16_hv_8bpc_c: 6075.5
      mct_8tap_regular_w16_hv_8bpc_sse2: 2252.1
      mct_8tap_regular_w16_hv_8bpc_ssse3: 973.4
      ---------------------
      mct_8tap_regular_w32_hv_8bpc_c: 22182.7
      mct_8tap_regular_w32_hv_8bpc_sse2: 7692.6
      mct_8tap_regular_w32_hv_8bpc_ssse3: 3599.8
      ---------------------
      mct_8tap_regular_w64_hv_8bpc_c: 50876.8
      mct_8tap_regular_w64_hv_8bpc_sse2: 18499.6
      mct_8tap_regular_w64_hv_8bpc_ssse3: 8815.6
      ---------------------
      mct_8tap_regular_w128_hv_8bpc_c: 122926.3
      mct_8tap_regular_w128_hv_8bpc_sse2: 45120.0
      mct_8tap_regular_w128_hv_8bpc_ssse3: 22085.7
      ------------------------------------------
      22fb8a42
    • Victorien Le Couviour--Tuffet's avatar
      x86: Adapt SSSE3 prep_bilin to SSE2 · 83956bf1
      Victorien Le Couviour--Tuffet authored
      ---------------------
      x86_64:
      ------------------------------------------
      mct_bilinear_w4_h_8bpc_c: 98.9
      mct_bilinear_w4_h_8bpc_sse2: 30.2
      mct_bilinear_w4_h_8bpc_ssse3: 11.5
      ---------------------
      mct_bilinear_w8_h_8bpc_c: 175.3
      mct_bilinear_w8_h_8bpc_sse2: 57.0
      mct_bilinear_w8_h_8bpc_ssse3: 19.7
      ---------------------
      mct_bilinear_w16_h_8bpc_c: 396.2
      mct_bilinear_w16_h_8bpc_sse2: 179.3
      mct_bilinear_w16_h_8bpc_ssse3: 50.9
      ---------------------
      mct_bilinear_w32_h_8bpc_c: 1311.2
      mct_bilinear_w32_h_8bpc_sse2: 718.8
      mct_bilinear_w32_h_8bpc_ssse3: 243.9
      ---------------------
      mct_bilinear_w64_h_8bpc_c: 2892.7
      mct_bilinear_w64_h_8bpc_sse2: 1746.0
      mct_bilinear_w64_h_8bpc_ssse3: 568.0
      ---------------------
      mct_bilinear_w128_h_8bpc_c: 7192.6
      mct_bilinear_w128_h_8bpc_sse2: 4339.8
      mct_bilinear_w128_h_8bpc_ssse3: 1619.2
      ------------------------------------------
      mct_bilinear_w4_v_8bpc_c: 129.7
      mct_bilinear_w4_v_8bpc_sse2: 26.6
      mct_bilinear_w4_v_8bpc_ssse3: 16.7
      ---------------------
      mct_bilinear_w8_v_8bpc_c: 233.3
      mct_bilinear_w8_v_8bpc_sse2: 55.0
      mct_bilinear_w8_v_8bpc_ssse3: 24.7
      ---------------------
      mct_bilinear_w16_v_8bpc_c: 498.9
      mct_bilinear_w16_v_8bpc_sse2: 146.0
      mct_bilinear_w16_v_8bpc_ssse3: 54.2
      ---------------------
      mct_bilinear_w32_v_8bpc_c: 1562.2
      mct_bilinear_w32_v_8bpc_sse2: 560.6
      mct_bilinear_w32_v_8bpc_ssse3: 201.0
      ---------------------
      mct_bilinear_w64_v_8bpc_c: 3221.3
      mct_bilinear_w64_v_8bpc_sse2: 1380.6
      mct_bilinear_w64_v_8bpc_ssse3: 499.3
      ---------------------
      mct_bilinear_w128_v_8bpc_c: 7357.7
      mct_bilinear_w128_v_8bpc_sse2: 3439.0
      mct_bilinear_w128_v_8bpc_ssse3: 1489.1
      ------------------------------------------
      mct_bilinear_w4_hv_8bpc_c: 185.0
      mct_bilinear_w4_hv_8bpc_sse2: 54.5
      mct_bilinear_w4_hv_8bpc_ssse3: 22.1
      ---------------------
      mct_bilinear_w8_hv_8bpc_c: 377.8
      mct_bilinear_w8_hv_8bpc_sse2: 104.3
      mct_bilinear_w8_hv_8bpc_ssse3: 35.8
      ---------------------
      mct_bilinear_w16_hv_8bpc_c: 1159.4
      mct_bilinear_w16_hv_8bpc_sse2: 311.0
      mct_bilinear_w16_hv_8bpc_ssse3: 106.3
      ---------------------
      mct_bilinear_w32_hv_8bpc_c: 4436.2
      mct_bilinear_w32_hv_8bpc_sse2: 1230.7
      mct_bilinear_w32_hv_8bpc_ssse3: 400.7
      ---------------------
      mct_bilinear_w64_hv_8bpc_c: 10627.7
      mct_bilinear_w64_hv_8bpc_sse2: 2934.2
      mct_bilinear_w64_hv_8bpc_ssse3: 957.2
      ---------------------
      mct_bilinear_w128_hv_8bpc_c: 26048.9
      mct_bilinear_w128_hv_8bpc_sse2: 7590.3
      mct_bilinear_w128_hv_8bpc_ssse3: 2947.0
      ------------------------------------------
      83956bf1
  3. Jun 10, 2020
  4. Jun 09, 2020
  5. Jun 07, 2020
    • Niklas Haas's avatar
      CI: Enable coverage reports · 2b98fd28
      Niklas Haas authored and Jean-Baptiste Kempf's avatar Jean-Baptiste Kempf committed
      Blacklisted some files not directly relevant to the codebase (such as
      tests, tools and debugging functions).
      
      The coverage HTML report gets attached as a build artifact, although
      unfortunately we can't link directly to the `index.html`. We also attach
      the coverage XML as a cobertura report, although I'm not sure if it does
      anything.
      2b98fd28
  6. Jun 04, 2020
  7. Jun 01, 2020
    • Victorien Le Couviour--Tuffet's avatar
      x86: Add put_8tap_scaled AVX2 asm · a755541f
      Victorien Le Couviour--Tuffet authored
      mc_scaled_8tap_regular_w2_8bpc_c: 764.4
      mc_scaled_8tap_regular_w2_8bpc_avx2: 191.3
      mc_scaled_8tap_regular_w2_dy1_8bpc_c: 705.8
      mc_scaled_8tap_regular_w2_dy1_8bpc_avx2: 89.5
      mc_scaled_8tap_regular_w2_dy2_8bpc_c: 964.0
      mc_scaled_8tap_regular_w2_dy2_8bpc_avx2: 120.3
      
      mc_scaled_8tap_regular_w4_8bpc_c: 1355.7
      mc_scaled_8tap_regular_w4_8bpc_avx2: 180.9
      mc_scaled_8tap_regular_w4_dy1_8bpc_c: 1233.2
      mc_scaled_8tap_regular_w4_dy1_8bpc_avx2: 115.3
      mc_scaled_8tap_regular_w4_dy2_8bpc_c: 1707.6
      mc_scaled_8tap_regular_w4_dy2_8bpc_avx2: 117.9
      
      mc_scaled_8tap_regular_w8_8bpc_c: 2483.2
      mc_scaled_8tap_regular_w8_8bpc_avx2: 294.8
      mc_scaled_8tap_regular_w8_dy1_8bpc_c: 2166.4
      mc_scaled_8tap_regular_w8_dy1_8bpc_avx2: 222.0
      mc_scaled_8tap_regular_w8_dy2_8bpc_c: 3133.7
      mc_scaled_8tap_regular_w8_dy2_8bpc_avx2: 292.6
      
      mc_scaled_8tap_regular_w16_8bpc_c: 5239.2
      mc_scaled_8tap_regular_w16_8bpc_avx2: 729.9
      mc_scaled_8tap_regular_w16_dy1_8bpc_c: 5156.5
      mc_scaled_8tap_regular_w16_dy1_8bpc_avx2: 602.2
      mc_scaled_8tap_regular_w16_dy2_8bpc_c: 8018.4
      mc_scaled_8tap_regular_w16_dy2_8bpc_avx2: 783.1
      
      mc_scaled_8tap_regular_w32_8bpc_c: 14745.0
      mc_scaled_8tap_regular_w32_8bpc_avx2: 2205.0
      mc_scaled_8tap_regular_w32_dy1_8bpc_c: 14862.3
      mc_scaled_8tap_regular_w32_dy1_8bpc_avx2: 1721.3
      mc_scaled_8tap_regular_w32_dy2_8bpc_c: 23607.6
      mc_scaled_8tap_regular_w32_dy2_8bpc_avx2: 2325.7
      
      mc_scaled_8tap_regular_w64_8bpc_c: 54891.7
      mc_scaled_8tap_regular_w64_8bpc_avx2: 8351.4
      mc_scaled_8tap_regular_w64_dy1_8bpc_c: 50249.0
      mc_scaled_8tap_regular_w64_dy1_8bpc_avx2: 5864.4
      mc_scaled_8tap_regular_w64_dy2_8bpc_c: 79400.1
      mc_scaled_8tap_regular_w64_dy2_8bpc_avx2: 8295.7
      
      mc_scaled_8tap_regular_w128_8bpc_c: 121046.8
      mc_scaled_8tap_regular_w128_8bpc_avx2: 21809.1
      mc_scaled_8tap_regular_w128_dy1_8bpc_c: 133720.4
      mc_scaled_8tap_regular_w128_dy1_8bpc_avx2: 16197.8
      mc_scaled_8tap_regular_w128_dy2_8bpc_c: 218774.8
      mc_scaled_8tap_regular_w128_dy2_8bpc_avx2: 22993.1
      a755541f
  8. May 28, 2020
    • Steve Lhomme's avatar
      meson: favor _aligned_malloc over posix_memalign · ed39e8fb
      Steve Lhomme authored
      posix_memalign is defined as a built-in in gcc in msys2 but it's not available
      when linking with the Universal C Runtime. _aligned_malloc is available in the
      UCRT.
      
      That should only affect builds targeting Windows since _aligned_malloc is a MS
      thing.
      ed39e8fb
  9. May 26, 2020
  10. May 25, 2020
    • Niklas Haas's avatar
      dav1dplay: allow resizing the window · a1e7a329
      Niklas Haas authored
      libplacebo v66 got helper functions that make preserving the aspect
      ratio in this case trivial. But we still need to make sure to clear the
      FBO to black if the image doesn't cover it fully.
      a1e7a329
  11. May 20, 2020
    • Niklas Haas's avatar
      dav1dplay: don't freeze on render errors · df40d36d
      Niklas Haas authored
      Returning out of this function when pl_render_image() fails is the wrong
      thing to do, since that leaves the swapchain frame acquired but never
      submitted. Instead, just clear the target FBO to blank red (to make it
      clear that something went wrong) and continue on with presentation.
      0.7.0
      df40d36d
  12. May 19, 2020
  13. May 18, 2020
    • Niklas Haas's avatar
      dav1dplay: support on-GPU film grain synthesis · cbe05cf4
      Niklas Haas authored
      Annoying minor differences in this struct layout mean we can't just
      memcpy the entire thing. Oh well.
      
      Note: technically, PL_API_VER 33 added this API, but PL_API_VER 63 is
      the minimum version of libplacebo that doesn't have glaring bugs when
      generating chroma grain, so we require that as a minimum instead.
      
      (I tested this version on some 4:2:2 and 4:2:0, 8-bit and 10-bit grain
      samples I had lying around and made sure the output was identical up to
      differences in rounding / dithering.)
      cbe05cf4
    • Niklas Haas's avatar
      dav1dplay: handle all supported csps/reprs/bitdepths · 7bbebdb4
      Niklas Haas authored
      Generalize the code to set the right pl_image metadata based on the
      values signaled in the Dav1dPictureParameters / Dav1dSequenceHeader.
      
      Some values are not mapped, in which case stdout will be spammed.
      Whatever. Hopefully somebody sees that error spam and opens a bug report
      for libplacebo to implement it.
      7bbebdb4
    • Niklas Haas's avatar
      dav1dplay: move and simplify pl_image generation · f01fd0f1
      Niklas Haas authored
      Having the pl_image generation live in upload_planes() rather than
      render() will make it easier to set the correct pl_image metadata based
      on the Dav1dPicture headers moving forwards. Rename the function to make
      more sense, semantically.
      
      Reduce some code duplication by turning per-plane fields into arrays
      wherever appropriate.
      
      As an aside, also apply the correct chroma location rather than
      hard-coding it as PL_CHROMA_LEFT.
      f01fd0f1
    • Niklas Haas's avatar
      dav1dplay: don't write directly to iparams.extensions · 3bb0aed1
      Niklas Haas authored
      This is turned into a const array in upstream libplacebo, which
      generates warnings due to the implicit cast. Rewrite the code to have
      the mutable array live inside a separate variable `extensions` and only
      set `iparams.extensions` to this, rather than directly manipulating it.
      3bb0aed1
  14. May 16, 2020
  15. May 15, 2020
  16. May 14, 2020
Loading