Skip to content
Snippets Groups Projects
  1. Jan 22, 2023
  2. Jan 12, 2023
  3. Dec 14, 2022
  4. Dec 13, 2022
  5. Dec 09, 2022
  6. Dec 04, 2022
  7. Nov 21, 2022
  8. Nov 10, 2022
  9. Oct 30, 2022
  10. Oct 27, 2022
  11. Oct 26, 2022
  12. Oct 20, 2022
    • Victorien Le Couviour--Tuffet's avatar
      threading: Fix a race around frame completion (frame-mt) · 3e7886db
      Victorien Le Couviour--Tuffet authored
      The completion of the first frame to decode while an async reset
      request on that same frame is pending will render it stale. The
      processing of such a stale request is likely to result in a hang.
      
      One reason this happens is the skip condition at the beginning of
      reset_task_cur().
      => Consume the async request before that check.
      
      Another reason is several threads producing async reset requests in
      parallel: an async request for the first frame could cascade through the
      other threads (other frames) during completion of that frame, meaning
      not being caught by the last synchronous reset_task_cur() after
      signaling the main thread and before releasing the lock.
      => To solve this we need to add protections at the racy locations. That
      means after we increase first, before returning from
      reset_task_cur_async(), and after consuming the async request.
      3e7886db
  13. Oct 10, 2022
    • Sebastian Dröge's avatar
      Handle host_machine.system() 'ios' and 'tvos' the same way as 'darwin' · 5b07b425
      Sebastian Dröge authored
      Despite not being documented in Meson's list of canonical system names,
      Meson does accept 'ios' mostly a synonym for darwin.
      
      By using 'ios' instead of darwin, it allows distinguishing between the
      two in the cases where that is necessary. Therefore, within dav1d, allow
      using the 'ios' name as alias for 'darwin' for system name, to allow
      using cross files that does this distinction.
      
      meson itself also allows 'tvos' in addition to 'ios' in the internal
      `is_darwin()` function, as such all 3 are handled the same here.
      5b07b425
  14. Sep 30, 2022
  15. Sep 28, 2022
  16. Sep 26, 2022
  17. Sep 19, 2022
    • Martin Storsjö's avatar
      arm: itx: Add clipping to row_clip_min/max in the 10 bpc codepaths · 345127a7
      Martin Storsjö authored
      This fixes conformance with the argon test samples, in particular
      with these samples:
          profile0_core/streams/test10100_579_8614.obu
          profile0_core/streams/test10218_6914.obu
      
      This gives a pretty notable slowdown to these transforms - some
      examples:
      
      Before:                                 Cortex A53       A72       A73    Apple M1
      inv_txfm_add_8x8_dct_dct_1_10bpc_neon:       365.7     290.2     299.8    0.3
      inv_txfm_add_16x16_dct_dct_2_10bpc_neon:    1865.2    1384.1    1457.5    2.6
      inv_txfm_add_64x64_dct_dct_4_10bpc_neon:   33976.3   26817.0   24864.2   40.4
      After:
      inv_txfm_add_8x8_dct_dct_1_10bpc_neon:       397.7     322.2     335.1    0.4
      inv_txfm_add_16x16_dct_dct_2_10bpc_neon:    2121.9    1336.7    1664.6    2.6
      inv_txfm_add_64x64_dct_dct_4_10bpc_neon:   38569.4   27622.6   28176.0   51.0
      
      Thus, for the transforms alone, it makes them around 10-13% slower
      (the Apple M1 measurements are too noisy to be conclusive here).
      
      Measured on actual full decoding, it makes decoding of 10 bpc
      Chimera around maybe 1% slower on an Apple M1 - close to measurement
      noise anyway.
      345127a7
    • Henrik Gramner's avatar
      9c74a9b0
    • Henrik Gramner's avatar
      x86: Fix overflows in 12bpc AVX2 DC-only IDCT · 49b1c3c5
      Henrik Gramner authored
      Using smaller immediates also results in a small code size reduction in
      some cases, so apply those changes to the (10bpc-only) SSE code as well.
      49b1c3c5
    • Henrik Gramner's avatar
      x86: Fix clipping in high bit-depth AVX2 4x16 IDCT · 0c8a3461
      Henrik Gramner authored
      Certain clips were incorrectly performed on negated values, which
      caused things to be off-by-one in both directions. Correct this by
      negating such values prior to clipping instead of afterwards.
      0c8a3461
  18. Sep 15, 2022
    • Martin Storsjö's avatar
      Don't use gas-preprocessor with clang-cl for arm targets · cc9651f5
      Martin Storsjö authored
      Since meson 0.58.0 (released in May 2021), meson accepts adding '.S'
      assembly files as source files to the clang-cl compiler.
      
      If using an older version of meson, keep using gas-preprocessor
      just like for MSVC builds.
      cc9651f5
    • David Conrad's avatar
      Fix checking the reference dimesions for the projection process · d4a2b75d
      David Conrad authored
      Section 7.9.2 returns 0 "If RefMiRows[ srcIdx ] is not equal to MiRows,
      RefMiCols[ srcIdx ] is not equal to MiCols"
      
      dav1d was comparing pixel width/height, not block width/height,
      so conform with the spec
      d4a2b75d
    • David Conrad's avatar
      Fix calculation of OBMC lap dimensions · eb25f00c
      David Conrad authored
      Individual OBMC lapped predictions have a max width of 64 pixels
      for the top lap and have a max height of 64 for the left laps
      
      This is 7.11.3.9. Overlapped motion compensation process
      step4 = Clip3( 2, 16, Num_4x4_Blocks_Wide[ candSz ] )
      
      dav1d wasn't clipping this as needed, which means that with scaled MC, the
      interpolation of the 2nd half of a 128 block was incorrect, since mx/my
      for subpel filter selection need to be reset at the 64 pixel boundary
      eb25f00c
Loading