Skip to content
Snippets Groups Projects
  1. Jan 19, 2025
  2. Oct 18, 2024
  3. Sep 17, 2024
  4. Sep 12, 2024
  5. Sep 10, 2024
  6. Sep 05, 2024
  7. Aug 29, 2024
  8. May 25, 2024
  9. Apr 02, 2024
    • Martin Storsjö's avatar
      checkasm: Add support for the private macOS kperf API for benchmarking · 5e31720b
      Martin Storsjö authored
      
      On AArch64, the performance counter registers usually are
      restricted and not accessible from user space.
      
      On macOS, we currently use mach_absolute_time() as timer on
      aarch64. This measures wallclock time but with a very coarse
      resolution.
      
      There is a private API, kperf, that one can use for getting
      high precision timers though. Unfortunately, it requires running
      the checkasm binary as root (e.g. with sudo).
      
      Also, as it is a private, undocumented API, it can potentially
      change at any time.
      
      This is handled by adding a new meson build option, for switching
      to this timer. If the timer source in checkasm could be changed
      at runtime with an option, this wouldn't need to be a build time
      option.
      
      This allows getting benchmarks like this:
      
      mc_8tap_regular_w16_hv_8bpc_c:              1522.1 ( 1.00x)
      mc_8tap_regular_w16_hv_8bpc_neon:            331.8 ( 4.59x)
      
      Instead of this:
      
      mc_8tap_regular_w16_hv_8bpc_c:                 9.0 ( 1.00x)
      mc_8tap_regular_w16_hv_8bpc_neon:              1.9 ( 4.76x)
      
      Co-authored-by: default avatarJ. Dekker <jdek@itanimul.li>
      5e31720b
  10. Mar 08, 2024
  11. Mar 04, 2024
    • Martin Storsjö's avatar
      aarch64: Check for assembler support for various aarch64 extensions · e1f80dec
      Martin Storsjö authored
      First check if the assembler supports the ".arch" directive, and
      what architecture levels are supported.
      
      In principle, we'd only need to check for support for ".arch armv8.2-a",
      since that's enough for enabling the i8mm and sve2 extensions.
      
      However, recent Clang versions (before version 17) wasn't able to
      enable the dotprod and i8mm extensions via the ".arch_extension"
      directives, so check for support for armv8.4-a and armv8.6-a as well,
      which enable dotprod and i8mm implicitly.
      
      This allows assembling these instructions on most commonly available
      GCC and Clang based toolchains, while still allowing toggling support
      for the instruction sets on and off within the source files.
      
      Within assembly, we disable these extensions by default, so that
      instructions enabled within these extension sets can't be used
      by accident in unintended functions. Code meaning to use these
      extensions can be assembled like this:
      
          #if HAVE_SVE
          ENABLE_SVE
          // code
          DISABLE_SVE
          #endif
      e1f80dec
  12. Feb 20, 2024
  13. Feb 13, 2024
  14. Jan 31, 2024
  15. Jan 30, 2024
  16. Jan 21, 2024
  17. Oct 03, 2023
  18. Jun 06, 2023
  19. Jun 01, 2023
  20. May 02, 2023
  21. Apr 18, 2023
    • James Almer's avatar
      picture: allow storing an array of Dav1dITUTT35 entries · feeeccb6
      James Almer authored
      Nothing in the spec prevents a Temporal Unit from having more than one Metadata
      OBU of type ITU-T T.35, so export them as an array instead of only exporting
      the last one we parse.
      This is backwards compatible with the previous implementation, as users unaware
      of this change can ignore the n_itut_t35 field and still access the first (or
      only) entry in the array as they have been doing until now.
      feeeccb6
  22. Apr 12, 2023
  23. Apr 08, 2023
    • James Almer's avatar
      picture: allow storing an array of Dav1dITUTT35 entries · 62f8b887
      James Almer authored and Jean-Baptiste Kempf's avatar Jean-Baptiste Kempf committed
      
      Nothing in the spec prevents a Temporal Unit from having more than one Metadata
      OBU of type ITU-T T.35, so export them as an array instead of only exporting
      the last one we parse.
      This is backwards compatible with the previous implementation, as users unaware
      of this change can ignore the n_itut_t35 field and still access the first (or
      only) entry in the array as they have been doing until now.
      
      Signed-off-by: default avatarJames Almer <jamrial@gmail.com>
      62f8b887
  24. Mar 03, 2023
  25. Feb 14, 2023
  26. Jan 31, 2023
    • Martin Storsjö's avatar
      checkasm: Add an --affinity= option for selecting a CPU core · 77b39555
      Martin Storsjö authored
      Add an option for selecting the core where the single thread of
      checkasm runs. This allows benchmarking on specific CPU cores on
      heterogenous CPUs, like ARM big.LITTLE configurations.
      
      On Linux, one can easily wrap an invocation of checkasm with
      "taskset -c <n> [...]" - so this option isn't very essential
      there - however it is quite useful on Windows.
      
      On Windows, it is somewhat possible to do the same by launching
      the tool with "start /B /affinity <hexmask> [...]", but that
      doesn't work well with scripting ("start" returns before the
      command has finished running, and it's not obvious how to
      invoke "start" from within WSL).
      
      Using "taskset" to launch processes on specific cores within WSL
      on Windows doesn't work - regardless of the Linux level affinity,
      the process ends up running on the performance cores anyway.
      77b39555
  27. Jan 12, 2023
  28. Dec 14, 2022
  29. Oct 30, 2022
  30. Oct 10, 2022
    • Sebastian Dröge's avatar
      Handle host_machine.system() 'ios' and 'tvos' the same way as 'darwin' · 5b07b425
      Sebastian Dröge authored
      Despite not being documented in Meson's list of canonical system names,
      Meson does accept 'ios' mostly a synonym for darwin.
      
      By using 'ios' instead of darwin, it allows distinguishing between the
      two in the cases where that is necessary. Therefore, within dav1d, allow
      using the 'ios' name as alias for 'darwin' for system name, to allow
      using cross files that does this distinction.
      
      meson itself also allows 'tvos' in addition to 'ios' in the internal
      `is_darwin()` function, as such all 3 are handled the same here.
      5b07b425
  31. Sep 28, 2022
  32. Sep 15, 2022
  33. Sep 10, 2022
Loading