Skip to content
Snippets Groups Projects
  1. Feb 02, 2022
  2. Jan 31, 2022
  3. Jan 30, 2022
  4. Jan 25, 2022
  5. Jan 24, 2022
    • Matthias Dressel's avatar
      x86/itx: Add 16x16 12bpc AVX2 transforms · 8e8148c1
      Matthias Dressel authored
      inv_txfm_add_16x16_adst_adst_0_12bpc_c: 8990.0
      inv_txfm_add_16x16_adst_adst_0_12bpc_avx2: 646.1
      inv_txfm_add_16x16_adst_adst_1_12bpc_c: 8965.3
      inv_txfm_add_16x16_adst_adst_1_12bpc_avx2: 646.9
      inv_txfm_add_16x16_adst_adst_2_12bpc_c: 8983.2
      inv_txfm_add_16x16_adst_adst_2_12bpc_avx2: 870.1
      inv_txfm_add_16x16_adst_dct_0_12bpc_c: 9058.2
      inv_txfm_add_16x16_adst_dct_0_12bpc_avx2: 548.8
      inv_txfm_add_16x16_adst_dct_1_12bpc_c: 9092.7
      inv_txfm_add_16x16_adst_dct_1_12bpc_avx2: 549.3
      inv_txfm_add_16x16_adst_dct_2_12bpc_c: 9086.7
      inv_txfm_add_16x16_adst_dct_2_12bpc_avx2: 775.5
      inv_txfm_add_16x16_adst_flipadst_0_12bpc_c: 9083.4
      inv_txfm_add_16x16_adst_flipadst_0_12bpc_avx2: 645.6
      inv_txfm_add_16x16_adst_flipadst_1_12bpc_c: 8998.3
      inv_txfm_add_16x16_adst_flipadst_1_12bpc_avx2: 646.2
      inv_txfm_add_16x16_adst_flipadst_2_12bpc_c: 9014.7
      inv_txfm_add_16x16_adst_flipadst_2_12bpc_avx2: 873.8
      inv_txfm_add_16x16_dct_adst_0_12bpc_c: 9080.1
      inv_txfm_add_16x16_dct_adst_0_12bpc_avx2: 598.2
      inv_txfm_add_16x16_dct_adst_1_12bpc_c: 9103.3
      inv_txfm_add_16x16_dct_adst_1_12bpc_avx2: 598.1
      inv_txfm_add_16x16_dct_adst_2_12bpc_c: 9089.5
      inv_txfm_add_16x16_dct_adst_2_12bpc_avx2: 764.4
      inv_txfm_add_16x16_dct_dct_0_12bpc_c: 1042.1
      inv_txfm_add_16x16_dct_dct_0_12bpc_avx2: 28.6
      inv_txfm_add_16x16_dct_dct_1_12bpc_c: 9164.6
      inv_txfm_add_16x16_dct_dct_1_12bpc_avx2: 500.8
      inv_txfm_add_16x16_dct_dct_2_12bpc_c: 9161.9
      inv_txfm_add_16x16_dct_dct_2_12bpc_avx2: 678.2
      inv_txfm_add_16x16_dct_flipadst_0_12bpc_c: 9104.9
      inv_txfm_add_16x16_dct_flipadst_0_12bpc_avx2: 601.8
      inv_txfm_add_16x16_dct_flipadst_1_12bpc_c: 9248.6
      inv_txfm_add_16x16_dct_flipadst_1_12bpc_avx2: 599.2
      inv_txfm_add_16x16_dct_flipadst_2_12bpc_c: 9087.4
      inv_txfm_add_16x16_dct_flipadst_2_12bpc_avx2: 770.1
      inv_txfm_add_16x16_dct_identity_0_12bpc_c: 6570.4
      inv_txfm_add_16x16_dct_identity_0_12bpc_avx2: 243.9
      inv_txfm_add_16x16_dct_identity_1_12bpc_c: 6615.4
      inv_txfm_add_16x16_dct_identity_1_12bpc_avx2: 246.0
      inv_txfm_add_16x16_dct_identity_2_12bpc_c: 6553.4
      inv_txfm_add_16x16_dct_identity_2_12bpc_avx2: 435.0
      inv_txfm_add_16x16_flipadst_adst_0_12bpc_c: 8982.1
      inv_txfm_add_16x16_flipadst_adst_0_12bpc_avx2: 647.2
      inv_txfm_add_16x16_flipadst_adst_1_12bpc_c: 8978.9
      inv_txfm_add_16x16_flipadst_adst_1_12bpc_avx2: 647.2
      inv_txfm_add_16x16_flipadst_adst_2_12bpc_c: 8964.0
      inv_txfm_add_16x16_flipadst_adst_2_12bpc_avx2: 868.4
      inv_txfm_add_16x16_flipadst_dct_0_12bpc_c: 9083.5
      inv_txfm_add_16x16_flipadst_dct_0_12bpc_avx2: 550.0
      inv_txfm_add_16x16_flipadst_dct_1_12bpc_c: 9070.4
      inv_txfm_add_16x16_flipadst_dct_1_12bpc_avx2: 550.2
      inv_txfm_add_16x16_flipadst_dct_2_12bpc_c: 9085.8
      inv_txfm_add_16x16_flipadst_dct_2_12bpc_avx2: 779.7
      inv_txfm_add_16x16_flipadst_flipadst_0_12bpc_c: 8977.1
      inv_txfm_add_16x16_flipadst_flipadst_0_12bpc_avx2: 657.3
      inv_txfm_add_16x16_flipadst_flipadst_1_12bpc_c: 9002.0
      inv_txfm_add_16x16_flipadst_flipadst_1_12bpc_avx2: 657.3
      inv_txfm_add_16x16_flipadst_flipadst_2_12bpc_c: 9008.4
      inv_txfm_add_16x16_flipadst_flipadst_2_12bpc_avx2: 872.0
      inv_txfm_add_16x16_identity_dct_0_12bpc_c: 6504.7
      inv_txfm_add_16x16_identity_dct_0_12bpc_avx2: 387.5
      inv_txfm_add_16x16_identity_dct_1_12bpc_c: 6548.3
      inv_txfm_add_16x16_identity_dct_1_12bpc_avx2: 387.5
      inv_txfm_add_16x16_identity_dct_2_12bpc_c: 6512.4
      inv_txfm_add_16x16_identity_dct_2_12bpc_avx2: 387.5
      inv_txfm_add_16x16_identity_identity_0_12bpc_c: 3926.2
      inv_txfm_add_16x16_identity_identity_0_12bpc_avx2: 135.0
      inv_txfm_add_16x16_identity_identity_1_12bpc_c: 3896.7
      inv_txfm_add_16x16_identity_identity_1_12bpc_avx2: 134.5
      inv_txfm_add_16x16_identity_identity_2_12bpc_c: 3888.0
      inv_txfm_add_16x16_identity_identity_2_12bpc_avx2: 230.3
      8e8148c1
    • Victorien Le Couviour--Tuffet's avatar
      x86: Add mc.resize AVX-512 (Ice Lake) asm · 4a52aa47
      Victorien Le Couviour--Tuffet authored
      resize_8bpc_c: 542599.0
      resize_8bpc_ssse3: 87635.4
      resize_8bpc_avx2: 67401.1
      resize_8bpc_avx512icl: 50263.6
      
      resize_16bpc_c: 573438.9
      resize_16bpc_ssse3: 121505.2
      resize_16bpc_avx2: 83293.4
      resize_16bpc_avx512icl: 77974.8
      4a52aa47
    • Victorien Le Couviour--Tuffet's avatar
    • Victorien Le Couviour--Tuffet's avatar
      Split the frame init task · a8f3124a
      Victorien Le Couviour--Tuffet authored
      Allows to run most of dav1d_decode_frame_init unconditionally by putting the
      CDF and subsequent initializations in a separate task.
      a8f3124a
  6. Jan 19, 2022
    • Victorien Le Couviour--Tuffet's avatar
      1e3f0bea
    • Victorien Le Couviour--Tuffet's avatar
      Fix current frame selector wrapping condition · 6aaeeea6
      Victorien Le Couviour--Tuffet authored
      This could cause a desync between first and cur, which results in
      skipping a frame, halting the decoding.
      This desync typically doesn't occur "long enough" in the current state of the
      project to trigger the bug, as some frames would fix this cur back.
      In order to trigger this, one needs to call reset_task_cur() on the last
      frame, this would be the call post insertion of the INIT task (during
      dav1d_task_frame_init).
      This doesn't happen as we would normally pick a task from a previous frame
      already in the queue.
      6aaeeea6
  7. Jan 18, 2022
  8. Jan 17, 2022
  9. Jan 14, 2022
  10. Jan 13, 2022
  11. Jan 12, 2022
    • Victorien Le Couviour--Tuffet's avatar
      x86: Add high bitdepth mc(t)_scaled SSSE3 asm · 5919517f
      Victorien Le Couviour--Tuffet authored
      mc_scaled_8tap_regular_w2_16bpc_c: 737.7
      mc_scaled_8tap_regular_w2_16bpc_ssse3: 151.7
      mc_scaled_8tap_regular_w2_16bpc_avx2: 141.2
      mc_scaled_8tap_regular_w2_dy1_16bpc_c: 660.3
      mc_scaled_8tap_regular_w2_dy1_16bpc_ssse3: 80.8
      mc_scaled_8tap_regular_w2_dy1_16bpc_avx2: 73.2
      mc_scaled_8tap_regular_w2_dy2_16bpc_c: 884.9
      mc_scaled_8tap_regular_w2_dy2_16bpc_ssse3: 101.6
      mc_scaled_8tap_regular_w2_dy2_16bpc_avx2: 87.2
      mc_scaled_8tap_regular_w4_16bpc_c: 1356.3
      mc_scaled_8tap_regular_w4_16bpc_ssse3: 172.3
      mc_scaled_8tap_regular_w4_16bpc_avx2: 172.5
      mc_scaled_8tap_regular_w4_dy1_16bpc_c: 1244.9
      mc_scaled_8tap_regular_w4_dy1_16bpc_ssse3: 125.7
      mc_scaled_8tap_regular_w4_dy1_16bpc_avx2: 96.1
      mc_scaled_8tap_regular_w4_dy2_16bpc_c: 1665.6
      mc_scaled_8tap_regular_w4_dy2_16bpc_ssse3: 150.2
      mc_scaled_8tap_regular_w4_dy2_16bpc_avx2: 112.8
      mc_scaled_8tap_regular_w8_16bpc_c: 2536.5
      mc_scaled_8tap_regular_w8_16bpc_ssse3: 383.4
      mc_scaled_8tap_regular_w8_16bpc_avx2: 256.2
      mc_scaled_8tap_regular_w8_dy1_16bpc_c: 2331.8
      mc_scaled_8tap_regular_w8_dy1_16bpc_ssse3: 350.0
      mc_scaled_8tap_regular_w8_dy1_16bpc_avx2: 214.0
      mc_scaled_8tap_regular_w8_dy2_16bpc_c: 3169.6
      mc_scaled_8tap_regular_w8_dy2_16bpc_ssse3: 395.7
      mc_scaled_8tap_regular_w8_dy2_16bpc_avx2: 265.7
      mc_scaled_8tap_regular_w16_16bpc_c: 6384.6
      mc_scaled_8tap_regular_w16_16bpc_ssse3: 1004.4
      mc_scaled_8tap_regular_w16_16bpc_avx2: 665.0
      mc_scaled_8tap_regular_w16_dy1_16bpc_c: 6103.4
      mc_scaled_8tap_regular_w16_dy1_16bpc_ssse3: 896.3
      mc_scaled_8tap_regular_w16_dy1_16bpc_avx2: 544.2
      mc_scaled_8tap_regular_w16_dy2_16bpc_c: 8584.5
      mc_scaled_8tap_regular_w16_dy2_16bpc_ssse3: 1049.0
      mc_scaled_8tap_regular_w16_dy2_16bpc_avx2: 695.1
      mc_scaled_8tap_regular_w32_16bpc_c: 19672.8
      mc_scaled_8tap_regular_w32_16bpc_ssse3: 3204.3
      mc_scaled_8tap_regular_w32_16bpc_avx2: 2109.6
      mc_scaled_8tap_regular_w32_dy1_16bpc_c: 15964.6
      mc_scaled_8tap_regular_w32_dy1_16bpc_ssse3: 2634.5
      mc_scaled_8tap_regular_w32_dy1_16bpc_avx2: 1555.8
      mc_scaled_8tap_regular_w32_dy2_16bpc_c: 24156.9
      mc_scaled_8tap_regular_w32_dy2_16bpc_ssse3: 3217.3
      mc_scaled_8tap_regular_w32_dy2_16bpc_avx2: 2088.8
      mc_scaled_8tap_regular_w64_16bpc_c: 74356.3
      mc_scaled_8tap_regular_w64_16bpc_ssse3: 11225.9
      mc_scaled_8tap_regular_w64_16bpc_avx2: 7434.7
      mc_scaled_8tap_regular_w64_dy1_16bpc_c: 60080.9
      mc_scaled_8tap_regular_w64_dy1_16bpc_ssse3: 8912.8
      mc_scaled_8tap_regular_w64_dy1_16bpc_avx2: 5222.2
      mc_scaled_8tap_regular_w64_dy2_16bpc_c: 88891.4
      mc_scaled_8tap_regular_w64_dy2_16bpc_ssse3: 10824.8
      mc_scaled_8tap_regular_w64_dy2_16bpc_avx2: 7086.3
      mc_scaled_8tap_regular_w128_16bpc_c: 171633.3
      mc_scaled_8tap_regular_w128_16bpc_ssse3: 27089.3
      mc_scaled_8tap_regular_w128_16bpc_avx2: 17998.2
      mc_scaled_8tap_regular_w128_dy1_16bpc_c: 164399.9
      mc_scaled_8tap_regular_w128_dy1_16bpc_ssse3: 24694.1
      mc_scaled_8tap_regular_w128_dy1_16bpc_avx2: 14711.2
      mc_scaled_8tap_regular_w128_dy2_16bpc_c: 244865.3
      mc_scaled_8tap_regular_w128_dy2_16bpc_ssse3: 30599.1
      mc_scaled_8tap_regular_w128_dy2_16bpc_avx2: 20341.1
      
      mct_scaled_8tap_regular_w4_16bpc_c: 946.2
      mct_scaled_8tap_regular_w4_16bpc_ssse3: 117.5
      mct_scaled_8tap_regular_w4_16bpc_avx2: 112.5
      mct_scaled_8tap_regular_w4_dy1_16bpc_c: 886.1
      mct_scaled_8tap_regular_w4_dy1_16bpc_ssse3: 100.5
      mct_scaled_8tap_regular_w4_dy1_16bpc_avx2: 76.8
      mct_scaled_8tap_regular_w4_dy2_16bpc_c: 1170.1
      mct_scaled_8tap_regular_w4_dy2_16bpc_ssse3: 117.6
      mct_scaled_8tap_regular_w4_dy2_16bpc_avx2: 87.9
      mct_scaled_8tap_regular_w8_16bpc_c: 2784.2
      mct_scaled_8tap_regular_w8_16bpc_ssse3: 408.5
      mct_scaled_8tap_regular_w8_16bpc_avx2: 280.3
      mct_scaled_8tap_regular_w8_dy1_16bpc_c: 2530.5
      mct_scaled_8tap_regular_w8_dy1_16bpc_ssse3: 358.2
      mct_scaled_8tap_regular_w8_dy1_16bpc_avx2: 227.1
      mct_scaled_8tap_regular_w8_dy2_16bpc_c: 3525.0
      mct_scaled_8tap_regular_w8_dy2_16bpc_ssse3: 425.6
      mct_scaled_8tap_regular_w8_dy2_16bpc_avx2: 283.6
      mct_scaled_8tap_regular_w16_16bpc_c: 6773.8
      mct_scaled_8tap_regular_w16_16bpc_ssse3: 1054.6
      mct_scaled_8tap_regular_w16_16bpc_avx2: 696.4
      mct_scaled_8tap_regular_w16_dy1_16bpc_c: 6418.0
      mct_scaled_8tap_regular_w16_dy1_16bpc_ssse3: 938.7
      mct_scaled_8tap_regular_w16_dy1_16bpc_avx2: 584.5
      mct_scaled_8tap_regular_w16_dy2_16bpc_c: 9432.4
      mct_scaled_8tap_regular_w16_dy2_16bpc_ssse3: 1125.3
      mct_scaled_8tap_regular_w16_dy2_16bpc_avx2: 753.1
      mct_scaled_8tap_regular_w32_16bpc_c: 26028.8
      mct_scaled_8tap_regular_w32_16bpc_ssse3: 4128.4
      mct_scaled_8tap_regular_w32_16bpc_avx2: 2748.4
      mct_scaled_8tap_regular_w32_dy1_16bpc_c: 21604.3
      mct_scaled_8tap_regular_w32_dy1_16bpc_ssse3: 3312.4
      mct_scaled_8tap_regular_w32_dy1_16bpc_avx2: 2051.1
      mct_scaled_8tap_regular_w32_dy2_16bpc_c: 32844.3
      mct_scaled_8tap_regular_w32_dy2_16bpc_ssse3: 4102.9
      mct_scaled_8tap_regular_w32_dy2_16bpc_avx2: 2741.6
      mct_scaled_8tap_regular_w64_16bpc_c: 49101.8
      mct_scaled_8tap_regular_w64_16bpc_ssse3: 8758.9
      mct_scaled_8tap_regular_w64_16bpc_avx2: 5822.2
      mct_scaled_8tap_regular_w64_dy1_16bpc_c: 53557.7
      mct_scaled_8tap_regular_w64_dy1_16bpc_ssse3: 8469.7
      mct_scaled_8tap_regular_w64_dy1_16bpc_avx2: 5264.3
      mct_scaled_8tap_regular_w64_dy2_16bpc_c: 83379.7
      mct_scaled_8tap_regular_w64_dy2_16bpc_ssse3: 10623.7
      mct_scaled_8tap_regular_w64_dy2_16bpc_avx2: 7164.0
      mct_scaled_8tap_regular_w128_16bpc_c: 163182.2
      mct_scaled_8tap_regular_w128_16bpc_ssse3: 26452.9
      mct_scaled_8tap_regular_w128_16bpc_avx2: 18402.2
      mct_scaled_8tap_regular_w128_dy1_16bpc_c: 148199.8
      mct_scaled_8tap_regular_w128_dy1_16bpc_ssse3: 23584.9
      mct_scaled_8tap_regular_w128_dy1_16bpc_avx2: 14808.1
      mct_scaled_8tap_regular_w128_dy2_16bpc_c: 234702.2
      mct_scaled_8tap_regular_w128_dy2_16bpc_ssse3: 29653.8
      mct_scaled_8tap_regular_w128_dy2_16bpc_avx2: 20042.4
      5919517f
  12. Jan 11, 2022
  13. Jan 10, 2022
  14. Jan 09, 2022
  15. Jan 07, 2022
  16. Jan 06, 2022
    • Ronald S. Bultje's avatar
      Add option to write each frame to separate output file · 36beb818
      Ronald S. Bultje authored
      For per-file yuv/y4m writes, this can be automatically specified
      using e.g. -o file_%w_%h_%5n.yuv/y4m. --muxer=framemd5 -o - --quiet
      will accomplish the same for per-frame md5sums.
      
      Addresses part of #310.
      36beb818
    • Wan-Teh Chang's avatar
      DAV1D_MC_IDENTITY requires DAV1D_PIXEL_LAYOUT_I444 · f9bddfff
      Wan-Teh Chang authored and Ronald S. Bultje's avatar Ronald S. Bultje committed
      Section 6.4.2 (Color config semantics) of the AV1 spec says:
        If matrix_coefficients is equal to MC_IDENTITY, it is a requirement of
        bitstream conformance that subsampling_x is equal to 0 and
        subsampling_y is equal to 0.
      Add Dav1dSettings.strict_std_compliance flag which, when set, allows
      aborting decoding when such standard-compliance violations fail, even
      though they don't affect decoding. In CLI, this flag can be accessed
      using -strict.
      f9bddfff
  17. Jan 04, 2022
  18. Jan 03, 2022
Loading