Skip to content
Snippets Groups Projects
  1. Feb 07, 2022
  2. Feb 05, 2022
  3. Feb 03, 2022
  4. Feb 02, 2022
  5. Jan 31, 2022
  6. Jan 30, 2022
  7. Jan 25, 2022
  8. Jan 24, 2022
    • Matthias Dressel's avatar
      x86/itx: Add 16x16 12bpc AVX2 transforms · 8e8148c1
      Matthias Dressel authored
      inv_txfm_add_16x16_adst_adst_0_12bpc_c: 8990.0
      inv_txfm_add_16x16_adst_adst_0_12bpc_avx2: 646.1
      inv_txfm_add_16x16_adst_adst_1_12bpc_c: 8965.3
      inv_txfm_add_16x16_adst_adst_1_12bpc_avx2: 646.9
      inv_txfm_add_16x16_adst_adst_2_12bpc_c: 8983.2
      inv_txfm_add_16x16_adst_adst_2_12bpc_avx2: 870.1
      inv_txfm_add_16x16_adst_dct_0_12bpc_c: 9058.2
      inv_txfm_add_16x16_adst_dct_0_12bpc_avx2: 548.8
      inv_txfm_add_16x16_adst_dct_1_12bpc_c: 9092.7
      inv_txfm_add_16x16_adst_dct_1_12bpc_avx2: 549.3
      inv_txfm_add_16x16_adst_dct_2_12bpc_c: 9086.7
      inv_txfm_add_16x16_adst_dct_2_12bpc_avx2: 775.5
      inv_txfm_add_16x16_adst_flipadst_0_12bpc_c: 9083.4
      inv_txfm_add_16x16_adst_flipadst_0_12bpc_avx2: 645.6
      inv_txfm_add_16x16_adst_flipadst_1_12bpc_c: 8998.3
      inv_txfm_add_16x16_adst_flipadst_1_12bpc_avx2: 646.2
      inv_txfm_add_16x16_adst_flipadst_2_12bpc_c: 9014.7
      inv_txfm_add_16x16_adst_flipadst_2_12bpc_avx2: 873.8
      inv_txfm_add_16x16_dct_adst_0_12bpc_c: 9080.1
      inv_txfm_add_16x16_dct_adst_0_12bpc_avx2: 598.2
      inv_txfm_add_16x16_dct_adst_1_12bpc_c: 9103.3
      inv_txfm_add_16x16_dct_adst_1_12bpc_avx2: 598.1
      inv_txfm_add_16x16_dct_adst_2_12bpc_c: 9089.5
      inv_txfm_add_16x16_dct_adst_2_12bpc_avx2: 764.4
      inv_txfm_add_16x16_dct_dct_0_12bpc_c: 1042.1
      inv_txfm_add_16x16_dct_dct_0_12bpc_avx2: 28.6
      inv_txfm_add_16x16_dct_dct_1_12bpc_c: 9164.6
      inv_txfm_add_16x16_dct_dct_1_12bpc_avx2: 500.8
      inv_txfm_add_16x16_dct_dct_2_12bpc_c: 9161.9
      inv_txfm_add_16x16_dct_dct_2_12bpc_avx2: 678.2
      inv_txfm_add_16x16_dct_flipadst_0_12bpc_c: 9104.9
      inv_txfm_add_16x16_dct_flipadst_0_12bpc_avx2: 601.8
      inv_txfm_add_16x16_dct_flipadst_1_12bpc_c: 9248.6
      inv_txfm_add_16x16_dct_flipadst_1_12bpc_avx2: 599.2
      inv_txfm_add_16x16_dct_flipadst_2_12bpc_c: 9087.4
      inv_txfm_add_16x16_dct_flipadst_2_12bpc_avx2: 770.1
      inv_txfm_add_16x16_dct_identity_0_12bpc_c: 6570.4
      inv_txfm_add_16x16_dct_identity_0_12bpc_avx2: 243.9
      inv_txfm_add_16x16_dct_identity_1_12bpc_c: 6615.4
      inv_txfm_add_16x16_dct_identity_1_12bpc_avx2: 246.0
      inv_txfm_add_16x16_dct_identity_2_12bpc_c: 6553.4
      inv_txfm_add_16x16_dct_identity_2_12bpc_avx2: 435.0
      inv_txfm_add_16x16_flipadst_adst_0_12bpc_c: 8982.1
      inv_txfm_add_16x16_flipadst_adst_0_12bpc_avx2: 647.2
      inv_txfm_add_16x16_flipadst_adst_1_12bpc_c: 8978.9
      inv_txfm_add_16x16_flipadst_adst_1_12bpc_avx2: 647.2
      inv_txfm_add_16x16_flipadst_adst_2_12bpc_c: 8964.0
      inv_txfm_add_16x16_flipadst_adst_2_12bpc_avx2: 868.4
      inv_txfm_add_16x16_flipadst_dct_0_12bpc_c: 9083.5
      inv_txfm_add_16x16_flipadst_dct_0_12bpc_avx2: 550.0
      inv_txfm_add_16x16_flipadst_dct_1_12bpc_c: 9070.4
      inv_txfm_add_16x16_flipadst_dct_1_12bpc_avx2: 550.2
      inv_txfm_add_16x16_flipadst_dct_2_12bpc_c: 9085.8
      inv_txfm_add_16x16_flipadst_dct_2_12bpc_avx2: 779.7
      inv_txfm_add_16x16_flipadst_flipadst_0_12bpc_c: 8977.1
      inv_txfm_add_16x16_flipadst_flipadst_0_12bpc_avx2: 657.3
      inv_txfm_add_16x16_flipadst_flipadst_1_12bpc_c: 9002.0
      inv_txfm_add_16x16_flipadst_flipadst_1_12bpc_avx2: 657.3
      inv_txfm_add_16x16_flipadst_flipadst_2_12bpc_c: 9008.4
      inv_txfm_add_16x16_flipadst_flipadst_2_12bpc_avx2: 872.0
      inv_txfm_add_16x16_identity_dct_0_12bpc_c: 6504.7
      inv_txfm_add_16x16_identity_dct_0_12bpc_avx2: 387.5
      inv_txfm_add_16x16_identity_dct_1_12bpc_c: 6548.3
      inv_txfm_add_16x16_identity_dct_1_12bpc_avx2: 387.5
      inv_txfm_add_16x16_identity_dct_2_12bpc_c: 6512.4
      inv_txfm_add_16x16_identity_dct_2_12bpc_avx2: 387.5
      inv_txfm_add_16x16_identity_identity_0_12bpc_c: 3926.2
      inv_txfm_add_16x16_identity_identity_0_12bpc_avx2: 135.0
      inv_txfm_add_16x16_identity_identity_1_12bpc_c: 3896.7
      inv_txfm_add_16x16_identity_identity_1_12bpc_avx2: 134.5
      inv_txfm_add_16x16_identity_identity_2_12bpc_c: 3888.0
      inv_txfm_add_16x16_identity_identity_2_12bpc_avx2: 230.3
      8e8148c1
    • Victorien Le Couviour--Tuffet's avatar
      x86: Add mc.resize AVX-512 (Ice Lake) asm · 4a52aa47
      Victorien Le Couviour--Tuffet authored
      resize_8bpc_c: 542599.0
      resize_8bpc_ssse3: 87635.4
      resize_8bpc_avx2: 67401.1
      resize_8bpc_avx512icl: 50263.6
      
      resize_16bpc_c: 573438.9
      resize_16bpc_ssse3: 121505.2
      resize_16bpc_avx2: 83293.4
      resize_16bpc_avx512icl: 77974.8
      4a52aa47
    • Victorien Le Couviour--Tuffet's avatar
    • Victorien Le Couviour--Tuffet's avatar
      Split the frame init task · a8f3124a
      Victorien Le Couviour--Tuffet authored
      Allows to run most of dav1d_decode_frame_init unconditionally by putting the
      CDF and subsequent initializations in a separate task.
      a8f3124a
  9. Jan 19, 2022
    • Victorien Le Couviour--Tuffet's avatar
      1e3f0bea
    • Victorien Le Couviour--Tuffet's avatar
      Fix current frame selector wrapping condition · 6aaeeea6
      Victorien Le Couviour--Tuffet authored
      This could cause a desync between first and cur, which results in
      skipping a frame, halting the decoding.
      This desync typically doesn't occur "long enough" in the current state of the
      project to trigger the bug, as some frames would fix this cur back.
      In order to trigger this, one needs to call reset_task_cur() on the last
      frame, this would be the call post insertion of the INIT task (during
      dav1d_task_frame_init).
      This doesn't happen as we would normally pick a task from a previous frame
      already in the queue.
      6aaeeea6
  10. Jan 18, 2022
  11. Jan 17, 2022
  12. Jan 14, 2022
  13. Jan 13, 2022
  14. Jan 12, 2022
    • Victorien Le Couviour--Tuffet's avatar
      x86: Add high bitdepth mc(t)_scaled SSSE3 asm · 5919517f
      Victorien Le Couviour--Tuffet authored
      mc_scaled_8tap_regular_w2_16bpc_c: 737.7
      mc_scaled_8tap_regular_w2_16bpc_ssse3: 151.7
      mc_scaled_8tap_regular_w2_16bpc_avx2: 141.2
      mc_scaled_8tap_regular_w2_dy1_16bpc_c: 660.3
      mc_scaled_8tap_regular_w2_dy1_16bpc_ssse3: 80.8
      mc_scaled_8tap_regular_w2_dy1_16bpc_avx2: 73.2
      mc_scaled_8tap_regular_w2_dy2_16bpc_c: 884.9
      mc_scaled_8tap_regular_w2_dy2_16bpc_ssse3: 101.6
      mc_scaled_8tap_regular_w2_dy2_16bpc_avx2: 87.2
      mc_scaled_8tap_regular_w4_16bpc_c: 1356.3
      mc_scaled_8tap_regular_w4_16bpc_ssse3: 172.3
      mc_scaled_8tap_regular_w4_16bpc_avx2: 172.5
      mc_scaled_8tap_regular_w4_dy1_16bpc_c: 1244.9
      mc_scaled_8tap_regular_w4_dy1_16bpc_ssse3: 125.7
      mc_scaled_8tap_regular_w4_dy1_16bpc_avx2: 96.1
      mc_scaled_8tap_regular_w4_dy2_16bpc_c: 1665.6
      mc_scaled_8tap_regular_w4_dy2_16bpc_ssse3: 150.2
      mc_scaled_8tap_regular_w4_dy2_16bpc_avx2: 112.8
      mc_scaled_8tap_regular_w8_16bpc_c: 2536.5
      mc_scaled_8tap_regular_w8_16bpc_ssse3: 383.4
      mc_scaled_8tap_regular_w8_16bpc_avx2: 256.2
      mc_scaled_8tap_regular_w8_dy1_16bpc_c: 2331.8
      mc_scaled_8tap_regular_w8_dy1_16bpc_ssse3: 350.0
      mc_scaled_8tap_regular_w8_dy1_16bpc_avx2: 214.0
      mc_scaled_8tap_regular_w8_dy2_16bpc_c: 3169.6
      mc_scaled_8tap_regular_w8_dy2_16bpc_ssse3: 395.7
      mc_scaled_8tap_regular_w8_dy2_16bpc_avx2: 265.7
      mc_scaled_8tap_regular_w16_16bpc_c: 6384.6
      mc_scaled_8tap_regular_w16_16bpc_ssse3: 1004.4
      mc_scaled_8tap_regular_w16_16bpc_avx2: 665.0
      mc_scaled_8tap_regular_w16_dy1_16bpc_c: 6103.4
      mc_scaled_8tap_regular_w16_dy1_16bpc_ssse3: 896.3
      mc_scaled_8tap_regular_w16_dy1_16bpc_avx2: 544.2
      mc_scaled_8tap_regular_w16_dy2_16bpc_c: 8584.5
      mc_scaled_8tap_regular_w16_dy2_16bpc_ssse3: 1049.0
      mc_scaled_8tap_regular_w16_dy2_16bpc_avx2: 695.1
      mc_scaled_8tap_regular_w32_16bpc_c: 19672.8
      mc_scaled_8tap_regular_w32_16bpc_ssse3: 3204.3
      mc_scaled_8tap_regular_w32_16bpc_avx2: 2109.6
      mc_scaled_8tap_regular_w32_dy1_16bpc_c: 15964.6
      mc_scaled_8tap_regular_w32_dy1_16bpc_ssse3: 2634.5
      mc_scaled_8tap_regular_w32_dy1_16bpc_avx2: 1555.8
      mc_scaled_8tap_regular_w32_dy2_16bpc_c: 24156.9
      mc_scaled_8tap_regular_w32_dy2_16bpc_ssse3: 3217.3
      mc_scaled_8tap_regular_w32_dy2_16bpc_avx2: 2088.8
      mc_scaled_8tap_regular_w64_16bpc_c: 74356.3
      mc_scaled_8tap_regular_w64_16bpc_ssse3: 11225.9
      mc_scaled_8tap_regular_w64_16bpc_avx2: 7434.7
      mc_scaled_8tap_regular_w64_dy1_16bpc_c: 60080.9
      mc_scaled_8tap_regular_w64_dy1_16bpc_ssse3: 8912.8
      mc_scaled_8tap_regular_w64_dy1_16bpc_avx2: 5222.2
      mc_scaled_8tap_regular_w64_dy2_16bpc_c: 88891.4
      mc_scaled_8tap_regular_w64_dy2_16bpc_ssse3: 10824.8
      mc_scaled_8tap_regular_w64_dy2_16bpc_avx2: 7086.3
      mc_scaled_8tap_regular_w128_16bpc_c: 171633.3
      mc_scaled_8tap_regular_w128_16bpc_ssse3: 27089.3
      mc_scaled_8tap_regular_w128_16bpc_avx2: 17998.2
      mc_scaled_8tap_regular_w128_dy1_16bpc_c: 164399.9
      mc_scaled_8tap_regular_w128_dy1_16bpc_ssse3: 24694.1
      mc_scaled_8tap_regular_w128_dy1_16bpc_avx2: 14711.2
      mc_scaled_8tap_regular_w128_dy2_16bpc_c: 244865.3
      mc_scaled_8tap_regular_w128_dy2_16bpc_ssse3: 30599.1
      mc_scaled_8tap_regular_w128_dy2_16bpc_avx2: 20341.1
      
      mct_scaled_8tap_regular_w4_16bpc_c: 946.2
      mct_scaled_8tap_regular_w4_16bpc_ssse3: 117.5
      mct_scaled_8tap_regular_w4_16bpc_avx2: 112.5
      mct_scaled_8tap_regular_w4_dy1_16bpc_c: 886.1
      mct_scaled_8tap_regular_w4_dy1_16bpc_ssse3: 100.5
      mct_scaled_8tap_regular_w4_dy1_16bpc_avx2: 76.8
      mct_scaled_8tap_regular_w4_dy2_16bpc_c: 1170.1
      mct_scaled_8tap_regular_w4_dy2_16bpc_ssse3: 117.6
      mct_scaled_8tap_regular_w4_dy2_16bpc_avx2: 87.9
      mct_scaled_8tap_regular_w8_16bpc_c: 2784.2
      mct_scaled_8tap_regular_w8_16bpc_ssse3: 408.5
      mct_scaled_8tap_regular_w8_16bpc_avx2: 280.3
      mct_scaled_8tap_regular_w8_dy1_16bpc_c: 2530.5
      mct_scaled_8tap_regular_w8_dy1_16bpc_ssse3: 358.2
      mct_scaled_8tap_regular_w8_dy1_16bpc_avx2: 227.1
      mct_scaled_8tap_regular_w8_dy2_16bpc_c: 3525.0
      mct_scaled_8tap_regular_w8_dy2_16bpc_ssse3: 425.6
      mct_scaled_8tap_regular_w8_dy2_16bpc_avx2: 283.6
      mct_scaled_8tap_regular_w16_16bpc_c: 6773.8
      mct_scaled_8tap_regular_w16_16bpc_ssse3: 1054.6
      mct_scaled_8tap_regular_w16_16bpc_avx2: 696.4
      mct_scaled_8tap_regular_w16_dy1_16bpc_c: 6418.0
      mct_scaled_8tap_regular_w16_dy1_16bpc_ssse3: 938.7
      mct_scaled_8tap_regular_w16_dy1_16bpc_avx2: 584.5
      mct_scaled_8tap_regular_w16_dy2_16bpc_c: 9432.4
      mct_scaled_8tap_regular_w16_dy2_16bpc_ssse3: 1125.3
      mct_scaled_8tap_regular_w16_dy2_16bpc_avx2: 753.1
      mct_scaled_8tap_regular_w32_16bpc_c: 26028.8
      mct_scaled_8tap_regular_w32_16bpc_ssse3: 4128.4
      mct_scaled_8tap_regular_w32_16bpc_avx2: 2748.4
      mct_scaled_8tap_regular_w32_dy1_16bpc_c: 21604.3
      mct_scaled_8tap_regular_w32_dy1_16bpc_ssse3: 3312.4
      mct_scaled_8tap_regular_w32_dy1_16bpc_avx2: 2051.1
      mct_scaled_8tap_regular_w32_dy2_16bpc_c: 32844.3
      mct_scaled_8tap_regular_w32_dy2_16bpc_ssse3: 4102.9
      mct_scaled_8tap_regular_w32_dy2_16bpc_avx2: 2741.6
      mct_scaled_8tap_regular_w64_16bpc_c: 49101.8
      mct_scaled_8tap_regular_w64_16bpc_ssse3: 8758.9
      mct_scaled_8tap_regular_w64_16bpc_avx2: 5822.2
      mct_scaled_8tap_regular_w64_dy1_16bpc_c: 53557.7
      mct_scaled_8tap_regular_w64_dy1_16bpc_ssse3: 8469.7
      mct_scaled_8tap_regular_w64_dy1_16bpc_avx2: 5264.3
      mct_scaled_8tap_regular_w64_dy2_16bpc_c: 83379.7
      mct_scaled_8tap_regular_w64_dy2_16bpc_ssse3: 10623.7
      mct_scaled_8tap_regular_w64_dy2_16bpc_avx2: 7164.0
      mct_scaled_8tap_regular_w128_16bpc_c: 163182.2
      mct_scaled_8tap_regular_w128_16bpc_ssse3: 26452.9
      mct_scaled_8tap_regular_w128_16bpc_avx2: 18402.2
      mct_scaled_8tap_regular_w128_dy1_16bpc_c: 148199.8
      mct_scaled_8tap_regular_w128_dy1_16bpc_ssse3: 23584.9
      mct_scaled_8tap_regular_w128_dy1_16bpc_avx2: 14808.1
      mct_scaled_8tap_regular_w128_dy2_16bpc_c: 234702.2
      mct_scaled_8tap_regular_w128_dy2_16bpc_ssse3: 29653.8
      mct_scaled_8tap_regular_w128_dy2_16bpc_avx2: 20042.4
      5919517f
  15. Jan 11, 2022
  16. Jan 10, 2022
  17. Jan 09, 2022
  18. Jan 07, 2022
  19. Jan 06, 2022
Loading