Skip to content
Snippets Groups Projects

Update `subWxH_dct` kernels for AARCH64 NEON. This will also make the SVE implementation redundant.

Open Matthias Langer requested to merge nekobasu/x264:sub_dct_kernels into master
BEFORE                    =>   AFTER                     = IMPROVEMENT
--------------------------------------------------------------------------
sub4x4_dct_c: 67          =>   sub4x4_dct_c: 66          =
sub4x4_dct_neon: 51       =>   sub4x4_dct_neon: 13       = 51/13 = 3.92x
sub4x4_dct_sve: 19        =>   sub4x4_dct_sve: 19        = now redundant
sub8x8_dct_c: 321         =>   sub8x8_dct_c: 317         =
sub8x8_dct_neon: 69       =>   sub8x8_dct_neon: 63       = 69/63 = 1.10x
sub8x8_dct8_c: 540        =>   sub8x8_dct8_c: 534        =
sub8x8_dct8_neon: 110     =>   sub8x8_dct8_neon: 105     = 110/105 = 1.05x
sub8x8_dct_dc_c: 130      =>   sub8x8_dct_dc_c: 130      =
sub8x8_dct_dc_neon: 22    =>   sub8x8_dct_dc_neon: 18    = 22/18 = 1.22x
sub8x16_dct_dc_c: 283     =>   sub8x16_dct_dc_c: 280     =
sub8x16_dct_dc_neon: 51   =>   sub8x16_dct_dc_neon: 48   = 51/48 = 1.06x
sub16x16_dct_c: 1352      =>   sub16x16_dct_c: 1345      =
sub16x16_dct_neon: 318    =>   sub16x16_dct_neon: 297    = 318/297 = 1.07x
sub16x16_dct8_c: 2273     =>   sub16x16_dct8_c: 2279     =
sub16x16_dct8_neon: 499   =>   sub16x16_dct8_neon: 478   = 499/478 = 1.04x

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
Please register or sign in to reply
Loading