Commits · master · luo Bei / dav1d

Sep 13, 2022
- Reverse the change for lib.c, Sep 13 2022 · 2c1fca1c
  luo Bei authored 2 years ago
  
  2c1fca1c
Sep 12, 2022
- dav1d statistic extraction report, Sep 13 2022 · 67e07449
  luo Bei authored 2 years ago
  
  67e07449
- dav1d RaceHorses_416x240_30 decode reconstrcution yuv. · a198defa
  luo Bei authored 2 years ago
  
  a198defa
- Class D av1 bitstream, Sep 13 2022 · b87b0581
  luo Bei authored 2 years ago
  
  b87b0581
- upload test data and report · f5bb5783
  luo Bei authored 2 years ago
  
  f5bb5783
- Sep 9 2022, · 8706ad04
  luo Bei authored 2 years ago
```
dav1d statistic extraction,
support dav1d to feedback statistic information real-time
```
  8706ad04
- x86: Fix clipping in 10bpc SSE4.1 IDCT asm · 128a0d89
  Henrik Gramner authored 2 years ago
  
  128a0d89
Sep 10, 2022
- build: Improve Windows linking options · 178681e5
  Henrik Gramner authored 2 years ago
  
  178681e5
Sep 09, 2022

tools: Improve demuxer probing · 52473197

Henrik Gramner authored 2 years ago

Increase the probing size, and change the logic to assume a stream is
valid even if no conclusive decision could be made within the probing
window as long as a sequence header was detected.

52473197

CI: Disable trimming on some tests · 934713e4
Matthias Dressel authored 2 years ago
```
Allow checkasm to run.
```
934713e4
CI: Remove git 'safe.directory' config · 3920bd9d
Matthias Dressel authored 2 years ago
```
It is now handled by the gitlab runner.

Ref: 7d859f9c
```
3920bd9d
gcovr: Ignore parsing errors · ddb3189c
Matthias Dressel authored 2 years ago

ddb3189c

crossfiles: Update Android toolchains · aa3fda78

Matthias Dressel authored 2 years ago

* Android armv7: target API 19 since it's the lowest directly provided
  by the new NDK.
* Newer NDK has generic tools for ar, strip, etc.
* Remove windres as it's only relevant for Windows targets.

aa3fda78

CI: Update images · d92594bd
Matthias Dressel authored 2 years ago
```
Remove experimental since gcc12, clang14, mold are now in unstable.
```
d92594bd

Sep 08, 2022

threading: Limit the progress bitfields to the used size · 6680d26f

Victorien Le Couviour--Tuffet authored 2 years ago

Store the used size instead of the allocated size.

The used size can be smaller than the allocated size, which results in
a wrong computation of the linear progress from the frame_progress
bitfield.

6680d26f

x86: Fix rare crash in chroma film grain asm · fab6427e

Henrik Gramner authored 2 years ago

The width parameter is used directly as a pointer offset, so ensure
that it has an appropriately sized data type.

This has been done previously for luma, but chroma was overlooked.

fab6427e

Sep 07, 2022
- x86: Fix overflows in 12bpc AVX2 identity itx asm · 677129c2
  Henrik Gramner authored 2 years ago and Henrik Gramner committed 2 years ago
  
  677129c2
- x86: Fix an alignment issue in 8-bit AVX-512 loop restoration · 58b15237
  Henrik Gramner authored 2 years ago and Henrik Gramner committed 2 years ago
```
We don't have a separate 8-bit AVX-512 5-tap Wiener filter so the 7-tap
function is used for chroma as well, and in some esoteric edge cases
chroma dst pointers may only have a 32-byte alignment despite having
a width larger than 32, so use an unaligned store as a workaround.
```
  58b15237
Sep 02, 2022
- checkasm: Add short options · 895fed08
  Victorien Le Couviour--Tuffet authored 2 years ago
  
  895fed08
- checkasm: Add pattern matching to --test · 713a4f4e
  Victorien Le Couviour--Tuffet authored 2 years ago
  
  713a4f4e
- checkasm: Remove pattern matching from --bench · a63a7c96
  Victorien Le Couviour--Tuffet authored 2 years ago
```
The pattern matching feature has been improved and is now performed
under the new --function parameter, rendering this one obsolete.
```
  a63a7c96
- checkasm: Add a --function option · d5d37926
  Victorien Le Couviour--Tuffet authored 2 years ago
```
Allows to run checkasm only for functions matching a given pattern.
```
  d5d37926
Aug 30, 2022
- threading: Fix copy_lpf_progress initialization · a3a55b18
  Victorien Le Couviour--Tuffet authored 2 years ago
```
The copy_lpf_progress bitfield might not be fully cleared when size goes
down.

Credit to Oss-Fuzz.
```
  a3a55b18
Aug 19, 2022
- data: don't overwrite the Dav1dDataProps size value · cd5e4152
  James Almer authored 2 years ago
```
Fixes a regression since commit 3d3c51a0.
```
  cd5e4152
Jul 25, 2022

Adjust inlining attributes on some functions · a029d689

Henrik Gramner authored 2 years ago

The code size increase of inlining every call to certain functions
isn't a worthwhile trade-off, and most compilers actually ends up
overriding those particular inlining hints anyway.

In some cases it's also better to split the function into separate
luma and chroma functions.

a029d689

Jul 19, 2022
- x86: Remove leftover instruction in loopfilter AVX2 asm · 0b7a0a2e
  Henrik Gramner authored 2 years ago and Henrik Gramner committed 2 years ago
```
In 0aca76c3 sequences of pand/pandn/por was replaced by pblendvb, but
one instruction (which now acts as a no-op) was accidentally left in.
```
  0b7a0a2e
Jul 13, 2022
- Enable pointer authentication in assembly when building arm64e · 6dc03eee
  David Conrad authored 2 years ago and Martin Storsjö committed 2 years ago
  
  6dc03eee
Jul 11, 2022

Don't trash the return stack buffer in the NEON loop filter · d503bb0c

David Conrad authored 2 years ago

The NEON loop filter's innermost asm function can return to a different
location than the address that called it. This messes up the return stack
predictor, causing returns to be mispredicted

Instead, rework the function to always return to the address that calls it,
and instead return the information needed for the caller to short-circuit
storing pixels

d503bb0c

Jul 06, 2022

CI: Removed snap package generation · 79bc755d

Konstantin Pavlov authored 2 years ago and

Henrik Gramner committed 2 years ago

snapcraft version we use is no longer compatible with authentication
schemes snap store uses. This could be fixed by updating the snapcraft
inside the docker image, but Ubuntu no longer ships an up to date
snapcraft version in their own repositories. The other way to install
snapcraft is to manually fetch the project and core snaps just like we
do in https://code.videolan.org/videolan/docker-images/-/blob/master/vlc-ubuntu-focal/Dockerfile,
but that currently fails on Jammy due to conflict in Python versions
between what is shipped in Jammy and inside snapcraft project.

All in all, it seems snapcraft seems to be abandoned for our CI
use-case, and the usefulness of dav1d snap is disputable, so just drop
it altogether. Packaging is still available in package/snap/ for the
brave souls who want to build it on their own.

79bc755d

Eliminate unused C DSP functions at compile time · bd046635

Henrik Gramner authored 2 years ago and

Henrik Gramner committed 2 years ago

When compiling with asm enabled there's no point in compiling
C versions of DSP functions that have asm implementations using
instruction sets that the compiler can unconditionally use.

E.g. when compiling with -mssse3 we can remove the C version
of all functions with SSSE3 implementations.

This is accomplished using the compiler's dead code elimination
functionality.

Can be configured using the new 'trim_dsp' meson option, which
by default is enabled when compiling in release mode.

bd046635

cpu: Inline dav1d_get_cpu_flags() · 820bf515
Henrik Gramner authored 2 years ago and Henrik Gramner committed 2 years ago

820bf515

Jun 22, 2022
- x86: Add minor loopfilter asm improvements · 233737c9
  Henrik Gramner authored 2 years ago and Henrik Gramner committed 2 years ago
  
  233737c9
Jun 20, 2022

checkasm: Speed up signal handling · 0421f787

Henrik Gramner authored 2 years ago

Enabling/disabling signal handlers is very slow and requires a syscall.

A better approach is to keep the signal handlers enabled all the time,
and use a simple flag variable to determine if a given signal should
be handled or passed on to the default signal handler.

0421f787

checkasm: Improve seed generation on Windows · fa68b036

Henrik Gramner authored 2 years ago

GetTickCount() increases at a very low frequency, >10ms per tick.
When running multiple loops of checkasm instances in parallel
different instances regularly ends up using identical seeds.

Prefer the use of QueryPerformanceCounter() instead, which ticks at
a significantly higher rate, which in turn increases randomness.

fa68b036

ci: Don't specify a specific MacOS version · 0c590fc7
Henrik Gramner authored 2 years ago and Henrik Gramner committed 2 years ago

0c590fc7

Jun 14, 2022
- x86: Add high bit-depth loopfilter AVX-512 (Ice Lake) asm · b0907cf9
  Henrik Gramner authored 2 years ago and Henrik Gramner committed 2 years ago
  
  b0907cf9
Jun 13, 2022
- checkasm/lpf: Use operating dimensions · 9717802d
  Victorien Le Couviour--Tuffet authored 2 years ago
```
Fixes use of uninitialized value.
```
  9717802d
Jun 03, 2022
- checkasm: Print the cpu model and cpuid signature on x86 · 7576cd57
  Henrik Gramner authored 2 years ago and Henrik Gramner committed 2 years ago
  
  7576cd57
- checkasm: Add a vzeroupper check on x86 · 0aa04fd3
  Henrik Gramner authored 2 years ago and Henrik Gramner committed 2 years ago
```
Verifying that the YMM state is clean when returning from assembly
functions helps catching potential issues with AVX/SSE transitions.
```
  0aa04fd3
Jun 02, 2022

x86: Add a workaround for quirky AVX-512 hardware behavior · 0cfb03cd

Henrik Gramner authored 2 years ago and

Henrik Gramner committed 2 years ago

On Intel CPUs certain AVX-512 shuffle instructions incorrectly
flag the upper halves of YMM registers as in use when writing
to XMM registers, which may cause AVX/SSE state transitions.

This behavior is not documented and only occurs on physical
hardware, not when using the Intel SDE, so as far as I can tell
it appears to be a hardware bug.

Work around the issue by using EVEX-only registers. This avoids
the problem at the cost of a slightly larger code size.

0cfb03cd