- Sep 13, 2022
-
-
luo Bei authored
-
- Sep 12, 2022
- Sep 10, 2022
-
-
Henrik Gramner authored
-
- Sep 09, 2022
-
-
Henrik Gramner authored
Increase the probing size, and change the logic to assume a stream is valid even if no conclusive decision could be made within the probing window as long as a sequence header was detected.
-
Matthias Dressel authored
Allow checkasm to run.
-
Matthias Dressel authored
It is now handled by the gitlab runner. Ref: 7d859f9c
-
Matthias Dressel authored
-
Matthias Dressel authored
* Android armv7: target API 19 since it's the lowest directly provided by the new NDK. * Newer NDK has generic tools for ar, strip, etc. * Remove windres as it's only relevant for Windows targets.
-
Matthias Dressel authored
Remove experimental since gcc12, clang14, mold are now in unstable.
-
- Sep 08, 2022
-
-
Victorien Le Couviour--Tuffet authored
Store the used size instead of the allocated size. The used size can be smaller than the allocated size, which results in a wrong computation of the linear progress from the frame_progress bitfield.
-
Henrik Gramner authored
The width parameter is used directly as a pointer offset, so ensure that it has an appropriately sized data type. This has been done previously for luma, but chroma was overlooked.
-
- Sep 07, 2022
-
-
-
We don't have a separate 8-bit AVX-512 5-tap Wiener filter so the 7-tap function is used for chroma as well, and in some esoteric edge cases chroma dst pointers may only have a 32-byte alignment despite having a width larger than 32, so use an unaligned store as a workaround.
-
- Sep 02, 2022
-
-
Victorien Le Couviour--Tuffet authored
-
Victorien Le Couviour--Tuffet authored
-
Victorien Le Couviour--Tuffet authored
The pattern matching feature has been improved and is now performed under the new --function parameter, rendering this one obsolete.
-
Victorien Le Couviour--Tuffet authored
Allows to run checkasm only for functions matching a given pattern.
-
- Aug 30, 2022
-
-
Victorien Le Couviour--Tuffet authored
The copy_lpf_progress bitfield might not be fully cleared when size goes down. Credit to Oss-Fuzz.
-
- Aug 19, 2022
-
-
James Almer authored
Fixes a regression since commit 3d3c51a0.
-
- Jul 25, 2022
-
-
Henrik Gramner authored
The code size increase of inlining every call to certain functions isn't a worthwhile trade-off, and most compilers actually ends up overriding those particular inlining hints anyway. In some cases it's also better to split the function into separate luma and chroma functions.
-
- Jul 19, 2022
-
-
In 0aca76c3 sequences of pand/pandn/por was replaced by pblendvb, but one instruction (which now acts as a no-op) was accidentally left in.
-
- Jul 13, 2022
-
-
- Jul 11, 2022
-
-
David Conrad authored
The NEON loop filter's innermost asm function can return to a different location than the address that called it. This messes up the return stack predictor, causing returns to be mispredicted Instead, rework the function to always return to the address that calls it, and instead return the information needed for the caller to short-circuit storing pixels
-
- Jul 06, 2022
-
-
snapcraft version we use is no longer compatible with authentication schemes snap store uses. This could be fixed by updating the snapcraft inside the docker image, but Ubuntu no longer ships an up to date snapcraft version in their own repositories. The other way to install snapcraft is to manually fetch the project and core snaps just like we do in https://code.videolan.org/videolan/docker-images/-/blob/master/vlc-ubuntu-focal/Dockerfile, but that currently fails on Jammy due to conflict in Python versions between what is shipped in Jammy and inside snapcraft project. All in all, it seems snapcraft seems to be abandoned for our CI use-case, and the usefulness of dav1d snap is disputable, so just drop it altogether. Packaging is still available in package/snap/ for the brave souls who want to build it on their own.
-
When compiling with asm enabled there's no point in compiling C versions of DSP functions that have asm implementations using instruction sets that the compiler can unconditionally use. E.g. when compiling with -mssse3 we can remove the C version of all functions with SSSE3 implementations. This is accomplished using the compiler's dead code elimination functionality. Can be configured using the new 'trim_dsp' meson option, which by default is enabled when compiling in release mode.
-
-
- Jun 22, 2022
-
-
- Jun 20, 2022
-
-
Henrik Gramner authored
Enabling/disabling signal handlers is very slow and requires a syscall. A better approach is to keep the signal handlers enabled all the time, and use a simple flag variable to determine if a given signal should be handled or passed on to the default signal handler.
-
Henrik Gramner authored
GetTickCount() increases at a very low frequency, >10ms per tick. When running multiple loops of checkasm instances in parallel different instances regularly ends up using identical seeds. Prefer the use of QueryPerformanceCounter() instead, which ticks at a significantly higher rate, which in turn increases randomness.
-
-
- Jun 14, 2022
-
-
- Jun 13, 2022
-
-
Victorien Le Couviour--Tuffet authored
Fixes use of uninitialized value.
-
- Jun 03, 2022
-
-
-
Verifying that the YMM state is clean when returning from assembly functions helps catching potential issues with AVX/SSE transitions.
-
- Jun 02, 2022
-
-
On Intel CPUs certain AVX-512 shuffle instructions incorrectly flag the upper halves of YMM registers as in use when writing to XMM registers, which may cause AVX/SSE state transitions. This behavior is not documented and only occurs on physical hardware, not when using the Intel SDE, so as far as I can tell it appears to be a hardware bug. Work around the issue by using EVEX-only registers. This avoids the problem at the cost of a slightly larger code size.
-