Forked from
VideoLAN / dav1d
1528 commits behind the upstream repository.
-
Jean-Baptiste Kempf authored39667c75
To find the state of this project's repository at the time of any of these versions, check out the tags.
NEWS 4.18 KiB
Changes for 0.5.2 'Asiatic Cheetah':
------------------------------------
0.5.2 is a small release improving speed for ARM32 and adding minor features:
- ARM32 optimizations for loopfilter, ipred_dc|h|v
- Add section-5 raw OBU demuxer
- Improve the speed by reducing the L2 cache collisions
- Fix minor issues
Changes for 0.5.1 'Asiatic Cheetah':
------------------------------------
0.5.1 is a small release improving speeds and fixing minor issues
compared to 0.5.0:
- SSE2 optimizations for CDEF, wiener and warp_affine
- NEON optimizations for SGR on ARM32
- Fix mismatch issue in x86 asm in inverse identity transforms
- Fix build issue in ARM64 assembly if debug info was enabled
- Add a workaround for Xcode 11 -fstack-check bug
Changes for 0.5.0 'Asiatic Cheetah':
------------------------------------
0.5.0 is a medium release fixing regressions and minor issues,
and improving speed significantly:
- Export ITU T.35 metadata
- Speed improvements on blend_ on ARM
- Speed improvements on decode_coef and MSAC
- NEON optimizations for blend*, w_mask_, ipred functions for ARM64
- NEON optimizations for CDEF and warp on ARM32
- SSE2 optimizations for MSAC hi_tok decoding
- SSSE3 optimizations for deblocking loopfilters and warp_affine
- AVX-2 optimizations for film grain and ipred_z2
- SSE4 optimizations for warp_affine
- VSX optimizations for wiener
- Fix inverse transform overflows in x86 and NEON asm
- Fix integer overflows with large frames
- Improve film grain generation to match reference code
- Improve compatibility with older binutils for ARM
- More advanced Player example in tools
Changes for 0.4.0 'Cheetah':
----------------------------
- Fix playback with unknown OBUs
- Add an option to limit the maximum frame size
- SSE2 and ARM64 optimizations for MSAC
- Improve speed on 32bits systems
- Optimization in obmc blend
- Reduce RAM usage significantly
- The initial PPC SIMD code, cdef_filter
- NEON optimizations for blend functions on ARM
- NEON optimizations for w_mask functions on ARM
- NEON optimizations for inverse transforms on ARM64
- VSX optimizations for CDEF filter
- Improve handling of malloc failures
- Simple Player example in tools
Changes for 0.3.1 'Sailfish':
------------------------------
- Fix a buffer overflow in frame-threading mode on SSSE3 CPUs
- Reduce binary size, notably on Windows
- SSSE3 optimizations for ipred_filter
- ARM optimizations for MSAC
Changes for 0.3.0 'Sailfish':
------------------------------
This is the final release for the numerous speed improvements of 0.3.0-rc.
It mostly:
- Fixes an annoying crash on SSSE3 that happened in the itx functions
Changes for 0.2.2 (0.3.0-rc) 'Antelope':
-----------------------------
- Large improvement on MSAC decoding with SSE, bringing 4-6% speed increase
The impact is important on SSSE3, SSE4 and AVX-2 cpus
- SSSE3 optimizations for all blocks size in itx
- SSSE3 optimizations for ipred_paeth and ipred_cfl (420, 422 and 444)
- Speed improvements on CDEF for SSE4 CPUs
- NEON optimizations for SGR and loop filter
- Minor crashes, improvements and build changes
Changes for 0.2.1 'Antelope':
----------------------------
- SSSE3 optimization for cdef_dir
- AVX-2 improvements of the existing CDEF optimizations
- NEON improvements of the existing CDEF and wiener optimizations
- Clarification about the numbering/versionning scheme
Changes for 0.2.0 'Antelope':
----------------------------
- ARM64 and ARM optimizations using NEON instructions
- SSSE3 optimizations for both 32 and 64bits
- More AVX-2 assembly, reaching almost completion
- Fix installation of includes
- Rewrite inverse transforms to avoid overflows
- Snap packaging for Linux
- Updated API (ABI and API break)
- Fixes for un-decodable samples
Changes for 0.1.0 'Gazelle':
----------------------------
Initial release of dav1d, the fast and small AV1 decoder.
- Support for all features of the AV1 bitstream
- Support for all bitdepth, 8, 10 and 12bits
- Support for all chroma subsamplings 4:2:0, 4:2:2, 4:4:4 *and* grayscale
- Full acceleration for AVX-2 64bits processors, making it the fastest decoder
- Partial acceleration for SSSE3 processors
- Partial acceleration for NEON processors