Commits · master · Link Mauve / dav1d

Jul 08, 2019
- dav1d_fuzzer: use Dav1dSettings.frame_size_limit instead of a custom picture allocator · 6ef9a030
  James Almer authored 5 years ago
```
Limit frame size in pixels to about 16MP, while allowing the fuzzer to test
frame widths and heights above 4096.
```
  6ef9a030
Jul 07, 2019
- Fix memory leak in dav1d_submit_frame() · ee31bb85
  Henrik Gramner authored 5 years ago
  
  ee31bb85
Jul 06, 2019
- obu: also check frame_size_limit with Frame Header OBUs · 1681028f
  James Almer authored 5 years ago
  
  1681028f
Jul 05, 2019

Improve robustness of handling malloc failures · e2e56ab9

Henrik Gramner authored 5 years ago and

Henrik Gramner committed 5 years ago

Calling dav1d_get_picture() again after it has already returned with
an error due to a memory allocation failure could result in crashes.

Although doing so is not a proper API usage, and the outcome is going
to be unpredictable, we should at least try to avoid crashing.

e2e56ab9

Correctly return an error on malloc failure · c1a28d0e

Henrik Gramner authored 5 years ago and

Henrik Gramner committed 5 years ago

dav1d_submit_frame() could erroneously return 0 when tile data memory
allocation failed.

Fixes an assertion failure in dav1d_parse_obus().

c1a28d0e

Fix potential memory leak · 0435ec9c

Henrik Gramner authored 5 years ago and

Henrik Gramner committed 5 years ago

In the (very unlikely) scenario of a pthread mutex/cond init failure
in the tile state reallocation code some newly allocated mutexes/conds
could leak.

0435ec9c

Jul 02, 2019

arm: mc: neon: Improvement in blend_v function · 632b4876

B Krishnan Iyer authored 5 years ago and

Martin Storsjö committed 5 years ago

	                     A73             A53
	                Earlier	Now	Earlier	Now

blend_v_w2_8bpc_neon:	122.1	121.3	195.5	195.5
blend_v_w4_8bpc_neon:	248.2	247.5	375.6	358.5
blend_v_w8_8bpc_neon:	210.3	205.2	375.6	358.5
blend_v_w16_8bpc_neon:	252.7	237.1	579.2	590.5
blend_v_w32_8bpc_neon:	347	345.8	997.4	994.1

632b4876

Reduce the size of frame threading buffers · 65ba279b
Henrik Gramner authored 5 years ago and Henrik Gramner committed 5 years ago
```
Avoid allocating significantly more memory than what is actually used.
```
65ba279b

Consolidate scratch buffers · 0276455d

Henrik Gramner authored 5 years ago and

Henrik Gramner committed 5 years ago

Also eliminate some pointer chasing by allocating tile context buffers
as part of the struct instead of having the struct contain pointers to
separately allocated buffers.

0276455d

build: fix meson deprecation warning · beda6e0d

Victorien Le Couviour--Tuffet authored 5 years ago

'build_' prefix is reserved by meson, this will become an error in the
future, as indicated by a warning when configuring the build dir.

Closes #285.

beda6e0d

Jun 30, 2019

checkasm: msac: Add verbose printouts on failures · c9f19b1f
Martin Storsjö authored 5 years ago

c9f19b1f

checkasm: cdef: Add verbose prints for output data (and relevant input) · 13a7d786

Martin Storsjö authored 5 years ago

For the cdef_filter tests, one could also extend the buffer to
contain 16*11 pixels, to simplify printing it as one rectangular
section.

Extend the common hex_dump function to allow dumping to an arbitrary
FILE* pointer, to reuse it for printing the source pixel buffer in
case of errors.

13a7d786

Jun 29, 2019
- checkasm: looprestoration: Use checkasm_check* · 7107c2f1
  Martin Storsjö authored 5 years ago
  
  7107c2f1
- checkasm: loopfilter: Use checkasm_check* · 578489df
  Martin Storsjö authored 5 years ago
  
  578489df
- checkasm: ipred: Use checkasm_check* · 764e8ea1
  Martin Storsjö authored 5 years ago
  
  764e8ea1
Jun 27, 2019
- ci: add test stage for clang armv7a build · fcb6a6da
  Janne Grunau authored 5 years ago and Jean-Baptiste Kempf committed 5 years ago
  
  fcb6a6da
- checkasm: mc: Use checkasm_check_* for better debuggability · 18df7139
  Martin Storsjö authored 5 years ago and Jean-Baptiste Kempf committed 5 years ago
  
  18df7139
Jun 26, 2019

arm64: itx: Add NEON optimized inverse transforms · ef1ea008

Martin Storsjö authored 5 years ago and

Jean-Baptiste Kempf committed 5 years ago

The speedup for most non-dc-only dct functions is around 9-12x
over the C code generated by GCC 7.3.

Relative speedups vs C for a few functions:

Cortex A53 A72 A73
inv_txfm_add_4x4_dct_dct_0_8bpc_neon: 3.90 4.16 5.65
inv_txfm_add_4x4_dct_dct_1_8bpc_neon: 7.20 8.05 11.19
inv_txfm_add_8x8_dct_dct_0_8bpc_neon: 5.09 6.73 6.45
inv_txfm_add_8x8_dct_dct_1_8bpc_neon: 12.18 10.80 13.05
inv_txfm_add_16x16_dct_dct_0_8bpc_neon: 7.31 9.35 11.17
inv_txfm_add_16x16_dct_dct_1_8bpc_neon: 14.36 13.06 15.93
inv_txfm_add_16x16_dct_dct_2_8bpc_neon: 11.00 10.09 12.05
inv_txfm_add_32x32_dct_dct_0_8bpc_neon: 4.41 5.40 5.77
inv_txfm_add_32x32_dct_dct_1_8bpc_neon: 13.84 13.81 18.04
inv_txfm_add_32x32_dct_dct_2_8bpc_neon: 11.75 11.87 15.22
inv_txfm_add_32x32_dct_dct_3_8bpc_neon: 10.20 10.40 13.13
inv_txfm_add_32x32_dct_dct_4_8bpc_neon: 9.01 9.21 11.56
inv_txfm_add_64x64_dct_dct_0_8bpc_neon: 3.84 4.82 5.28
inv_txfm_add_64x64_dct_dct_1_8bpc_neon: 14.40 12.69 16.71
inv_txfm_add_64x64_dct_dct_4_8bpc_neon: 10.91 9.63 12.67

Some of the specialcased identity_identity transforms for 32x32
give insane speedups over the generic C code:

inv_txfm_add_32x32_identity_identity_0_8bpc_neon: 225.26 238.11 247.07
inv_txfm_add_32x32_identity_identity_1_8bpc_neon: 225.33 238.53 247.69
inv_txfm_add_32x32_identity_identity_2_8bpc_neon: 59.60 61.94 64.63
inv_txfm_add_32x32_identity_identity_3_8bpc_neon: 26.98 27.99 29.21
inv_txfm_add_32x32_identity_identity_4_8bpc_neon: 15.08 15.93 16.56

ef1ea008

tools: Use DAV1D_ERR for strerror calls · e0346114
Marvin Scholz authored 5 years ago and Jean-Baptiste Kempf committed 5 years ago

e0346114
include: Consistently use DAV1D_ERR in docs · 04dc8a4d
Marvin Scholz authored 5 years ago and Jean-Baptiste Kempf committed 5 years ago

04dc8a4d

Jun 24, 2019
- checkasm: itx: Add verbose printouts for the pixel differences · c1b3e1a9
  Martin Storsjö authored 5 years ago
  
  c1b3e1a9
- checkasm: Add functions for printing pixel buffers · c950e710
  Martin Storsjö authored 5 years ago
  
  c950e710
Jun 21, 2019
- arm: mc: Move the blend functions up above put/prep · 46980237
  Martin Storsjö authored 5 years ago
```
This keeps the put/prep functions close to the 8tap/bilin functions
that use them.
```
  46980237
Jun 20, 2019
- arm64: Consistently name macro arguments tX for temporaries in transposes · 4a2ea99d
  Martin Storsjö authored 5 years ago
  
  4a2ea99d
Jun 19, 2019
- cli: use mach_absolute_time as fallback for clock_gettime on darwin. Fixes #283 · 79e4a5f7
  Janne Grunau authored 5 years ago
```
clock_gettime() is only available since MacOS X 10.12 (Sierra).
```
  79e4a5f7
Jun 14, 2019

arm:mc: NEON implementation of blend, blend_h and blend_v function · a1e3f358

B Krishnan Iyer authored 6 years ago and

B Krishnan Iyer committed 5 years ago

	                A73	A53

blend_h_w2_8bpc_c:	149.3	246.8
blend_h_w2_8bpc_neon:	74.6	137
blend_h_w4_8bpc_c:	251.6	409.8
blend_h_w4_8bpc_neon:	66	146.6
blend_h_w8_8bpc_c:	446.6	844.1
blend_h_w8_8bpc_neon:	68.6	131.2
blend_h_w16_8bpc_c:	830	1513
blend_h_w16_8bpc_neon:	85.9	192
blend_h_w32_8bpc_c:	1605.2	2847.8
blend_h_w32_8bpc_neon:	149.8	357.6
blend_h_w64_8bpc_c:	3304.8	5515.5
blend_h_w64_8bpc_neon:	262.8	629.5
blend_h_w128_8bpc_c:	7895.1	13260.6
blend_h_w128_8bpc_neon:	577	1402
blend_v_w2_8bpc_c:	241.2	410.8
blend_v_w2_8bpc_neon:	122.1	196.8
blend_v_w4_8bpc_c:	874.4	1418.2
blend_v_w4_8bpc_neon:	248.5	375.9
blend_v_w8_8bpc_c:	1550.5	2514.7
blend_v_w8_8bpc_neon:	210.8	376
blend_v_w16_8bpc_c:	2925.3	5086
blend_v_w16_8bpc_neon:	253.4	608.3
blend_v_w32_8bpc_c:	5686.7	9470.5
blend_v_w32_8bpc_neon:	348.2	994.8
blend_w4_8bpc_c:	201.5	309.3
blend_w4_8bpc_neon:	38.6	99.2
blend_w8_8bpc_c:	531.3	944.8
blend_w8_8bpc_neon:	55.1	125.8
blend_w16_8bpc_c:	1992.8	3349.8
blend_w16_8bpc_neon:	150.1	344
blend_w32_8bpc_c:	4982	8165.9
blend_w32_8bpc_neon:	360.4	910.9

a1e3f358

Jun 10, 2019
- checkasm: Add an option to benchmark the C code as well · efd852af
  Luca Barbato authored 5 years ago and Luca Barbato committed 5 years ago
  
  efd852af
- checkasm: Add a --help option to checkasm · f6024104
  Luca Barbato authored 5 years ago and Luca Barbato committed 5 years ago
  
  f6024104
- checkasm: Add a readtime impl for ppc · 2073ea11
  Luca Barbato authored 5 years ago and Luca Barbato committed 5 years ago
  
  2073ea11
- Initial PowerPC support · 197032bd
  Luca Barbato authored 5 years ago and Luca Barbato committed 5 years ago
```
Limited to PowerPC64 LE for now.
```
  197032bd
- meson: Look for librt if clock_gettime isn't found without it · 39dba4cd
  Martin Storsjö authored 5 years ago
```
On older versions of glibc, clock_gettime isn't available in the main
libc, but part of a separate librt.

Only look for librt if clock_gettime isn't available otherwise.
```
  39dba4cd
Jun 09, 2019
- meson: simplify a few checks for x86 targets · e0623286
  James Almer authored 5 years ago
  
  e0623286
- x86: include config.asm in x86inc instead of every asm file · 1df18164
  James Almer authored 5 years ago
  
  1df18164
Jun 07, 2019

checkasm: Check for __ARM_ARCH >= 7 for the arm cpu timer inline assembly · 13067916

Martin Storsjö authored 5 years ago and

Janne Grunau committed 5 years ago

This fixes building with raspbian compilers, that default to armv6.
The isb instruction is unavailable on armv6, and the cycle counter
register is accessed differently there as well.

This fixes issue #282.

13067916

Jun 06, 2019
- CI: Added ppc64le build and test jobs · 6c90f005
  Konstantin Pavlov authored 5 years ago
  
  6c90f005
Jun 05, 2019
- Update NEWS for 0.4.0 · 3e3855bf
  Jean-Baptiste Kempf authored 5 years ago
  
  3e3855bf
- output: automatically use null muxer for /dev/null · 75c3f4a4
  Tristan Matthews authored 5 years ago
  
  75c3f4a4
Jun 04, 2019

meson: Fix nasm detection · 098a565c

Marvin Scholz authored 5 years ago

nasm -v can actually fail for example on macOS, where nasm could be a
stub executable that forwards commands to the real nasm, but if the real
nasm is not installed, fails.
This would lead to a confusing error message due to the out of bounds
array access, to avoid that, explicitly check the exit code.

098a565c

Jun 01, 2019
- checkasm: Fix out-of-bounds read in warp8x8 tests · 0040d92b
  Henrik Gramner authored 5 years ago and Henrik Gramner committed 5 years ago
  
  0040d92b
May 31, 2019
- x86: Optimize warp8x8 AVX2 asm · 5bc43169
  Henrik Gramner authored 5 years ago and Henrik Gramner committed 5 years ago
  
  5bc43169