Commits · master · Konstantin Pavlov / x264

Mar 14, 2019
- Fix warning in autocomplete.c when compiled with lavf · 5493be84
  Henrik Gramner authored 6 years ago
  
  5493be84
Mar 06, 2019
- Remove compatibility workarounds · d4099dd4
  Anton Mitrofanov authored 7 years ago
```
This will break decoding with older versions of FFmpeg/Libav.
```
  d4099dd4
- Remove h->rc dereferencing where possible · 120ed3af
  Anton Mitrofanov authored 6 years ago
  
  120ed3af
- x86inc: Add support for GFNI instructions · 3e5aed95
  Henrik Gramner authored 6 years ago and Anton Mitrofanov committed 6 years ago
  
  3e5aed95
- x86inc: Improve warnings for use of unsupported instructions · d3fa8b97
  Henrik Gramner authored 6 years ago and Anton Mitrofanov committed 6 years ago
```
Warn when the following are used without the appropriate cpuflag:
 * YMM and ZMM registers
 * 'pextrw' with a memory operand
 * GPR instruction set extensions
```
  d3fa8b97
- x86inc: Support N_PEXT bit on Mach-O · 101bd27d
  Henrik Gramner authored 6 years ago and Anton Mitrofanov committed 6 years ago
```
Allows for marking symbols as having limited global scope, similar to
using 'hidden' symbol visibility on ELF.
```
  101bd27d
- x86inc: Make 'non-adjacent' default in the TAIL_CALL macro · 6f85b3c4
  Henrik Gramner authored 6 years ago and Anton Mitrofanov committed 6 years ago
  
  6f85b3c4
- x86inc: Add x86-32 PIC support macros · 82721eae
  Henrik Gramner authored 6 years ago and Anton Mitrofanov committed 6 years ago
  
  82721eae
- x86inc: Turn 'movsxd' into 'movifnidn' on x86-32 · b7e9935c
  Henrik Gramner authored 6 years ago and Anton Mitrofanov committed 6 years ago
  
  b7e9935c
- Bump dates to 2019 · ec1d3230
  Henrik Gramner authored 6 years ago and Anton Mitrofanov committed 6 years ago
  
  ec1d3230
- cli: Bash autocomplete support · 74c051f2
  Henrik Gramner authored 6 years ago and Anton Mitrofanov committed 6 years ago
```
Allows for automatic command line completion for both options and values.

Options such as --input-csp and --input-fmt will dynamically retrieve
supported values from libavformat when compiled with lavf support.

Execute 'source tools/bash-autocomplete.sh' in bash to enable.
```
  74c051f2
- Signal Progressive and Constrained profiles · 92d36908
  Yusuke Nakamura authored 7 years ago and Anton Mitrofanov committed 6 years ago
```
Progressive High, Constrained High, and Progressive High 10.

Even in Main profile, constraint_set4_flag is now set to 1 if progressive,
and constraint_set5_flag is set to 1 if no B-slices are present.
```
  92d36908
- ppc: Use xxpermdi in sad_x3/x4 and use macros to avoid redundant code · 57baac4e
  Alexandra Hájková authored 6 years ago and Anton Mitrofanov committed 6 years ago
  
  57baac4e
- ppc: Use the vec_xst_len for partial stores in mc · de380f4a
  Luca Barbato authored 6 years ago and Anton Mitrofanov committed 6 years ago
```
Around a ~1% speedup to the overall encoding for --slow.
```
  de380f4a
- ppc: Use vec_splats in mc · 69dfb289
  Luca Barbato authored 6 years ago and Anton Mitrofanov committed 6 years ago
```
No overall speedup, just tidier code.
```
  69dfb289
- ppc: Use the vec_xst_len for partial stores · 40688108
  Luca Barbato authored 6 years ago and Anton Mitrofanov committed 6 years ago
```
Seems to give about a 1-2% overall speedup on --slow.
```
  40688108
- ppc: Use xxpermdi in VEC_STORE8 · 0d111333
  Luca Barbato authored 6 years ago and Anton Mitrofanov committed 6 years ago
```
Around a ~2% speedup to the overall encoding for --slow.
```
  0d111333
- ppc: Use a single store to write the scores for sad_x4_8x8 · 18262ee3
  Luca Barbato authored 6 years ago and Anton Mitrofanov committed 6 years ago
```
Yet another use of xxpermdi, another 10% gain.
```
  18262ee3
- ppc: Use xxpermdi to halve the computation in sad_x4_8x8 · 28fb2661
  Luca Barbato authored 6 years ago and Anton Mitrofanov committed 6 years ago
```
About 20% faster.
```
  28fb2661
- ppc: Rework satd_4* likewise · 83acefef
  Luca Barbato authored 6 years ago and Anton Mitrofanov committed 6 years ago
```
Now 4x4 is as slow as C and 4x8 is a 2% faster than before.
```
  83acefef
- ppc: Factor out the sum of absolute · e0d846a6
  Luca Barbato authored 6 years ago and Anton Mitrofanov committed 6 years ago
```
And use it on the other satd > 8.

5-10% faster depending on the size.
```
  e0d846a6
- ppc: Rework the adds in satd8x8 · 6e74eb5a
  Luca Barbato authored 6 years ago and Anton Mitrofanov committed 6 years ago
```
10% faster.
```
  6e74eb5a
- ppc: Add quant_4x4x4 · 4dd83955
  Luca Barbato authored 6 years ago and Anton Mitrofanov committed 6 years ago
```
4x faster than C.
```
  4dd83955
- ppc: Cleanup quant · 8f6ac77f
  Luca Barbato authored 6 years ago and Anton Mitrofanov committed 6 years ago
  
  8f6ac77f
- x86: Always use PIC in x86-64 asm · 275ef533
  Henrik Gramner authored 6 years ago and Anton Mitrofanov committed 6 years ago
```
Most x86-64 operating systems nowadays doesn't even allow .text relocations
in object files any more, and there is no measurable overall performance
difference from using RIP-relative addressing in x264 asm.

Enforcing PIC reduces complexity and simplifies testing.
```
  275ef533
Mar 03, 2019
- x86: Fix integer overflow in intra_sa8d_x3_8x8_sse2 · 72db4377
  Henrik Gramner authored 6 years ago
  
  72db4377
- Check that mbtree settings are consistent between passes · 88943afa
  Anton Mitrofanov authored 6 years ago and Henrik Gramner committed 6 years ago
```
Also check that CQP mode is not used with 2-pass.
```
  88943afa
- Mark frame_size_estimated as volatile · 6d8af5f0
  Anton Mitrofanov authored 6 years ago and Henrik Gramner committed 6 years ago
```
Ensures that access is atomic and that other threads sees the actual
value of the variable.
```
  6d8af5f0
- Fix data race detected by ThreadSanitizer · a6327f8a
  Anton Mitrofanov authored 6 years ago and Henrik Gramner committed 6 years ago
```
Bug report by Daniel Deptford.
```
  a6327f8a
- Fix XAVC with sliced-threads · 6172da4d
  Anton Mitrofanov authored 6 years ago and Henrik Gramner committed 6 years ago
  
  6172da4d
- Fix XAVC slice pattern · c7ec24cf
  Anton Mitrofanov authored 6 years ago and Henrik Gramner committed 6 years ago
  
  c7ec24cf
- Eliminate the use of strtok() · 6aa4b592
  Henrik Gramner authored 6 years ago
```
Also fix the string parsing in param_apply_tune() to correctly compare
the entire string, not just the first N characters.
```
  6aa4b592
Dec 23, 2018
- configure: Fix log2f misdetection on some systems · d6af8239
  Anton Mitrofanov authored 6 years ago and Henrik Gramner committed 6 years ago
```
Bug report by Dirk Fieldhouse.
```
  d6af8239
- Fix ultrafast preset speed regression · b763e338
  Anton Mitrofanov authored 6 years ago and Henrik Gramner committed 6 years ago
```
--trellis 0 was missed for it during 8-bit and 10-bit unification.
Bug report by Aleksey Vasenev.
```
  b763e338
- Fix --crop-rect top offset with --interlaced or --fake-interlaced · b048e265
  Anton Mitrofanov authored 6 years ago and Henrik Gramner committed 6 years ago
```
Bug report by Koby Shina.
```
  b048e265
Sep 25, 2018
- Fix possible double transpose of custom CQM if --level is not set · 545de2ff
  Anton Mitrofanov authored 6 years ago and Henrik Gramner committed 6 years ago
```
Bug reported by Nicolas Gaullier
```
  545de2ff
Aug 22, 2018
- cli: Fix linking with --system-libx264 on x86 · b63c73dc
  Henrik Gramner authored 6 years ago
  
  b63c73dc
- Fix CAVLC+RDO in 4:4:4 · fb17a6b5
  Anton Mitrofanov authored 6 years ago and Henrik Gramner committed 6 years ago
  
  fb17a6b5
Aug 06, 2018

ppc: Optimize quant functions · 303c484e

Alexandra Hájková authored 6 years ago and

Henrik Gramner committed 6 years ago

1) using xxpermdi + merge instead of 2 merges improves quant_8x8
performance by 5%

2) use vec_splats instead of vec_splat

checkasm timings when compiled with gcc:
                  C:            AltiVec:
                                before: after:
quant_2x2_dc:      57            163      46
quant_4x4_dc:     141            162      57

dequant_4x4_cmp:  104            101      45
dequant_4x4_flat: 104            106      46
dequant_8x8_cmp:  412            208     147
dequant_8x8_flat: 414            212     149

303c484e

ppc: Add support for Power9-only vec_absd · 44f16713
Alexandra Hajkova authored 6 years ago and Henrik Gramner committed 6 years ago
```
Increases overall encoding speed on POWER9 by 8%.
```
44f16713