inline assembly: missing clobbers and "memory"
Dear developers,
we are academic researchers working in automated program analysis. We are currently interested in checking compliance of inline asm chunks as found in C programs. While benchmarking our tool and technique, we found a number of issues in x264.
We report them to you, as well as adequate patches 0001-inline-assembly-add-mmx-and-xmm-clobber-information.patch.
Actually, we found that your assembly chunks do not declare the SIMD (mmx/xmm) registers in the clobber list when they use them. Such compliance issues may trigger a bug at some point in time with compiler optimizations becoming more and more aggressive (e.g. inlining with Link Time Optimizations).
We want also to point out that 4 chunks in common/x86/predict-c.c
and
additional 4 in common/x86/util.h
miss proper memory information.
For instance, we found in the function x264_predictor_difference_mmx2
that there is
a phantom entry "m"(M64( mvc ))
to inform that memory pointed by mvc
is
read. However, given the definition of M64
, the compiler is only aware
that 8 bytes are read from mvc
. This is incorrect as the actual number of read bytes depends of the
variable i_mvc
.
We have not included the fixes in our patch since there is no perfect way to solve the issue.
It is possible to:
- remove the phantom entry and add the "memory" keyword to the clobber. It is the safest and the most portable (across the compilers) solution, but it may have an impact on performance;
- cast
mvc
to a dynamically sized array, ie"m"(*(uint16_t [i_mvc][2])mvc)
. This will work with newer version of GCC. Clang seems to accept it as well. However, ICC will not work correctly due to an internal bug; - cast
mvc
to a dynamically sized struct, ie"m"(*(union{uint16_t t[i_mvc][2];}*)mvc)
. This will work great with GCC and ICC, but Clang does not support variable sized array in struct.
Let us know if you are interested by one of them and we will send you the appropriate patches.
We would be very interested to hear your opinion on these matters. Are you interested in such errors and patches? Also, besides the patches, we are currently working on a code analyzer prototype designed to check asm compliance and to propose patches when the chunk is not compliant. This is still work in progress and we are finalizing it. The errors and patches reported here to you came from our prototype. In case such a prototype would be made available, would you consider using it?
Best regards,