Cdef filter simd
cdef_filter_4x4_8bpc_c: 2273.6 cdef_filter_4x4_8bpc_avx2: 113.6 cdef_filter_8x8_8bpc_c: 7913.0 cdef_filter_8x8_8bpc_avx2: 309.9
Decoding time reduces to 15.51s for first 1000 frames of chimera 1080p, from 23.1 before cdef_filter SIMD or 17.86 with only 8x8 cdef_filter SIMD.
Also add unit tests and rewrite C code to remove last remnants of libaom code in cdef.c
.