- Nov 29, 2021
-
-
Henrik Gramner authored
-
Matthias Dressel authored
inv_txfm_add_16x4_adst_adst_0_12bpc_c: 1756.6 inv_txfm_add_16x4_adst_adst_0_12bpc_avx2: 182.4 inv_txfm_add_16x4_adst_adst_1_12bpc_c: 1756.0 inv_txfm_add_16x4_adst_adst_1_12bpc_avx2: 182.5 inv_txfm_add_16x4_adst_adst_2_12bpc_c: 1763.2 inv_txfm_add_16x4_adst_adst_2_12bpc_avx2: 182.4 inv_txfm_add_16x4_adst_dct_0_12bpc_c: 1863.6 inv_txfm_add_16x4_adst_dct_0_12bpc_avx2: 176.0 inv_txfm_add_16x4_adst_dct_1_12bpc_c: 1864.1 inv_txfm_add_16x4_adst_dct_1_12bpc_avx2: 176.0 inv_txfm_add_16x4_adst_dct_2_12bpc_c: 1861.3 inv_txfm_add_16x4_adst_dct_2_12bpc_avx2: 176.0 inv_txfm_add_16x4_adst_flipadst_0_12bpc_c: 1768.6 inv_txfm_add_16x4_adst_flipadst_0_12bpc_avx2: 184.1 inv_txfm_add_16x4_adst_flipadst_1_12bpc_c: 1768.8 inv_txfm_add_16x4_adst_flipadst_1_12bpc_avx2: 184.5 inv_txfm_add_16x4_adst_flipadst_2_12bpc_c: 1769.3 inv_txfm_add_16x4_adst_flipadst_2_12bpc_avx2: 184.7 inv_txfm_add_16x4_adst_identity_0_12bpc_c: 1686.6 inv_txfm_add_16x4_adst_identity_0_12bpc_avx2: 145.4 inv_txfm_add_16x4_adst_identity_1_12bpc_c: 1685.8 inv_txfm_add_16x4_adst_identity_1_12bpc_avx2: 145.8 inv_txfm_add_16x4_adst_identity_2_12bpc_c: 1681.7 inv_txfm_add_16x4_adst_identity_2_12bpc_avx2: 145.8 inv_txfm_add_16x4_dct_adst_0_12bpc_c: 1783.4 inv_txfm_add_16x4_dct_adst_0_12bpc_avx2: 167.7 inv_txfm_add_16x4_dct_adst_1_12bpc_c: 1789.1 inv_txfm_add_16x4_dct_adst_1_12bpc_avx2: 167.9 inv_txfm_add_16x4_dct_adst_2_12bpc_c: 1788.0 inv_txfm_add_16x4_dct_adst_2_12bpc_avx2: 169.8 inv_txfm_add_16x4_dct_dct_0_12bpc_c: 209.5 inv_txfm_add_16x4_dct_dct_0_12bpc_avx2: 21.6 inv_txfm_add_16x4_dct_dct_1_12bpc_c: 1894.3 inv_txfm_add_16x4_dct_dct_1_12bpc_avx2: 156.8 inv_txfm_add_16x4_dct_dct_2_12bpc_c: 1892.0 inv_txfm_add_16x4_dct_dct_2_12bpc_avx2: 156.8 inv_txfm_add_16x4_dct_flipadst_0_12bpc_c: 1784.7 inv_txfm_add_16x4_dct_flipadst_0_12bpc_avx2: 167.2 inv_txfm_add_16x4_dct_flipadst_1_12bpc_c: 1796.7 inv_txfm_add_16x4_dct_flipadst_1_12bpc_avx2: 168.6 inv_txfm_add_16x4_dct_flipadst_2_12bpc_c: 1788.9 inv_txfm_add_16x4_dct_flipadst_2_12bpc_avx2: 168.9 inv_txfm_add_16x4_dct_identity_0_12bpc_c: 1712.7 inv_txfm_add_16x4_dct_identity_0_12bpc_avx2: 128.8 inv_txfm_add_16x4_dct_identity_1_12bpc_c: 1714.8 inv_txfm_add_16x4_dct_identity_1_12bpc_avx2: 128.8 inv_txfm_add_16x4_dct_identity_2_12bpc_c: 1710.2 inv_txfm_add_16x4_dct_identity_2_12bpc_avx2: 128.8 inv_txfm_add_16x4_flipadst_adst_0_12bpc_c: 1763.6 inv_txfm_add_16x4_flipadst_adst_0_12bpc_avx2: 186.6 inv_txfm_add_16x4_flipadst_adst_1_12bpc_c: 1761.1 inv_txfm_add_16x4_flipadst_adst_1_12bpc_avx2: 185.6 inv_txfm_add_16x4_flipadst_adst_2_12bpc_c: 1761.8 inv_txfm_add_16x4_flipadst_adst_2_12bpc_avx2: 187.0 inv_txfm_add_16x4_flipadst_dct_0_12bpc_c: 1864.4 inv_txfm_add_16x4_flipadst_dct_0_12bpc_avx2: 176.8 inv_txfm_add_16x4_flipadst_dct_1_12bpc_c: 1862.7 inv_txfm_add_16x4_flipadst_dct_1_12bpc_avx2: 176.8 inv_txfm_add_16x4_flipadst_dct_2_12bpc_c: 1860.2 inv_txfm_add_16x4_flipadst_dct_2_12bpc_avx2: 176.8 inv_txfm_add_16x4_flipadst_flipadst_0_12bpc_c: 1760.4 inv_txfm_add_16x4_flipadst_flipadst_0_12bpc_avx2: 185.3 inv_txfm_add_16x4_flipadst_flipadst_1_12bpc_c: 1761.8 inv_txfm_add_16x4_flipadst_flipadst_1_12bpc_avx2: 185.3 inv_txfm_add_16x4_flipadst_flipadst_2_12bpc_c: 1766.5 inv_txfm_add_16x4_flipadst_flipadst_2_12bpc_avx2: 184.9 inv_txfm_add_16x4_flipadst_identity_0_12bpc_c: 1673.0 inv_txfm_add_16x4_flipadst_identity_0_12bpc_avx2: 143.1 inv_txfm_add_16x4_flipadst_identity_1_12bpc_c: 1673.2 inv_txfm_add_16x4_flipadst_identity_1_12bpc_avx2: 143.1 inv_txfm_add_16x4_flipadst_identity_2_12bpc_c: 1681.6 inv_txfm_add_16x4_flipadst_identity_2_12bpc_avx2: 143.2 inv_txfm_add_16x4_identity_adst_0_12bpc_c: 1128.7 inv_txfm_add_16x4_identity_adst_0_12bpc_avx2: 102.8 inv_txfm_add_16x4_identity_adst_1_12bpc_c: 1131.3 inv_txfm_add_16x4_identity_adst_1_12bpc_avx2: 101.3 inv_txfm_add_16x4_identity_adst_2_12bpc_c: 1127.5 inv_txfm_add_16x4_identity_adst_2_12bpc_avx2: 99.1 inv_txfm_add_16x4_identity_dct_0_12bpc_c: 1228.3 inv_txfm_add_16x4_identity_dct_0_12bpc_avx2: 88.3 inv_txfm_add_16x4_identity_dct_1_12bpc_c: 1220.5 inv_txfm_add_16x4_identity_dct_1_12bpc_avx2: 88.0 inv_txfm_add_16x4_identity_dct_2_12bpc_c: 1227.3 inv_txfm_add_16x4_identity_dct_2_12bpc_avx2: 88.1 inv_txfm_add_16x4_identity_flipadst_0_12bpc_c: 1142.4 inv_txfm_add_16x4_identity_flipadst_0_12bpc_avx2: 100.3 inv_txfm_add_16x4_identity_flipadst_1_12bpc_c: 1134.1 inv_txfm_add_16x4_identity_flipadst_1_12bpc_avx2: 100.3 inv_txfm_add_16x4_identity_flipadst_2_12bpc_c: 1136.4 inv_txfm_add_16x4_identity_flipadst_2_12bpc_avx2: 100.3 inv_txfm_add_16x4_identity_identity_0_12bpc_c: 1056.1 inv_txfm_add_16x4_identity_identity_0_12bpc_avx2: 61.6 inv_txfm_add_16x4_identity_identity_1_12bpc_c: 1064.6 inv_txfm_add_16x4_identity_identity_1_12bpc_avx2: 62.9 inv_txfm_add_16x4_identity_identity_2_12bpc_c: 1067.5 inv_txfm_add_16x4_identity_identity_2_12bpc_avx2: 63.5
-
Matthias Dressel authored
inv_txfm_add_4x16_adst_adst_0_12bpc_c: 1799.1 inv_txfm_add_4x16_adst_adst_0_12bpc_avx2: 178.8 inv_txfm_add_4x16_adst_adst_1_12bpc_c: 1795.0 inv_txfm_add_4x16_adst_adst_1_12bpc_avx2: 179.1 inv_txfm_add_4x16_adst_adst_2_12bpc_c: 1806.6 inv_txfm_add_4x16_adst_adst_2_12bpc_avx2: 179.3 inv_txfm_add_4x16_adst_dct_0_12bpc_c: 1824.8 inv_txfm_add_4x16_adst_dct_0_12bpc_avx2: 166.8 inv_txfm_add_4x16_adst_dct_1_12bpc_c: 1828.2 inv_txfm_add_4x16_adst_dct_1_12bpc_avx2: 166.7 inv_txfm_add_4x16_adst_dct_2_12bpc_c: 1830.9 inv_txfm_add_4x16_adst_dct_2_12bpc_avx2: 165.6 inv_txfm_add_4x16_adst_flipadst_0_12bpc_c: 1797.9 inv_txfm_add_4x16_adst_flipadst_0_12bpc_avx2: 179.6 inv_txfm_add_4x16_adst_flipadst_1_12bpc_c: 1795.9 inv_txfm_add_4x16_adst_flipadst_1_12bpc_avx2: 180.6 inv_txfm_add_4x16_adst_flipadst_2_12bpc_c: 1791.6 inv_txfm_add_4x16_adst_flipadst_2_12bpc_avx2: 180.1 inv_txfm_add_4x16_adst_identity_0_12bpc_c: 1163.7 inv_txfm_add_4x16_adst_identity_0_12bpc_avx2: 78.6 inv_txfm_add_4x16_adst_identity_1_12bpc_c: 1163.4 inv_txfm_add_4x16_adst_identity_1_12bpc_avx2: 78.9 inv_txfm_add_4x16_adst_identity_2_12bpc_c: 1164.3 inv_txfm_add_4x16_adst_identity_2_12bpc_avx2: 78.8 inv_txfm_add_4x16_dct_adst_0_12bpc_c: 1914.8 inv_txfm_add_4x16_dct_adst_0_12bpc_avx2: 177.0 inv_txfm_add_4x16_dct_adst_1_12bpc_c: 1904.8 inv_txfm_add_4x16_dct_adst_1_12bpc_avx2: 177.3 inv_txfm_add_4x16_dct_adst_2_12bpc_c: 1905.4 inv_txfm_add_4x16_dct_adst_2_12bpc_avx2: 176.4 inv_txfm_add_4x16_dct_dct_0_12bpc_c: 217.1 inv_txfm_add_4x16_dct_dct_0_12bpc_avx2: 26.6 inv_txfm_add_4x16_dct_dct_1_12bpc_c: 1955.1 inv_txfm_add_4x16_dct_dct_1_12bpc_avx2: 162.3 inv_txfm_add_4x16_dct_dct_2_12bpc_c: 1948.9 inv_txfm_add_4x16_dct_dct_2_12bpc_avx2: 162.2 inv_txfm_add_4x16_dct_flipadst_0_12bpc_c: 1922.8 inv_txfm_add_4x16_dct_flipadst_0_12bpc_avx2: 180.6 inv_txfm_add_4x16_dct_flipadst_1_12bpc_c: 1919.7 inv_txfm_add_4x16_dct_flipadst_1_12bpc_avx2: 180.1 inv_txfm_add_4x16_dct_flipadst_2_12bpc_c: 1912.0 inv_txfm_add_4x16_dct_flipadst_2_12bpc_avx2: 180.1 inv_txfm_add_4x16_dct_identity_0_12bpc_c: 1276.4 inv_txfm_add_4x16_dct_identity_0_12bpc_avx2: 75.4 inv_txfm_add_4x16_dct_identity_1_12bpc_c: 1277.5 inv_txfm_add_4x16_dct_identity_1_12bpc_avx2: 75.4 inv_txfm_add_4x16_dct_identity_2_12bpc_c: 1270.1 inv_txfm_add_4x16_dct_identity_2_12bpc_avx2: 75.3 inv_txfm_add_4x16_flipadst_adst_0_12bpc_c: 1802.8 inv_txfm_add_4x16_flipadst_adst_0_12bpc_avx2: 180.8 inv_txfm_add_4x16_flipadst_adst_1_12bpc_c: 1804.8 inv_txfm_add_4x16_flipadst_adst_1_12bpc_avx2: 180.7 inv_txfm_add_4x16_flipadst_adst_2_12bpc_c: 1800.6 inv_txfm_add_4x16_flipadst_adst_2_12bpc_avx2: 181.2 inv_txfm_add_4x16_flipadst_dct_0_12bpc_c: 1842.5 inv_txfm_add_4x16_flipadst_dct_0_12bpc_avx2: 165.1 inv_txfm_add_4x16_flipadst_dct_1_12bpc_c: 1837.8 inv_txfm_add_4x16_flipadst_dct_1_12bpc_avx2: 164.4 inv_txfm_add_4x16_flipadst_dct_2_12bpc_c: 1841.6 inv_txfm_add_4x16_flipadst_dct_2_12bpc_avx2: 166.1 inv_txfm_add_4x16_flipadst_flipadst_0_12bpc_c: 1812.4 inv_txfm_add_4x16_flipadst_flipadst_0_12bpc_avx2: 182.0 inv_txfm_add_4x16_flipadst_flipadst_1_12bpc_c: 1803.9 inv_txfm_add_4x16_flipadst_flipadst_1_12bpc_avx2: 181.2 inv_txfm_add_4x16_flipadst_flipadst_2_12bpc_c: 1809.9 inv_txfm_add_4x16_flipadst_flipadst_2_12bpc_avx2: 183.2 inv_txfm_add_4x16_flipadst_identity_0_12bpc_c: 1170.5 inv_txfm_add_4x16_flipadst_identity_0_12bpc_avx2: 78.4 inv_txfm_add_4x16_flipadst_identity_1_12bpc_c: 1172.1 inv_txfm_add_4x16_flipadst_identity_1_12bpc_avx2: 80.0 inv_txfm_add_4x16_flipadst_identity_2_12bpc_c: 1170.9 inv_txfm_add_4x16_flipadst_identity_2_12bpc_avx2: 78.6 inv_txfm_add_4x16_identity_adst_0_12bpc_c: 1705.4 inv_txfm_add_4x16_identity_adst_0_12bpc_avx2: 162.6 inv_txfm_add_4x16_identity_adst_1_12bpc_c: 1714.5 inv_txfm_add_4x16_identity_adst_1_12bpc_avx2: 162.6 inv_txfm_add_4x16_identity_adst_2_12bpc_c: 1703.1 inv_txfm_add_4x16_identity_adst_2_12bpc_avx2: 162.5 inv_txfm_add_4x16_identity_dct_0_12bpc_c: 1775.0 inv_txfm_add_4x16_identity_dct_0_12bpc_avx2: 150.5 inv_txfm_add_4x16_identity_dct_1_12bpc_c: 1753.0 inv_txfm_add_4x16_identity_dct_1_12bpc_avx2: 150.6 inv_txfm_add_4x16_identity_dct_2_12bpc_c: 1759.6 inv_txfm_add_4x16_identity_dct_2_12bpc_avx2: 149.8 inv_txfm_add_4x16_identity_flipadst_0_12bpc_c: 1727.5 inv_txfm_add_4x16_identity_flipadst_0_12bpc_avx2: 160.3 inv_txfm_add_4x16_identity_flipadst_1_12bpc_c: 1739.8 inv_txfm_add_4x16_identity_flipadst_1_12bpc_avx2: 160.9 inv_txfm_add_4x16_identity_flipadst_2_12bpc_c: 1728.3 inv_txfm_add_4x16_identity_flipadst_2_12bpc_avx2: 159.9 inv_txfm_add_4x16_identity_identity_0_12bpc_c: 1098.6 inv_txfm_add_4x16_identity_identity_0_12bpc_avx2: 60.4 inv_txfm_add_4x16_identity_identity_1_12bpc_c: 1095.4 inv_txfm_add_4x16_identity_identity_1_12bpc_avx2: 61.3 inv_txfm_add_4x16_identity_identity_2_12bpc_c: 1111.6 inv_txfm_add_4x16_identity_identity_2_12bpc_avx2: 60.6
-
Matthias Dressel authored
WHT uses no SSSE3 instructions. The 16bpc variant is already SSE2.
-
- Nov 18, 2021
-
-
The previous code could cause padded pixels along the right edge to be slightly off in some obscure cases.
-
- Nov 15, 2021
-
-
Henrik Gramner authored
-
Henrik Gramner authored
-
Henrik Gramner authored
-
Henrik Gramner authored
-
- Nov 13, 2021
-
-
Matthias Dressel authored
inv_txfm_add_8x8_adst_adst_0_12bpc_c: 1997.9 inv_txfm_add_8x8_adst_adst_0_12bpc_avx2: 185.7 inv_txfm_add_8x8_adst_adst_1_12bpc_c: 2009.8 inv_txfm_add_8x8_adst_adst_1_12bpc_avx2: 185.7 inv_txfm_add_8x8_adst_dct_0_12bpc_c: 1991.0 inv_txfm_add_8x8_adst_dct_0_12bpc_avx2: 161.3 inv_txfm_add_8x8_adst_dct_1_12bpc_c: 1977.0 inv_txfm_add_8x8_adst_dct_1_12bpc_avx2: 161.4 inv_txfm_add_8x8_adst_flipadst_0_12bpc_c: 2017.6 inv_txfm_add_8x8_adst_flipadst_0_12bpc_avx2: 184.2 inv_txfm_add_8x8_adst_flipadst_1_12bpc_c: 2018.9 inv_txfm_add_8x8_adst_flipadst_1_12bpc_avx2: 184.2 inv_txfm_add_8x8_adst_identity_0_12bpc_c: 1407.2 inv_txfm_add_8x8_adst_identity_0_12bpc_avx2: 95.7 inv_txfm_add_8x8_adst_identity_1_12bpc_c: 1405.9 inv_txfm_add_8x8_adst_identity_1_12bpc_avx2: 95.8 inv_txfm_add_8x8_dct_adst_0_12bpc_c: 2024.2 inv_txfm_add_8x8_dct_adst_0_12bpc_avx2: 156.9 inv_txfm_add_8x8_dct_adst_1_12bpc_c: 2018.8 inv_txfm_add_8x8_dct_adst_1_12bpc_avx2: 160.1 inv_txfm_add_8x8_dct_dct_0_12bpc_c: 213.0 inv_txfm_add_8x8_dct_dct_0_12bpc_avx2: 24.8 inv_txfm_add_8x8_dct_dct_1_12bpc_c: 2008.6 inv_txfm_add_8x8_dct_dct_1_12bpc_avx2: 139.0 inv_txfm_add_8x8_dct_flipadst_0_12bpc_c: 2012.3 inv_txfm_add_8x8_dct_flipadst_0_12bpc_avx2: 159.2 inv_txfm_add_8x8_dct_flipadst_1_12bpc_c: 2005.1 inv_txfm_add_8x8_dct_flipadst_1_12bpc_avx2: 158.7 inv_txfm_add_8x8_dct_identity_0_12bpc_c: 1470.4 inv_txfm_add_8x8_dct_identity_0_12bpc_avx2: 71.7 inv_txfm_add_8x8_dct_identity_1_12bpc_c: 1477.8 inv_txfm_add_8x8_dct_identity_1_12bpc_avx2: 70.7 inv_txfm_add_8x8_flipadst_adst_0_12bpc_c: 2006.1 inv_txfm_add_8x8_flipadst_adst_0_12bpc_avx2: 183.6 inv_txfm_add_8x8_flipadst_adst_1_12bpc_c: 1987.6 inv_txfm_add_8x8_flipadst_adst_1_12bpc_avx2: 183.6 inv_txfm_add_8x8_flipadst_dct_0_12bpc_c: 1986.6 inv_txfm_add_8x8_flipadst_dct_0_12bpc_avx2: 163.0 inv_txfm_add_8x8_flipadst_dct_1_12bpc_c: 1979.3 inv_txfm_add_8x8_flipadst_dct_1_12bpc_avx2: 163.1 inv_txfm_add_8x8_flipadst_flipadst_0_12bpc_c: 2004.0 inv_txfm_add_8x8_flipadst_flipadst_0_12bpc_avx2: 184.3 inv_txfm_add_8x8_flipadst_flipadst_1_12bpc_c: 2003.9 inv_txfm_add_8x8_flipadst_flipadst_1_12bpc_avx2: 184.3 inv_txfm_add_8x8_flipadst_identity_0_12bpc_c: 1433.5 inv_txfm_add_8x8_flipadst_identity_0_12bpc_avx2: 95.3 inv_txfm_add_8x8_flipadst_identity_1_12bpc_c: 1425.4 inv_txfm_add_8x8_flipadst_identity_1_12bpc_avx2: 96.3 inv_txfm_add_8x8_identity_adst_0_12bpc_c: 1456.5 inv_txfm_add_8x8_identity_adst_0_12bpc_avx2: 115.8 inv_txfm_add_8x8_identity_adst_1_12bpc_c: 1453.5 inv_txfm_add_8x8_identity_adst_1_12bpc_avx2: 115.8 inv_txfm_add_8x8_identity_dct_0_12bpc_c: 1450.0 inv_txfm_add_8x8_identity_dct_0_12bpc_avx2: 93.5 inv_txfm_add_8x8_identity_dct_1_12bpc_c: 1447.5 inv_txfm_add_8x8_identity_dct_1_12bpc_avx2: 94.3 inv_txfm_add_8x8_identity_flipadst_0_12bpc_c: 1451.7 inv_txfm_add_8x8_identity_flipadst_0_12bpc_avx2: 114.0 inv_txfm_add_8x8_identity_flipadst_1_12bpc_c: 1456.4 inv_txfm_add_8x8_identity_flipadst_1_12bpc_avx2: 114.0 inv_txfm_add_8x8_identity_identity_0_12bpc_c: 892.3 inv_txfm_add_8x8_identity_identity_0_12bpc_avx2: 33.7 inv_txfm_add_8x8_identity_identity_1_12bpc_c: 897.2 inv_txfm_add_8x8_identity_identity_1_12bpc_avx2: 33.1
-
Matthias Dressel authored
inv_txfm_add_8x4_adst_adst_0_12bpc_c: 882.1 inv_txfm_add_8x4_adst_adst_0_12bpc_avx2: 113.7 inv_txfm_add_8x4_adst_adst_1_12bpc_c: 882.5 inv_txfm_add_8x4_adst_adst_1_12bpc_avx2: 113.8 inv_txfm_add_8x4_adst_dct_0_12bpc_c: 928.0 inv_txfm_add_8x4_adst_dct_0_12bpc_avx2: 109.2 inv_txfm_add_8x4_adst_dct_1_12bpc_c: 924.9 inv_txfm_add_8x4_adst_dct_1_12bpc_avx2: 109.2 inv_txfm_add_8x4_adst_flipadst_0_12bpc_c: 889.9 inv_txfm_add_8x4_adst_flipadst_0_12bpc_avx2: 114.3 inv_txfm_add_8x4_adst_flipadst_1_12bpc_c: 886.0 inv_txfm_add_8x4_adst_flipadst_1_12bpc_avx2: 114.8 inv_txfm_add_8x4_adst_identity_0_12bpc_c: 832.2 inv_txfm_add_8x4_adst_identity_0_12bpc_avx2: 88.8 inv_txfm_add_8x4_adst_identity_1_12bpc_c: 834.6 inv_txfm_add_8x4_adst_identity_1_12bpc_avx2: 89.0 inv_txfm_add_8x4_dct_adst_0_12bpc_c: 870.3 inv_txfm_add_8x4_dct_adst_0_12bpc_avx2: 96.3 inv_txfm_add_8x4_dct_adst_1_12bpc_c: 884.6 inv_txfm_add_8x4_dct_adst_1_12bpc_avx2: 96.3 inv_txfm_add_8x4_dct_dct_0_12bpc_c: 116.1 inv_txfm_add_8x4_dct_dct_0_12bpc_avx2: 24.5 inv_txfm_add_8x4_dct_dct_1_12bpc_c: 925.1 inv_txfm_add_8x4_dct_dct_1_12bpc_avx2: 92.3 inv_txfm_add_8x4_dct_flipadst_0_12bpc_c: 882.7 inv_txfm_add_8x4_dct_flipadst_0_12bpc_avx2: 97.0 inv_txfm_add_8x4_dct_flipadst_1_12bpc_c: 882.1 inv_txfm_add_8x4_dct_flipadst_1_12bpc_avx2: 97.0 inv_txfm_add_8x4_dct_identity_0_12bpc_c: 827.5 inv_txfm_add_8x4_dct_identity_0_12bpc_avx2: 72.4 inv_txfm_add_8x4_dct_identity_1_12bpc_c: 827.8 inv_txfm_add_8x4_dct_identity_1_12bpc_avx2: 73.8 inv_txfm_add_8x4_flipadst_adst_0_12bpc_c: 899.5 inv_txfm_add_8x4_flipadst_adst_0_12bpc_avx2: 113.2 inv_txfm_add_8x4_flipadst_adst_1_12bpc_c: 898.8 inv_txfm_add_8x4_flipadst_adst_1_12bpc_avx2: 113.3 inv_txfm_add_8x4_flipadst_dct_0_12bpc_c: 945.7 inv_txfm_add_8x4_flipadst_dct_0_12bpc_avx2: 108.3 inv_txfm_add_8x4_flipadst_dct_1_12bpc_c: 945.6 inv_txfm_add_8x4_flipadst_dct_1_12bpc_avx2: 108.3 inv_txfm_add_8x4_flipadst_flipadst_0_12bpc_c: 903.6 inv_txfm_add_8x4_flipadst_flipadst_0_12bpc_avx2: 113.9 inv_txfm_add_8x4_flipadst_flipadst_1_12bpc_c: 902.8 inv_txfm_add_8x4_flipadst_flipadst_1_12bpc_avx2: 114.2 inv_txfm_add_8x4_flipadst_identity_0_12bpc_c: 856.6 inv_txfm_add_8x4_flipadst_identity_0_12bpc_avx2: 88.3 inv_txfm_add_8x4_flipadst_identity_1_12bpc_c: 848.8 inv_txfm_add_8x4_flipadst_identity_1_12bpc_avx2: 87.4 inv_txfm_add_8x4_identity_adst_0_12bpc_c: 583.2 inv_txfm_add_8x4_identity_adst_0_12bpc_avx2: 69.6 inv_txfm_add_8x4_identity_adst_1_12bpc_c: 584.3 inv_txfm_add_8x4_identity_adst_1_12bpc_avx2: 69.6 inv_txfm_add_8x4_identity_dct_0_12bpc_c: 632.9 inv_txfm_add_8x4_identity_dct_0_12bpc_avx2: 65.3 inv_txfm_add_8x4_identity_dct_1_12bpc_c: 629.6 inv_txfm_add_8x4_identity_dct_1_12bpc_avx2: 65.8 inv_txfm_add_8x4_identity_flipadst_0_12bpc_c: 587.0 inv_txfm_add_8x4_identity_flipadst_0_12bpc_avx2: 71.0 inv_txfm_add_8x4_identity_flipadst_1_12bpc_c: 586.9 inv_txfm_add_8x4_identity_flipadst_1_12bpc_avx2: 71.0 inv_txfm_add_8x4_identity_identity_0_12bpc_c: 533.0 inv_txfm_add_8x4_identity_identity_0_12bpc_avx2: 45.3 inv_txfm_add_8x4_identity_identity_1_12bpc_c: 539.7 inv_txfm_add_8x4_identity_identity_1_12bpc_avx2: 45.9
-
Matthias Dressel authored
inv_txfm_add_4x8_adst_adst_0_12bpc_c: 900.8 inv_txfm_add_4x8_adst_adst_0_12bpc_avx2: 118.8 inv_txfm_add_4x8_adst_adst_1_12bpc_c: 893.7 inv_txfm_add_4x8_adst_adst_1_12bpc_avx2: 118.8 inv_txfm_add_4x8_adst_dct_0_12bpc_c: 890.2 inv_txfm_add_4x8_adst_dct_0_12bpc_avx2: 104.8 inv_txfm_add_4x8_adst_dct_1_12bpc_c: 887.4 inv_txfm_add_4x8_adst_dct_1_12bpc_avx2: 104.8 inv_txfm_add_4x8_adst_flipadst_0_12bpc_c: 919.6 inv_txfm_add_4x8_adst_flipadst_0_12bpc_avx2: 116.6 inv_txfm_add_4x8_adst_flipadst_1_12bpc_c: 912.1 inv_txfm_add_4x8_adst_flipadst_1_12bpc_avx2: 116.6 inv_txfm_add_4x8_adst_identity_0_12bpc_c: 613.5 inv_txfm_add_4x8_adst_identity_0_12bpc_avx2: 42.8 inv_txfm_add_4x8_adst_identity_1_12bpc_c: 608.7 inv_txfm_add_4x8_adst_identity_1_12bpc_avx2: 43.3 inv_txfm_add_4x8_dct_adst_0_12bpc_c: 951.7 inv_txfm_add_4x8_dct_adst_0_12bpc_avx2: 113.8 inv_txfm_add_4x8_dct_adst_1_12bpc_c: 949.0 inv_txfm_add_4x8_dct_adst_1_12bpc_avx2: 113.1 inv_txfm_add_4x8_dct_dct_0_12bpc_c: 118.6 inv_txfm_add_4x8_dct_dct_0_12bpc_avx2: 24.5 inv_txfm_add_4x8_dct_dct_1_12bpc_c: 942.4 inv_txfm_add_4x8_dct_dct_1_12bpc_avx2: 99.2 inv_txfm_add_4x8_dct_flipadst_0_12bpc_c: 959.3 inv_txfm_add_4x8_dct_flipadst_0_12bpc_avx2: 113.9 inv_txfm_add_4x8_dct_flipadst_1_12bpc_c: 964.1 inv_txfm_add_4x8_dct_flipadst_1_12bpc_avx2: 114.3 inv_txfm_add_4x8_dct_identity_0_12bpc_c: 659.9 inv_txfm_add_4x8_dct_identity_0_12bpc_avx2: 41.9 inv_txfm_add_4x8_dct_identity_1_12bpc_c: 658.6 inv_txfm_add_4x8_dct_identity_1_12bpc_avx2: 41.6 inv_txfm_add_4x8_flipadst_adst_0_12bpc_c: 906.6 inv_txfm_add_4x8_flipadst_adst_0_12bpc_avx2: 117.3 inv_txfm_add_4x8_flipadst_adst_1_12bpc_c: 907.7 inv_txfm_add_4x8_flipadst_adst_1_12bpc_avx2: 117.3 inv_txfm_add_4x8_flipadst_dct_0_12bpc_c: 890.3 inv_txfm_add_4x8_flipadst_dct_0_12bpc_avx2: 104.6 inv_txfm_add_4x8_flipadst_dct_1_12bpc_c: 895.6 inv_txfm_add_4x8_flipadst_dct_1_12bpc_avx2: 104.6 inv_txfm_add_4x8_flipadst_flipadst_0_12bpc_c: 902.9 inv_txfm_add_4x8_flipadst_flipadst_0_12bpc_avx2: 116.5 inv_txfm_add_4x8_flipadst_flipadst_1_12bpc_c: 915.0 inv_txfm_add_4x8_flipadst_flipadst_1_12bpc_avx2: 116.4 inv_txfm_add_4x8_flipadst_identity_0_12bpc_c: 618.6 inv_txfm_add_4x8_flipadst_identity_0_12bpc_avx2: 45.3 inv_txfm_add_4x8_flipadst_identity_1_12bpc_c: 618.1 inv_txfm_add_4x8_flipadst_identity_1_12bpc_avx2: 44.0 inv_txfm_add_4x8_identity_adst_0_12bpc_c: 829.7 inv_txfm_add_4x8_identity_adst_0_12bpc_avx2: 107.4 inv_txfm_add_4x8_identity_adst_1_12bpc_c: 831.7 inv_txfm_add_4x8_identity_adst_1_12bpc_avx2: 107.8 inv_txfm_add_4x8_identity_dct_0_12bpc_c: 823.2 inv_txfm_add_4x8_identity_dct_0_12bpc_avx2: 90.7 inv_txfm_add_4x8_identity_dct_1_12bpc_c: 824.1 inv_txfm_add_4x8_identity_dct_1_12bpc_avx2: 90.7 inv_txfm_add_4x8_identity_flipadst_0_12bpc_c: 853.4 inv_txfm_add_4x8_identity_flipadst_0_12bpc_avx2: 106.8 inv_txfm_add_4x8_identity_flipadst_1_12bpc_c: 852.2 inv_txfm_add_4x8_identity_flipadst_1_12bpc_avx2: 106.8 inv_txfm_add_4x8_identity_identity_0_12bpc_c: 543.2 inv_txfm_add_4x8_identity_identity_0_12bpc_avx2: 36.4 inv_txfm_add_4x8_identity_identity_1_12bpc_c: 544.8 inv_txfm_add_4x8_identity_identity_1_12bpc_avx2: 36.6
-
- Nov 12, 2021
-
-
Ronald S. Bultje authored
Credit to oss-fuzz.
-
- Nov 11, 2021
-
-
Ronald S. Bultje authored
Credit to oss-fuzz.
-
- Nov 10, 2021
-
-
Henrik Gramner authored
Also fix some incorrect comments.
-
- Nov 05, 2021
-
-
Matthias Dressel authored
Bidirectional control and invisible characters can be used to hide malicious code. Ref: CVE-2021-42574, CVE-2021-42694
-
- Nov 02, 2021
-
-
Matthias Dressel authored
Values need to be clipped after Hadamard rotations.
-
- Nov 01, 2021
-
-
Victorien Le Couviour--Tuffet authored
Credit to Oss-Fuzz.
-
- Oct 31, 2021
-
-
Niklas Haas authored
The signature of pl_allocate/release_dav1dpic takes a void *cookie, which the compiler warns about if we don't implicitly cast.
-
- Oct 29, 2021
-
-
Victorien Le Couviour--Tuffet authored
-
Victorien Le Couviour--Tuffet authored
-
Victorien Le Couviour--Tuffet authored
-
- Oct 28, 2021
-
-
Martin Storsjö authored
Use the check result instead of hardcoding what OSes have got the function. This also requires checking for the pthread_np.h header and including it while testing for functions in meson, but allows getting rid of the hardcoded OS conditions in the source. This fixes building for Android, if _GNU_SOURCE happens to be defined. (It gets defined if building with a slightly nonstandard cross file that defines "system = 'linux'", but it could also have been set by the caller.)
-
- Oct 27, 2021
-
-
Salome Thirot authored
Add Branch Target Identifiers (BTIs) to all functions defined in AArch64 assembly files. BTI support is turned on or off at compile time based on the presence of the __ARM_FEATURE_BTI_DEFAULT feature macro. A binary compiled with BTI support can be executed on an Armv8-A processor without BTI support because the instructions are defined in NOP space. Signed-off-by:
Jonathan Wright <jonathan.wright@arm.com> Signed-off-by:
Matthew Dalzell <matthew.dalzell@arm.com> Signed-off-by:
Salome Thirot <salome.thirot@arm.com>
-
Salome Thirot authored
Using ret x<n> instead of br x<n> removes the need for a BTI landing pad at the target address in x<n>. Using 'ret' instead of 'br' does not have any performance implications. Signed-off-by:
Jonathan Wright <jonathan.wright@arm.com> Signed-off-by:
Matthew Dalzell <matthew.dalzell@arm.com> Signed-off-by:
Salome Thirot <salome.thirot@arm.com>
-
- Oct 18, 2021
-
-
Matthias Dressel authored
Refactors itx into separate 10, 12 bit functions to prevent conditional jumps. inv_txfm_add_4x4_adst_adst_0_12bpc_c: 370.9 inv_txfm_add_4x4_adst_adst_0_12bpc_avx2: 68.6 inv_txfm_add_4x4_adst_adst_1_12bpc_c: 371.0 inv_txfm_add_4x4_adst_adst_1_12bpc_avx2: 68.7 inv_txfm_add_4x4_adst_dct_0_12bpc_c: 413.1 inv_txfm_add_4x4_adst_dct_0_12bpc_avx2: 69.2 inv_txfm_add_4x4_adst_dct_1_12bpc_c: 412.7 inv_txfm_add_4x4_adst_dct_1_12bpc_avx2: 68.8 inv_txfm_add_4x4_adst_flipadst_0_12bpc_c: 378.5 inv_txfm_add_4x4_adst_flipadst_0_12bpc_avx2: 74.9 inv_txfm_add_4x4_adst_flipadst_1_12bpc_c: 378.1 inv_txfm_add_4x4_adst_flipadst_1_12bpc_avx2: 74.6 inv_txfm_add_4x4_adst_identity_0_12bpc_c: 347.8 inv_txfm_add_4x4_adst_identity_0_12bpc_avx2: 48.8 inv_txfm_add_4x4_adst_identity_1_12bpc_c: 342.7 inv_txfm_add_4x4_adst_identity_1_12bpc_avx2: 49.0 inv_txfm_add_4x4_dct_adst_0_12bpc_c: 399.2 inv_txfm_add_4x4_dct_adst_0_12bpc_avx2: 73.1 inv_txfm_add_4x4_dct_adst_1_12bpc_c: 398.7 inv_txfm_add_4x4_dct_adst_1_12bpc_avx2: 72.2 inv_txfm_add_4x4_dct_dct_0_12bpc_c: 69.6 inv_txfm_add_4x4_dct_dct_0_12bpc_avx2: 32.9 inv_txfm_add_4x4_dct_dct_1_12bpc_c: 420.5 inv_txfm_add_4x4_dct_dct_1_12bpc_avx2: 72.2 inv_txfm_add_4x4_dct_flipadst_0_12bpc_c: 405.5 inv_txfm_add_4x4_dct_flipadst_0_12bpc_avx2: 75.9 inv_txfm_add_4x4_dct_flipadst_1_12bpc_c: 404.2 inv_txfm_add_4x4_dct_flipadst_1_12bpc_avx2: 75.6 inv_txfm_add_4x4_dct_identity_0_12bpc_c: 374.1 inv_txfm_add_4x4_dct_identity_0_12bpc_avx2: 51.6 inv_txfm_add_4x4_dct_identity_1_12bpc_c: 368.0 inv_txfm_add_4x4_dct_identity_1_12bpc_avx2: 51.8 inv_txfm_add_4x4_flipadst_adst_0_12bpc_c: 368.0 inv_txfm_add_4x4_flipadst_adst_0_12bpc_avx2: 69.2 inv_txfm_add_4x4_flipadst_adst_1_12bpc_c: 370.7 inv_txfm_add_4x4_flipadst_adst_1_12bpc_avx2: 70.4 inv_txfm_add_4x4_flipadst_dct_0_12bpc_c: 393.7 inv_txfm_add_4x4_flipadst_dct_0_12bpc_avx2: 70.1 inv_txfm_add_4x4_flipadst_dct_1_12bpc_c: 392.9 inv_txfm_add_4x4_flipadst_dct_1_12bpc_avx2: 69.6 inv_txfm_add_4x4_flipadst_flipadst_0_12bpc_c: 382.2 inv_txfm_add_4x4_flipadst_flipadst_0_12bpc_avx2: 74.6 inv_txfm_add_4x4_flipadst_flipadst_1_12bpc_c: 381.3 inv_txfm_add_4x4_flipadst_flipadst_1_12bpc_avx2: 74.9 inv_txfm_add_4x4_flipadst_identity_0_12bpc_c: 346.7 inv_txfm_add_4x4_flipadst_identity_0_12bpc_avx2: 48.2 inv_txfm_add_4x4_flipadst_identity_1_12bpc_c: 347.9 inv_txfm_add_4x4_flipadst_identity_1_12bpc_avx2: 48.7 inv_txfm_add_4x4_identity_adst_0_12bpc_c: 344.7 inv_txfm_add_4x4_identity_adst_0_12bpc_avx2: 59.8 inv_txfm_add_4x4_identity_adst_1_12bpc_c: 340.5 inv_txfm_add_4x4_identity_adst_1_12bpc_avx2: 59.2 inv_txfm_add_4x4_identity_dct_0_12bpc_c: 369.8 inv_txfm_add_4x4_identity_dct_0_12bpc_avx2: 59.3 inv_txfm_add_4x4_identity_dct_1_12bpc_c: 369.5 inv_txfm_add_4x4_identity_dct_1_12bpc_avx2: 59.2 inv_txfm_add_4x4_identity_flipadst_0_12bpc_c: 353.4 inv_txfm_add_4x4_identity_flipadst_0_12bpc_avx2: 65.6 inv_txfm_add_4x4_identity_flipadst_1_12bpc_c: 350.9 inv_txfm_add_4x4_identity_flipadst_1_12bpc_avx2: 65.9 inv_txfm_add_4x4_identity_identity_0_12bpc_c: 326.1 inv_txfm_add_4x4_identity_identity_0_12bpc_avx2: 39.5 inv_txfm_add_4x4_identity_identity_1_12bpc_c: 321.6 inv_txfm_add_4x4_identity_identity_1_12bpc_avx2: 39.5
-
Matthias Dressel authored
Use numerical GPR references everywhere for consistency.
-
Matthias Dressel authored
Give some constants a more explicit name to avoid confusion when 12bpc support is added.
-
Henrik Gramner authored
-
-
Henrik Gramner authored
-
Henrik Gramner authored
-
Henrik Gramner authored
-
Henrik Gramner authored
-
Henrik Gramner authored
-
Henrik Gramner authored
Realign the buffer if neccessary to maintain 64-byte alignment.
-
Henrik Gramner authored
Also make some minor optimizations to the AVX2 asm.
-
Henrik Gramner authored
-
Henrik Gramner authored
-
Henrik Gramner authored
-