shaders/sampling: don't unroll whole ortho sampling
Unrolling fully doesn't bring performance gain and makes shaders huge and heavy to compile.
Also this avoid recompiling shader when kernel size changes, because it is not dynamic variable.
Comparision on Ryzen 7950x.
Before:
Spent 27.292 ms translating SPIR-V Spent 1317.319 ms translating SPIR-V (slow!) Spent 0.052 ms compiling shader Spent 67.641 ms creating pipeline
After:
Spent 28.963 ms translating SPIR-V Spent 77.494 ms translating SPIR-V Spent 0.043 ms compiling shader Spent 0.199 ms creating pipeline
Note it is done twice, for vertical and horizontal kernel. On top on some platforms there is additional step to translate SPIR-V to another format like DXBC.
In an atempt to mitigate some users having excessive compilation times for no good reason.