gpu: completely refactor texture transfer strides
Rather than the stride_w/h fields (specified in texels), change this to
the more common convention of row_pitch
and depth_pitch
, specified
in bytes, matching the convention of e.g. DMAbufs.
Also fixes several implicit and explicit bugs, such as the texture transfer offset alignment accidentally being applied to texels instead of bytes (even though the latter is what e.g. the vulkan spec actually says for optimalBufferCopyRowPitchAlignment).
OpenGL has no alignment requirements, because it always falls back to (horrifically inefficient) strided upload paths. For Vulkan, in the emulated upload paths (which are effectively required for all NPOT textures), we only require alignment to individual components (e.g. 2 bytes for 16-bit textures). This should allow things like uploading png24 to just transparently work, fixing several issues with edge cases like these.
Closes #126 (closed)