Possible race condition in Vulkan swapchain recreation
There is a long-standing bug in mpv player, where the player hangs when resizing window, emitting VK_DEVICE_LOST_ERROR
Recently I decided to look into that problem and noted that there are some Vulkan Validation failures in the gpu debug log.
Attached is the log produced with mpv --no-config --log-file=gpu-debug.txt --gpu-debug --vo=gpu --gpu-api=vulkan ~/Japan\ in\ 8K-m1jY2VLCRmY.mkv
gpu-debug.txt
The backtrace suggests pl_swapchain_resize
is the fault and upon looking into that, I found punching a vkDeviceWaitIdle
call during swapchain recreation can make the problem go away. (A deadlock condition seemed to be introduced when mpv config vulkan-queue-count
is anything larger than 1, but it is expected with vkDeviceWaitIdle anyway)
Is this a pontial bug in libplacebo vulkan code or in NVIDIA's vulkan driver?
My operating system: Gentoo Linux
My mpv player: upstream master
branch, 0.34.0-301-gdefb02daa4
My libplacebo version: master
branch, commit b4867541
My GPU driver: NVIDIA proprietary 515.43.04
The video I used to reproduce the problem can be downloaded from https://www.youtube.com/watch?v=m1jY2VLCRmY with yt-dlp
tool. On my machine it can be reproduced quite consistently by resizing mpv window around with mouse.
Said `vkDeviceWaitIdle` workaround
diff --git a/src/vulkan/common.h b/src/vulkan/common.h
index 339da6a..be99a45 100644
--- a/src/vulkan/common.h
+++ b/src/vulkan/common.h
@@ -226,4 +226,5 @@ struct vk_ctx {
PL_VK_FUN(GetMemoryWin32HandleKHR);
PL_VK_FUN(GetSemaphoreWin32HandleKHR);
#endif
+ PL_VK_FUN(DeviceWaitIdle);
};
diff --git a/src/vulkan/context.c b/src/vulkan/context.c
index 5be9bd7..72d3057 100644
--- a/src/vulkan/context.c
+++ b/src/vulkan/context.c
@@ -336,6 +336,7 @@ static const struct vk_fun vk_dev_funs[] = {
PL_VK_DEV_FUN(SetDebugUtilsObjectNameEXT),
PL_VK_DEV_FUN(UpdateDescriptorSets),
PL_VK_DEV_FUN(WaitForFences),
+ PL_VK_DEV_FUN(DeviceWaitIdle),
};
static void load_vk_fun(struct vk_ctx *vk, const struct vk_fun *fun)
diff --git a/src/vulkan/swapchain.c b/src/vulkan/swapchain.c
index 3b66ee6..8ff443f 100644
--- a/src/vulkan/swapchain.c
+++ b/src/vulkan/swapchain.c
@@ -570,6 +570,8 @@ static bool vk_sw_recreate(pl_swapchain sw, int w, int h)
while (p->old_swapchain)
vk_poll_commands(vk, UINT64_MAX);
+ vk->DeviceWaitIdle(vk->dev);
+
VkSwapchainCreateInfoKHR sinfo = p->protoInfo;
sinfo.oldSwapchain = p->swapchain;