Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[viogpu3d] Virtio GPU 3D acceleration for windows #943

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

max8rr8
Copy link

@max8rr8 max8rr8 commented Jul 18, 2023

Hello! This series of changes spanning across multiple repositories introduce support for 3d accelerated virtiogpu windows guests.

Demo image

Wglgears window is rendered with wgl on virgl and window below it is cube rendered with d3d10umd on virgl.

How to test

NOTE: This driver does have some rendering glitches and might crash. Try at your own risk.
0. Create qemu windows VM with VirtIO GPU with 3d acceleration enabled. It is highly recommended to use "disposable" virtual machine to test, loss of data might occur.

  1. Use patched version of virglrenderer from this repo branch viogpu_win
  2. Compile from source OR download pre-built drivers.
  3. Install drivers on target VM. Note: if drivers were not signed you need to manually select them in device manager.

Known issues

  • FIXED: Frames displayed on screen are lagging behind

  • FIXED: D3d10 clearing color is not supported

  • FIXED: D3d10 applications using DXGI_SWAP_EFFECT_DISCARD and DXGI_SWAP_EFFECT_SEQUENTIAL are not displayed.

  • Rendering glitches in WinUI3 apps.

    There are some rendering glitches in apps based on WinUI3 (maybe other apps too), best way to see them is to install WinUI3 Gallery from microsoft store and navigate around it. Haven't yet invistigated.

  • Vscode (possibly other electron apps) does not render

    Black window. Requires implementation of PIPE_QUERY_TIMESTAMP_DISJOINT in virglrenderer.

  • No preemption

    Kernel-mode driver does not implement preemption, and i am very confised about how to implement it in WDDM. VioGpu3D disables preemption systemwide to workaround lack of preemption implementation, but this is not ideal. Would appreciate some help.

Siblings

@vrozenfe
Copy link
Collaborator

@max8rr8
Hi Max,

Thank for a very impressive work you've done.
Please give us some time to go through your code and
see how to integrate it into our upstream repository.

Nice work!
All the best,
Vadim.

5. Build and install mesa binaries: `ninja install`
6. Go to `viogpu` directory of this repository
7. Run: `.\build_AllNoSdv.bat`
8. Compiled drivers will be available in `viogpu\viogpu3d\objfre_win10_amd64\amd64\viogpu3d`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @max8rr8,

This instruction is not clear to me. Can you please describe it more with directory examples? What dependencies should be installed and how configured? (Example: https://github.com/virtio-win/kvm-guest-drivers-windows/wiki/Building-the-drivers-using-Windows-11-21H2-EWDK).
I looked into the mesa compilation guide and it looks completely different.

meson setup builddir/
meson compile -C builddir/
sudo meson install -C builddir/

Several questions:

  1. What is %MESA_PREFIX% and where it defined?
  2. Can we use any precompiled MESA for Windows (for example https://fdossena.com/?p=mesa/index.frag)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is %MESA_PREFIX% and where it defined?

%MESA_PREFIX% environment variable is set during build process, it points to directory where mesa installs its files.

Can we use any precompiled MESA for Windows

I don't think so, as pointed out in mesa MR when building user-mode driver we have to build it with specific mesa flags to build only virgl driver (to avoid conflicts)

Copy link
Author

@max8rr8 max8rr8 Jul 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building instruction

NOTE 1: this is valid for now, until changes are not in upstream repositories
NOTE 2: this assumes that all build dependencies(meson, WDK, ninja, etc...) are installed

Part 1: Virglrenderer

On host machine it is required that patched version of virglrenderer is used.

  1. Acquire source code git clone --branch viogpu_win https://gitlab.freedesktop.org/max8rr8/virglrenderer && cd virglrenderer
  2. Create install directory mkdir install and build directory: mkdir build && cd build
  3. Configure build meson --prefix=$(pwd)/../install (we set prefix to install libvirglrenderer not globally but to previously created dir install)
  4. Compile and install ninja install
    Now ensure that qemu loads libvirglrenderer from install directory, this can be done by setting LD_LIBRARY_PATH to something like /some/path/to/starter_dir/virglrenderer/install

Part 2: Build mesa

Now inside virtual machine with build tools installed create working directory, then inside it (this assumes use of Powershell):

  1. Create mesa prefix dir mkdir mesa_prefix and set env MESA_PREFIX to its path: $env:MESA_PREFIX="$PWD\mesa_prefix"
  2. Get patched mesa source code git clone --depth 10 --branch viogpu_win https://gitlab.freedesktop.org/max8rr8/mesa and then cd into it cd mesa
  3. Create build directory mkdir build && cd build
  4. Configure build meson .. --prefix=$env:MESA_PREFIX -Dgallium-drivers=virgl -Dgallium-d3d10umd=true -Dgallium-wgl-dll-name=viogpu_wgl -Dgallium-d3d10-dll-name=viogpu_d3d10 -Db_vscrt=mt, build options explained:
  • --prefix=$env:MESA_PREFIX set installation path to dir created in step 1
  • -Dgallium-drivers=virgl build only virgl driver
  • -Dgallium-d3d10umd=true build DirectX 10 user-mode driver (opengl one is build by default)
  • -Dgallium-d3d10-dll-name=viogpu_d3d10 name of generated d3d10 dll to viogpu_d3d10.dll
  • -Dgallium-wgl-dll-name=viogpu_wgl name of generated wgl dll to viogpu_wgl.dll
  • -Db_vscrt=mt use static c runtime (see this comment)
  1. Build and install (to mesa prefix): ninja install

Part 3: Build driver

Now that mesa is build and installed into %MESA_PREFIX% viogpu3d will be built (in case %MESA_PREFIX is not set viogpu3d inf generation is skipped)

  1. Acquire source code git clone --branch viogpu_win https://github.com/max8rr8/kvm-guest-drivers-windows and cd into it cd kvm-guest-drivers-windows
  2. Go to viogpu cd viogpu
  3. (optional, but very useful) setup test code signning from visual studio
  4. Call build .\build_AllNoSdv.bat

Part 4: Installation

Now copy kvm-guest-drivers-windows\viogpu\viogpu3d\objfre_win10_amd64\amd64\viogpu3d to target VM and install it.

EDIT: Added gallium-windows-dll-name to mesa parameters.
EDIT2: More changes related to dll naming in mesa parameters

@max8rr8
Copy link
Author

max8rr8 commented Jul 21, 2023

Changes:

  • Fixed crash caused by use of PAGED_CODE at DISPATCH_LEVEL in flip timer. Instead of using ExTimer, separate flipping thread is used.
  • Building system now checks for viogpu related dll's specifically when deciding whether to compile viogpu3d
  • Renamed DLL's and updated build instructions due to changes requested by mesa

@max8rr8
Copy link
Author

max8rr8 commented Jul 24, 2023

Changes:

  • Re-implemented RotateResourceIdentites. This fixed biggest issue of frames lagging behind, now using driver becomes pretty smooth experience.
  • Implemented clear_render_target similarly to i915
  • Implemented support for staging resources and kernel-mode Present BLT which fixes window display in applications using DXGI_SWAP_EFFECT_DISCARD and DXGI_SWAP_EFFECT_DISCARD.

Pre-built driver provided in description was updated with these changes.

@mincore
Copy link

mincore commented Aug 1, 2023

I've tested the driver on win10, and got a black screen. But if change the guest os to ubuntu(same qemu command line), glxgears performs well. (glxinfo shows that the renderer is NVIDIA gpu)

env

guest os:
win 10 enterprise ltsc + viogpu3d

host os:
fedora38, qemu 8.0.3 with virglrenderer

qemu command line

-display egl-headless,rendernode=/dev/dri/card1 -device virtio-vga-gl -trace enable="virtio_gpu*" -D qemu.log

virglrenderer.log

gl_version 46 - core profile enabled
GLSL feature level 460
vrend_check_no_error: context error reported 3 "" Unknown 1282
context 3 failed to dispatch DRAW_VBO: 22
vrend_decode_ctx_submit_cmd: context error reported 3 "" Illegal command buffer 786440
GLSL feature level 460
vrend_check_no_error: context error reported 5 "" Unknown 1282
context 5 failed to dispatch DRAW_VBO: 22
vrend_decode_ctx_submit_cmd: context error reported 5 "" Illegal command buffer 786440
GLSL feature level 460
GLSL feature level 460
GLSL feature level 460
GLSL feature level 460
GLSL feature level 460
vrend_check_no_error: context error reported 9 "" Unknown 1282
context 9 failed to dispatch DRAW_VBO: 22

qemu_strace.log

virtio_gpu_features virgl 0
virtio_gpu_features virgl 0
virtio_gpu_cmd_get_edid scanout 0
virtio_gpu_cmd_get_display_info
virtio_gpu_cmd_ctx_create ctx 0x1, name
virtio_gpu_cmd_ctx_create ctx 0x2, name
virtio_gpu_cmd_res_create_3d res 0x1, fmt 0x1, w 1280, h 1024, d 1
virtio_gpu_cmd_ctx_res_attach ctx 0x2, res 0x1
virtio_gpu_cmd_res_back_attach res 0x1
virtio_gpu_cmd_res_create_3d res 0x2, fmt 0x1, w 1280, h 1024, d 1
virtio_gpu_cmd_ctx_res_attach ctx 0x2, res 0x2
virtio_gpu_cmd_res_create_2d res 0x3, fmt 0x2, w 1280, h 1024
virtio_gpu_cmd_res_back_attach res 0x3
virtio_gpu_cmd_set_scanout id 0, res 0x1, w 1280, h 1024, x 0, y 0
virtio_gpu_cmd_res_flush res 0x1, w 1280, h 1024, x 0, y 0
virtio_gpu_cmd_set_scanout id 0, res 0x1, w 1280, h 1024, x 0, y 0
virtio_gpu_cmd_res_flush res 0x1, w 1280, h 1024, x 0, y 0
virtio_gpu_cmd_ctx_create ctx 0x3, name
virtio_gpu_cmd_ctx_destroy ctx 0x3
virtio_gpu_cmd_ctx_create ctx 0x3, name
virtio_gpu_cmd_res_create_3d res 0x4, fmt 0xb1, w 48, h 1, d 1
virtio_gpu_cmd_ctx_res_attach ctx 0x3, res 0x4
virtio_gpu_cmd_res_create_3d res 0x5, fmt 0xb1, w 4000, h 1, d 1
virtio_gpu_cmd_ctx_res_attach ctx 0x3, res 0x5
virtio_gpu_cmd_res_create_3d res 0x6, fmt 0xb1, w 16, h 1, d 1
virtio_gpu_cmd_ctx_res_attach ctx 0x3, res 0x6
virtio_gpu_cmd_res_create_3d res 0x7, fmt 0xb1, w 48, h 1, d 1
virtio_gpu_cmd_ctx_res_attach ctx 0x3, res 0x7
virtio_gpu_cmd_res_create_3d res 0x8, fmt 0xb1, w 240012, h 1, d 1
virtio_gpu_cmd_ctx_res_attach ctx 0x3, res 0x8
virtio_gpu_cmd_res_create_3d res 0x9, fmt 0xb1, w 102400, h 1, d 1
virtio_gpu_cmd_ctx_res_attach ctx 0x3, res 0x9
virtio_gpu_cmd_res_create_3d res 0xa, fmt 0xb1, w 144, h 1, d 1
virtio_gpu_cmd_ctx_res_attach ctx 0x3, res 0xa
virtio_gpu_cmd_res_create_3d res 0xb, fmt 0xb1, w 160000, h 1, d 1
virtio_gpu_cmd_ctx_res_attach ctx 0x3, res 0xb
virtio_gpu_cmd_res_create_3d res 0xc, fmt 0xb1, w 16000, h 1, d 1
virtio_gpu_cmd_ctx_res_attach ctx 0x3, res 0xc
virtio_gpu_cmd_res_create_3d res 0xd, fmt 0xb1, w 240000, h 1, d 1
virtio_gpu_cmd_ctx_res_attach ctx 0x3, res 0xd
virtio_gpu_cmd_res_create_3d res 0xe, fmt 0xb1, w 192, h 1, d 1
virtio_gpu_cmd_ctx_res_attach ctx 0x3, res 0xe
virtio_gpu_cmd_res_create_3d res 0xf, fmt 0xb1, w 16, h 1, d 1
virtio_gpu_cmd_ctx_res_attach ctx 0x3, res 0xf
virtio_gpu_cmd_res_create_3d res 0x10, fmt 0xb1, w 272, h 1, d 1
virtio_gpu_cmd_ctx_res_attach ctx 0x3, res 0x10
virtio_gpu_cmd_res_create_3d res 0x11, fmt 0xb1, w 240, h 1, d 1
virtio_gpu_cmd_ctx_res_attach ctx 0x3, res 0x11
virtio_gpu_cmd_res_create_3d res 0x12, fmt 0xb1, w 272, h 1, d 1
virtio_gpu_cmd_ctx_res_attach ctx 0x3, res 0x12
virtio_gpu_cmd_res_create_3d res 0x13, fmt 0x1, w 50, h 50, d 1
virtio_gpu_cmd_ctx_res_attach ctx 0x3, res 0x13
virtio_gpu_cmd_ctx_create ctx 0x4, name
virtio_gpu_cmd_res_back_attach res 0x13
virtio_gpu_cmd_res_back_attach res 0x4
virtio_gpu_cmd_res_back_attach res 0x5
virtio_gpu_cmd_res_back_attach res 0x6
virtio_gpu_cmd_res_back_attach res 0xa
virtio_gpu_cmd_res_back_attach res 0xe
virtio_gpu_cmd_res_back_attach res 0xf
virtio_gpu_cmd_res_back_attach res 0x10
virtio_gpu_cmd_res_back_attach res 0x11
virtio_gpu_cmd_res_back_attach res 0x12
virtio_gpu_cmd_ctx_submit ctx 0x3, size 74220

@max8rr8
Copy link
Author

max8rr8 commented Aug 2, 2023

I've tested the driver on win10, and got a black screen. But if change the guest os to ubuntu(same qemu command line), glxgears performs well. (glxinfo shows that the renderer is NVIDIA gpu)

It seems to be an bug in virglrenderer on nvidia related to GL_PRIMITIVE_RESTART_NV. You can try to apply this diff to fix it, but it's a bit hacky solution.

@CE1CECL
Copy link

CE1CECL commented Aug 4, 2023

What versions of windows does it go down to?
In my case, I am experimenting with https://github.com/CE1CECL/qemu-vmvga and it works with vista (no 3d yet), trying to get 3D to work right now.

@foxlet
Copy link

foxlet commented Aug 6, 2023

I tried the pre-built driver on Windows 10 22H2 and it only showed a black screen before hard-locking and resetting. Also tested the same config with Ubuntu which had working virgl.

Using -display gtk,gl=on -device virtio-vga-gl on an AMD R9 6900HS + RX 6700S host.
virglrenderer was compiled from git (considering that it's already been merged in).
qemu is version 7.2.4

qemu.log

@Conan-Kudo
Copy link

Can we validate that this works with Windows 7 too? It's a fairly common virtualization guest for playing older games, and it's still WDDM class.

@max8rr8
Copy link
Author

max8rr8 commented Aug 13, 2023

What versions of windows does it go down to?

Current driver theoretically supports windows 8.1, but it is built for and tested only on Windows 10 22H2. But it is important to note that support for blob resources which are required to improve performance and support vulkan will require using WDDM 2 which lift minimum windows version to 10.

Can we validate that this works with Windows 7 too?

I doubt it will work with current code, it might be possible to adapt it for windows 7, but i do not have interest in doing that (though wouldn't mind if someone else will adapt code). Plus at some point driver will have to use WDDM 2 which will splitting codebases or more likely require either dropping support for older OS.

I tried the pre-built driver on Windows 10 22H2 and it only showed a black screen before hard-locking and resetting. Also tested the same config with Ubuntu which had working virgl.

Nothing seems wrong in attached qemu.log. I don't see any lines like GLSL feature level 460 that should be printed to stdout, can you also attach stdout of qemu.
Although it could be issue in kernel driver then the only way to know what's wrong is to attach kernel debugger to vm.

@Torinde
Copy link

Torinde commented Aug 14, 2023

That's great @max8rr8!
It seems your focus is on modern software (Win8.1/10) and you plan going upwards in support (D3D12/Vulkan/venus). In that regards - do you plan adding:

  • video encoding/decoding acceleration
  • lvp/llvmpipe/softpipe - fallback when no compatible GPU is present on the host or to run the graphics on a high-corecount CPU instead of its weak integrated GPU? Or if possible to combine the two (augment the performance of a weak GPU with llvmpipe running on the CPU cores)

For the retro direction, above was mentioned Win7 and in addition I think such virgl/venus GPU will be very useful also for:

@RedGreenBlue09
Copy link

RedGreenBlue09 commented Aug 19, 2023

I compiled and tried it, but I got the BSOD DRIVER_CORRUPTED_EXPOOL.
Host: Windows 11 Pro (10.0.22621.2070), NVIDIA driver 536.67
Guest: Windows 10 Enterprise (10.0.16299.15)

First, I compiled QEMU in MSYS2 like this:

cp /c/Program\ Files\ \(x86\)/Windows\ Kits/10/Include/10.0.22621.0/um/WinHv*\
 /ucrt64/x86_64-w64-mingw32/include/

./configure \
--target-list=x86_64-softmmu,i386-softmmu,arm-softmmu \
--cpu=x86_64 \
--enable-lto \
--enable-malloc=jemalloc \
--enable-avx2 \
--enable-dsound \
--enable-hax \
--enable-iconv \
--enable-lzo \
--enable-opengl \
--enable-png \
--enable-sdl \
--enable-sdl-image \
--enable-spice \
--enable-spice-protocol \
--enable-tcg \
--enable-whpx \
--enable-virglrenderer \
--disable-docs 

I compiled virglrenderer (upstream) like your comment above, then replaced the dll in QEMU dir with the newly built one.

Then I compiled mesa with VS (MinGW GCC prints a bunch of errors related to the WDK headers, so I gave up):

meson setup build/ -Dgallium-drivers=virgl -Dgallium-d3d10umd=true -Dgallium-wgl-dll-name=viogpu_wgl -Dgallium-d3d10-dll-name=viogpu_d3d10 --backend=vs

It outputs to my root folder (C:\) so I copied bin to a directory named mesa_prefix.
I opened cmd, set MESA_PREFIX to the mesa_prefix directory then compiled viogpu.sln with msbuild (Configuration="Win10 Release", Platform=x64).

I signed the output in viogpu\viogpu3d\objfre_win10_amd64\amd64\viogpu3d:

for /R . %a in (*.exe, *.sys, *.dll) do signtool sign /f "Surface.pfx" /t "http://timestamp.sectigo.com" /fd certHash "%a"

Then I booted up the VM, enabled test signing and then install the driver. The screen instantly went black and Windows crashes.

WinDBG, qemu logs and VM command line attached. I tried using serial kernel debug but it just hangs forever at boot so I analyzed the dump file instead.

windbg.txt
qemu.log
cmdline.txt

@RedGreenBlue09
Copy link

Additionally, I tried booting Linux Mint 20.2 live cd but QEMU itself crashed on some heap corruption issue. I replaced the original MSYS2 virglrenderer dll which fixed it. The Windows 10 guest BSOD remains.

@max8rr8
Copy link
Author

max8rr8 commented Aug 20, 2023

Hi @RedGreenBlue09, it seems that kernel crash happened during driver unloading (VioGpu3DRemoveDevice in backtrace) and while it is a problem, it is not the root cause for driver not working on your system as for some unknown reason windows triggers driver unload.

You should try to attach windbg to running vm (try to use network debugging instead of serial, it works for me). Additionally when connecting to vm from windbg enable break on connection and run following command

bp  watchdog!WdLogEvent5_WdError "k; g"

This windbg command should add backtrace logging to all errors happening in dxgkrnl. After you ran that command continue kernel execution with windbg command g or a button in gui. Then get full log and send it there, so we can analyze what actually went wrong (why does unload happen).

@RedGreenBlue09
Copy link

Hi @RedGreenBlue09, it seems that kernel crash happened during driver unloading (VioGpu3DRemoveDevice in backtrace) and while it is a problem, it is not the root cause for driver not working on your system as for some unknown reason windows triggers driver unload.

You should try to attach windbg to running vm (try to use network debugging instead of serial, it works for me). Additionally when connecting to vm from windbg enable break on connection and run following command

bp  watchdog!WdLogEvent5_WdError "k; g"

This windbg command should add backtrace logging to all errors happening in dxgkrnl. After you ran that command continue kernel execution with windbg command g or a button in gui. Then get full log and send it there, so we can analyze what actually went wrong (why does unload happen).

I will test again later. Also, if I use VGA + virtio-gpu-gl setup the driver don't load and it have error 49 in device manager. The error message is something like "Windows unloaded the driver because it has reported problems"

@RedGreenBlue09
Copy link

... Network debug also hangs like serial. Interestingly that doesn't happen with ReactOS.

@RedGreenBlue09
Copy link

RedGreenBlue09 commented Aug 20, 2023

I have no luck with live kernel debug. I tried using accel tcg, official qemu, none of these helps. If
I enable boot debug, Windows boot manager itself freezes. Absolutely no idea.

Edit: Even ditching OVMF for SeaBIOS, it still freezes.

@RedGreenBlue09
Copy link

@max8rr8 Okay, I'm sorry for the rant. It doesn't hang forever, just 30 minutes :((. I tried your command and got an error:

0: kd> bp  watchdog!WdLogEvent5_WdError "k; g"
Bp expression 'watchdog!WdLogEvent5_WdError ' could not be resolved, adding deferred bp

@RedGreenBlue09
Copy link

RedGreenBlue09 commented Aug 21, 2023

After reloading symbols, the error is gone. Here is the new log.

windbg2.txt

@andrelung
Copy link

Thank you very much @max8rr8 ! I (and possibly many others in this thread) would love to help testing and validating your updates. Can you give instructions on how to use your implementation?
Your instructions in the initial post seem to be outdated:

  1. Use patched version of virglrenderer from this repo branch viogpu_win
  2. Compile from source OR download pre-built drivers.

If anyone is able to build the drivers, this might be very helpful for anyone else following this issue/thread.

Again, thank you for all the work and please let us non-driver-devs know how to help. Documentation, formatting, testing, etc is surely something we'd be willing to take off your shoulders.

@max8rr8
Copy link
Author

max8rr8 commented Jan 26, 2025

Here is my take on DXVK vs D3d10UMD:

First of all, DXVK is absolutely undeniably definitely better implementation of DirectX 10(11) APIs than current state of d3d10umd. BUT it is an implementation of DirectX API, not the UMD required by d3d11.dll and dxgi.dll. Basically, when an app wants to draw something 3d accelerated with DirectX it will call functions (f.e. CreateRenderTargetView), in case of Wine+DXVK these functions are implemented in DLL provided by DXVK, so flow looks like this: VulkanDriver <==(Vulkan)==> DXVK <==(DirectX 10)==> App. On Windows DirectX API is implemented by d3d11.dll(called Direct3D runtime in WDDM docs), d3d11.dll then does some additional work and calls device-specific implementation (unofficially d3ddi API) from user-mode driver (in case of this project viogpu_d3d10.dll), which then HAS to call back to DirectX runtime through functions usually called pfnD3DDICB*, which will in turn contact kernel driver, so flow in WDDM case looks like this: KMD <==(WDDM API)==> d3d11.dll <==(D3DDICB)==> UMD <==(D3DDI)==> d3d11.dll <==(DirectX 10)==> App. IMHO, wddm design is really not great, but it is what it is. And because DXVK implements DirectX10 itself (not D3DDDI) it is impossible to use it in pure WDDM model.

Now, not all hope for DXVK is lost. It definitely is possible (and much easier) to implement a WDF driver that provides API to userspace, which in turn can provide Vulkan (Venus), OpenGL (Virgl/Zink) and DirectX (through DXVK). This will work quite well for heavy-GPU applications. In case of DXVK this would require to replace/inject systems d3d11 implementation with DXVK one. But, replacing won't work everywhere, mainly it won't work in DWM.exe (windows compositor, think kwin/mutter but with flavor). DWM.exe is really important part of windows display stack and for it is very entangled (and IMHO unnecessary) with dxgi.dll which in turn is entangled with d3d11.dll. By entangled I mean that they use each other's private and undocumented APIs which are quite hard to understand and even more so to emulate in DXVK. Due to these limitations related to DWM and design of windows display stack, it will be impossible(or close to it) to accelerate compositing and display. And performance in desktop usage (web browsing, MS Office, Photoshop) will be rather low because CPU-copies will be required to display a frame.

Luckily, d3d10umd is actually quite good in terms of supporting DWM.exe. So possible future solution that will provide both composition/display acceleration and DXVK's performance is to use d3d10umd for DWM (and possibly other places that are reliant on WDDM) and inject DXVK in place of d3d11.dll in other apps (can be done on load of UMD driver).

@max8rr8
Copy link
Author

max8rr8 commented Jan 26, 2025

Rebased against master. Thanks @kostyanf14 for merging in preparations commit. If there is any way i can help in either upstreaming this or aiding in design/development of a new solution please feel free to contact me on telegram or through email.

@FlysoftBeta
Copy link

I wonder if it is possible to develop something like UMD2VK? Although it seemed quite complicated.

@YanVugenfirer
Copy link
Collaborator

Rebased against master. Thanks @kostyanf14 for merging in preparations commit. If there is any way i can help in either upstreaming this or aiding in design/development of a new solution please feel free to contact me on telegram or through email.

Thanks!

@max8rr8
Copy link
Author

max8rr8 commented Jan 27, 2025

I wonder if it is possible to develop something like UMD2VK? Although it seemed quite complicated.

It is likely not possible without hacks. Real d3d11.dll relies on the fact that UMD will call its functions for access to kernel driver. F.e. this is used for resource sharing, when app requests resource creation d3d11.dll will call PFND3D10DDI_CREATERESOURCE passing hRTResource as one of params. Then when UMD wants to allocate memory it MUST pass this handle to pfnAllocateCb so d3d11.dll can associate kernel resource with user one. This will be comlicated to do if there would be also vulkan driver in a middle.

Even if that can be dealt with, it would likely be very complicated to do from scratch. It might be possible to base on top of dxvk (but I am not sure how keen upstream dxvk will be to accept that) or using d3d10umd+zink (should be theoretically possible, cause gallium). But, considering solution listed above in my opinion on DXVK/d3d10umd, it is likely will be much easier to fix d3d10umd or figure out injecting full dxvk than to create UMD2VK.

@omove
Copy link

omove commented Jan 27, 2025

VirtualBox v7.0 and later apparently use DXVK for 3D acceleration on non-Windows hosts, looking into how VirtualBox handles this in it's drivers may help with making a driver using DXVK.

Using VKD3D to get DX12 might be a better option than using DXVK for DX8-11 though since Microsoft already has D3D9on12 and D3D11on12 to support DX1-11 on 12. I know Intel also uses 9on12 for their newer GPUs and iirc Qualcomm basically only has a DX12 driver and uses the all the Microsoft DX1-11/Vulkan/OpenGL/OpenCL to DX12 compatibility layers.

@DemiMarie
Copy link

@omove I think Qualcomm has a native Vulkan driver now but am not sure.
@max8rr8 I think a UMD2VK on top of DXVK is something would make sense in DXVK. I don’t think that Direct3D 10 is enough in the long term, and would not be surprised if DWM eventually requires later versions of DirectX and therefore falls back to software rendering anyway. That said, something like this PR (but made to use native contexts instead of virGL) would make a lot of sense for Qubes OS. Qubes OS can work around some (but not all) of virGL’s security problems by using the Xen stubdomain feature, but that would both be extra work on the Qubes OS side and still be both slower and less secure than supporting native contexts directly.

@FlysoftBeta
Copy link

@max8rr8 AFAIK, there is a kind of umd called SW rasterizer which can potientally avoid KMD.
Another option can be just creating a stub KMD, which I think won't be too complicated

@DemiMarie
Copy link

@FlysoftBeta Can one implement a “SW rasterizer” that is backed by real HW?

I will add that there are cases where bypassing DWM.exe is desirable. Seamless virtualization environments such as Qubes OS want to extract the buffers from individual windows, bypassing DWM if possible. The buffers will then be submitted to a host Wayland compositor. (Qubes OS will support Wayland by then.)

@FlysoftBeta
Copy link

Can one implement a “SW rasterizer” that is backed by real HW?

Sure, but I suspect if dwm related stuffs really supports sw rasterizer.
Another more compatible alternative solution could be implementing a stub KMD

I will add that there are cases where bypassing DWM.exe is desirable. Seamless virtualization environments such as Qubes OS want to extract the buffers from individual windows, bypassing DWM if possible. The buffers will then be submitted to a host Wayland compositor. (Qubes OS will support Wayland by then.)

But still, there is still a lot apps that rely on dwm apis like winui3 apps and simply bypassing dwm seems to be too intrusive and not desirable for those who needs a full desktop experience.

@DemiMarie
Copy link

Can one implement a “SW rasterizer” that is backed by real HW?

Sure, but I suspect if dwm related stuffs really supports sw rasterizer. Another more compatible alternative solution could be implementing a stub KMD

Stub KMD?

I will add that there are cases where bypassing DWM.exe is desirable. Seamless virtualization environments such as Qubes OS want to extract the buffers from individual windows, bypassing DWM if possible. The buffers will then be submitted to a host Wayland compositor. (Qubes OS will support Wayland by then.)

But still, there is still a lot apps that rely on dwm apis like winui3 apps and simply bypassing dwm seems to be too intrusive and not desirable for those who needs a full desktop experience.

Is it possible to support DWM APIs and still extract individual window buffers?

@FlysoftBeta
Copy link

Is it possible to support DWM APIs and still extract individual window buffers?

Sounds good but I don't think it is rather an easy job.

@DemiMarie
Copy link

Is it possible to support DWM APIs and still extract individual window buffers?

Sounds good but I don't think it is rather an easy job.

Can it be done using only documented Windows APIs, or would reverse-engineering be required?

@max8rr8
Copy link
Author

max8rr8 commented Jan 28, 2025

VirtualBox v7.0 and later apparently use DXVK for 3D acceleration on non-Windows hosts, looking into how VirtualBox handles this in it's drivers may help with making a driver using DXVK.

I am not 100% sure, but as far as i understood their stack looks like this: VulkanDriver <==> DXVK <==> Hypervisor <==||VMBOUNDARY||==> VirtualBox KMD <==> WDDM/d3d11.dll <==> VirtualBox UMD <==> d3d11.dll <==> app. They are passing DirectX commands over virtual boundary to host where they are executed on real d3d11 on windows and through dxvk on linux

Using VKD3D to get DX12 might be a better option than using DXVK for DX8-11 though since Microsoft already has D3D9on12 and D3D11on12 to support DX1-11 on 12. I know Intel also uses 9on12 for their newer GPUs and iirc Qualcomm basically only has a DX12 driver and uses the all the Microsoft DX1-11/Vulkan/OpenGL/OpenCL to DX12 compatibility layers.

I think you are correct. D3D11on12 might actually be a solution that both doesn't require injections and probides quite good performance on top of vulkan. I think it might actually work with DWM without much problems because d3d11on12 is a UMD, so all privates of d3d11.dll are still kept in place. And I believe there are accommodations in both dxgi.dll and d3d11.dll that help it work even though it's on top of D3d12 not real kmd (f.e. mentions of DXGIOn12 in codebase, ignore of hRTResource).

@max8rr8
Copy link
Author

max8rr8 commented Jan 28, 2025

@max8rr8 AFAIK, there is a kind of umd called SW rasterizer which can potientally avoid KMD.

If by SW rasterizer you mean d3d10warp then it still uses special KMD.

Another option can be just creating a stub KMD, which I think won't be too complicated

Well, yes it is possible, but why? I don't think it is great idea to have real driver and then another one providing stubs. Yes it might be easier to work around some WDDM limitations but you can also do that by using D3DKMTEscape for anything that doesn't fit WDDM model. Plus some support of 3d will be required in KMD anyway to provide accelerated path from dwm to display.

Is it possible to support DWM APIs and still extract individual window buffers?

Sounds good but I don't think it is rather an easy job.

Can it be done using only documented Windows APIs, or would reverse-engineering be required?

I think there are two ways to achieve seamless integration with windows guest:

  1. Custom daemon that captures windows with Windows.Graphics.Capture (which returns DX11 on gpu surfaces), pass them to virtio-wl driver which converts it to raw virtio-gpu handle and passes it out through virtio-wl to be handled on host

  2. RDP: In WSL Microsoft uses rdp with additional trick to pass application surfaces over hyper-v socket which allows them to avoid any cpu copies. I believe windows itself can also pass this surfaces over to host where freerdp could handle them and display (although seemless mode support in freerdp is a bit lacking)

@AlfCraft07
Copy link

Can one implement a “SW rasterizer” that is backed by real HW?

Sure, but I suspect if dwm related stuffs really supports sw rasterizer. Another more compatible alternative solution could be implementing a stub KMD

DWM supports software rendering starting from Windows 8 build 7880, so yes (if you’re talking about Windows 10).

@max8rr8
Copy link
Author

max8rr8 commented Jan 28, 2025

Qualcomm basically only has a DX12 driver and uses the all the Microsoft DX1-11/Vulkan/OpenGL/OpenCL to DX12 compatibility layers.

I am not sure how true is this, I have downloaded latest adreno drivers from Lenovo and Woa-project and they both in INFs specify qcdx11arm64xum.dll as d3d10 an d3d11 umd implementation. I have ran strings util on this dll and it yielded source paths that are nothing like D3d11on12 and look more like actual implementation of d3d11 UMD

@gurchetansingh
Copy link

Great work on this, @max8rr8. This really moves the ball forward for Windows guest support.

Just FYI, we would like to support {RADV, ANV} guest + D3DKMT/virtgpu guest + {Windows/Linux} host one day. Here's the high-level RFC for this:

https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33190

I think the custom daemon would be the approach that would fit nicely in terms of display integration. Happy to collaborate with FOSS community on how to make this happen.

@v-fox
Copy link

v-fox commented Jan 29, 2025

https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33190

You've lost me at:

The library is similar to libdrm, but with a more modern and unified interface…
The library will be written in Rust… 
**Q: Does the introduction of more Rust dependencies mean we can rewrite Mesa in Rust?**
You said it, not me. Maybe if we ever need a new compiler this could be handy..
**Q: Why not do this in C or C++?**
Is this 1987?

Rewriting core OS libraries in the worst LLVM fork for no reason is the last thing we need. Just write your own "Resa" for your toy OS in 20 years without the rest of us, please.

@gurchetansingh
Copy link

If you are calling NIR the worst LLVM fork, then clearly you haven't seen LLVM forks, my friend!

Otherwise, mesa3d.rs is exactly what I'm aiming for. No need for Rust games to depend on C Khronos APIs!

@max8rr8
Copy link
Author

max8rr8 commented Jan 29, 2025

Just FYI, we would like to support {RADV, ANV} guest + D3DKMT/virtgpu guest + {Windows/Linux} host one day. Here's the high-level RFC for this:

@gurchetansingh

This would indeed be quite useful for 3d acceleration on windows guests, especially for Vulkan/DXVK because right now afaik venus is quite closely tied to drm (no winsys separation like in gallium). Although I would like to ask a few questions:

  1. What are requirements from guest's kernel driver? I suspect context init, blob resources and host mem. Am I missing something else?
  2. Are there any plans to support gfxstream on top of magma? Or will gfxstream require its own implementation of calls to kernel virtio-gpu driver?
  3. Usermode drivers for d3d11 runtime have slightly different way(through D3DDDICB calls) of contacting kernel side compared to normal drivers (D3DKMT drivers). In my Mesa patches I have abstracted such details through gdikmt, perhaps something similar would be worth considering for magma
  4. More out of curiosity, can gfxstream theoretically be adapted to also support DirectX 11/10?

Overall I think this is quite nice movement. If there would be support for magma in radv/anv I will try to implement missing parts in this kernel driver and virtgpu d3dkmt.

@gurchetansingh
Copy link

  1. You are correct. Hopefully someone here can help implement these new virtio-gpu features :-)

  2. Magma proposes a new VIRTIO_GPU_CAPSET_MAGMA, while gfxstream currently relies on VIRTIO_GPU_CAPSET_GFXSTREAM. I think implementing "virtio" aware drivers {virgl, gfxstream, venus} on magma -- while theoretically possible -- could take the inception too far. For gfxstream, the recommended place for a Windows backend is:

https://gitlab.freedesktop.org/mesa/mesa/-/tree/main/src/gfxstream/guest/platform/windows

  1. I'm not actually too well-versed in the nuances of d3d11 versus d3d10 versus d3d12. Do you think this distinction is pertinent to Vulkan on D3DKMT drivers? It might be good to offer your insights here:

https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29945

I think that MR will set the standard for Windows Vulkan support in Mesa. RADV + D3DKMT/virtgpu guest + Linux host will most definitely come after that MR lands.

  1. If DirectX 11/10 can be layered on top of Vulkan (and the answer to that is "yes"?), gfxstream can work.

@DemiMarie
Copy link

@max8rr8 I don’t think either Venus or gfxstream are particularly relevant in the long term. Native contexts are the future for both performance and security reasons.

@SyntheticBird45
Copy link

SyntheticBird45 commented Jan 30, 2025

Stop shilling your method ladies and gentlemen, We're all doomed anyway unless freedesktop.org find a new sponsor for its infra.

@Entropy512
Copy link

Entropy512 commented Jan 30, 2025

@max8rr8 I don’t think either Venus or gfxstream are particularly relevant in the long term. Native contexts are the future for both performance and security reasons.

That's EXTREMELY longterm though - basically not until nvk comes close to the capabilities of NVidia's proprietary drivers. There are going to be a lot of use cases where virtio would be the preferred approach if possible, but an NVidia card will be specified for the fallback of using SR-IOV as a risk mitigation because that's the known tested configuration. It looks like native context for Intel and AMD were both implemented by employees of those companies, so I wouldn't expect to see nvidia support any time soon.

Also in Google's use case (gfxstream/chromeos/android) - while Adreno was the first mobile GPU supported by native context (and in fact pioneered the concept), work on any other mobile GPU (Mali, VideoCore) has not even started.

@max8rr8
Copy link
Author

max8rr8 commented Jan 30, 2025

@gurchetansingh thank you for answers

You are correct. Hopefully someone here can help implement these new virtio-gpu features :-)

I see. Context init would just require handling pPrivateData from D3DKMTCreateContext, host mem required showing WDDM the PCI BAR and calling VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB when needed. The biggest obstacle is blob resources, which require data from context which is not available during resource creation (only on AttachResource will context of resource be known), this would require deferred initialization and abstracting different types of resource (though that would also help with command buffer submition performance and help migrate to WDDMv2). I will try to do that when I will get larger chunk of free time.

I'm not actually too well-versed in the nuances of d3d11 versus d3d10 versus d3d12. Do you think this distinction is pertinent to Vulkan on D3DKMT drivers? It might be good to offer your insights here:

I was under misconception that magma could also be used in other drivers (e.g. gallium). But if it is for Vulkan-only (which is very cool and needed still) then only d3dkmt should be used. Thank you for clarifying.

@max8rr8 I don’t think either Venus or gfxstream are particularly relevant in the long term. Native contexts are the future for both performance and security reasons.

@DemiMarie While I agree that native contexts(and Magma) are definetly better, there are likely will be complications with guest drivers support(VirGL f.e. couldn't compile on windows at all, I suspect other drivers might have similar problems) and unsupported hardware. That's why initially I went with solution on top of just VirGL. But considering Magma development for Vulkan support in this driver I will likely try to do it with Radv in guest for start.

@DemiMarie
Copy link

@max8rr8 I don’t think either Venus or gfxstream are particularly relevant in the long term. Native contexts are the future for both performance and security reasons.

That's EXTREMELY longterm though - basically not until nvk comes close to the capabilities of NVidia's proprietary drivers. There are going to be a lot of use cases where virtio would be the preferred approach if possible, but an NVidia card will be specified for the fallback of using SR-IOV as a risk mitigation because that's the known tested configuration.

Are you thinking of compute or graphics use-cases? Nvidia limits SR-IOV to enterprise GPUs, which severely limits its applicability.

@Entropy512
Copy link

Entropy512 commented Jan 30, 2025

Are you thinking of compute or graphics use-cases? Nvidia limits SR-IOV to enterprise GPUs, which severely limits its applicability.

Both. The cost delta between an RTX 5000 Ada (supports vGPU) and an RTX 4000 Ada (does not) is about 1 engineer-week's worth of labor expenditure - so it was specced as a risk mitigation in one example where I am working now in case it was needed. AMD was a complete non-starter due to a long history of engineers here and our sister business units running into driver quality issues with their products.

Venus (or native context) would allow for more flexibility (not having to partition the GPU in advance, and also PCI passthrough of any form kills live migration), but it's a nice-to-have - for the foreseeable future, AMD cards aren't going to be specced out by most enterprise customers - at least not until AMD's Windows driver quality comes up to par with Mesa on Linux (or AMD replaces their proprietary Windows Vulkan driver with RADV).

Actually in our use case, video encode acceleration (to push pixels to thin clients) is going to be far more useful than 3D, although being able to offload graphics would free up some cores, and one internal product is starting to migrate from pure CPU to use GPU compute to accelerate some operations so hwaccel for OpenCL is likely to also be beneficial down the line. Both Venus and native context provide that, the current architecture here (using virgl) does not. The theoretical security issues with Venus aren't a significant concern either - all VMs are trusted, all users are trusted, and in production the system is not connected to the outside world.

@DemiMarie
Copy link

@Entropy512 Do you need shared virtual memory (SVM)? Implementing that with virtio-GPU is very difficult for technical reasons, whether one is using native contexts or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.