Did you run this test on the latest nightly and updated models?
For me, previously, RAW denoise task on CUDA EP on Windows was very slow too. But with recent updates it is lighting fast.
RAW denoise of the same image on Windows 11:
- DirectML – 1.33 s
- CUDA – 1.30 s
The actual numbers depend on hardware and image size, but in general RAW denoise on CUDA on Windows is not slow anymore.
Please see: