Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Request] Remove CUDA dependency #439

Open
mia-0 opened this issue Jul 9, 2018 · 182 comments
Open

[Request] Remove CUDA dependency #439

mia-0 opened this issue Jul 9, 2018 · 182 comments
Labels
CUDA wip work in progress

Comments

@mia-0
Copy link
Contributor

mia-0 commented Jul 9, 2018

I realize this is a lot to just ask without the time and ability to implement and maintain the necessary changes myself. Hence, I’ll spare you the long-winded political speech. The gist of it is: You can’t call something free and open source software when it depends on and endorses proprietary components whose only purpose is vendor lock-in.

AliceVision should be able to work without CUDA, no matter how glacially slowly. I prefer inefficient CPU-only computation that spills registers and caches all over the place over a requirement for GPUs with inferior Linux support and very unstable drivers (my GTX 960 keeps freezing my computer whenever NVIDIA’s driver decides it doesn’t want to do memory management anymore, and it is IMPOSSIBLE to report this problem to NVIDIA unless you’re a big corp and there’s money involved). I simply don’t have the patience to deal with this garbage, and I desperately want to move to a different GPU vendor so I get proper support for my platform.

Ideally, the CUDA parts should be ported to an open platform such as Vulkan or the older OpenCL.

@fabiencastan
Copy link
Member

Currently, we have neither the interest nor the resources to do another implementation of the CUDA code to another GPU framework.
If someone is willing to make this contribution, we will support and help for integration.

@cody-code-wy
Copy link

cody-code-wy commented Jul 14, 2018

I was looking into this some to see if there were any tools to make such a transition easier I found a project called swan thats is meant to make it very simple to effectively 'translate' Cuda kernels and code into OpenCL equivalents. Though it has not been updated in some time, so it may not help very much.

I feel like its worth pointing out also that OpenCL works on most embedded GPUs, integrated GPUs, and many FPGAs include drop in modules to allow OpenCL functionality. All of this means that if a change like this was made to AliceVision there would be many new potential uses. Such as micro computer clusters, or use on mobile devices directly.

@fabiencastan
Copy link
Member

It's difficult to find a good solution in this technology war with Apple deprecation of OpenGL and OpenCL:
https://developer.apple.com/macos/whats-new#deprecationofopenglandopencl

Another interesting project on this topic is HIP:
https://gpuopen.com/compute-product/hip-convert-cuda-to-portable-c-code

@zvrba
Copy link
Member

zvrba commented Jul 14, 2018

That one is easy though: ditch OSX support. OSX is IME by far the worst and most buggy implementation of POSIX APIs that I've had to work with.

@cody-code-wy
Copy link

I agree that theres no particularly good solution currently.

I agree that apple's depreciation of OpenCL could be somewhat problematic, but I feel like its worth pointing out that relatively few of apple's systems have any support for Nvidia cards so CUDA is not much better for supporting Mac OS.

Also HIP looks like a pretty nice option. There seems to have been a few interesting similar projects in the past like gpuocelot, which is sadly now defunct.

Apperently Vulkan can be used for GPGPU, and thats supported in windows and linux on both AMD and Nvidia, and with MoltenVK on anything supporting apple's Metal APIs. But Vulkan is still pretty new so there not much info out there about using it for GPGPU...

@fabiencastan
Copy link
Member

I would be interested in trying Halide as it enables to write high-level algorithms but also enable fine tuning of the scheduling. And then it generates code for each target.

HPG2017_FastImageProcessing
halide-inria-march2017

@zvrba
Copy link
Member

zvrba commented Jul 15, 2018

ISPC (https://ispc.github.io/) could be another option. It also has an (experimental) PTX backend.

@cody-code-wy
Copy link

Halide looks like a pretty good option. While there is no metal backend yet it looks like (from issues on their github) a few people may be working on one, but obviously osx has OpenCL still for now.

And with support for ARM v7/NEON it could be used on Raspberry PIs (2 and later) and the like, and even android devices. That could seriously open up what AliceVision could be used for in the future.

@zvrba
Copy link
Member

zvrba commented Jul 25, 2018

I'm skeptical about using something not backed by a major industry vendor. Halide is an academic project, they may get tired of developing it (when they've exhausted publishable stuff), they probably don't care about breaking changes (from the homepage: "These academic publications describe the ideas behind Halide and its scheduling model. Halide syntax changes over time, so don't rely on them for correct syntax."), etc.

Tools from major industry vendors (nvidia, intel) aren't open-source. So what?

If there's a viable alternative to CUDA, it's SYCL (Khronos standard; opencl using modern c++, i.e., something resembling CUDA), but the downside is that there are no free (as in beer) quality compilers that I'm aware of.

OpenCL seems to be the most future-oriented as it can support FPGAs as well. Intel has acquired Altera and another FPGA manufacturer, and OpenCL tooling will probably follow.

@AndreaMonzini
Copy link

AndreaMonzini commented Jul 27, 2018

We are trying to compile Meshroom and AliceVision with Linux but it's sad to discover that it will work only with a proprietary solution that i do not have ( i use AMD GPU with Mesa driver).

@mia-0
Copy link
Contributor Author

mia-0 commented Jul 27, 2018

To me the issue is: I do have the hardware, but it is just unstable as hell, requiring a lot of power cycles (since even the reset buttons stop working). Have been able to reproduce this with multiple kernel versions, driver versions, motherboards, GPUs, PSUs… It’s safe to say that it’s not a hardware issue, other than potential firmware bugs.

Anyway, my suggestion is to take a step back from all the frameworks and try to get just a basic C implementation done, with no drastic optimization whatsoever. My belief is that this will make future native ports (Vulkan, etc.) and SIMD optimization much easier, especially for outside contributors, because C is much more accessible. Also, before deciding on frameworks in an attempt to cover all potential use cases, it’s probably best to understand the challenges and requirements by doing a clean implementation with minimal external dependencies first.

@AndreaMonzini
Copy link

AndreaMonzini commented Jul 30, 2018

hi @fabiencastan is there a way to support a solution like HIP or Halide ?
Maybe an open-source bounty?

I think that the support for only 1 GPU vendor with proprietary GPGPU solution sounds limiting for a very promising free and open source project.

I could find and buy a proprietary software alternative for the photogrammetry but i prefer to support free and open source software and i use AMD GPU for its free and open source drivers.

https://github.com/ROCm-Developer-Tools/HIP

Anyway thank you for sharing your work :)

@AndreaMonzini
Copy link

example of HIP porting:

https://gpuopen.com/ported-caffe-hip-heres-happened/

@Ashtreighlia
Copy link

Hi everyone,

I read through the comments and it seems like the ditching of OpenCL/GL in the new OSX versions gives the developers a tiny headache on what computing language to use for this program.
I am a Mac user and since following the "development" of new macs (with metal1&2), it seems to me like the are ditching every other computing enviroment. Despite the fact that the last Nvidia GPUs used in any models was around 2013 and with the upcoming and already existing empire of Metal, this propably won't change soon.
Just want to give my view on the OSX "issue" ^^

Have a good one

@AndreaMonzini
Copy link

AndreaMonzini commented Aug 28, 2018

Hello, for what understand HIP uses C++ so it should be compatible without OpenCL.

https://github.com/ROCm-Developer-Tools/HIP/blob/master/docs/markdown/hip_faq.md#how-does-hip-compare-with-opencl

@kwahoo2
Copy link

kwahoo2 commented Aug 30, 2018

Here is output after running hip (rocm) converter:

adi@adi-ryzen7:~/kompilacje/AliceVision$ /opt/rocm/hip/bin/hipconvertinplace-perl.sh src
...
info: TOTAL-converted 713 CUDA->HIP refs( dev:153 mem:74 kern:150 coord_func:0 math_func:0 special_func:3 stream:0 event:0 err:7 def:3 tex:323 extern_shared:0 other:0 ) warn:39 LOC:665119
  warning: unconverted cudaReadModeNormalizedFloat : 9
  warning: unconverted cudaArraySurfaceLoadStore : 6
  warning: unconverted cudaExtent : 5
  warning: unconverted cudaMemcpy3DParms : 4
  warning: unconverted cudaMemcpy3D : 4
  warning: unconverted cudaMalloc3DArray : 3
  warning: unconverted cudaMalloc3D : 2
  warning: unconverted cudaMemcpy2DFromArray : 2
  warning: unconverted cudaMemcpyFromArray : 2
  warning: unconverted cudaPitchedPtr : 2
  kernels (2 total) :   nearestKernel(1)  pushPull_Pull_kernel(1)

@MrMinimal
Copy link

@Storagraph The deprecation of OpenGL is not too much of a problem since there are multiple translation libraries which can convert to multiple graphics backends. Khronos have succeeded in getting Vulkan to run everywhere regardless of graphics API thanks to the portability initiative.
MoltenVK enables vendors to target MacOS as well when using Vulkans compute shaders. So Vulkan is most portable option out there.

If anyone is intimidated by the Vulkan API, there is a project which reduces it's complexity: V-EZ

So the CUDA dependency could be removed if Vulkan compute shader were used.

@Ashtreighlia
Copy link

@MrMinimal I just mentioned OpenGL for completeness.
Vulkan/OpenGL/DirectX/D3D (graphic apis) are used for rasterization of 3D Objects and are generally not used for computing tasks, OpenCL (open computing language) is for computing.
There is a work around by using SPIR-V to access OpenCL via the front-end in Vulkan, but doesn't this also need the support for OpenCL on OSX in the first place?
Just to mention it, Apple announced in a press release, that will ditch both OpenCL & GL.

Sorry for the confusion ^^

@AndreaMonzini
Copy link

@kwahoo2 thank you for the conversion with HIP, i think it could be the right solution with additional work.

@PolarNick239
Copy link

Currently HIP doesn't support Windows and doesn't support amdgpu-pro driver under Linux (in fact only rocm platform under Linux is supported).

@AndreaMonzini
Copy link

As supporter of free and open source software under Linux i prefer AMDGPU Mesa FOSS driver.
I would like to support AliceVision also because it's a FOSS project and a FOSS driver like OpenCL, Vulkan, HIP or alternatives, would be the best solution in the FOSS perspective.

@AndreaMonzini
Copy link

AndreaMonzini commented Oct 8, 2018

Hello,
just to inform about a new interesting project based on Vulkan that could be useful:

https://github.com/jgbit/vuda

@beta-tester
Copy link

beta-tester commented Oct 17, 2018

any chance to run AliceVision/Meshroom - CPU only - without any specialized hardware, without nVidia, ... ?
most of the discussion i see here is about nVidia, CUDA, AMD, Vulkan, macOS, Metal, ... (voodoo :P)

i have only an older intel CPU (i7-3xxx) with a "built-in" intel GPU (HD-4000) - i don't need more GPU power than the GPU on CPU.
to me, time doesn't matter...

@Freebase394
Copy link

Freebase394 commented Jul 31, 2021

Hi all,

Same here:
With NITRO+ RX 5700 XT 8G

Same problems here:
Program called with the following parameters:

  • downscale = 4
  • exportIntermediateResults = 0
  • imagesFolder = "C:/Users/USERxxxx/AppData/Local/Temp/MeshroomCache/PrepareDenseScene/b5506dfdae0783ed4f2af93ed5e456c597410b68"
  • input = "C:/Users/USERxxxx/AppData/Local/Temp/MeshroomCache/StructureFromMotion/a18323503797a7ca24cbdebf412d6b86d0ad0932/sfm.abc"
  • maxViewAngle = 70
  • minViewAngle = 2
  • nbGPUs = 0
  • output = "C:/Users/USERxxxx/AppData/Local/Temp/MeshroomCache/DepthMap/36f66f7ffefbecb47504f7cbd92b7f8e44e4c9af"
  • rangeSize = 3
  • rangeStart = 0
  • refineGammaC = 15.5
  • refineGammaP = 8
  • refineMaxTCams = 3
  • refineNDepthsToRefine = 31
  • refineNSamplesHalf = 150
  • refineNiters = 100
  • refineSigma = 15
  • refineUseTcOrRcPixSize = 0
  • refineWSH = 3
  • sgmGammaC = 5.5
  • sgmGammaP = 8
  • sgmMaxTCams = 3
  • sgmWSH = 4
  • verboseLevel = "error"

[11:15:01.772700][error] cudaGetDeviceCount failed: CUDA driver version is insufficient for CUDA runtime version
[11:15:01.772700][error] This program needs a CUDA-Enabled GPU (with at least compute capability 2.0).

@zicklag
Copy link

zicklag commented Jul 31, 2021

Hey folks, forgive me if this is out of place now, as I haven't been able to keep up with exactly what the current status on this thread is, but as far as technology for implementing a new version of the CUDA components, I might have an idea that would work on CPU, OpenGL compute, Vulkan compute, DirectX 12 compute, and Metal compute, without having to write multiple implementations.

If we used the Rust GPU project, we could implement the core algorithm in portable no_std Rust and then compile for CPU and, using Rust GPU, to SPIRV, which can be translated using naga or spirv-cross to HSLS, GLSL, and Metal to target each platforms natively supported graphics API.

Rust GPU is a relatively new project with lots of rough-edges, but the portability that we could achieve with it, being able to support CPU and GPU, when available, with one codebase, is incredible.

Again, maybe this is ill-timed, or not useful to the thread currently, but I thought I'd throw it out there just in case. 🙂

@tamara-schmitz
Copy link

tamara-schmitz commented Aug 4, 2021

Again, maybe this is ill-timed, or not useful to the thread currently, but I thought I'd throw it out there just in case. slightly_smiling_face

I think the main issue is that there is no one up to try implement any solution not a lack of ideas... At least not from the maintainers.
I'll experiment with this next week. If we can get a prototype going of any solution I'm sure we can convince the maintainers to help us.

EDIT: oh right. the maintainers had CI issues that led them to stop pursuing this further.

@zicklag
Copy link

zicklag commented Aug 4, 2021

Maybe the AliceVision project could support dynamically loading a DLL with a simple C interface that could be used to perform the Depth Map operation, and then the community could independenty work on different implementations that provide the same C interface, using whatever technology they want, CPU only C, OpenCL, Rust GPU, etc.

That might at least reduce need for a super-official alternative implementation and allow the community to experiment on their own without having to merge changes into AliceVision to get working prototypes and let other people try out alternative implementations. It might also help any CI issues, because the alternate implementations would be purely community provided at that point, for now anyway, and no responsibility is put on the existing maintainers.

@fabiencastan
Copy link
Member

@zicklag Someone can create their own depthmap node, load the same input files and generate the same output exr files, and then use the full pipeline.
There are several open-source implementations of depth map estimation that could be integrated that way. If someone with C++ expertise is interested to do that, I would be happy to support such initiative. It would be interesting for comparison and evaluation.

@revisionarian
Copy link

Our group is developing OpenCL photogrammetry software that uses the Meshroom GUI, named "MeshroomCL". Windows binaries are available here:

https://github.com/openphotogrammetry/meshroomcl/releases

Several users have enjoyed success with our software on AMD, NVIDIA, and Intel platforms, using the familiar Meshroom workflow.

@tamara-schmitz
Copy link

Our group is developing OpenCL photogrammetry software that uses the Meshroom GUI, named "MeshroomCL".

That's awesome! Any plans to upstream?

@ochafik
Copy link

ochafik commented Feb 2, 2022

Our group is developing OpenCL photogrammetry software that uses the Meshroom GUI, named "MeshroomCL".

That's awesome! Any plans to upstream?

@tamara-schmitz FWIW, opened another bug (their third) asking for the MPL2-licensed modified sources of that project to be made available alongside their binary releases 🤞🤞🤞

@natowi
Copy link
Member

natowi commented Feb 2, 2022

@ochafik MeshroomCL is not actually using a ported version of the AliceVision library. It utilizes Colmap (CL) and MVE. The interesting bit here are the changes made to the Colmap CL and MVE executables to support the file formats and parameters supported by the AliceVision/Meshroom pipeline.

The idea of using other libraries within Meshroom is not new and roughly outlined here.

@ochafik
Copy link

ochafik commented Feb 2, 2022

@natowi thanks for the explanation! Ugh, given that Colmap is BSD-licensed, probably means chances to see colmap-cl changes upstreamed are slim. I've started renting CUDA instances in the cloud to test things out ¯\_(ツ)_/¯

@acxz
Copy link

acxz commented Apr 23, 2022

@AndreaMonzini
Copy link

https://github.com/oneapi-src/SYCLomatic

@philipturner
Copy link

philipturner commented Nov 14, 2022

Working on a Metal backend for hipSYCL. SYCL seems like the most coherent effort to date to replace CUDA with another single-source API. Only downside - it takes nonzero time to replace CUDA with SYCL, and the maintainers may see no need to do so.

  • Intel: DPC++
  • NVIDIA: hipSYCL + CUDA
  • AMD: hipSYCL + HIP (HIP works on Windows for Blender)
  • Apple: hipSYCL + Metal

@fabiencastan
Copy link
Member

fabiencastan commented Nov 14, 2022

Interesting, thanks for the links and the feedback.

@philipturner
Copy link

As a practical limitation, what besides the CUDA language does this depend on? For example, if this repo depends on cuDNN or cuBLAS, those might be hard to replace with SYCL. I'm interested in the practical feasibility of this move. If it just requires lots of effort, perhaps some interested users might pitch in to help.

This may be helpful - an analysis of maturity of SYCL (2021, 13 pages): https://dl.acm.org/doi/10.1145/3456669.3456701

@balaclava9
Copy link

Good News for hipSYCL, I think. I have no ability to implement something like this though.
https://hipsycl.github.io/hipsycl/sscp/compiler/generic-sscp/

@michal2229
Copy link

In my opinion even a basic CPU implementation would suffice for a lot of people (including me). Any pointers on how to begin such work?

@seifertm
Copy link

Good News for hipSYCL, I think. I have no ability to implement something like this though. https://hipsycl.github.io/hipsycl/sscp/compiler/generic-sscp/

Apparently, hipSYCL has been renamed to OpenSYCL, so the link is now broken.

The referenced article can now be found here:
https://opensycl.github.io/hipsycl/sscp/compiler/generic-sscp/

@jahav
Copy link

jahav commented Jul 27, 2023

Note that OpenSYCL will be renamed to HolisticCpp in near future (see AdaptiveCpp/AdaptiveCpp#999 (comment) for background).

@stendarr
Copy link

stendarr commented Mar 6, 2024

Since I think this is highly relevant I'd like to mention Zluda again. The maintainer has switched from supporting Intel GPUs only to AMD GPUs only due to a change in employment. Support for AliceVision (Meshroom) is not quite there yet but I'm sure the project would benefit from some of the knowledgeable eyes around here.

@Nprod
Copy link

Nprod commented Mar 8, 2024

We're approaching 6 years after this issue was opened and nothing has been done. I hate Nvidia so much. Shame on any developer who uses this cursed trojan horse of a library.

Since I think this is highly relevant I'd like to mention Zluda again. The maintainer has switched from supporting Intel GPUs only to AMD GPUs only due to a change in employment. Support for AliceVision (Meshroom) is not quite there yet but I'm sure the project would benefit from some of the knowledgeable eyes around here.

They have already started to make their move on it by changing their EULA to prohibit specifically the use of translation layers. Considering they've oriented themselves as an AI company now, they will certainly fight to prevent things like ZLUDA from working to the bitter end.

@stendarr
Copy link

stendarr commented Mar 8, 2024

They have already started to make their move on it by changing their EULA to prohibit specifically the use of translation layers.

Yes and no. Their EULA has prohibited reverse engineering and translating for a very long time. See this licence file (line 117) for example.

What has changed is that this warning comes with the installation process. See this tomshardware article for more information.

@Nprod
Copy link

Nprod commented Mar 8, 2024

They have already started to make their move on it by changing their EULA to prohibit specifically the use of translation layers.

Yes and no. Their EULA has prohibited reverse engineering and translating for a very long time. See this licence file (line 117) for example.

What has changed is that this warning comes with the installation process. See this tomshardware article for more information.

I'm not familiar with the specifics, but what is stopping them from altering their compiler just enough to make ZLUDA unusable for any future versions? Autodesk does this move all the time when they change the spec for the FBX file format a little every version so that other software packages can't use it and have to put some effort into reverse-engineering it again every time.

@1kaiser
Copy link

1kaiser commented Mar 8, 2024

can we learn to put a wasm compiler in the software itself so that before it runs it can check compiled version of the version if available with compatible hardware. as some alternatives have helpful in the past to be mentioned as

https://github.com/openphotogrammetry

https://github.com/nerfstudio-project/nerfstudio

https://github.com/google-research/google-research/tree/master/jaxnerf

@NeoIsrafil
Copy link

For what it's worth (not much I know ^_^), I too would very much like it if I could use my renderbox that was built for graphics rendering (and some gaming) to run meshroom, but shes a multiple AMD Vega based machine. I'm lucky in that my fiance's computer has an older nvidia graphics card that will run CUDA based stuff, but its like cruising around in your prius when you've got a lamborghini in the driveway. Feelsbadman. lol. I'll make do with what I can while I have to, but it would be amazing if there were a Vulkan running version since basically everything modern-ish can use Vulkan. I wish I had the knowledge to help or take on such a project, but programming isnt exactly my strong suit. I know just enough to get myself in trouble and write some basic stuff, but I went to school for 3d art and game design so everything I know of programming is self taught and it isnt a lot.

@natowi
Copy link
Member

natowi commented May 25, 2024

@NeoIsrafil AliceVision is now supported by ZLUDA "ZLUDA lets you run unmodified CUDA applications with near-native performance on AMD GPUs."

However some testing is still needed. You can find early binaries here alicevision/Meshroom#595 (comment) - I put it together blind, since I don´t have a AMD GPU... Or put it together on your own: https://github.com/vosen/ZLUDA/releases/tag/v3

@Jakdaw
Copy link

Jakdaw commented Jul 18, 2024

https://docs.scale-lang.com/ might be another path for AMD GPU on Linux (which fails-fast with Zluda at present)

@cg9999
Copy link

cg9999 commented Jul 18, 2024

https://docs.scale-lang.com/ might be another path for AMD GPU on Linux (which fails-fast with Zluda at present)

sadly seems to be closed source

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CUDA wip work in progress
Projects
None yet
Development

No branches or pull requests