Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MadNLPGPU] Upgrade CUDSS -- support iterative refinement and hybrid mode #329

Merged
merged 1 commit into from
Jul 15, 2024

Conversation

amontoison
Copy link
Contributor

No description provided.

@amontoison amontoison changed the title [MadNLPGPU] Upgrade CUDSS and support iterative refinement [MadNLPGPU] Upgrade CUDSS -- support iterative refinement and hybrid mode Jul 5, 2024
@amontoison
Copy link
Contributor Author

cc @frapac

@amontoison
Copy link
Contributor Author

I tested on my cluster and I have a few tests that are failing.
They are not related to this PR.

Error During Test at /home/montalex/.julia/packages/MadNLPTests/kA3ek/src/MadNLPTests.jl:139
  Got exception outside of a @test
  type CuSparseMatrixCSC has no field nzval
  Stacktrace:
    [1] getproperty
      @ ./Base.jl:37 [inlined]
    [2] initialize!(kkt::MadNLP.SparseCondensedKKTSystem{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, CUDA.CUSPARSE.CuSparseMatrixCSC{Float64, Int32}, MadNLP.ExactHessian{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, LapackGPUSolver{Float64}, CuArray{Int64, 1, CUDA.DeviceMemory}, CuArray{Int32, 1, CUDA.DeviceMemory}, CuArray{Tuple{Int32, Int32}, 1, CUDA.DeviceMemory}, CuArray{Tuple{Int32, Tuple{Int64, Int64, Int64}}, 1, CUDA.DeviceMemory}, @NamedTuple{jptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, hess_com_ptr::CuArray{Tuple{Int64, Int64}, 1, CUDA.DeviceMemory}, hess_com_ptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, jt_csc_ptr::CuArray{Tuple{Int64, Int64}, 1, CUDA.DeviceMemory}, jt_csc_ptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, diag_map_to::CuArray{Int32, 1, CUDA.DeviceMemory}, diag_map_fr::CuArray{Int32, 1, CUDA.DeviceMemory}}})
      @ MadNLP ~/.julia/packages/MadNLP/u0fX5/src/KKT/sparse.jl:426
    [3] initialize!(solver::MadNLPSolver{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, CuArray{Int64, 1, CUDA.DeviceMemory}, MadNLP.SparseCondensedKKTSystem{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, CUDA.CUSPARSE.CuSparseMatrixCSC{Float64, Int32}, MadNLP.ExactHessian{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, LapackGPUSolver{Float64}, CuArray{Int64, 1, CUDA.DeviceMemory}, CuArray{Int32, 1, CUDA.DeviceMemory}, CuArray{Tuple{Int32, Int32}, 1, CUDA.DeviceMemory}, CuArray{Tuple{Int32, Tuple{Int64, Int64, Int64}}, 1, CUDA.DeviceMemory}, @NamedTuple{jptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, hess_com_ptr::CuArray{Tuple{Int64, Int64}, 1, CUDA.DeviceMemory}, hess_com_ptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, jt_csc_ptr::CuArray{Tuple{Int64, Int64}, 1, CUDA.DeviceMemory}, jt_csc_ptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, diag_map_to::CuArray{Int32, 1, CUDA.DeviceMemory}, diag_map_fr::CuArray{Int32, 1, CUDA.DeviceMemory}}}, MadNLPTests.SparseWrapperModel{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, Float64, Vector{Int64}, Vector{Float64}, NLPModelsJuMP.MathOptNLPModel}, MadNLP.SparseCallback{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, CuArray{Int64, 1, CUDA.DeviceMemory}, MadNLPTests.SparseWrapperModel{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, Float64, Vector{Int64}, Vector{Float64}, NLPModelsJuMP.MathOptNLPModel}, MadNLP.RelaxBound, MadNLP.RelaxEquality}, MadNLP.RichardsonIterator{Float64, MadNLP.SparseCondensedKKTSystem{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, CUDA.CUSPARSE.CuSparseMatrixCSC{Float64, Int32}, MadNLP.ExactHessian{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, LapackGPUSolver{Float64}, CuArray{Int64, 1, CUDA.DeviceMemory}, CuArray{Int32, 1, CUDA.DeviceMemory}, CuArray{Tuple{Int32, Int32}, 1, CUDA.DeviceMemory}, CuArray{Tuple{Int32, Tuple{Int64, Int64, Int64}}, 1, CUDA.DeviceMemory}, @NamedTuple{jptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, hess_com_ptr::CuArray{Tuple{Int64, Int64}, 1, CUDA.DeviceMemory}, hess_com_ptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, jt_csc_ptr::CuArray{Tuple{Int64, Int64}, 1, CUDA.DeviceMemory}, jt_csc_ptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, diag_map_to::CuArray{Int32, 1, CUDA.DeviceMemory}, diag_map_fr::CuArray{Int32, 1, CUDA.DeviceMemory}}}}, MadNLP.InertiaBased, MadNLP.UnreducedKKTVector{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, CuArray{Int64, 1, CUDA.DeviceMemory}}})
      @ MadNLP ~/.julia/packages/MadNLP/u0fX5/src/IPM/solver.jl:60
    [4] solve!(nlp::MadNLPTests.SparseWrapperModel{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, Float64, Vector{Int64}, Vector{Float64}, NLPModelsJuMP.MathOptNLPModel}, solver::MadNLPSolver{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, CuArray{Int64, 1, CUDA.DeviceMemory}, MadNLP.SparseCondensedKKTSystem{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, CUDA.CUSPARSE.CuSparseMatrixCSC{Float64, Int32}, MadNLP.ExactHessian{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, LapackGPUSolver{Float64}, CuArray{Int64, 1, CUDA.DeviceMemory}, CuArray{Int32, 1, CUDA.DeviceMemory}, CuArray{Tuple{Int32, Int32}, 1, CUDA.DeviceMemory}, CuArray{Tuple{Int32, Tuple{Int64, Int64, Int64}}, 1, CUDA.DeviceMemory}, @NamedTuple{jptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, hess_com_ptr::CuArray{Tuple{Int64, Int64}, 1, CUDA.DeviceMemory}, hess_com_ptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, jt_csc_ptr::CuArray{Tuple{Int64, Int64}, 1, CUDA.DeviceMemory}, jt_csc_ptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, diag_map_to::CuArray{Int32, 1, CUDA.DeviceMemory}, diag_map_fr::CuArray{Int32, 1, CUDA.DeviceMemory}}}, MadNLPTests.SparseWrapperModel{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, Float64, Vector{Int64}, Vector{Float64}, NLPModelsJuMP.MathOptNLPModel}, MadNLP.SparseCallback{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, CuArray{Int64, 1, CUDA.DeviceMemory}, MadNLPTests.SparseWrapperModel{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, Float64, Vector{Int64}, Vector{Float64}, NLPModelsJuMP.MathOptNLPModel}, MadNLP.RelaxBound, MadNLP.RelaxEquality}, MadNLP.RichardsonIterator{Float64, MadNLP.SparseCondensedKKTSystem{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, CUDA.CUSPARSE.CuSparseMatrixCSC{Float64, Int32}, MadNLP.ExactHessian{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, LapackGPUSolver{Float64}, CuArray{Int64, 1, CUDA.DeviceMemory}, CuArray{Int32, 1, CUDA.DeviceMemory}, CuArray{Tuple{Int32, Int32}, 1, CUDA.DeviceMemory}, CuArray{Tuple{Int32, Tuple{Int64, Int64, Int64}}, 1, CUDA.DeviceMemory}, @NamedTuple{jptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, hess_com_ptr::CuArray{Tuple{Int64, Int64}, 1, CUDA.DeviceMemory}, hess_com_ptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, jt_csc_ptr::CuArray{Tuple{Int64, Int64}, 1, CUDA.DeviceMemory}, jt_csc_ptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, diag_map_to::CuArray{Int32, 1, CUDA.DeviceMemory}, diag_map_fr::CuArray{Int32, 1, CUDA.DeviceMemory}}}}, MadNLP.InertiaBased, MadNLP.UnreducedKKTVector{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, CuArray{Int64, 1, CUDA.DeviceMemory}}}, stats::MadNLP.MadNLPExecutionStats{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}; x::Nothing, y::Nothing, zl::Nothing, zu::Nothing, kwargs::@Kwargs{})
      @ MadNLP ~/.julia/packages/MadNLP/u0fX5/src/IPM/solver.jl:159
    [5] solve!
      @ ~/.julia/packages/MadNLP/u0fX5/src/IPM/solver.jl:128 [inlined]
    [6] solve!
      @ ~/.julia/packages/MadNLP/u0fX5/src/IPM/solver.jl:14 [inlined]
    [7] solve!(solver::MadNLPSolver{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, CuArray{Int64, 1, CUDA.DeviceMemory}, MadNLP.SparseCondensedKKTSystem{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, CUDA.CUSPARSE.CuSparseMatrixCSC{Float64, Int32}, MadNLP.ExactHessian{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, LapackGPUSolver{Float64}, CuArray{Int64, 1, CUDA.DeviceMemory}, CuArray{Int32, 1, CUDA.DeviceMemory}, CuArray{Tuple{Int32, Int32}, 1, CUDA.DeviceMemory}, CuArray{Tuple{Int32, Tuple{Int64, Int64, Int64}}, 1, CUDA.DeviceMemory}, @NamedTuple{jptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, hess_com_ptr::CuArray{Tuple{Int64, Int64}, 1, CUDA.DeviceMemory}, hess_com_ptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, jt_csc_ptr::CuArray{Tuple{Int64, Int64}, 1, CUDA.DeviceMemory}, jt_csc_ptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, diag_map_to::CuArray{Int32, 1, CUDA.DeviceMemory}, diag_map_fr::CuArray{Int32, 1, CUDA.DeviceMemory}}}, MadNLPTests.SparseWrapperModel{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, Float64, Vector{Int64}, Vector{Float64}, NLPModelsJuMP.MathOptNLPModel}, MadNLP.SparseCallback{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, CuArray{Int64, 1, CUDA.DeviceMemory}, MadNLPTests.SparseWrapperModel{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, Float64, Vector{Int64}, Vector{Float64}, NLPModelsJuMP.MathOptNLPModel}, MadNLP.RelaxBound, MadNLP.RelaxEquality}, MadNLP.RichardsonIterator{Float64, MadNLP.SparseCondensedKKTSystem{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, CUDA.CUSPARSE.CuSparseMatrixCSC{Float64, Int32}, MadNLP.ExactHessian{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, LapackGPUSolver{Float64}, CuArray{Int64, 1, CUDA.DeviceMemory}, CuArray{Int32, 1, CUDA.DeviceMemory}, CuArray{Tuple{Int32, Int32}, 1, CUDA.DeviceMemory}, CuArray{Tuple{Int32, Tuple{Int64, Int64, Int64}}, 1, CUDA.DeviceMemory}, @NamedTuple{jptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, hess_com_ptr::CuArray{Tuple{Int64, Int64}, 1, CUDA.DeviceMemory}, hess_com_ptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, jt_csc_ptr::CuArray{Tuple{Int64, Int64}, 1, CUDA.DeviceMemory}, jt_csc_ptrptr::CuArray{Int64, 1, CUDA.DeviceMemory}, diag_map_to::CuArray{Int32, 1, CUDA.DeviceMemory}, diag_map_fr::CuArray{Int32, 1, CUDA.DeviceMemory}}}}, MadNLP.InertiaBased, MadNLP.UnreducedKKTVector{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, CuArray{Int64, 1, CUDA.DeviceMemory}}})
      @ MadNLP ~/.julia/packages/MadNLP/u0fX5/src/IPM/solver.jl:17
    [8] madnlp(model::MadNLPTests.SparseWrapperModel{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}, Float64, Vector{Int64}, Vector{Float64}, NLPModelsJuMP.MathOptNLPModel}; kwargs::@Kwargs{print_level::MadNLP.LogLevels, linear_solver::UnionAll, lapack_algorithm::MadNLP.LinearFactorization})
      @ MadNLP ~/.julia/packages/MadNLP/u0fX5/src/IPM/solver.jl:11
    [9] macro expansion
      @ ~/.julia/packages/MadNLPTests/kA3ek/src/MadNLPTests.jl:149 [inlined]
   [10] macro expansion
      @ ~/Applications/julia/julia-1.10.3/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
   [11] unbounded(optimizer_constructor::var"#11#22"; Arr::Type)
      @ MadNLPTests ~/.julia/packages/MadNLPTests/kA3ek/src/MadNLPTests.jl:140
   [12] macro expansion
      @ ~/.julia/packages/MadNLPTests/kA3ek/src/MadNLPTests.jl:115 [inlined]
   [13] macro expansion
      @ ~/Applications/julia/julia-1.10.3/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
   [14] test_madnlp(name::String, optimizer_constructor::Function, exclude::Vector{String}; Arr::Type)
      @ MadNLPTests ~/.julia/packages/MadNLPTests/kA3ek/src/MadNLPTests.jl:114
   [15] macro expansion
      @ ~/Argonne/MadNLP.jl/lib/MadNLPGPU/test/madnlpgpu_test.jl:106 [inlined]
   [16] macro expansion
      @ ~/Applications/julia/julia-1.10.3/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
   [17] top-level scope
      @ ~/Argonne/MadNLP.jl/lib/MadNLPGPU/test/madnlpgpu_test.jl:102
   [18] include(fname::String)
      @ Base.MainInclude ./client.jl:489
   [19] macro expansion
      @ ~/Argonne/MadNLP.jl/lib/MadNLPGPU/test/runtests.jl:7 [inlined]
   [20] macro expansion
      @ ~/Applications/julia/julia-1.10.3/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
   [21] top-level scope
      @ ~/Argonne/MadNLP.jl/lib/MadNLPGPU/test/runtests.jl:7
   [22] include(fname::String)
      @ Base.MainInclude ./client.jl:489
   [23] top-level scope
      @ none:6
   [24] eval
      @ ./boot.jl:385 [inlined]
   [25] exec_options(opts::Base.JLOptions)
      @ Base ./client.jl:291
   [26] _start()
      @ Base ./client.jl:552
Test Summary:                                                | Pass  Error  Total     Time
MadNLPGPU test                                               |  170     51    221  5m20.0s
  MadNLPGPU test                                             |    2     51     53  2m42.3s
    CUDSS                                                    |           5      5  1m19.8s
      infeasible                                             |           1      1  1m08.2s
      unbounded                                              |           1      1     0.6s
      lootsma                                                |           1      1     2.1s
      eigmina                                                |           1      1     2.1s
      lp_examodels_issue75                                   |           1      1     0.4s
    CUDSS-AMD                                                |           5      5     5.5s
      infeasible                                             |           1      1     0.5s
      unbounded                                              |           1      1     0.7s
      lootsma                                                |           1      1     0.3s
      eigmina                                                |           1      1     0.4s
      lp_examodels_issue75                                   |           1      1     0.3s
    CUDSS-METIS                                              |           5      5     5.3s
      infeasible                                             |           1      1     0.4s
      unbounded                                              |           1      1     0.3s
      lootsma                                                |           1      1     0.3s
      eigmina                                                |           1      1     0.4s
      lp_examodels_issue75                                   |           1      1     0.3s
    CUDSS-HYBRID                                             |           5      5     5.3s
      infeasible                                             |           1      1     0.6s
      unbounded                                              |           1      1     0.3s
      lootsma                                                |           1      1     0.4s
      eigmina                                                |           1      1     0.4s
      lp_examodels_issue75                                   |           1      1     0.3s
    CUSOLVERRF                                               |           5      5     9.7s
      infeasible                                             |           1      1     4.9s
      unbounded                                              |           1      1     0.4s
      lootsma                                                |           1      1     0.3s
      eigmina                                                |           1      1     0.4s
      lp_examodels_issue75                                   |           1      1     0.3s
    CUSOLVER-CHOLESKY                                        |           5      5    10.1s
      infeasible                                             |           1      1     5.5s
      unbounded                                              |           1      1     0.3s
      lootsma                                                |           1      1     0.3s
      eigmina                                                |           1      1     0.4s
      lp_examodels_issue75                                   |           1      1     0.3s
    GLU                                                      |           5      5     8.6s
      infeasible                                             |           1      1     3.9s
      unbounded                                              |           1      1     0.3s
      lootsma                                                |           1      1     0.3s
      eigmina                                                |           1      1     0.4s
      lp_examodels_issue75                                   |           1      1     0.3s
    LapackGPU-BUNCHKAUFMAN                                   |           5      5     8.8s
      infeasible                                             |           1      1     4.0s
      unbounded                                              |           1      1     0.3s
      lootsma                                                |           1      1     0.3s
      eigmina                                                |           1      1     0.4s
      lp_examodels_issue75                                   |           1      1     0.3s
    LapackGPU-LU                                             |           5      5     5.0s
      infeasible                                             |           1      1     0.3s
      unbounded                                              |           1      1     0.3s
      lootsma                                                |           1      1     0.4s
      eigmina                                                |           1      1     0.4s
      lp_examodels_issue75                                   |           1      1     0.3s
    LapackGPU-QR                                             |           5      5     5.0s
      infeasible                                             |           1      1     0.3s
      unbounded                                              |           1      1     0.4s
      lootsma                                                |           1      1     0.3s
      eigmina                                                |           1      1     0.4s
      lp_examodels_issue75                                   |           1      1     0.3s
    LapackGPU-CHOLESKY                                       |           1      1     2.8s
      unbounded                                              |           1      1     2.8s
  MadNLPGPU (MadNLP.DenseKKTSystem)                          |   60            60  1m44.8s
  MadNLPGPU (MadNLP.DenseCondensedKKTSystem)                 |   60            60    33.8s
  MadNLP: MadNLP.BFGS + MadNLP.DenseKKTSystem                |   12            12     4.1s
  MadNLP: MadNLP.BFGS + MadNLP.DenseCondensedKKTSystem       |   12            12     3.8s
  MadNLP: MadNLP.DampedBFGS + MadNLP.DenseKKTSystem          |   12            12     3.5s
  MadNLP: MadNLP.DampedBFGS + MadNLP.DenseCondensedKKTSystem |   12            12     3.8s
ERROR: LoadError: Some tests did not pass: 170 passed, 0 failed, 51 errored, 0 broken.
in expression starting at /home/montalex/Argonne/MadNLP.jl/lib/MadNLPGPU/test/runtests.jl:6
ERROR: Package MadNLPGPU errored during testing

@frapac
Copy link
Collaborator

frapac commented Jul 15, 2024

@amontoison Tests for MadNLPGPU are passing locally on my machine. I am not sure to understand why moonshot cannot find CUDSS.jl 0.3.1 there.

@amontoison amontoison closed this Jul 15, 2024
@amontoison amontoison reopened this Jul 15, 2024
@amontoison
Copy link
Contributor Author

@amontoison Tests for MadNLPGPU are passing locally on my machine. I am not sure to understand why moonshot cannot find CUDSS.jl 0.3.1 there.

I restarted the tests.

@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.17%. Comparing base (2e301da) to head (65e334a).

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #329   +/-   ##
=======================================
  Coverage   70.17%   70.17%           
=======================================
  Files          45       45           
  Lines        3943     3943           
=======================================
  Hits         2767     2767           
  Misses       1176     1176           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@frapac
Copy link
Collaborator

frapac commented Jul 15, 2024

Thank you Alexis!

@frapac frapac merged commit dbfc0fc into MadNLP:master Jul 15, 2024
11 of 12 checks passed
@amontoison amontoison deleted the cudss_update branch July 15, 2024 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants