Update set method for sparse to only add non-zero elements #145

Lundez · 2021-07-27T15:18:19Z

I think it makes sense to disregard irrelevant values for the sparse matrices. I first noticed this issue when using another library (kmath) which has a buildMatrix method they yet have to specialize for sparse structures.
This buildMatrix would be much better off if the set method never applied "0" elements. They risk of doubling the array-size way larger than required would still exist, but at least we wouldn't make sure that it happens 😉

The best approach forward is for kmath to better support sparse structures, but this step in the right direction would be helpful for not only kmath-user but also ejml ones 😄

Before submitting this PR, please make sure:

[-] Your code is covered by tests (something about memory breaking, not sure what...)
You ran ./gradlew spotlessJavaApply

lessthanoptimal · 2021-07-28T13:03:33Z

@Lundez thanks for the PR! Going to need to put some thought into this and make sure there are no unintended consequences. This might have also come up once before but I can't seem to find the PR or discussion. For example a potential negative is that, maybe someone wants to pre-initialize the sparse structure and now they need to jump through hoops and this is an unexpected behavior. In my personal real-world applications this isn't useful since I'm not assigning values unless it's non-zero (e.g Jacobian of a large sparse system). As an example of untended consequences, libraries which mark matrices as transposed instead of transposing them to save time/memory basically just end up with much more complex slower code and dubious memory saving.

Are there any more established sparse linear algebra libraries which have this as their default behavior? E.g. SuiteSparse

@FlorentinD @szarnyasg Any thoughts on this as well?

szarnyasg · 2021-07-28T13:06:58Z

@lessthanoptimal I think this is related to an earlier discussion in the SuiteSparse:GraphBLAS repository: GraphBLAS/LAGraph#28

It's quite a lengthy one, so the issue of explicit zeros is not a trivial one. SuiteSparse:GraphBLAS can yield explicit zeros in some cases and there are also ways to drop them (e.g. with GxB_select).

lessthanoptimal

I don't think this is going to work as intended. Once a non-zero element is already in the graph structure you want to assign it the new value no matter what. If the element is not in the graph structure and the value being assigned is zero you can safely not add it.

lessthanoptimal · 2021-07-28T13:09:16Z

main/ejml-core/src/org/ejml/data/DMatrixSparseTriplet.java

@@ -169,7 +171,7 @@ public void unsafe_set( int row, int col, double value ) {
 int index = nz_index(row, col);
 if (index < 0)
 addItem(row, col, value);
- else {
+ else if (value != 0.0) {


Already in the graph structure so you want to update the value.

lessthanoptimal · 2021-07-28T13:09:42Z

main/ejml-core/src/org/ejml/data/DMatrixSparseTriplet.java

@@ -169,7 +171,7 @@ public void unsafe_set( int row, int col, double value ) {
 int index = nz_index(row, col);
 if (index < 0)
 addItem(row, col, value);


not in the graph structure so you could check to see if it's zero here and not add it.

lessthanoptimal · 2021-07-28T13:10:00Z

main/ejml-core/src/org/ejml/data/DMatrixSparseCSC.java

@@ -231,7 +231,9 @@ public void set( int row, int col, double val ) {
 if (row < 0 || row >= numRows || col < 0 || col >= numCols)
 throw new IllegalArgumentException("Outside of matrix bounds");

- unsafe_set(row, col, val);


see comments for triplet below.

lessthanoptimal · 2021-07-28T13:23:13Z

@szarnyasg Thanks for linking to that discussion!

Thanks to @FlorentinD you can remove zero elements too using CommonOps_DSCC.select()

FlorentinD · 2021-07-29T07:37:48Z

Greetings to Lund @Lundez

Personally, for a graph context, I favor having the possibility of 0 weights.

Maybe we could provide another method that does not set zero elements, instead of changing existing behavior?

Lundez · 2021-07-29T08:01:03Z

Greetings to Lund @Lundez

Personally, for a graph context, I favor having the possibility of 0 weights.

Maybe we could provide another method that does not set zero elements, instead of changing existing behavior?

Could make sense, or doing it the other way around.
@lessthanoptimal any preference? If you still think this idea makes sense that is

Thanks for everyones input, as soon as I know which direction I should take I'll fix the comments 👍

ennerf · 2021-07-29T11:16:57Z

I'll chime in here as well. I've tried this optimization a few times, but for my use cases the extra branching always ended up costing more than I got from any subsequent benefits. There may be cases where it's worth it, but I'd vote against making it the default behavior.

lessthanoptimal · 2021-07-29T17:02:47Z

@ennerf thanks for that insight! What matrix size are you dealing with again? Curious if this could be replicated in a benchmark.

lessthanoptimal · 2021-07-29T17:10:10Z

@Lundez Right now I'm leaning towards having a distinctive setter that filters out zeros. That will also have a lower bar for being added as it can't potentially break a ton of code. Also forgot to mention in the PR, unit tests are missing.

Unrelated to this PR are you a contributor to kmath? I might have some performance suggestions that could be discussed outside.

lessthanoptimal · 2021-07-29T17:27:33Z

For what it's worth, it looks like Matlab and Octave filter out and remove 0 when assigning.

a=sparse(10,5)
a(3,4) = 10
a(4,5) = 0
a(3,4) = 0

Octave Outputs:

Compressed Column Sparse (rows = 10, cols = 5, nnz = 0 [0%])

Compressed Column Sparse (rows = 10, cols = 5, nnz = 1 [2%])
(3, 4) ->  10

Compressed Column Sparse (rows = 10, cols = 8, nnz = 1 [1.2%])
(3, 4) ->  10

Compressed Column Sparse (rows = 10, cols = 8, nnz = 0 [0%])

Lundez · 2021-07-29T17:52:25Z

@lessthanoptimal makes sense 👍

I'm not a contributor, but I guess I could get a PR up.
Currently I've added my own methods on top of their lib to optimize the usage of sparse matrices (such as defining size & only non-zero elements).
The thing I like about kmath is that it'll simplify a transition from pure JVM libraries to native if I need to make that change later for my NLP-library which will include Deep Learning at some time where I need the GPU optimizations (like they got support for ND4J & PyTorch Tensors already).

ennerf · 2021-07-29T22:53:49Z

@lessthanoptimal I think the largest CSC I used to benchmark in the polynomial spline trajectory project was 9k squared with 15k entries (1k segments), but I assume that the ratio of zeros probably matters more than the raw size. As far as I remember there were very few actual zeros, but unfortunately I'm not using CSC in that project anymore and can't easily reproduce the benchmark data.

For a filtering setter it may be useful to have a minimum threshold for values that can be considered effectively zero, e.g., if( abs(value) < threshold) ignore.

FlorentinD · 2021-08-02T20:45:56Z

For more intuitive filtering based on the threshold idea by @ennerf , we could provide a 'filter(matrix, threshold)' method which delegates to select with a partly fixed lambda ('x-> |x| > threshold')

Lundez added 2 commits July 27, 2021 17:04

Update set method for sparse to only add non-zero elements

647119b

Fix issue

6808d3a

Lundez marked this pull request as ready for review July 27, 2021 15:25

Lundez requested a review from lessthanoptimal as a code owner July 27, 2021 15:25

lessthanoptimal requested changes Jul 28, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update set method for sparse to only add non-zero elements #145

Update set method for sparse to only add non-zero elements #145

Lundez commented Jul 27, 2021 •

edited

lessthanoptimal commented Jul 28, 2021

szarnyasg commented Jul 28, 2021

lessthanoptimal left a comment

lessthanoptimal Jul 28, 2021

lessthanoptimal Jul 28, 2021

lessthanoptimal Jul 28, 2021

lessthanoptimal commented Jul 28, 2021

FlorentinD commented Jul 29, 2021

Lundez commented Jul 29, 2021

ennerf commented Jul 29, 2021

lessthanoptimal commented Jul 29, 2021

lessthanoptimal commented Jul 29, 2021

lessthanoptimal commented Jul 29, 2021

Lundez commented Jul 29, 2021 •

edited

ennerf commented Jul 29, 2021 •

edited

FlorentinD commented Aug 2, 2021

Update set method for sparse to only add non-zero elements #145

Are you sure you want to change the base?

Update set method for sparse to only add non-zero elements #145

Conversation

Lundez commented Jul 27, 2021 • edited

lessthanoptimal commented Jul 28, 2021

szarnyasg commented Jul 28, 2021

lessthanoptimal left a comment

Choose a reason for hiding this comment

lessthanoptimal Jul 28, 2021

Choose a reason for hiding this comment

lessthanoptimal Jul 28, 2021

Choose a reason for hiding this comment

lessthanoptimal Jul 28, 2021

Choose a reason for hiding this comment

lessthanoptimal commented Jul 28, 2021

FlorentinD commented Jul 29, 2021

Lundez commented Jul 29, 2021

ennerf commented Jul 29, 2021

lessthanoptimal commented Jul 29, 2021

lessthanoptimal commented Jul 29, 2021

lessthanoptimal commented Jul 29, 2021

Lundez commented Jul 29, 2021 • edited

ennerf commented Jul 29, 2021 • edited

FlorentinD commented Aug 2, 2021

Lundez commented Jul 27, 2021 •

edited

Lundez commented Jul 29, 2021 •

edited

ennerf commented Jul 29, 2021 •

edited