Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High Content Screening #630

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from
Draft

High Content Screening #630

wants to merge 12 commits into from

Conversation

s-n-i
Copy link
Contributor

@s-n-i s-n-i commented Aug 19, 2022

Background

I am opening a draft PR to see if there is interest in having viv support loading multiple datasets for high content screening. This is accomplished by passing in an array of loaders and information on how to position the images next to each other into the PictureInPictureViewer. This change is backwards-compatible, so Avivator works as before. I marked the PR as draft because it is not yet ready to be merged, and I am looking for feedback. It would be great to have this functionality merged into viv, rather than maintaining a fork of viv.

image

Change List

  • Support for multiple loaders and position data

Checklist

  • [✓] Update JSdoc types if there is any API change.
  • [✓] Make sure Avivator works as expected with your change.

@s-n-i s-n-i changed the title Loaders High Content Screening Aug 22, 2022
@ilan-gold
Copy link
Collaborator

ilan-gold commented Aug 23, 2022

@s-n-i As a start, I would want a layer that does this, not a viewer. @manzt has expressed opposition to having this in the core of viv, and I completely understand his point, so I to would lean towards "no" (see #287) EDIT: maybe he hasn't? I seem to remember agreeing that the core of Viv was not meant for this but maybe not. Laying out MultiscaleImageLayers how one would like probably should not constitute a core contribution.

That being said, how long does the above screenshot take to load? If you had some contribution that allowed super fast loading or interaction (both of which have been challenges in the past from what I remember), I think then we might be more interested in that since others (i.e vizarr) would benefit from this too and so having this committed would be broadly helpful.

@s-n-i
Copy link
Contributor Author

s-n-i commented Aug 24, 2022

@ilan-gold I tested how long loading 16 plates takes. The application starts with only one plate in the viewport. It takes about 3 seconds to load it. It takes 2 seconds to zoom out and load the remaining 15 plates. Only plates that are within the viewport get loaded. I am not serving the datasets locally, so network speed has an effect.

With this implementation, when loading 16 plates, I am not noticing any rendering performance problems. When displaying around 100 plates, the performance is noticeably slower, but still usable.

Our approach to fast loading and interaction with hundreds of plates involves generating small top levels for these pyramids, with the highest level being 1 pixel in size, discussed in #620.

@ilan-gold
Copy link
Collaborator

ilan-gold commented Aug 30, 2022

Relatedly @s-n-i - why are you maintaining a fork? Just curious, are you using Avivator as your viewer? @manzt maybe it would be worth releasing Avivator as a package too i.e a giant React component? And allow people to pass in a high level prop for layers or something? Maybe we need more use-cases for this...

@manzt
Copy link
Member

manzt commented Aug 30, 2022

maybe it would be worth releasing Avivator as a package too i.e a giant React component?

I think this is beyond the scope of Viv. The React components are already intended to be extensible, and this would open up lots of additional development burden with having to maintain and document an additional API.

@ilan-gold
Copy link
Collaborator

ilan-gold commented Aug 31, 2022

So @s-n-i I am not sure how @manzt feels, but a very generic GridLayer could be a nice addition. I think generally, the goal would be to pare this down to the minimum of what is needed to get your feature working. So there's a few things:

  1. How exactly are you providing the loader URL? Is the OME-Zarr HCS spec?
  2. How are you using/deploying your version of Avivator?
  3. Is it possible to do what you are asking by just creating a layer that orchestrates all of this? One issue is that standardizing the relationship between loaders, layers, and layout maybe be too complex for Viv. That being said, we do have this wonderful monorepo now, so maybe we could make a experimental-layers package @manzt? Something where community members could iterate on things until they become stable, so perhaps moving vizarr's implementation of a GridLayer there too?
  4. Relatedly, @s-n-i, why not use vizarr for showing HCS?

Thanks @s-n-i !

@s-n-i
Copy link
Contributor Author

s-n-i commented Sep 6, 2022

I am maintaining a viv fork because it appeared to be the most efficient way to implement the functionality to load multiple plates. Please let me know, if an alternative approach could be better.

I am using a modified version of Avivator as the viewer, so I would have a simpler code base if Avivator were to be added to the viv library on npm. Actually, it does get packaged into the viv library locally if I run pnpm run build && pnpm pack.

  1. I have an array of URLs and I call Avivator's createLoader function for each of them. This creates an array of loaders, which I pass into the modified PictureInPictureViewer.
  2. I have a custom modification of Avivator consume our fork of the viv library.
  3. Yes, creating a new layer would work as well. This might even improve performance, if the layer has logic to only load datasets that are within the viewport.
  4. I am not very familiar withvizarr. My understanding is that it runs inside of a Jupyter Notebook and we would like to have a web application.

@ilan-gold
Copy link
Collaborator

@s-n-i Let's see what @manzt has to say about a layer. I'm not opposed since it seems like a common use-case, even outside of HCS. There are also loads of nice performance improvements, I imagine, to be made that everyone would benefit from.

That being said, here is vizarr in a web app. This previous example is just one image but here are some HCS examples

If vizarr does not work for you, I'm not sure what we can do. I don't think we're in a position right now to maintain an Avivator API one could customize (nor am I sure this is really something we want to do given the complexity of it being a full page web application).

What parts of Avivator have you changed? If it is just a few lines, perhaps exporting just the controller would be a nice middle ground (if that is not changing for you), although this too is a bit fraught because of its shared state with the actual viewer.

@s-n-i
Copy link
Contributor Author

s-n-i commented Sep 6, 2022

Thank you for sharing the vizarr web app, I can see it potentially being suitable for our goals. We would just need to evaluate it in more detail.

The motivation behind our approach is allowing the user to perform high-content screening by visualizing a grid of multiple datasets, which are not in the HCS format. This simplifies the data pipeline because data scientists would not need to convert existing datasets into HCS format.

In Avivator I have changed Controller.jsx, hooks.js, and Viewer.jsx. My approach does not require any changes to Avivator. Exporting it "as is" from the viv library would be sufficient.

Also, I am not 100% sure about this, but it looks like creating a custom layer would also require modifying the PictureInPictureViewer, as well as other viewers, so that they can load this new layer.

@ilan-gold
Copy link
Collaborator

ilan-gold commented Sep 10, 2022

@s-n-i If you have altered those files, what would an Avivator component API look like should we release it? I guess one route I could see here that might make everyone happy:

  1. Implement a general purpose grid layer and update upstream API's to allow for it
  2. Allow users to pass in comma-separated lists of URLS to Avivator
  3. Optionally: If the images have different channel lists, make the Avivator controller flexible enough to handle this.

The main thing I am worried about is how the controller would work - if all these images have different channels for example, you would need different controllers for each? How does this work? All that being said, I would feel comfortable with the above three changes. It will be a bit of work and we would want to get the API just right here, but this seems feasible.

@s-n-i
Copy link
Contributor Author

s-n-i commented Sep 10, 2022

Here is how I have it currently implemented.

<PictureInPictureViewer
      gridLoaders={{
        loaders,
        spacingX: 12345,
        spacingY: 12345,
        numberOfColumns: 12
      }}

the loaders are set like this:

Promise.all(urls.map((url) => createLoader(url))).then((values) => setLoaders(values.map((value) => value.data)));

For the channels we have a few options:

  1. Only allow datasets with the same channels and display an error when loading datasets with different channels.
  2. Show a slider for each unique channel. Moving this slider affects all datasets that have this channel.
  3. Show sliders for for the currently selected dataset.

@manzt
Copy link
Member

manzt commented Sep 12, 2022

Catching up on this thread.

I think I would be supportive of iterating on a generalized gridlayer. We could use the vizarr implementation as a reference point as well as what you have been working on @s-n-i.

I am not very familiar with vizarr. My understanding is that it runs inside of a Jupyter Notebook and we would like to have a web application.

To clarify, vizarr is a general web-based image viewer for zarr-based images. It is intended to be used as a standalone web-app like Avivator (see embedded use in the OME Blog), and additionally has optional features for running within Jupyter Notebooks (and loading multiple image layers). Vizarr's key feature is support for OME-NGFF metadata. For example, rather than defining non-standard patterns for loading multiple images (i.e., comma separated URLs), plate layouts are expressed within the metadata. This allows Vizarr to be compatible with other NGFF-compatible viewers.

Allow users to pass in comma-separated lists of URLS to Avivator

hmm, with the multi-tiff loader comma-separated lists of URLS already have "special" meaning in Avivator. Why not have a new route for Avivator (and actually make use of the BrowserRouter) (i.e., https://avivator.gehlenborglab.org/grid?image_urls=)

@s-n-i
Copy link
Contributor Author

s-n-i commented Sep 12, 2022

I looked up the length limits on URLs in different browsers:

https://www.geeksforgeeks.org/maximum-length-of-a-url-in-different-browsers/

Looks like Microsoft Edge might not be able to fit URLs for hundreds of different datasets in the address bar.

@ilan-gold
Copy link
Collaborator

ilan-gold commented Sep 13, 2022

hmm, with the multi-tiff loader comma-separated lists of URLS already have "special" meaning in Avivator. Why not have a new route for Avivator (and actually make use of the BrowserRouter) (i.e., https://avivator.gehlenborglab.org/grid?image_urls=)

This is what I meant, not a(nother) file (format).

As for the "hundreds of data sets" issue, I think we'll need to think about this a bit...one option if you all have the capacity would be a URL shortener.

@ilan-gold
Copy link
Collaborator

I also think I may have misunderstood - you are considering non-HCS datasets that you wish to compare that were acquired separately and have some value when looked at side-by-side? Or you have non-HCS format HCS-acquired datasets?

@manzt
Copy link
Member

manzt commented Sep 13, 2022

As for the "hundreds of data sets" issue, I think we'll need to think about this a bit...one option if you all have the capacity would be a URL shortener.

At some point, pointing to many images URLs in the query parameters is just a poor choice. URL shortening means that the URLs are human readable, and there are likely better alternatives to expressing this information in a structured manner. Hundreds of URLs is unwieldy and the "comma separated list" is essentially a new format of its own. Also there is so much implicit information with a comma separated list of URLs.

How many rows? how many columns? are the images all the same size? In this case some type of manifest JSON file is probably most appropriate which contains all this metadata as well as links to the individual images, but this is essentially OME-NGFF plate specification and I'd rather not create any sort of manifest that is Avivator-specific.

Something like kerchunk could be used to create a "virtual" OME-NGFF plate from many TIFFs. Vizarr supports this kerchunk-reference based stores. Here is an example of reading and OME-TIFF as Zarr with a chunk reference: https://observablehq.com/@manzt/ome-tiff-as-filesystemreference

@s-n-i
Copy link
Contributor Author

s-n-i commented Sep 13, 2022

you are considering non-HCS datasets that you wish to compare that were acquired separately and have some value when looked at side-by-side?

Yes.

I don't fully understand the distinction between datasets "acquired separately" and "HCS-acquired".

@s-n-i
Copy link
Contributor Author

s-n-i commented Sep 14, 2022

More on this from a colleague:

I think what we would like to accomplish is to either:

  1. Never use the ngff hcs spec
  2. Use the ngff hcs spec, but allow the visualization to be configurable

I think us looking at vizarr would be a good idea, because they are likely doing something similar to what we want to accomplish
In the ngff hcs spec, there should be many pyramids built into the same zarr file, meaning many pyramids are being visualized side by side. This is effectively what we want to accomplish

@ilan-gold
Copy link
Collaborator

So it sounds like @s-n-i you would be content with vizarr? If so, I think we can close this unless you would like to spearhead development of a new GridLayer? My only point about "acquired separately" and "HCS-acquired" was that I thoguht you were talkign about different datasets that were acquired completely apart not together as part of a joint experiment of some sort for comparison. So I was thinking that if they were separate, there would be fewer - for example, just looking at 4 or 8 or 16 different tumor slides or something.

@s-n-i
Copy link
Contributor Author

s-n-i commented Sep 19, 2022

I will check with the team on whether to switch to vizarr or to spearhead the development of a new GridLayer.

For now, our viv fork works for loading over 100 datasets. The performance does drop relative to rendering a single dataset.

@manzt manzt force-pushed the main branch 6 times, most recently from 6477c65 to 6c43fa4 Compare November 17, 2023 00:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants