Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example script to export webknossos volume annotations #7

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

kabilar
Copy link
Member

@kabilar kabilar commented Oct 9, 2024

Hi @jingjingwu1225, here is an example script to export Webknossos annotations based on their documentation. I have tested this on the LINC Jupyter Hub for the magnification level of 8-8-1 and all lower resolutions. Please let me know if it crashes for higher resolution levels.

In order to run:

  1. Install the following Python packages: webknossos, tifffile, and numpy==1.26.1
  2. The WK_TOKEN can be found under the main menu under Auth Token:
    image
  3. Use the command:
    WK_URL="https://webknossos.lincbrain.org" WK_TOKEN="<add_token>" python export_annotation.py

Please update the script to include the following changes:

  1. Export to a single Zarr instead of TIFF files
  2. Export all z slices to the same Zarr
  3. Export all segment_ids to the same Zarr
  4. Export all resolution levels to the same Zarr
  5. Review the annotations to ensure a 1-1 mapping between the annotations in the exported Zarrs and the annotations on webknossos.lincbrain.org.

cc @aaronkanzer @balbasty

@kabilar
Copy link
Member Author

kabilar commented Oct 9, 2024

Upon quick review of some of the TIFFs generated, the annotations are visible in the images.

@kabilar
Copy link
Member Author

kabilar commented Oct 9, 2024

I added points 4 and 5 above.

@kabilar
Copy link
Member Author

kabilar commented Oct 9, 2024

Drafting this pull request since it shouldn't be merged as is, but it is here for reference.

@kabilar kabilar marked this pull request as draft October 9, 2024 21:36
@kabilar
Copy link
Member Author

kabilar commented Oct 9, 2024

Perhaps using a smaller buffer_size for the get_buffered_slice_reader() method would reduce the risk of filling up the JupyterHub node memory at higher resolution levels.

Also, get_buffered_slice_writer() seems to write data to disk as soon as the buffer is full. Will have to explore more to see what formats can be written to disk.

@kabilar
Copy link
Member Author

kabilar commented Oct 16, 2024

Hi @jingjingwu1225 @balbasty @aaronkanzer, I am still testing but the following seems to be working for me without putting a stress on the Webknossos server or the JupyterHub instance. Please let me know what you think.

annotation_zarr_link = 'https://webknossos.lincbrain.org/data/annotations/zarr/v2yszt4hvDxpIXKK/Volume/'
annotation_name = 'JW_MR243_20240927'
local_path = '/home/jovyan'

source_group = zarr.convenience.open(store=annotation_zarr_link)
dest_group = zarr.hierarchy.group(store=local_path)
zarr.convenience.copy(source=source_group, dest=dest_group, name=f'{annotation_name}.zarr')

@balbasty
Copy link
Collaborator

That's very neat! Do you know if it's possible to specify the chunking options for the output array in the copy operation?

@kabilar
Copy link
Member Author

kabilar commented Oct 16, 2024

It does look like we can pass any keyword arguments to zarr.convenience.copy that would then get passed to create_dataset when copying the array. But we may only be able to use a single chunk size for all resolution levels. I am still exploring but would we want different chunk sizes for the different resolution levels?

@balbasty
Copy link
Collaborator

We've used the same chunk size across levels so far, so I don't see this as too much of a problem.

@kabilar
Copy link
Member Author

kabilar commented Oct 16, 2024

Great. I am now testing for chunks=(1,128,128,1). Will let you know how it goes.

If we do need different chunk sizes, we could loop through the resolution levels (see code snippet below) and write them individually to the destination group.

Input

for array_name, array in source_group.arrays(): print(array_name, array)

Output

1 <zarr.core.Array '/1' (1, 126976, 99630, 73) uint32>
128-128-1 <zarr.core.Array '/128-128-1' (1, 992, 778, 73) uint32>
16-16-1 <zarr.core.Array '/16-16-1' (1, 7936, 6226, 73) uint32>
2-2-1 <zarr.core.Array '/2-2-1' (1, 63488, 49815, 73) uint32>
256-256-1 <zarr.core.Array '/256-256-1' (1, 496, 389, 73) uint32>
32-32-1 <zarr.core.Array '/32-32-1' (1, 3968, 3113, 73) uint32>
4-4-1 <zarr.core.Array '/4-4-1' (1, 31744, 24907, 73) uint32>
64-64-1 <zarr.core.Array '/64-64-1' (1, 1984, 1556, 73) uint32>
8-8-1 <zarr.core.Array '/8-8-1' (1, 15872, 12453, 73) uint32>

@kabilar
Copy link
Member Author

kabilar commented Oct 17, 2024

Hi team, it looks like we also need to pass the arguments fill_value=0 and write_empty_chunks=False to ensure that only chunks with non-zero values are stored. See Zarr API docs.

Webknossos is offline right now so will finish up testing tomorrow.

@kabilar
Copy link
Member Author

kabilar commented Oct 17, 2024

Also, I switched to a chunk size of (1, 4096, 4096, 1) since that was previously decided upon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants