diff --git a/examples/pannuke_cellpose_benchmark.ipynb b/examples/pannuke_cellpose_benchmark.ipynb index f993c45..6ee93e9 100644 --- a/examples/pannuke_cellpose_benchmark.ipynb +++ b/examples/pannuke_cellpose_benchmark.ipynb @@ -7,7 +7,7 @@ "source": [ "# In this notebook, we will demo how to benchmark a trained model\n", "\n", - "We will be using [fvcore](https://github.com/facebookresearch/fvcore) package for computing the FLOPS and [pandas](https://pandas.pydata.org/docs/) for to display the results nicely." + "We will be using [fvcore](https://github.com/facebookresearch/fvcore) package for computing the FLOPS and [pandas](https://pandas.pydata.org/docs/) to display the results nicely." ] }, { @@ -391,7 +391,7 @@ } ], "source": [ - "# This is the mPQ over all pre-class PQs.\n", + "# This is the mPQ over all per-class PQs.\n", "res_df[res_df[[\"pq\", \"dq\", \"sq\"]] >= 0].mean(axis=0)" ] }, @@ -434,7 +434,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We can also look at the mean PQ per cell-type, to get a clearer picture of the hard to detect classes. Turns out that the dead cell class is very hard to detect. Reason for this is that there are very few examples in the training set.. Anyway, the results are far from being SOTA, so there is still some proper training to do. But on the bright side, the model learned to a descent mapping with only a handful of training epochs." + "We can also look at the mean PQ per cell-type to get a clearer picture of the hard to detect classes. Turns out that the dead cell class is very hard to detect. Reason for this is that there are very few examples of dead cells in the training set.. All in all, the results are far from being SOTA so there is still a lot of proper training to do to get a matching performance. On the bright side, the model learned a decent mapping with only a handful of training epochs." ] }, { @@ -535,7 +535,7 @@ "source": [ "## Latency and throughput Benchmarks\n", "\n", - "Next we will benchmark the model latency. These models typically have an NN backbone which outputs need to be post-processed. We will benchmark the latencies and throughputs for the post-processing and model forward pass separately. The benchmarking was done on a laptop with:\n", + "Next we will benchmark the model latency. These models typically have an NN backbone that spits out intermediate outputs that need to be post-processed. Thus, we will benchmark the latencies and throughputs for the post-processing and NN backbone forward pass separately. This benchmarking was done on a laptop with:\n", "\n", "- a worn-out NVIDIA GeForce RTX 2080 Mobile (8Gb VRAM)\n", "- Intel i7-9750H 6 x 6 cores @ 2.60GHz (32 GiB RAM)\n", @@ -563,7 +563,7 @@ "source": [ "First, let's compute the model forward pass mean latency over 100 samples (in seconds).\n", "\n", - "We also report the input size and batch size since those have an effect to the result." + "We also report the input size and batch size since those have an effect on the result." ] }, { @@ -637,9 +637,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Then, let's get the post-processing latencies. This cellpose model post-processing pipeline contains instance segmentation post-processing and cell type segmentation post-processing.\n", + "Then, let's get the post-processing latencies. This cellpose model's post-processing pipeline contains instance segmentation post-processing and cell type segmentation post-processing.\n", "\n", - "Note that, the post-processing latencies depend on the number of cells in each image, thus we are going to compute the latencies over the whole pannuke fold3. We are going to run the post-processing 10 times per image and report the mean for each image.\n", + "Note that, the post-processing latencies depend on the number of cells in each image, thus, we are going to compute the latencies over the whole pannuke fold3. We are going to run the post-processing 10 times per image and report the mean for each image.\n", "\n", "**NOTE**: This takes a while to finsh." ] @@ -672,9 +672,9 @@ "source": [ "### Sneak peak to latency benchmark tables\n", "\n", - "The Pannuke images are so small (256x256px) that the latencies very quick.. For larger images, the post-procesing methods can become a bottleneck. \n", + "The Pannuke images are so small (256x256px) that the latencies are small as well. However, for larger images, the post-procesing methods can become a bottleneck. \n", "\n", - "Note that the number of cells and number of pixels in these cells are reported as well since they have an effect on the latency." + "Note that the number of cells and number of pixels non-zero pixels are reported as well since they have an effect on the latency." ] }, { @@ -1124,7 +1124,7 @@ "\n", "In practice, the model (NN-backbone + post-processing + intermediate data-wrangilng) throughput is a lot more valuable metric than the latencies since it directly tells how scalable the model is. The throughput is important information especially when segmenting a large number of huge gigapixel images like WSIs.\n", "\n", - "Next up, we will compute the whole model inference + post-processing throughput (img/s)." + "Next, we will compute the whole model inference + post-processing throughput (img/s)." ] }, { @@ -1206,7 +1206,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "It took 2 min 45 sec to run the inference pipeline on this laptop on the Pannuke fold3 containing 2656 (256x256) images. This averages to ~0.06 seconds per image." + "It took 2 min 45 sec to run the inference pipeline on this laptop on the Pannuke fold3 containing 2656 (256x256) images. This averages to ~0.06 seconds per image. Note that the std is zero since we had run the whole pipeline only once." ] }, { @@ -1216,7 +1216,7 @@ "source": [ "### Compute the model parameters and FLOPS\n", "\n", - "Finally we will compute the NN backbone FLOPS with the [fvcore](https://github.com/facebookresearch/fvcore) library. FLOPS are typically reported in research papers, thus, it's good to report FLOPS here as well." + "Finally we will compute the NN backbone FLOPS and the number of learnable params with the [fvcore](https://github.com/facebookresearch/fvcore) library. FLOPS are typically reported in research papers, so why not here as well." ] }, { @@ -1303,10 +1303,15 @@ "- The model has 15.9M parameters (Very small)\n", "- One forward pass takes 153 GFlops (not too much)\n", "\n", - "Overall, the model is very lightweight. Adding some width and depth to the layers would probably do some magic (in terms of the segmentation results) without bloating the model too much.\n", + "Overall, this cellpose model is very lightweight. Adding some width and depth to the layers would probably do some magic (in terms of the segmentation results) without bloating the model too much. So, in conclusion, there is a lot of room for improvement, both on the architecture and training side of the model. However, lightweight models can be more suitable for a large number of huge images if you have limited time and compute resources. \n", "\n", "This concludes this notebook, thanks!" ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [] } ], "metadata": {