Skip to content

hatzakislab/REPLOM-analysis-tool

Repository files navigation

Analysis tool for REPLOM

This repository contains a pipeline for analysis of data obtained through REal-time kinetics via binding and Photobleaching LOcalization Microscopy (REPLOM). The methodology is described in https://doi.org/10.1101/2021.08.20.457097 and made available here for convenience. The tool relies on clustering with a euclidean minimum spanning tree which is implemented in the AstroML package (see here for an example use of the clustering methodology). It is therefore under the BSD license.

Introduction

The input data consists of STORM localizations in a csv file from blinking fluorophores with timestamps.

"frame","x [nm]","y [nm]","intensity [photon]"
1.0,1031.048741153928,38711.99299011621,771.0507218978817
1.0,1209.1570168683265,15066.011054392755,1469.2293350289435

The pipeline first use the method Hierarchical Clustering from the AstroML package to segment the observed data into molecular clusters

Overview

The individual clusters are then analyzed in parallel to estimate growth curves of their area, leading to an output looking like this Alt Text

Finally, the repository includes a fitting functionality which allows estimation of growth parameters for the segmented clusters.

Example use

The steps below guide you though an example use of how to employ this tool for analysis of REPLOM data.

Installation

The scripts rely on a set of libraries. As of now, the individual libraries must be installed by the user using either pip or conda. For most this will include (along with subdependencies of these libraries of course):

  1. matplotlib
  2. iminuit
  3. multiprocess
  4. sklearn
  5. tqdm

The animations are built using 'FuncAnimation' from 'matplotlib' which requires ffmpeg. If you do not have this on your system, you can run

conda install -c conda-forge ffmpeg

Or

pip install ffmpeg

To install it.

Segment clusters

The repository contains an example data set example_raw data.csv. After having installed the requirements, you may navigate to the cloned repository (if not already there) and run

python Automated_aggregate_analysis.py example_raw\ data.csv 0.92

This command calls the script Automated_aggregate_analysis.py with the file example_raw\ data.csv as input. The only input parameter is the distance cutoff percentile used to segment the euclidean minimum spanning tree (see here for an example). A higher value of this cutoff yields less but larger clusters and conversely for a lower cutoff.

After having run this command, a folder called example_raw data should have appeared, containing an overview plot called plot overview.png of the clusters along with coordinates in csv files named for example Group 0.csv. The script will then ask if the clustering is reasonable. If you agree, you type y and press enter. If not, you type in a new cutoff and press enter after which it will recompute the clustering.

Upon accepting the clusters computed, the script automatically begins analyzing the clusters and computing growth curves and videos. While this is run in parallel, the process still takes some time (can take up to hours, but movies should be produced continously). The end result is a movie for each file called for example Clustered Group 0.mp4 and a growth curve callled Group 8 Growth curve which may be fitted to extract growth parameters.

Fit growth curves

Upon completion of clustering, the individual growth curves may be fit. The script fitter-rate calculation.py contains functions to do so. In the current implementation of the repository, the script will run an example which relies on completion of cluster segmentation on the example data with a cutoff of 0.92. If done, running the command

python fitter-rate\ calculation.py

will fit two of the clusters, a symmetric and assymetric cluster with a single growth mode and a switching one respectively.

Repository overview

To sum up, here is a description of each of the scripts in this repository (apart from stuff relating to readme):

  1. Automated_aggregate_analysis.py
    1. Runs clustering analysis on an input csv file of REPLOM observations. Upon completion, it computes growth curves and generates movies for all clusters.
  2. fitter-rate calculation.py
    1. Contains fit functions to fit individual clusters
  3. example_raw data.csv
    1. Example data set for instructional use
  4. mst_clustering.py
    1. Auxillary script to compute Euclidean Spanning tree clustering, modified from the [AstroML](https://www.astroml.org/index.html)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages