Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FlowSOM fail on new normalized FCS files #20

Open
EAC-T opened this issue Jan 27, 2021 · 5 comments
Open

FlowSOM fail on new normalized FCS files #20

EAC-T opened this issue Jan 27, 2021 · 5 comments

Comments

@EAC-T
Copy link

EAC-T commented Jan 27, 2021

Hi everyone,

After I normalize the files, I upload them to cytobank to do FLOWSOM clustering, it keeps failing, any idea why?
Also the fcs file size of the normalize file is 3X smaller than that before normalization, is there a reason for that?

Thank you a lot

@SamGG
Copy link

SamGG commented Jan 27, 2021

Hi,
Failure: try to load the files with another software, e.g. FlowJo or Omiq, in order to repeat the error? check compliance with http://bioinformin.cesnet.cz/flowIO/.
Size: if the files before normalization are directly originating from a CyTOF, it is known that there is an overload of at least a factor 2 in such an FCS. If not, no idea.
Best.

@EAC-T
Copy link
Author

EAC-T commented Jan 27, 2021

Hi @SamGG
I did check my FCS files as you suggested, I think one potential reason is that I have few values with very big numbers something with 1.07+e30, is that indicate that the normalization failed? Have you encountered such a problem before?
Thank you a lot

@SamGG
Copy link

SamGG commented Jan 27, 2021

I didn't encounter such a problem. Sofie will probably answer you soon.

@SofieVG
Copy link
Member

SofieVG commented Feb 15, 2021

Hi,

This is indeed an issue I have also encountered once in a while. Typically it is related to the training data not being optimally representative of the data you are trying to normalize, so that in using the splines some extrapolation happens. One option to minimize this effect is to use the limits parameter, where you can pass some values which will be introduced in the spline as identity points. If you place those at some values which you expect are on the borders of the range (e.g. after transformation for cytof data typically around 0 and 8), this can help to encourage the spline to stay closer to the identity function out of this range. Alternatively, it might be a good idea to double check the FlowSOM model, and see if the clusters are making sense. Maybe you are overclustering the data, causing some small clusters with not enough data present to appropriately estimate the spline. Making some figures of the splines (e.g. using the plot = TRUE parameter) can be helpful to pinpoint the exact issue.

All the best,
Sofie

@tomashhurst
Copy link

@EAC-T you can also try using fewer metacluster to try and generate a slightly more accurate model -- this can often help prevent those extremely high values appearing the resulting data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants