Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can I use ananse to compare TFs networks between disease and normal condition such as aging and young ,or fibrosis disease and healthy control? #209

Open
socialtree-yt opened this issue Dec 4, 2023 · 11 comments

Comments

@socialtree-yt
Copy link

Can I use ananse to compare TFs networks between disease and normal condition such as aging and young ,or fibrosis disease and healthy control rather than differentiation?

@socialtree-yt
Copy link
Author

And if I use narrowPeak file in ananse binding -r parameter, will it calculates binding results in enhancer region automatically or just calculate merged regions among narrowPeaks?

@siebrenf
Copy link
Member

siebrenf commented Dec 4, 2023

Hi socialtree,

Can I use ananse to compare TFs networks between disease and normal condition such as aging and young ,or fibrosis disease and healthy control

ANANSE network outputs a network per condition and ANANSE influence outputs a differential network. However, these networks are very tricky to analyse. The top influential TFs output by ANANSE influence are much more robust.

rather than differentiation

I'm not sure I understand this part. In the final step, ANANSE influence, we require a Differential Gene Expression analysis as input. This data is combined with the networks from the previous step, ANANSE network, to find the top influential TFs, which are something more than "just" differentiation.

if I use narrowPeak file in ananse binding -r parameter, will it calculates binding results in enhancer region automatically or just calculate merged regions among narrowPeaks?

With the -r parameter, you tell ANANSE in which regions to look. The enhancer activity within those regions is automatically calculated from the enhancer data, which you give with the -A, -H, or -C parameters, depending on your data type (ATAC-, H3k27Ac ChIP-, or CAGE-seq respectively)

@socialtree-yt
Copy link
Author

OK. Thank you! I said "rather than differentiation" means that the introduction of ananse is "You can use it to study transcription regulation during development and differentiation, or to generate a shortlist of transcription factors for trans-differentiation experiments." in https://anansepy.readthedocs.io/en/master/. So I wonder whether ananse is quite suitable for differentiation relevant studies but not for common comparison studies such as comparing TF networks between disease and healthy condition

@socialtree-yt
Copy link
Author

"With the -r parameter, you tell ANANSE in which regions to look"
I also want to know the BED format for this parameter. Can I use BED6 format or must with BED12 format? And if I have predefined an enhancer region bed file containing typical enhancer and super enhancer. No matter how ananse identify the genomic location of these region, ananse would calculate the activity in my primary region but not split my super enhancer into multiple regions or discard these regions. Is it right?

@siebrenf
Copy link
Member

siebrenf commented Dec 4, 2023

So I wonder whether ananse is quite suitable for differentiation relevant studies but not for common comparison studies such as comparing TF networks between disease and healthy condition

Ah OK, now I understand you! I believe ANANSE is suitable for healthy-disease comparisons. (I use the word "believe" because it has not been benchmarked for this scenario, but it should work just fine)

I also want to know the BED format for this parameter

we accept various formats. BED3 (and higher) works

ananse would calculate the activity in my primary region but not split my super enhancer into multiple regions or discard these regions.

All regions are kept, but regions are scaled the same width. For ATAC-seq, this width is 200bp, for ChIP-seq this is 2000bp. The reason for this is that we calculate the activity as the number of reads under each peak, and wider peaks naturally contain more reads.
If this does not work for your analysis, you could supply a pfmscorefile to ANANSE binding. See ananse binding --help and gimme scan --help for more info.

@socialtree-yt
Copy link
Author

If this does not work for your analysis, you could supply a pfmscorefile to ANANSE binding. See ananse binding --help and gimme scan --help for more info.
I think this method helps to predict motif bindings according to different database but not helps to solving the scaled same width if I need regions with primary and different width?

@socialtree-yt
Copy link
Author

And I see "H3K27ac signal, for instance, would not work well, as peaks from a H3K27ac ChIP-seq experiment are too broad to provide a precise region for the motif analysis. You can also provide one or more narrowPeak files, for instance from MACS2." I want to know whether it's precise to use narrowPeak of CHIP seq data as regions to do ananse binding or ATAC narrow peak works better? Can I use both ATAC and CHIP seq narrowPeak as regions? Does it outperforms than only with ATAC narrow peak?
Thank you for your help!

@socialtree-yt
Copy link
Author

And for example, if I choose a wide range A in "ananse network -r parameter", does all regions in "binding.h5" file of ananse binding results will be selected if they are contained in range A? Or I must select regions as same as regions in "binding.h5" file?

@siebrenf
Copy link
Member

siebrenf commented Dec 5, 2023

I want to know whether it's precise to use narrowPeak of CHIP seq data as regions to do ananse binding or ATAC narrow peak works better? Can I use both ATAC and CHIP seq narrowPeak as regions? Does it outperforms than only with ATAC narrow peak?

Yes, yes and yes! NarrowPeak files to indicate regions are good for both ChIP-seq and ATAC-seq, and using both H3K27Ac ChIP-seq BAMs with ATAC-seq BAMs is better than either one alone.

if I choose a wide range A in "ananse network -r parameter", does all regions in "binding.h5" file of ananse binding results will be selected if they are contained in range A? Or I must select regions as same as regions in "binding.h5" file?

The latter. ananse network -r is used to filter your regions. So you can only select regions from the "binding.h5" file.

I think this method [...] not helps to solving the scaled same width if I need regions with primary and different width?

I believe ANANSE gives the best results if all input regions have the same width.

You could to this by taking the start + peak values from all your narrowPeak format files, and extending each region around this coordinate. Which width you use if up to you (for reference: our default is 200 bp).

My advise would be to try this option first, and then see what the effect of using variable region widths is (I expect that scores become biased towards wider regions).

I hope this is more clear :)

@socialtree-yt
Copy link
Author

You could to this by taking the start + peak values from all your narrowPeak format files, and extending each region around this coordinate. Which width you use if up to you
OK. Thank you! It's very clear but I also have a question that whether ANANSE will scale lengths of regions automatically to 200bp if I use narrowpeak as input regions because narrowpeak usually has different region lengths. And if I extend my region as you suggest, will the regions be scaled to 200bp?

@siebrenf
Copy link
Member

If you give ANANSE narrowPeak and BAM files, the regions are scaled.
If you give ANANSE a pfmscorefile, the regions are not scaled.

I think that the easiest way to make this pfmscorefile, is to use coverage_table (from gimmemotfs, which is included in the ANANSE environment). You would have to run this command for each width of region, them combine the results. (this will also show you how the values change depending on the region width)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants