You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for making such a wonderful tool publicly available for use.
I am trying to reproduce the results from the tutorial, but the model does not seem to train well. I noticed that the 1.20 release fixed a reproducibility issue, and I was using version 1.1. Therefore, I upgraded CREsted to the latest version (v1.2.1) and am currently retraining the model.
Before delving into the model specifics, I wanted to ask if there were any particular preprocessing steps applied, such as dividing adata.X by a factor of 100 after normalization.
I ask because, in the notebook titled "Introduction to CREsted with Peak Regression", there is a bar plot showing the ground truth values for the region chr18:3892771-3894885. The y-axis ranges from 0 to 14 in the plot, whereas the tutorial data after normalization ranges from 0 to 1400. There are also minor differences in the bar heights beyond a simple scale factor of 100. I wonder if preprocessing steps account for this difference and could be the main reason why the model did not train well.
Thank you in advance for any guidance!
Dan
Version information
No response
The text was updated successfully, but these errors were encountered:
Sorry to hear that you're having issues training. What kind of performances are you getting on your test set?
I think the y-axis ranges plot might be an error in the tutorial since the target value used to be the 'mean' but seems to have changed in the latest tutorial version to 'count', so the tutorial plot might be an artifact of when it was still the mean (I'll check with Niklas who wrote the tutorial when he gets back next week).
Thanks for noticing. I noticed that the bigwigs used in the tutorial were not up to date anymore with those I used myself. Originally, we were using coverage bigwigs (giving a count to all the basepairs between two cut sites), but recently I have switched to cut site bigwigs, which only contain count values at the actual cut sites. This gives more sparse data, which is the reason for switching to the 'count' scalar, because mean values will be too low.
We are working on fixing the bigwigs on our server to the correct ones asap.
Report
Hello CREsted developers,
Thanks for making such a wonderful tool publicly available for use.
I am trying to reproduce the results from the tutorial, but the model does not seem to train well. I noticed that the 1.20 release fixed a reproducibility issue, and I was using version 1.1. Therefore, I upgraded CREsted to the latest version (v1.2.1) and am currently retraining the model.
Before delving into the model specifics, I wanted to ask if there were any particular preprocessing steps applied, such as dividing adata.X by a factor of 100 after normalization.
I ask because, in the notebook titled "Introduction to CREsted with Peak Regression", there is a bar plot showing the ground truth values for the region chr18:3892771-3894885. The y-axis ranges from 0 to 14 in the plot, whereas the tutorial data after normalization ranges from 0 to 1400. There are also minor differences in the bar heights beyond a simple scale factor of 100. I wonder if preprocessing steps account for this difference and could be the main reason why the model did not train well.
Thank you in advance for any guidance!
Dan
Version information
No response
The text was updated successfully, but these errors were encountered: