Nanostring update 20181010 #50

mollie-barnard · 2018-10-10T19:13:07Z

The following pull request adds, fixes, and/or modifies:

Insert text here referencing the exact scripts that update
If possible, I will make sure that the pull request only updates one feature at a time
Add to this list if necessary

I performed the following prior to filing this pull request:

Tested that my change does not break the analysis pipeline
Ran a linter through my code
Update environment.yml if my code introduces new packages.

Lastly, I will assign individuals to review my code and be proud of making the pull request!

…ssed in part B

gwaybio

Congrats on the first pull request! 🎉 🎉 looks great - just a couple of comments/questions

gwaybio · 2018-10-10T19:31:26Z

7.Nanostring/scripts/A.get_correlation_output.R

-  dplyr::arrange(desc(rfFreq)) %>%
-  dplyr::top_n(n = top_n_genes) %>%
-  dplyr::mutate(genes = toupper(genes))
+  dplyr::filter(genes == "BOP1" 


lets do it this way instead - I think it is cleaner:

capture_genes <- c("BOP1", "DNAI1", "HSF1", "LRRC50", "MS4A3", "NTN2L", "SHARPIN", "SLC12A3", "SOX10", "TSNAXIP1") classifier_df <- readr::read_csv(file) %>% dplyr::filter(genes %in% capture_genes)

gwaybio · 2018-10-10T19:32:32Z

7.Nanostring/scripts/A.get_correlation_output.R

@@ -14,12 +14,40 @@ library(dplyr)

 file <- file.path("7.Nanostring", "data", "overallFreqs.csv")

+# OPTION 1: Classifier genes


What is the plan currently with this? Are we going to go with the different options and run independently?

mollie-barnard · 2018-10-10T19:51:55Z

Yay! Glad everything made sense. I like the filtering change. For your second question, my "quick fix" plan was to have the different options available to comment in/out as needed. I made this choice because I was unsure of what steps in parts B-F I would need to change if I had all of the gene sets lead to separate outputs (e.g., would I need to have 4 separate pipelines or somehow further streamline the output file naming process?). With this "quick fix" all I have to do is rename the "figures" and "results" folders at the completion of each full-pipeline run and run again with the next gene set. I'm open to more elegant solutions.

…

On Wed, Oct 10, 2018 at 1:32 PM Greg Way ***@***.***> wrote: ***@***.**** commented on this pull request. Congrats on the first pull request! 🎉 🎉 looks great - just a couple of comments/questions ------------------------------ In 7.Nanostring/scripts/A.get_correlation_output.R <#50 (comment)> : > classifier_df <- readr::read_csv(file) %>% - dplyr::arrange(desc(rfFreq)) %>% - dplyr::top_n(n = top_n_genes) %>% - dplyr::mutate(genes = toupper(genes)) + dplyr::filter(genes == "BOP1" lets do it this way instead - I think it is cleaner: capture_genes <- c("BOP1", "DNAI1", "HSF1", "LRRC50", "MS4A3", "NTN2L", "SHARPIN", "SLC12A3", "SOX10", "TSNAXIP1") classifier_df <- readr::read_csv(file) %>% dplyr::filter(genes %in% capture_genes) ------------------------------ In 7.Nanostring/scripts/A.get_correlation_output.R <#50 (comment)> : > @@ -14,12 +14,40 @@ library(dplyr) file <- file.path("7.Nanostring", "data", "overallFreqs.csv") +# OPTION 1: Classifier genes What is the plan currently with this? Are we going to go with the different options and run independently? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#50 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ApRdfMSFuUPrrPsOZaA0Y0g9NaL3x0BWks5ujktqgaJpZM4XWJGH> .

-- Mollie Barnard, ScD Postdoctoral Fellow Huntsman Cancer Institute Department of Population Health Sciences University of Utah 2000 Circle of Hope, Rm 4748 Salt Lake City, UT 84112 Phone: (801) 213-6006 [email protected]

gwaybio · 2018-10-11T00:22:08Z

I made this choice because
I was unsure of what steps in parts B-F I would need to change if I had all
of the gene sets lead to separate outputs (e.g., would I need to have 4
separate pipelines or somehow further streamline the output file naming
process?). With this "quick fix" all I have to do is rename the "figures"
and "results" folders at the completion of each full-pipeline run and run
again with the next gene set. I'm open to more elegant solutions.

This sounds good if you're looking to get results and iterate quickly. If this is the case, I am on board. However, it is not a good long term solution from a maintainability perspective.

I think we should run the pipeline through with an alternative set (like the extra 10 genes) and see what other issues we run into and in which scripts. When I wrote this I originally didn't intend for it to be run using different genes, and we're paying the price now! We should probably address any issues we run into.

For this particular solution, we may want to consider switching to an optparse solution. See here for an example. With this functionality we can also programmatically change the results and figures directories too

mollie.barnard added 2 commits October 10, 2018 12:16

add options for 10, 454 and 513 gene sets. create results folder acce…

f062f2a

…ssed in part B

create folders nested within figures folder

422f40e

gwaybio reviewed Oct 10, 2018

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nanostring update 20181010 #50

Nanostring update 20181010 #50

mollie-barnard commented Oct 10, 2018

gwaybio left a comment

gwaybio Oct 10, 2018

gwaybio Oct 10, 2018

mollie-barnard commented Oct 10, 2018 via email

gwaybio commented Oct 11, 2018 •

edited

Loading

		@@ -14,12 +14,40 @@ library(dplyr)

		file <- file.path("7.Nanostring", "data", "overallFreqs.csv")

		# OPTION 1: Classifier genes

Nanostring update 20181010 #50

Are you sure you want to change the base?

Nanostring update 20181010 #50

Conversation

mollie-barnard commented Oct 10, 2018

gwaybio left a comment

Choose a reason for hiding this comment

gwaybio Oct 10, 2018

Choose a reason for hiding this comment

gwaybio Oct 10, 2018

Choose a reason for hiding this comment

mollie-barnard commented Oct 10, 2018 via email

gwaybio commented Oct 11, 2018 • edited Loading

gwaybio commented Oct 11, 2018 •

edited

Loading