Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW]: ProtoSyn.jl: a novel platform for computational molecular manipulation and simulation with a focus on protein design #124

Open
30 of 42 tasks
whedon opened this issue Mar 6, 2023 · 16 comments

Comments

@whedon
Copy link
Collaborator

whedon commented Mar 6, 2023

Submitting author: @https://github.com/JosePereiraUA (José Pereira)
Repository: https://github.com/sergio-santos-group/ProtoSyn.jl
Branch with paper.md (empty if default branch):
Version:
Editor: @odow
Reviewers: @jgreener64, @mfherbst
Archive: Pending

Status

status

Status badge code:

HTML: <a href="https://proceedings.juliacon.org/papers/1a84b4d891d449916c18f81079109cb0"><img src="https://proceedings.juliacon.org/papers/1a84b4d891d449916c18f81079109cb0/status.svg"></a>
Markdown: [![status](https://proceedings.juliacon.org/papers/1a84b4d891d449916c18f81079109cb0/status.svg)](https://proceedings.juliacon.org/papers/1a84b4d891d449916c18f81079109cb0)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@jgreener64 & @mfherbst, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @odow know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Review checklist for @jgreener64

Conflict of interest

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Authorship: Has the submitting author (@https://github.com/JosePereiraUA) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Paper format

  • Authors: Does the paper.tex file include a list of authors with their affiliations?
  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?
  • Page limit: Is the page limit for full papers respected by the submitted document?

Content

  • Context: is the scientific context motivating the work correctly presented?
  • Methodology: is the approach taken in the work justified, presented with enough details and reference to reproduce it?
  • Results: are the results presented and compared to approaches with similar goals?

Review checklist for @mfherbst

Conflict of interest

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Authorship: Has the submitting author (@https://github.com/JosePereiraUA) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Paper format

  • Authors: Does the paper.tex file include a list of authors with their affiliations?
  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?
  • Page limit: Is the page limit for full papers respected by the submitted document?

Content

  • Context: is the scientific context motivating the work correctly presented?
  • Methodology: is the approach taken in the work justified, presented with enough details and reference to reproduce it?
  • Results: are the results presented and compared to approaches with similar goals?
@whedon
Copy link
Collaborator Author

whedon commented Mar 6, 2023

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @jgreener64, @mfherbst it looks like you're currently assigned to review this paper 🎉.

⚠️ JOSS reduced service mode ⚠️

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

⭐ Important ⭐

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/JuliaCon/proceedings-review) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/JuliaCon/proceedings-review:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf

@whedon
Copy link
Collaborator Author

whedon commented Mar 6, 2023

Failed to discover a Statement of need section in paper

@whedon
Copy link
Collaborator Author

whedon commented Mar 6, 2023

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1038/nbt0798-617 is OK
- 10.1146/annurev.biophys.37.032807.125832 is OK
- 10.1038/nature19946 is OK
- 10.1002/pro.4098 is OK
- 10.1126/science.1089427 is OK
- 10.1093/bioinformatics/btq007 is OK
- 10.1038/s41592-020-0848-2 is OK
- 10.1038/s41586-019-1923-7 is OK
- 10.1038/s41586-021-03828-1 is OK
- 10.1038/s41586-021-03819-2 is OK
- 10.1073/pnas.1914677117 is OK
- 10.1002/pro.3235 is OK
- 10.1007/978-1-4939-6637-0_2 is OK
- 10.26682/sjuod.2019.22.1.11 is OK
- 10.1371/journal.pone.0020161 is OK
- 10.1002/jcc.20727 is OK
- 10.1137/141000671 is OK
- 10.1145/3276490 is OK
- 10.1007/978-1-4613-8476-2_1 is OK
- 10.1002/CBIC.202000437 is OK
- 10.1093/BIOINFORMATICS/BTU106 is OK
- 10.1145/3276483 is OK
- 10.1073/pnas.96.10.5486 is OK

MISSING DOIs

- 10.1002/(sici)1097-0134(1999)37:3+<171::aid-prot21>3.0.co;2-z may be a valid DOI for title: Ab initio protein structure prediction of CASP III targets using ROSETTA

INVALID DOIs

- None

@whedon
Copy link
Collaborator Author

whedon commented Mar 6, 2023

Wordcount for paper.tex is 3356

@whedon
Copy link
Collaborator Author

whedon commented Mar 6, 2023

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@whedon
Copy link
Collaborator Author

whedon commented Mar 6, 2023

Software report (experimental):

github.com/AlDanial/cloc v 1.88  T=0.78 s (416.6 files/s, 92191.7 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Julia                          162           4905           2050          21193
YAML                            54             16             28           2757
TeX                              8            263            177           2558
Markdown                        74            857              0           2015
TOML                             2            227              1           1039
Jupyter Notebook                17              0          32062            712
Lisp                             2             74              0            326
Python                           2             36             13            105
Ruby                             1              8              4             45
JSON                             1              0              0              1
-------------------------------------------------------------------------------
SUM:                           323           6386          34335          30751
-------------------------------------------------------------------------------


Statistical information for the repository '9b056a74136b7f1027ca5c55' was
gathered on 2023/03/06.
The following historical commit information, by author, was found:

Author                     Commits    Insertions      Deletions    % of changes
JosePereiraUA                    8           954           3262           34.81
José Pereira                     1            57              0            0.47
Sergio M. Santos                 6          3973           3864           64.72

Below are the number of rows from each author that have survived and are still
intact in the current revision:

Author                     Rows      Stability          Age       % in comments
JosePereiraUA               102           10.7          7.9                6.86
Sergio M. Santos            109            2.7          0.0               10.09

@whedon
Copy link
Collaborator Author

whedon commented Mar 20, 2023

👋 @jgreener64, please update us on how your review is going (this is an automated reminder).

@whedon
Copy link
Collaborator Author

whedon commented Mar 20, 2023

👋 @mfherbst, please update us on how your review is going (this is an automated reminder).

@mfherbst
Copy link
Member

Oh sorry. I somehow missed I was actually made reviewer vere. Sorry. I'll try to get this done within the next week.

@jgreener64
Copy link
Collaborator

Protein design is an important and useful scientific problem and the authors are correct to point out the lack of variety of useable software in this area. ProtSyn.jl can therefore be a useful tool to the community, especially in combination with the steadily growing Julia ecosystem for biomolecular and atomic modelling.

The paper is fairly well written and adequately describes the software. My concerns are mainly with the packaging and testing of the software. With the below issues addressed I think this is a strong contribution and will be of interest to the protein design community in the JuliaCon Proceedings.

Major

  • I can install the software but using ProtoSyn and running the tests fails. I have opened this as an issue (Failure to import ProtoSyn sergio-santos-group/ProtoSyn.jl#52).
  • The build badge on the readme does not point to the actual build (https://app.travis-ci.com/github/sergio-santos-group/ProtoSyn.jl), which fails with the above error. Running CI every week, e.g. on GitHub actions, would help catch errors as they appear.
  • There are no compat bounds in Project.toml, without these users may install incompatible versions of dependencies.
  • There is a mismatch between the version number in Project.toml (0.4) and the version number on the GitHub release (1.1).
  • Whilst it is not essential, it is conventional to register the package in the Julia registry so it can be installed with add ProtoSyn.

Minor

  • Briefly introduce protein design and its potential applications in the introduction of the paper.
  • Protein design is currently being revolutionalised by deep learning. It is worth mentioning this at the end of the paper along with any ways that ProtoSyn.jl can be applied here.
  • The paper says that "ProtoSyn.jl is completely written in Julia" but it seems it can call out to other libraries such as TorchANI, I would rephrase this.
  • It might be worth citing https://doi.org/10.1016/j.cpc.2022.108452 as software that could be combined with ProtoSyn.jl.
  • There are some gaps in paragraphs in the PDF of the paper, see for example the start of page 4.
  • The GNU GPL 3 license of the software can be mentioned in the paper.

@mfherbst
Copy link
Member

The authors have described a new Julia-based software platform for computational protein modelling. I resonate strongly with the author's intent to bring the promising features of Julia to this domain, which without any doubt features noteworthy computational challenges and a desperate need for more modern software. Similarly the domain of atomistic and molecular modelling is to date under-represented in the Julia ecosystem. ProtoSyn thus provides a valuabe new tool for both the Julia as well as the biomolecular communities and I overall recommend publication in JuliaCon proceedings.

Given that molecular and atomistic modelling is a huge field with many challenges, the authors make some rather bold statements in the manuscript, which come across as unbelievable. Moreover the language in the manuscript is in parts sloppy and unclear. I strongly advise to revise the manuscript, keeping these aspect in mind.

In contrast the code is well-documented and written clearly. I don't agree with the strong split-up in so many submodules, but that is personal taste. On top of the issues @jgreener64 has flagged, I have nothing to add.

Major points

  • The paper is inconsistent with respect to what knowledge is assumed. More basic aspects such as internal coordinates are explained, while "carbohydrates", "glycoproteins", "TM-score" are aspects which are not explained or put into context. I'd suggest to keep in mind that JuliaCon proceedings target computational science as a whole and (a) add a more lay-term introduction to protein design folding and (b) explain terms such as the above or put them in context (e.g. is a TM-score of 0.5927 good or bad?). This would also make it easier to anticipate why protein folding is such a tough problem and thus performance are crucial.
  • Abstract: There are a number of open-source scientific tools for molecular simulation, that do an excellent job with respect to documentation, modularisation, software development practices etc. Psi4 is such an example. In light of that the statement "ProtoSyn.jl has the potential to modernize the way the scientific community uses simulation tools." is hardly believable, considering scientific computing (or even atomistic modelling) as a whole domain.
  • Performance bullet point: The explanation of the "two-language problem" the authors provide is unclear. To me the two-language problem is not that some languages are fast and some are slow, but more about the consequences that arise when one needs to combine two languages in a project, i.e. what is explained after the statement "This is known as the two-language problem".
  • Modular bullet point: It is not self-evident that "modular" goes hand in hand with everything being written in one programming language. In fact I would even argue one can have a perfectly modular code where the API of each C++ module is exposed and linked via python (again Psi4 is a prime example). Clearly this suffers from other issues (performance, composability, ...) due to the two-language problem though ...
  • Conclusion: The authors claim that "arbitrarily complex protocols and simulations can be easily constructed". This statement deserves further explanation to be plausible.
  • Conclusion: The authors claim "ProtoSyn.jl's development constitutes a first attempt at a Julia-based molecular manipulation and simulation software." I don't think this is true. There are a number of other molecular simulation
    codes in Julia out there.
  • Because of Failure to import ProtoSyn sergio-santos-group/ProtoSyn.jl#52 I have so far not been able to run the ProtoSyn examples.

Minor points

  • Introduction: I suggest to change "namely Github" -> "such as Github" (there are other such platforms)
  • Under the hood: The capitalisation is unusual here (Residues, Graph, State, ...)
  • Under the hood: The frequent parenthesis make the text hard too read.
  • Conclusion: I find it weird to find the computational examples in the conclusion. Maybe have a dedicated "Examples" section. Also provide context for the examples to the uneducated reader: Are these problems hard? How do other packages in comparison in terms of runtime?
  • The language is at times very sloppy, e.g. "new and better ways to do stuff". The authors should revise the manuscript appropriately.
  • It would be nice to cite Molly, AtomsBase, DFTK and other referenced software.
  • Some of the links to ipynb files on https://sergio-santos-group.github.io/ProtoSyn.jl/stable/getting-started/examples/
    are broken.

@JosePereiraUA
Copy link

Thank you so much for the constructive feedback.
I can certainly revise the manuscript if that is necessary, smoothing out the claims. I would like to take the opportunity to explain that ProtoSyn.jl was born almost 4 years ago, when some (if not most) of the claims made "more" sense. Fortunately, the field has exploded in the right direction and indeed a revision on how "revolutionary" ProtoSyn.jl is, now, is probably advised.

I would also like to add I've fixed the above-mentioned issue (see sergio-santos-group/ProtoSyn.jl#52 (comment)). Any tips on the next steps?

@odow
Copy link
Member

odow commented Apr 20, 2023

Any tips on the next steps?

Update the paper with the suggestions. Then we can rebuild it and re-review.

@odow
Copy link
Member

odow commented Sep 11, 2023

@JosePereiraUA what's the status of this?

@odow
Copy link
Member

odow commented Nov 10, 2023

Hi @JosePereiraUA just checking in on this.

@odow
Copy link
Member

odow commented May 16, 2024

Hi @JosePereiraUA just checking in on this again. Let me know if you need anything

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants