Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Pre-commit Hooks for Code Formatting and Linting - Resolves #188 #323

Closed
wants to merge 4 commits into from

Conversation

smb-h
Copy link

@smb-h smb-h commented Sep 28, 2023

Description

This pull request addresses issue #188 by adding pre-commit hooks to enhance code quality and consistency in the cookiecutter-data-science project. The following tools have been integrated as pre-commit hooks:

  • pre-commit-hooks: A collection of helpful pre-commit hooks for code linting and formatting.
  • mirrors-prettier: Ensures consistent code formatting using Prettier.
  • black: Automatically formats Python code according to PEP 8 standards.
  • isort: Sorts Python imports to comply with PEP 8 and PEP 257.
  • flake8: Performs code linting to identify and report style issues and potential errors.

Additional Notes

The project documentation has been updated to include information on how to set up and use these pre-commit hooks.
The changes have been tested locally and do not introduce any regressions or errors.
This PR is ready for review and can be merged into the main branch.

This PR is a part of my ongoing effort to improve code quality and maintainability in the cookiecutter-data-science project. I look forward to your feedback and suggestions for further improvements.

@smb-h
Copy link
Author

smb-h commented Oct 10, 2023

Hi @pjbull,

I'd appreciate your review on this PR addressing issue #188. It adds pre-commit hooks for code quality and consistency in cookiecutter-data-science. I've updated the docs and thoroughly tested it locally. Your input would be valuable. Thanks in advance for your time!

Best,
SMBH

@pjbull
Copy link
Member

pjbull commented Oct 10, 2023

Thanks @smb-h, I appreciate your work and thinking here, and the focus on code quality is great.

That said, as I noted in #188, I'm still not sold on pre-commit being the right tool by default for all projects.

This is for a few reasons:

  • I think the walk before you run philosophy of including these tools in your dependencies and running them normally is a better initial practice for most people and teams. pre-commit adds an extra layer of abstraction.
  • Since we don't initialize a .git repo by default, it doesn't do anything by default, which is potentially confusing or frustrating.
  • I don't like pre-commit's independent dependency management process. I think the tools should be installed and configured as part of a project's standard dev environment way.
  • Once that's done, it's easy to add a pre-commit hook that just runs make lint so the value-add of pre-commit is primarily in formatting that output.

We're currently working on shipping v2 with the key things we need to switch over to a more flexible templating CLI. This will allow us to have more extensibility and allow more options during the project configuration flow.

At the same time, we're moving things into a roadmap for v2.1 where those are added options that use the new functionality. I'm open to discussing pre-commit for 2.1, but at the moment, it's my opinion it's not the right project default.

@jamesmyatt
Copy link
Contributor

For what it's worth, as a lurker, I can see why you might want to exclude it from the default (i.e. walk-before-you-can-run), but I'm generally in favour of pre-commit for most projects. This is for a few reasons:

  • I do like pre-commit's independent dependency management process. It simplifies your dev environment and reduces the risk of conflicts between your dev tools and their dependencies and your project code and its dependencies.
  • It makes easy to use other pre-commit hooks than just black, isort and flake8, such as checking line endings, JSON/YAML validation and formatting, etc. They don't have to be Python-based and even the Python ones don't have to use the same version of Python as your project code.
  • Doesn't use make, which is a pain on non-Linux platforms.

@pjbull
Copy link
Member

pjbull commented Oct 10, 2023

Thanks, appreciate the comments @jamesmyatt. The last two points are well taken.

It simplifies your dev environment and reduces the risk of conflicts between your dev tools and their dependencies and your project code and its dependencies.

On this that I have often seen things go wrong when a dev runs black or flake8 at the CLI and pre-commit has a different version so the check fails. I've never seen a substantive dependency conflict between these tools and necessary package versions that can't be easily resolved, e.g. through pinning a tool version for the dev environment.

@smb-h
Copy link
Author

smb-h commented Oct 10, 2023

I'd like to express my gratitude for your input, @jamesmyatt. Totally agree with you.

I appreciate @pjbull perspective on dependency conflicts. While conflicts between pre-commit and other tool versions can occasionally occur, they can often be resolved by pinning the appropriate tool version for the development environment. It's important to strike a balance between the advantages of independent dependency management and ensuring a consistent experience for all contributors.

I'd like to emphasize that, especially for growing projects like ours, maintaining a structured and standardized code and environment is crucial. Pre-commit hooks help us achieve this by enforcing code quality and formatting standards, ultimately contributing to the project's long-term maintainability and scalability.

I understand that, in your opinion, this addition might not be deemed necessary at this moment. However, I'd like to emphasize that, from a software development perspective (acknowledging my limited experience), the inclusion of pre-commit hooks does not significantly add complexity to the project.

In my interactions with various teams and contributors, I've observed that such tools can be invaluable for maintaining code quality and consistency, especially in collaborative environments.

That said, I respect your decision, and if the consensus leans towards not incorporating this change, I fully understand. I initially added it manually because I found it helpful for my own use of cookiecutter-data-science, and I noticed similar requests in our issues. My aim was to make the development experience smoother for others as well.

Once again, I appreciate your engagement in this discussion, and I'm open to any outcome that aligns with our project's best interests.

Thank you both for your valuable input and insights.

@pjbull
Copy link
Member

pjbull commented Oct 10, 2023

maintaining a structured and standardized code and environment is crucial. Pre-commit hooks help us achieve this by enforcing code quality and formatting standards, ultimately contributing to the project's long-term maintainability and scalability.

Yes, agreed. We are on the same page here about the value of linting, formatting, and code standards. V2 has linting and formatting commands with flake8/black/isort for exactly this purpose. These commands will be able to be used both in git-native pre commit flows locally and also for CI/CD. I lean towards not adding the specific pre-commit tool as an additional layer on top of that.

I will, however, start a discussion on this topic once we ship v2 so we can have other voices chime in.

Appreciate the discussion!

@pjbull pjbull deleted the branch drivendataorg:v2 May 22, 2024 16:53
@pjbull pjbull closed this May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants