Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src to project_name #6

Closed
chritter opened this issue Mar 28, 2022 · 14 comments · Fixed by #23
Closed

src to project_name #6

chritter opened this issue Mar 28, 2022 · 14 comments · Fixed by #23
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@chritter
Copy link

Hello,

I would suggest to have the src directory named according to the project name as discussed here to follow common package development standards.

@goatsweater
Copy link
Contributor

The src directory is meant to represent the spot where people should be putting their code (vs docs, for example). The reason to keep the src directory is to ensure things are actually behaving how you expect. For example, a lot of people likely don't realize that when running unit tests their code is using the fact that python implicitly puts the current directory (.) on your path.

To your point though, encouraging a package over single script layout is something that is reasonable to lay out. What about this structure?

.
└── my_project/
    └── src/
        └── my_project/
            └── __init__.py

@chritter
Copy link
Author

chritter commented Apr 8, 2022

@goatsweater Yes, that might work. As long as the package can be installed directly via my-project.

Then we could have multiple packages under source, which can be installed separately, also from gitlab via pip..?

There might be a couple of spots where to make changes to adapt to this design including setup.py, e.g. with

setup(
    ...
    packages=find_packages(include=[[cookie_cutter_name]/my_project"]),
   ....
)

For sphinx we can let the user set the paths to the packages in the conf.py.

@goatsweater
Copy link
Contributor

I've push a pyproject.toml and setup.cfg file that reflect having packages under the src/ directory and to default build using setuptools. At a minimum it should make working with any recent pip version work well. In theory it supports having many packages, although I don't know how common that is in practice.

A lot of the metadata and options available through setup.cfg aren't defined and are left to the person creating the project (if they want to include things like author, etc).

@chritter
Copy link
Author

I would add to the docs to suggest people use the setup.cfg instead of setup.py for settings.

@chritter
Copy link
Author

@ToucheSir could test out the install of subpackages via pip.

@ToucheSir
Copy link
Contributor

I'm planning on setting up a new project repo today and will test out whether this allows for editable installs. Will loop back once I have results.

@chritter
Copy link
Author

@ToucheSir ^ bump

@ToucheSir
Copy link
Contributor

Sorry, I didn't get to touch the project code at all since setup (no coding yet), but I tried this out today. Can confirm that pip install -e . worked and I can import a package defined under src/!

@goatsweater
Copy link
Contributor

With confirmation that the approach works in principle I think the only remaining work on this is to document the use of setup.cfg over setup.py. I'll try to get that done soon.

@goatsweater goatsweater added the documentation Improvements or additions to documentation label May 13, 2022
goatsweater added a commit that referenced this issue May 19, 2022
@asolis
Copy link
Contributor

asolis commented May 26, 2022

I'm not sure if you guys actioned on this item but it seems to be open.

I suggest to follow the structure proposed by @goatsweater , more generally :

.
└──{{ coockiecutter.project_name }}/
    └── src/
        └── {{ coockiecutter.project_name }}/
            └── __init__.py

The current setup.cfg points to find: packages inside src. It will not included python modules files at the src level, and I believe this should be the intended way. This create a proper namespace for your package tied to your project name. It's currently not usable if you have python files at the src level.
Example:
With the current configuration, using src as your top level package:

.
└──{{ coockiecutter.project_name }}/
    └── src/
        └── __init.__.py  # Won't be included 
        └── file1.py         # Won't be included 
        └── conf/
            └── __init__.py
            └── app.py

Installing the package pip install . at the project root (not src) won't include file1.py even when the src/ folder contains an __init__.py inside . Only conf will make the cut. Invalidating all disregarding at src level. (pip show -f project_name won't list the files unless you manually add them)

Also creates a potentially name collision problem because without the top level package/project name using the installed package would be imported like this:
from conf import app #too generic

In the case of multiple top level packages , I think the cookiecutter should only create the template with the project name by default but the user could include more top level packages manually or rename them.

@asolis
Copy link
Contributor

asolis commented May 26, 2022

I'm not sure if you guys actioned on this item but it seems to be open.

I suggest to follow the structure proposed by @goatsweater , more generally :

.
└──{{ coockiecutter.project_name }}/
    └── src/
        └── {{ coockiecutter.project_name }}/
            └── __init__.py

The current setup.cfg points to find: packages inside src. It will not included python modules files at the src level, and I believe this should be the intended way. This create a proper namespace for your package tied to your project name. It's currently not usable if you have python files at the src level. Example: With the current configuration, using src as your top level package:

.
└──{{ coockiecutter.project_name }}/
    └── src/
        └── __init.__.py  # Won't be included 
        └── file1.py         # Won't be included 
        └── conf/
            └── __init__.py
            └── app.py

Installing the package pip install . at the project root (not src) won't include file1.py even when the src/ folder contains an __init__.py inside . Only conf will make the cut. Invalidating all disregarding at src level. (pip show -f project_name won't list the files unless you manually add them)

Also creates a potentially name collision problem because without the top level package/project name using the installed package would be imported like this: from conf import app #too generic

Another import detail to take under consideration is that if we use {{ cookiecutter.project_name }} as the package name and the project name contains dash ("-") (i.e., my-package). It's not a good identifier for a python package and it's not possible to import normally using import my-package. It's not impossible to import :

python import importlib  
my_package = importlib.import_module("my-package")

but I highly discourage this practice. Maybe a name convention of replacing non valid package identifier for "_" or just simple not accept it as a name.

@goatsweater
Copy link
Contributor

The current setup is going to create a top level project folder that consists of the English project name with any spaces the user enters replaced by a single dash (-). Otherwise, if people type _ or any other character it would survive. This should really be changed to implement <english_project>-<french_project> as the top level directory to reflect the bilingual aspects of it.

This means it is possible that the name will not be desirable (or even legal) for a Python package, and I think it would be better not to attempt to do string parsing to develop a suitable name. I remember there being discussions of having some sample code in the templates that would provide a cli.py, api.py, and associated argo template stubbed out (I may be getting those file names wrong).

A neutral approach may be to pick a directory name like mypackage that we fully expect everyone to change and have a bit of sample code in there for people to either use as a starting point or delete. Hopefully this would force a bit of thought into the package name so that things are more intuitive to find when pushed to artifactory (for things that will go that way).

@asolis
Copy link
Contributor

asolis commented Jun 3, 2022

I took a simple approach to this by introducing another cookiecutter variable :package_name to specify my versioning package or top level package (in your case). By default is rendering the value cookiecutter.project_name_en|lower|replace(‘ ‘,’_’)| replace(‘-‘,’_’) but it also allows the user to redefine it at creation.
For the datascience cookiecutter this could specify the top level package where all modules, version, cli entry point etc… stay.
In my use case I took the package_name to complete a full namespace package (i.e., dsd.hydra.module.{{ cookiecutter.package_name }}). The package_name variable is also quite useful to specify and point to ___version___, cli entry point ___main__.py , etc.

You can have a look here (These are private, waiting to do a fork from this project )

  1. cookiecutter.json
  2. hydra-module-template

@goatsweater
Copy link
Contributor

Good point about the templated value just being a default. I forgot the user had a chance to change it.

Per the PEP 8 guidelines the naming is actually a lot simpler than I thought:

Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged.

This actually makes the string parsing fairly simple as every non-alpha character should be discarded except for creating an underscore.

goatsweater added a commit that referenced this issue Jun 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
4 participants