Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notebooks from environments cannot find src #168

Closed
jraviotta opened this issue May 23, 2019 · 17 comments
Closed

Notebooks from environments cannot find src #168

jraviotta opened this issue May 23, 2019 · 17 comments

Comments

@jraviotta
Copy link
Contributor

Related to #143 #76

Configuration WSL

Name Version Build Channel
alabaster 0.7.12 pypi_0 pypi
arrow 0.13.2 py36_0 conda-forge
asn1crypto 0.24.0 py36_1003 conda-forge
attrs 19.1.0 py_0 conda-forge
awscli 1.16.164 pypi_0 pypi
babel 2.6.0 pypi_0 pypi
backcall 0.1.0 py_0 conda-forge
binaryornot 0.4.4 py_1 conda-forge
bleach 3.1.0 py_0 conda-forge
botocore 1.12.154 pypi_0 pypi
bzip2 1.0.6 h14c3975_1002 conda-forge
ca-certificates 2019.3.9 hecc5488_0 conda-forge
certifi 2019.3.9 py36_0 conda-forge
cffi 1.12.3 py36h8022711_0 conda-forge
chardet 3.0.4 py36_1003 conda-forge
click 7.0 py_0 conda-forge
colorama 0.3.9 pypi_0 pypi
conda 4.6.14 py36_0 conda-forge
conda-env 2.6.0 1 conda-forge
cookiecutter 1.6.0 py36_1000 conda-forge
coverage 4.5.3 pypi_0 pypi
cryptography 2.6.1 py36h72c5cf5_0 conda-forge
cryptography-vectors 2.6.1 py_0 conda-forge
cycler 0.10.0 pypi_0 pypi
dbus 1.13.6 he372182_0 conda-forge
decorator 4.4.0 py_0 conda-forge
defusedxml 0.5.0 py_1 conda-forge
docutils 0.14 pypi_0 pypi
entrypoints 0.3 py36_1000 conda-forge
expat 2.2.5 hf484d3e_1002 conda-forge
flake8 3.7.7 pypi_0 pypi
fontconfig 2.13.1 he4413a7_1000 conda-forge
freetype 2.10.0 he983fc9_0 conda-forge
future 0.17.1 py36_1000 conda-forge
gettext 0.19.8.1 hc5be6a0_1002 conda-forge
glib 2.58.3 hf63aee3_1001 conda-forge
gmp 6.1.2 hf484d3e_1000 conda-forge
gst-plugins-base 1.14.4 hdf3bae2_1001 conda-forge
gstreamer 1.14.4 h66beb1c_1001 conda-forge
icu 58.2 hf484d3e_1000 conda-forge
idna 2.8 py36_1000 conda-forge
imagesize 1.1.0 pypi_0 pypi
ipykernel 5.1.1 py36h24bf2e0_0 conda-forge
ipython 7.5.0 py36h24bf2e0_0 conda-forge
ipython_genutils 0.2.0 py_1 conda-forge
ipywidgets 7.4.2 py_0 conda-forge
isort 4.3.20 py36_0 conda-forge
jedi 0.13.3 py36_0 conda-forge
jinja2 2.10.1 py_0 conda-forge
jinja2-time 0.2.0 py_2 conda-forge
jmespath 0.9.4 pypi_0 pypi
jpeg 9c h14c3975_1001 conda-forge
jsonschema 3.0.1 py36_0 conda-forge
jupyter 1.0.0 py_2 conda-forge
jupyter-contrib-core 0.3.3 pypi_0 pypi
jupyter-contrib-nbextensions 0.5.1 pypi_0 pypi
jupyter-highlight-selected-word 0.2.0 pypi_0 pypi
jupyter-latex-envs 1.4.6 pypi_0 pypi
jupyter-nbextensions-configurator 0.4.1 pypi_0 pypi
jupyter_client 5.2.4 py_3 conda-forge
jupyter_console 6.0.0 py_0 conda-forge
jupyter_contrib_core 0.3.3 py_2 conda-forge
jupyter_core 4.4.0 py_0 conda-forge
jupyter_highlight_selected_word 0.2.0 py36_1000 conda-forge
jupyter_latex_envs 1.4.4 py36_1000 conda-forge
jupyterlab 0.35.6 py36_0 conda-forge
jupyterlab_server 0.2.0 py_0 conda-forge
jupyterthemes 0.20.0 pypi_0 pypi
kiwisolver 1.0.1 pypi_0 pypi
lesscpy 0.13.0 pypi_0 pypi
libedit 3.1.20170329 hf8c457e_1001 conda-forge
libffi 3.2.1 he1b5a44_1006 conda-forge
libgcc-ng 8.2.0 hdf63c60_1
libiconv 1.15 h516909a_1005 conda-forge
libpng 1.6.37 hed695b0_0 conda-forge
libsodium 1.0.16 h14c3975_1001 conda-forge
libstdcxx-ng 8.2.0 hdf63c60_1
libuuid 2.32.1 h14c3975_1000 conda-forge
libxcb 1.13 h14c3975_1002 conda-forge
libxml2 2.9.9 h13577e0_0 conda-forge
libxslt 1.1.32 h4785a14_1002 conda-forge
lxml 4.3.0 pypi_0 pypi
markupsafe 1.1.1 py36h14c3975_0 conda-forge
matplotlib 3.0.2 pypi_0 pypi
mccabe 0.6.1 pypi_0 pypi
mistune 0.8.4 py36h14c3975_1000 conda-forge
nb_conda 2.2.1 py36_2 conda-forge
nb_conda_kernels 2.2.2 py36_0 conda-forge
nbconvert 5.5.0 py_0 conda-forge
nbformat 4.4.0 py_1 conda-forge
nbstripout 0.3.5 py_0 conda-forge
ncurses 6.1 hf484d3e_1002 conda-forge
nodejs 11.14.0 he1b5a44_1 conda-forge
notebook 5.7.8 py36_0 conda-forge
numpy 1.16.0 pypi_0 pypi
openssl 1.1.1b h14c3975_1 conda-forge
packaging 19.0 pypi_0 pypi
pandoc 2.7.2 0 conda-forge
pandocfilters 1.4.2 py_1 conda-forge
parso 0.4.0 py_0 conda-forge
pcre 8.41 hf484d3e_1003 conda-forge
pexpect 4.7.0 py36_0 conda-forge
pickleshare 0.7.5 py36_1000 conda-forge
pip 19.1.1 pypi_0 pypi
ply 3.11 pypi_0 pypi
poyo 0.4.2 py_0 conda-forge
prometheus_client 0.6.0 py_0 conda-forge
prompt_toolkit 2.0.9 py_0 conda-forge
pthread-stubs 0.4 h14c3975_1001 conda-forge
ptyprocess 0.6.0 py_1001 conda-forge
pyasn1 0.4.5 pypi_0 pypi
pycodestyle 2.5.0 pypi_0 pypi
pycosat 0.6.3 py36h14c3975_1001 conda-forge
pycparser 2.19 py36_1 conda-forge
pyflakes 2.1.1 pypi_0 pypi
pygments 2.4.0 py_0 conda-forge
pyopenssl 19.0.0 py36_0 conda-forge
pyparsing 2.3.1 pypi_0 pypi
pyqt 5.9.2 py36hcca6a23_0 conda-forge
pyrsistent 0.15.2 py36h516909a_0 conda-forge
pysocks 1.7.0 py36_0 conda-forge
python 3.6.7 h381d211_1004 conda-forge
python-dateutil 2.8.0 py_0 conda-forge
python-dotenv 0.10.2 pypi_0 pypi
pytz 2019.1 pypi_0 pypi
pyyaml 3.13 pypi_0 pypi
pyzmq 18.0.1 py36hc4ba49a_1 conda-forge
qt 5.9.7 h52cfd70_1 conda-forge
qtconsole 4.4.4 py_0 conda-forge
readline 7.0 hf8c457e_1001 conda-forge
requests 2.22.0 py36_0 conda-forge
rsa 3.4.2 pypi_0 pypi
ruamel_yaml 0.15.71 py36h14c3975_1000 conda-forge
s3transfer 0.2.0 pypi_0 pypi
send2trash 1.5.0 py_0 conda-forge
setuptools 41.0.1 py36_0 conda-forge
simplegeneric 0.8.1 py_1 conda-forge
sip 4.19.8 py36hf484d3e_1000 conda-forge
six 1.12.0 py36_1000 conda-forge
snowballstemmer 1.2.1 pypi_0 pypi
sphinx 2.0.1 pypi_0 pypi
sphinxcontrib-applehelp 1.0.1 pypi_0 pypi
sphinxcontrib-devhelp 1.0.1 pypi_0 pypi
sphinxcontrib-htmlhelp 1.0.2 pypi_0 pypi
sphinxcontrib-jsmath 1.0.1 pypi_0 pypi
sphinxcontrib-qthelp 1.0.2 pypi_0 pypi
sphinxcontrib-serializinghtml 1.1.3 pypi_0 pypi
sqlite 3.28.0 h8b20d00_0 conda-forge
src 0.1.0 dev_0
terminado 0.8.2 py36_0 conda-forge
testpath 0.4.2 py_1001 conda-forge
tk 8.6.9 h84994c4_1001 conda-forge
tornado 6.0.2 py36h516909a_0 conda-forge
tqdm 4.31.1 pypi_0 pypi
traitlets 4.3.2 py36_1000 conda-forge
urllib3 1.24.3 py36_0 conda-forge
wcwidth 0.1.7 py_1 conda-forge
webencodings 0.5.1 py_1 conda-forge
wheel 0.33.4 py36_0 conda-forge
whichcraft 0.5.2 py_1 conda-forge
widgetsnbextension 3.4.2 py36_1000 conda-forge
xorg-libxau 1.0.9 h14c3975_0 conda-forge
xorg-libxdmcp 1.1.3 h516909a_0 conda-forge
xz 5.2.4 h14c3975_1001 conda-forge
yaml 0.1.7 h14c3975_1001 conda-forge
yapf 0.27.0 py_0 conda-forge
zeromq 4.3.1 hf484d3e_1000 conda-forge
zlib 1.2.11 h14c3975_1004 conda-forge

Steps to reproduce

  • Install Miniconda
  • Install Jupyter lab
  • Install cookiecutter
  • Install nb_conda_kernels
  • Create conda environment with conda create -n yourenvname python=3
  • Configure environment with
conda activate yourenvname
conda install ...
  • Return to base environment with conda deactivate
  • Create new project with cookiecutter https://github.com/drivendata/cookiecutter-data-science
  • Execute make data
  • Confirm success with
make test_environment  
python3 test_environment.py  
Development environment passes all tests!  
  • Execute jupyter lab
  • In Jupyter Lab, navigate to myproject>notebooks
  • From jupyter lab launcher create a new notebook using the conda environment yourenvname
  • Execute first cell with
# OPTIONAL: Load the "autoreload" extension so that code can change
%load_ext autoreload

# OPTIONAL: always reload modules so that as you change code in src, it gets loaded
%autoreload 2

from src.data import make_dataset

Error

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-1-f47036c946b3> in <module>
      5 get_ipython().run_line_magic('autoreload', '2')
      6 
----> 7 from src.data import make_dataset

ModuleNotFoundError: No module named 'src'

The error does not occur when using a python3 notebook instead of a notebook with the environment kernel.

It appears that the project package is not recognized by the notebook because it is not installed in the environment kernel.

I saw the recommendation to install jupyter in the environment, but that seems to be contrary to the design of jupyter lab/notebook and environments. One wants to have one jupyter install in the base environment so one can traverse all projects but still isolate notebooks within kernels.

This issue plus the conversations on #164 #118 #83 suggest that environments are a source of confusion and complexity. It would be nice if there were a way to let the user choose the package manager then reference environment stuff encapsulated by package manager. That would make the maintenance problem easier rather than either trying to do complex conditional logic in the makefile or forcing the user to debug incorrect assumptions about their starting environment.

In the meantime, perhaps someone could suggest a command to import src into an existing environment.

@pjbull
Copy link
Member

pjbull commented May 24, 2019

Thanks for the thoughtful overview, @jraviotta!

The real problem with this particular issue is that you never do something like pip install -e . in the local directory to make src available. This currently happens in requirements.txt when that is used, but not in environments where you only use conda. This will change with #162 where we take advantage of the updates to conda that let us install packages with pip.

The jupyter kernel issue is basically not related because if the package was installed in the kernel environment, that workflow should be supported.

I think that what is in #162 should support this workflow when it is ready

@jraviotta
Copy link
Contributor Author

@pjbull I agree with your assessment. I took a look at the feature branch for #162 and also agree that the local module should be available using the new workflow. I really like the modifications. It should make maintaining the current features easier, and also smooth the addition of new ones.

I'm not going to try to rebase #170 on top of the new branch because the new branch achieves the same objective in a different way. I'll see if there is anything on that checklist I can help to move forward.

@fhaust
Copy link

fhaust commented Jul 25, 2019

So ... what would be the "upgrade path" for somebody with an old conda environment based project in this case? I feel like I have tried every possible combination and still can't use the local files in the jupyter notebooks.

@pjbull
Copy link
Member

pjbull commented Jul 25, 2019

@fhaust What do your directory structure and pip freeze look like?

pip install -e . should work in a conda environment without any changes. If your project was created before March 2018, you may be missing setup.py. You would need to copy this setup.py to the project root and fill in the variables manually.

If you are using environment.yml you can add the following to the file:

name: ....
channels:
 - channel 
dependencies:
 - conda_package
 - pip:     # these two lines are
   - -e .   # what you need in environment.yml

@jraviotta
Copy link
Contributor Author

So ... what would be the "upgrade path" for somebody with an old conda environment based project in this case? I feel like I have tried every possible combination and still can't use the local files in the jupyter notebooks.

I create a conda base environment as described here. Then I install conda environments for each project. and include an environement.yml with the following plus any other packages.

channels:
  - conda-forge
  - defaults
dependencies:
  - ipykernel
  - setuptools
  - wheel
  - pip
  - pip:
     - -e .

That installs the local code into the new environment and provides a kernel for the environment in the jupyter notebook.

@fhaust
Copy link

fhaust commented Jul 25, 2019

@pjbull

So I do have the setup.py ... that apparently was already a thing when I created the project.

I had a environment.yml that contains the pip and -e . lines. I put them in there by hand this morning, do I have to reload it somehow?

Thing is my environment.yml wasn't called like that in the beginning, I think I created it manually and called it after the folder the project resides in, but I just renamed it and it still doesn't recognize the local folders.

What changed is that now which python and which jupyter point back to the global installation of those, does that still pose a problem?

@pjbull
Copy link
Member

pjbull commented Jul 25, 2019

Have you read the docs on creating and managing conda environments? If not, that's the best place to start:
https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html

Once you have a conda environment activated, you can use the following to update it with that latest like so:
conda env update --file environment.yml

@fhaust
Copy link

fhaust commented Jul 25, 2019

I tried the env update before and just repeated it to no avail. It reported some inconsistencies, but I don't think that should be a problem?

I probably should mention that I have to specify the current conda executable via CONDA_EXE for a while now. No idea if that is a problem?

Edit: Also conda env export does not include the '-e .' line afterwards.

@pjbull
Copy link
Member

pjbull commented Jul 25, 2019

@fhaust At this point, this it's pretty clear that this isn't a problem with cookiecutter-data-science but instead with your conda environment configuration. We can't debug that for you remotely.

My advice is to (1) uninstall and reinstall conda, (2) create a brand new environment for your project, (3) install the local package for access in notebooks as described above.

Good luck!

@fhaust
Copy link

fhaust commented Jul 29, 2019

I've tried reinstalling conda (there was indeed a problem with my installation), created a new environment and reinstalled the local package. But Jupyter does not recognize it.

Let's try the other way around, can you point me to a branch and/or commit of this project that should work with conda environments?

@pjbull
Copy link
Member

pjbull commented Jul 29, 2019

All of the branches/commits including the current one will work with conda environments. Again, this isn't a CCDS issue since we don't do anything special with conda environments (or any environment for that matter).

Run which jupyter at your terminal. I suspect you are running the root environment jupyter in conda rather than in your conda environment. If the result of that does not have the environment name, you should:
(1) Activate the conda environment if it is not activated
(2) Run pip install jupyter or conda install jupyter inside the environment.
(3) Run which jupyter to ensure you are using the version in the environment

Now when you run jupyter you should be able to load the package that is installed in that environment. This is a byproduct of how conda sets up environments and Jupyter kernels and not a cookiecutter issue.

In exchange for the support on this issue, would you submit a PR adding a note to the docs that explains confirming that your running version of Jupyter and your local install of the package are in the same environment? Just a couple of lines at the end of this section will help other people who encounter this issue.

@fhaust
Copy link

fhaust commented Aug 6, 2019

I basically gave up on this ...

But after just reinstalling Jupyter Lab I now get the option to chose a Python [conda env:.conda-xxx] kernel, no idea what changed but with that kernel I can now include the local folders. 🤷‍♂️

@antimora
Copy link

antimora commented Oct 19, 2019

Another option is to specify PYTHONPATH before launching jupyter . For my own convenience I added a new make target into Makefile:

## Run Jupyter
jupyter:
	PYTHONPATH=$(abspath ./src) jupyter notebook --ip=0.0.0.0

This is more preferred for me since I am using docker for my projects.

@dioptx
Copy link

dioptx commented Dec 29, 2019

I always use this snippet in order to enable relative imports to my notebooks:

import os
import sys
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)

This makes sure that any relative import works just fine. hope this helps.

@jgrooviest
Copy link

@pjbull attempting to run notebooks from the notebooks folder still results in a "ModuleNotFoundError". Interestingly, opening up python from Anaconda prompt does allow an import from the main package folder, and from inside the notebooks folder...

What procedure of commands after running cookiecutter https://github.com/drivendata/cookiecutter-data-science (plus the automatic inputs) followed by pip install -e . from the parent directory is required to be able to import scripts from the src directory from inside the notebooks directory? (trying to avoid the appending to my path)

@jgrooviest
Copy link

Nevermind, utilizing setuptools and find_packages() appears to solve the issue: https://setuptools.readthedocs.io/en/latest/userguide/package_discovery.html

@pjbull
Copy link
Member

pjbull commented May 28, 2021

Yep, that should work and is baked into the default template already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants