Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python Implementation #267

Open
5 tasks
jayqi opened this issue Feb 20, 2020 · 18 comments
Open
5 tasks

Python Implementation #267

jayqi opened this issue Feb 20, 2020 · 18 comments
Labels

Comments

@jayqi
Copy link
Collaborator

jayqi commented Feb 20, 2020

I've begun working on a Python implementation of pkgnet. 🎉

Branch is here: https://github.com/jayqi/pkgnet/tree/py-pkg/py-pkg

This issue will track progress and discussion on design decisions.

Some code standards decisions to make:

  • Linter: Currently using flake8, link to config
  • Formatter: Currently using black, link to config
  • Line length: Currently using black's default of 88 characters
  • Minimum Python version: Currently 3.6, because I'm using f-strings.
  • Docstring style: Currently using Google
@jayqi
Copy link
Collaborator Author

jayqi commented Feb 20, 2020

What I have implemented as of now:

  • Working node and edge extraction for DependencyReporter
  • Working node and edge extraction for InheritanceReporter
  • Working node and edge extraction for a new reporter, ModuleReporter, since Python has the concept of modules (R does not)
    • This currently gets the file tree network. Is this useful? Is imports a better network to extract?
  • Working implementation of DirectedGraph that uses networkx backend
    • Question: do we want to stick with networkx (which is more popular) or switch to igraph (which is what R package uses)?

Some open areas that either seem hard or I don't know much about:

  • How to parse Python code for symbolic references for FunctionReporter
    • <obj>.__code__.co_names gives strings for symbolic references, but split on dots. Looking up the object will be hard.
    • Use ast library?
  • How to implement test packages for unit tests
  • How do we want to generate a report? Some potential options:
    • Simple HTML template, use jinja2
    • Use Jupyter notebooks and nbconvert
  • Which network viz library? Static image or interactive javascript? Some options I was looking at:
    • pyvis uses vis.js which is what visNetwork in R is using
    • plotly example doesn't let you drag nodes, may need to dig into options further
    • py2cytoscape uses cytoscape.js

@jameslamb
Copy link
Collaborator

How do we want to generate a report? Some potential options:
Simple HTML template, use jinja2
Use Jupyter notebooks and nbconvert

I personally prefer HTML + jinja2. jupyter and nbconvert are ubiquitous, but also kind of heavy for what we want to do.

@jameslamb
Copy link
Collaborator

I'm ok with all of the "standards to discuss" things. I personally prefer line length 100 but not something I'm going to fight hard on.

I think I could do "How to implement test packages for unit tests". I had to do that for doppel-cli (https://github.com/jameslamb/doppel-cli/tree/master/integration_tests/test-packages/python)

@jameslamb
Copy link
Collaborator

I like pyvis for consistency with the R package. Then we can live or die with vis.js, and maybe even discover things in it that we can fix or open issues for.

@bburns632
Copy link
Collaborator

bburns632 commented Feb 21, 2020 via email

@jayqi
Copy link
Collaborator Author

jayqi commented Feb 21, 2020

I'm ok with all of the "standards to discuss" things. I personally prefer line length 100 but not something I'm going to fight hard on.

100 length sounds good to me.

I personally prefer HTML + jinja2. jupyter and nbconvert are ubiquitous, but also kind of heavy for what we want to do.

I agree. I think this is right level to make that tradeoff for what we need.

I like pyvis for consistency with the R package. Then we can live or die with vis.js, and maybe even discover things in it that we can fix or open issues for.

I'm thinking about putting the viz backend in a separate class to make it easier to replace it later. (This was also something I was thinking about doing for R.) I'll start with pyvis.

On networkx vs igraph, does igraph have a very similar API in python and
R? We don't do all too much with graph metrics at the moment, but if we
ever do, maybe a similar API would make parity easier.

Sort of. It looks like the Python version has these enormous graph classes with a bajillion methods. The R implementation was completely functional. In that way the APIs are pretty different. The parameters are similar though, probably because everything uses the same C++ underneath. Here is an example:

Something to consider:

@jayqi
Copy link
Collaborator Author

jayqi commented Feb 24, 2020

So the really annoying thing, and maybe deal-breaker, with pyvis is that they don't actually output code you can easily embed in an HTML doc. They have their own HTML document template where they use jinja2 to inject data into the raw javascript code to draw the graph.

https://github.com/WestHealth/pyvis/blob/master/pyvis/templates/template.html

In order to use this, we'd have to copy-paste their javascript code block into our templates.

@jayqi
Copy link
Collaborator Author

jayqi commented Feb 24, 2020

I'm thinking now that it may be most straightforward to just write the visualization code directly in javascript, and just inject the network data as json. This might actually be the most flexible, because we don't have to worry about existence or support of a Python interface. Which don't seen to exist in general anyways. Tt doesn't seem like there are as many good options to have Python generate the javascript code -- there isn't the same htmlwidget ecosystem that R has, as far as I can tell.

@jameslamb
Copy link
Collaborator

I'm thinking now that it may be most straightforward to just write the visualization code directly in javascript, and just inject the network data as json. This might actually be the most flexible, because we don't have to worry about existence or support of a Python interface. Which don't seen to exist in general anyways. Tt doesn't seem like there are as many good options to have Python generate the javascript code -- there isn't the same htmlwidget ecosystem that R has, as far as I can tell.

ok I like this! I think it makes a lot of sense, gives us a clearer path for "language 3", and (selfishly) is an opportunity for us to practice a little JS

@jayqi
Copy link
Collaborator Author

jayqi commented Feb 24, 2020

Making some progress!

screencapture-file-Users-Jay-Desktop-pkgnet-tmp-html-2020-02-23-22_21_52

@jayqi
Copy link
Collaborator Author

jayqi commented Feb 24, 2020

Not too much work to implement a different library. This is sigma.js.
screencapture-file-Users-Jay-Desktop-pkgnet-tmp-html-2020-02-24-07_12_57

@jayqi
Copy link
Collaborator Author

jayqi commented Mar 9, 2020

So I've got a working implementation in place for the most part. Wondering if I should open a pull request. At some point it might be easier to start making issues for smaller pieces of stuff and tackling them individually.

@jameslamb
Copy link
Collaborator

jameslamb commented Mar 10, 2020

So I've got a working implementation in place for the most part. Wondering if I should open a pull request. At some point it might be easier to start making issues for smaller pieces of stuff and tackling them individually.

@jayqi I definitely think you should open a PR! I propose that you open the PR and we only focus on resolving comments related to the API, then worry about which viz library, how to handle the weird "package name you install is different from the one you import" thing, and other issues later.

@nicolasfguillaume
Copy link

nicolasfguillaume commented Sep 2, 2020

Hi @jayqi, thanks for your great work!

After successfully installing the package, I am trying to replicate your example but this does not work for me (Python 3.7.5 on Windows 10):

Python 3.7.5 (tags/v3.7.5:5c02a39a0b, Oct 15 2019, 00:11:34) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from pkgnet import create_package_report
>>> create_package_report("jinja2")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\Nicolas\Documents\home\workspace\pkgnet\py-pkg\pkgnet\package_report.py", line 172, in create_package_report
    pkg_reporters = default_reporters()
  File "C:\Users\Nicolas\Documents\home\workspace\pkgnet\py-pkg\pkgnet\package_report.py", line 33, in default_reporters
    DependencyReporter(),
TypeError: Can't instantiate abstract class DependencyReporter with abstract methods network_measures

Any help would be greatly appreciated.

Thank you so much,

Nicolas

@jayqi
Copy link
Collaborator Author

jayqi commented Sep 3, 2020

@nicolasfguillaume Thanks for trying it out! Sorry that you had an error—it looks like I had a bug sneak in there. #281 should fix it.

@bburns632 bburns632 reopened this Apr 19, 2024
@bburns632
Copy link
Collaborator

bburns632 commented Apr 19, 2024

Now that CI is lower cost (in terms of time), I'd like to test what appetite there is to update the docs and publish to PyPy. I feel guilty we're let this sit for years.

LOE:

  • Name it!
  • New docs
  • New CI
  • Handful of new issues after release

How about it @jayqi @jameslamb?

@jayqi
Copy link
Collaborator Author

jayqi commented May 3, 2024

I haven't touched my implementation nearly 4 years. I don't know if it's really ready for users (I vaguely recall some stuff was still kind of janky), and I'm not sure I want to commit to developing/maintaining it.

@jameslamb
Copy link
Collaborator

jameslamb commented May 3, 2024

+1 I can't to that, I can't commit to supporting the Python version.

In my opinion, you or @jayqi should do one of the following:

  • just delete that branch
  • publish the code to a separate public repo with permissive license then immediately archive that repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants