Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a cheatsheet #723

Open
cigrainger opened this issue Oct 23, 2023 · 8 comments
Open

Add a cheatsheet #723

cigrainger opened this issue Oct 23, 2023 · 8 comments
Assignees

Comments

@cigrainger
Copy link
Member

I'm excited about cheatsheets and something like this would beat "Ten minutes to Explorer", especially for those coming from dplyr or pandas who just need an easy reference.

dplyr: https://nyu-cdsc.github.io/learningr/assets/data-transformation.pdf
also dplyr: https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf
pandas: https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf

I started playing over the weekend but I think the diagrams are really powerful and got myself stuck trying to figure out how to replicate them in vega lite.

@billylanchantin
Copy link
Contributor

billylanchantin commented Oct 23, 2023

I'd love to see cheatsheets too! Personally, I think it would've saved me some documentation hunting.

For example, early on I saw there was a DataFrame.filter/2 but I couldn't find a corresponding Series.filter/2. I eventually found Series.mask, but it wasn't obvious. Something like a task-oriented cheatsheet would've made that much easier to find.

I think the diagrams are really powerful and got myself stuck trying to figure out how to replicate them in vega lite.

I also feel like I've lost a good bit of time trying to get vegalite to output specific visualizations. It's great for a lot of cases. But when you start doing non-data-driven things like annotating your visualization, I found that it gets tricky.

Were you thinking of doing a .cheatmd? Or a .pdf?

@cigrainger
Copy link
Member Author

Glad to hear it would be helpful! I was thinking of doing a .cheatmd so we can easily put it into the docs. I feel like it should be possible to get close with heatmaps based on specific categorical values. And I also agree that it's confusing we have Series.mask/2 instead of Series.filter/2.

@philss or @josevalim I'm sure there was a convo about this but I can't remember what the reasoning was for this anymore. #326 (comment)

@josevalim
Copy link
Member

We changed the implementation and renamed at the same time but I am fine with reverting the name back to filter. :) It should be a quick change and we can add:

@deprecated "Use Explorer.Series.filter/2 instead"
def mask(s1, s2), do: filter(s1, s2) 

It will certainly be much easier to find.

@billylanchantin
Copy link
Contributor

Oh I didn't mean to pick up a stray issue! I was just using it as an example.

We changed the implementation and renamed at the same time but I am fine with reverting the name back to filter.

It may be worth having both since mask/2 and filter/2 accept different datatypes: mask takes a boolean series while filter/2/filter_with/2 take a query/function. If you have the boolean series on hand, you'd want mask/2. But if you're finding that you need to build the boolean series e.g. with transform/2 only to pass it right into mask/2, filter/2 would be convenient.

@billylanchantin
Copy link
Contributor

I feel like it should be possible to get close with heatmaps based on specific categorical values.

I agree! I was more worried about the arrows:

Screen Shot 2023-10-23 at 12 01 45 PM

Though if you could embed the diagrams in a table, you could probably achieve a similar effect w/o the need for the arrow annotations.

@josevalim
Copy link
Member

It may be worth having both since mask/2 and filter/2 accept different datatypes: mask takes a boolean series while filter/2/filter_with/2 take a query/function.

The issue is that doing it with a function is horribly expensive and should be generally avoided.

@billylanchantin
Copy link
Contributor

Hey I wrote this a little after the earlier discussion. If it's not helpful just ignore me :)

https://vega.github.io/editor/#/gist/e0675e1408ba1944deb1a747f03a060d/spec.json

DPLYR VegaLite
Screen Shot 2023-10-26 at 3 48 06 PM Screen Shot 2023-10-26 at 3 53 01 PM

Note that it does appear to be possible to add margins to the rectangles:

https://vega.github.io/vega-lite/examples/rect_mosaic_labelled_with_offset.html

But my cursory reading of that example makes it seem a bit complex:

    {
      "calculate": "datum.y + (datum.rank_Cylinders - 1) * datum.distinct_Cylinders * 0.01 / 3",
      "as": "ny"
    },

@cigrainger
Copy link
Member Author

cigrainger commented Oct 27, 2023

Super helpful! Thank you @billylanchantin! I also don't think the margins are too important :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants