Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time-to-first-plot improvements #1278

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

non-Jedi
Copy link
Contributor

@non-Jedi non-Jedi commented Apr 19, 2019

These two changes give close to a 20% speedup in time-to-first-plot in
my informal bench-marking. Unfortunately, this produces different
output on some of the regression suite (thus the WIP); things aren't
getting drawn in the same order as before (see below). A total of 17
images are different, and more svgs than that are structured
differently without actually having visible differences. Don't worry;
I'll figure it out.

density2d

density2d

layer_order

layer_order

  • auto-enumerate
  • colored_smoot_lm
  • density2d
  • hair
  • histogram_errorbar
  • issue975
  • layer_leak
  • layer_order
  • multi_geom_layer
  • nan_skipping
  • smooth_lm
  • stat_binmean
  • subplot_grid
  • subplot_grid_free_axis
  • subplot_grid_free_y_2
  • subplot_layers
  • unitful_basic

The changes here are just my first fumbling attempts to address the
profiling in #1275. I honestly expected the first commit in this pack
to have a bigger effect than it did, and I expected the nothing
comparisons to be negligible when in reality they were more important
than getting rid of the splatting. Pushing this now as WIP since I
don't know when I'll be able to get back to this: could be tomorrow,
or it could be weeks.

Contributor checklist:

  • I've updated the documentation to reflect these changes
  • I've added an entry to NEWS.md
  • I've added and/or updated the unit tests
  • I've run the regression tests
  • I've squash'ed or fixup'ed junk commits with git-rebase
  • I've built the docs and confirmed these changes don't cause new errors

Proposed changes

@non-Jedi
Copy link
Contributor Author

non-Jedi commented Apr 19, 2019

Just noticed some other miscellaneous refactoring accidentally made it into the first commit other than its stated purpose of eliminating splatting. I think they're all changes for the good, but I can rebase them out if you'd prefer a cleaner review here.

@codecov-io
Copy link

codecov-io commented Apr 19, 2019

Codecov Report

Merging #1278 into master will increase coverage by 0.03%.
The diff coverage is 96.35%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1278      +/-   ##
==========================================
+ Coverage   85.83%   85.86%   +0.03%     
==========================================
  Files          36       36              
  Lines        4158     4161       +3     
==========================================
+ Hits         3569     3573       +4     
+ Misses        589      588       -1
Impacted Files Coverage Δ
src/geometry.jl 75% <ø> (ø) ⬆️
src/data.jl 0% <0%> (ø) ⬆️
src/misc.jl 62.58% <0%> (+0.16%) ⬆️
src/varset.jl 97.77% <100%> (ø) ⬆️
src/geom/point.jl 100% <100%> (ø) ⬆️
src/dataframes.jl 85.71% <100%> (ø) ⬆️
src/mapping.jl 83.01% <100%> (ø) ⬆️
src/geom/bar.jl 91.85% <100%> (ø) ⬆️
src/coord.jl 84.53% <100%> (ø) ⬆️
src/geom/hvabline.jl 93.84% <100%> (ø) ⬆️
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9f5f621...bf5bd2e. Read the comment docs.

@@ -67,7 +67,7 @@ function render(geom::PointGeometry, theme::Gadfly.Theme, aes::Gadfly.Aesthetics
for (x, y, color, size, shape, alpha) in Compose.cyclezip(aes.x, aes.y, aes.color, aes.size, aes.shape, aes_alpha)
shapefun = typeof(shape) <: Function ? shape : theme.point_shapes[shape]
sizeval = typeof(size) <: Int ? interpolate_size(size) : size
strokecolor = aes.color_key_continuous != nothing && aes.color_key_continuous ?
strokecolor = issomething(aes.color_key_continuous) && aes.color_key_continuous ?
Copy link
Member

@Mattriks Mattriks Apr 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note PR #1258 which rewrites render for PointGeometry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How close is that PR to merging? I can either rebase this on top of it, you can rebase on top of this, or we can wait and see whichever gets merged first. I did notice you used != nothing a few times in your rewrite; whatever we do, we should make sure that those are changed to issomething before they make it to master.

@bjarthur
Copy link
Member

thanks for you work on this dire problem! i'd been assuming it was hopeless from a pkg dev point of view and that only changes to julia base could fix it.

@non-Jedi
Copy link
Contributor Author

non-Jedi commented Apr 19, 2019

Well, I'm down to 7 that are different now (all but one entirely new from the previous list):

  • colorful_hist
  • hexbin
  • hexbin_dark
  • stacked_continuous_histogram
  • stacked_discrete_histogram
  • stacked_discrete_histogram_horizontal
  • subplot_grid_free_y_2

and to be honest for some of them I'm not seeing the difference. Any thoughts? I'll keep working. Examples:

colorful_hist (fixed)

No idea what's going on here.

master

colorful_hist

devel

colorful_hist

diff

colorful_hist png

hexbin_dark (fixed)

something different about the colors maybe?

master

hexbin_dark

devel

hexbin_dark

diff

hexbin_dark png

stacked_continuous_histogram (fixed)

no idea

master

stacked_continuous_histogram

devel

stacked_continuous_histogram

diff

stacked_continuous_histogram png

subplot_grid_free_y_2

Scales are off on this one. Weird that it's the only one that seems to have this problem.

master

subplot_grid_free_y_2

devel

subplot_grid_free_y_2

diff

subplot_grid_free_y_2 png

@non-Jedi non-Jedi force-pushed the benchmark2 branch 3 times, most recently from da4a328 to c5a6136 Compare April 20, 2019 01:39
@non-Jedi
Copy link
Contributor Author

All regression tests now pass. I'm gonna run some stuff to make pretty graphs showing the improvement as is only proper when working on Gadfly.

@non-Jedi non-Jedi changed the title WIP: Time-to-first-plot improvements Time-to-first-plot improvements Apr 20, 2019
@non-Jedi
Copy link
Contributor Author

non-Jedi commented Apr 20, 2019

So the results weren't as dramatic for these particular changes running on my desktop versus running on my laptop. Still a nice little ~4% bump though. I threw the data from running through all the testscripts up on Github if anyone is curious enough to take a closer look.

drawtime

drawallocs

Or perhaps more telling (I think the cluster around 10 seconds is because I had forgotten to make CSV or some other test dependency available):

c5a61361_drawtime

So to get below 10 seconds or so for most plots, I only need to make similar improvements another (0.95^28 * 40 seconds = 9.5 seconds)... 27 times... Hopefully some of my other ideas will be a bit more fruitful.

@bjarthur
Copy link
Member

lgtm. 4% is better than nothing.

where is issomething defined? i can't find it!

you might consider a PR to the "performance tips" of the julia docs. there is no mention of isnothing being faster and splats being slow.

@Mattriks you ready to merge #1258 ?

@non-Jedi
Copy link
Contributor Author

Both isnothing and issomething are actually defined in Gadfly's misc.jl which is necessary to support 1.0 since isnothing wasn't introduced to Base until 1.1. Which reminds me: the Base implementation uses type dispatch instead of ===. I should change misc.jl to match to see if we can get a tiny bit of additional speedup there.

@non-Jedi
Copy link
Contributor Author

Force pushed with changes mentioned above to isnothing and a rebase on master to resolve conflicts with #1276.

src/misc.jl Outdated Show resolved Hide resolved
This gives a marginal improvement on speed and memory use during first
plot, but the biggest benefit is that it makes the code easier to read
and the profiling significantly easier to grok.
@non-Jedi non-Jedi force-pushed the benchmark2 branch 6 times, most recently from 45f7b14 to 3470acc Compare April 22, 2019 22:14
@non-Jedi
Copy link
Contributor Author

CI failure is unrelated. I'll try to PR a version bump of Documenter in the next few days to fix doc building failure.

@bjarthur
Copy link
Member

bjarthur commented Apr 23, 2019

perhaps CI failure for docs is due to forgetting to fold docs/Project.toml into ./Project.toml in #1277 ?

src/geom/rectbin.jl Outdated Show resolved Hide resolved
Comparing nothing by value (==, =!) rather than by identity (===, !==)
or by type (isnothing) can have negative consequences on code
speed. This commit changes all nothing comparisons to use the
isnothing utility added in Julia 1.1 and Compat 2.1.
@non-Jedi
Copy link
Contributor Author

docs/Project.toml doesn't have a way to be folded into ./Project.toml unfortunately. The problem is that Documenter 0.20 has a hard version cap on DocumentStringExtensions of 0.2 which conflicts with current requirements in ./Project.toml. Should be fixable by bumping Documenter to 0.22.

src/coord.jl Outdated Show resolved Hide resolved
src/Gadfly.jl Outdated
@@ -388,14 +388,14 @@ function render_prepare(plot::Plot)
# they are missing.
datas = Array{Data}(undef, length(plot.layers))
for (i, layer) in enumerate(plot.layers)
if layer.data_source === nothing && isempty(layer.mapping)
if isnothing(layer.data_source) && isempty(layer.mapping)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be willing to bet === nothing is more efficient. The compiler loves ===.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply changing occurrences of == nothing to use === (and likewise for the negative) should provide a nice improvement.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, this issue applies: JuliaLang/julia#27681

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know. I'll update with straight === and see if it makes a difference.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would also resolve the isnothing name qualification discussion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed a commit changing from isnothing to === nothing. Still need to get back to the computer I've benchmarked on before to see if it makes a difference.

Per comment from @ararslan, the Julia compiler may prefer === versus
isnothing; improvements to be confirmed with benchmarking.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants