Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration with TensorBoard and other logging utilities #277

Open
wants to merge 37 commits into
base: master
Choose a base branch
from

Conversation

MilesCranmer
Copy link
Owner

@MilesCranmer MilesCranmer commented Jan 6, 2024

cc @eelregit @paulomontero

@eelregit made a request for a tensorboard logging capability. Turns out this was a fantastic idea! This is super useful, and also extremely easy to integrate. Gotta love Julia.

This is a work-in-progress PR to add tensorboard logging capabilities. Actually any other logger will work which is <:Logging.AbstractLogger – which is in the Julia standard library. That includes Wandb as well, via Wandb.jl

It seems to work nicely thus far. Here's a code example. First, run julia -e 'using Pkg; pkg"add TensorBoardLogger MLJBase https://github.com/MilesCranmer/SymbolicRegression.jl/tree/tb-logging"' to install everything.

Then, you can run this with (copy-paste):

julia> using SymbolicRegression, TensorBoardLogger, Logging, MLJBase

julia> logger = TBLogger("tensorboard_logs/run", min_level=Logging.Info);

julia> model = SRRegressor(
           niterations=50,
           binary_operators=[+, -, *, mod],
           unary_operators=[],
           maxsize=40,
           logger=logger,
       );

julia> X = (a = rand(500), b = rand(500));

julia> y = @. 2 * cos(X.a * 23.5) - X.b ^ 2;

julia> mach = machine(model, X, y);

julia> fit!(mach);

which results in (you can run Python: Launch Tensorboard in VSCode via the Python extension) –

Screenshot 2024-01-06 at 15 20 08


TODO:

  • Move the with_logger part into a separate function to clean things up. Probably in SearchUtils.jl.
  • Figure out what other stats we want to log. @eelregit what would be useful?
    • TensorBoardLogger actually lets you log entire plots. We could log the whole pareto front maybe?
  • Figure out if we want to let the user pass a method which computes statistics they would find useful from the pareto front.
    • Added as logging_callback which is passed a variety of relevant variables.
  • Add a unit test with SimpleLogger.
  • Document both logger and logging_callback in both equation_search and SRRegressor
  • Add example to documentation
  • Allow user to control frequency of logging
  • Make plots at a different frequency from scalar logs

@MilesCranmer MilesCranmer changed the title [WIP] Search state logging / TensorBoardLogger integration [WIP] TensorBoard integration / Generic search state logging Jan 6, 2024
@MilesCranmer MilesCranmer changed the title [WIP] TensorBoard integration / Generic search state logging [WIP] TensorBoard integration / Generic search state logging (via AbstractLogger) Jan 6, 2024

This comment was marked as resolved.

@MilesCranmer MilesCranmer changed the title [WIP] TensorBoard integration / Generic search state logging (via AbstractLogger) [WIP] TensorBoard/Wandb integration / Generic search state logging (via AbstractLogger) Jan 6, 2024
@MilesCranmer MilesCranmer changed the title [WIP] TensorBoard/Wandb integration / Generic search state logging (via AbstractLogger) [WIP] TensorBoard/Wandb integration – Generic search state logging via AbstractLogger Jan 6, 2024
@MilesCranmer
Copy link
Owner Author

FYI Wandb works too! https://github.com/avik-pal/Wandb.jl

@eelregit
Copy link

eelregit commented Jan 7, 2024

Thanks @MilesCranmer ! The tensorboard monitoring will complement the progress bar nicely for jobs longer than a few hours.

Here's just some ideas on which evolution histories can be useful:

  • loss based on model selection, which already looks great in the above figure
  • complexity fraction quantile to monitor the balance of equations across complexity:
    something like fraction(<=0.8*maxsize), fraction(<=0.6*maxsize), ...
  • pareto front figure (see my simple python functions below)
  • performance stats scalars: the head worker load and expr/sec (already in the progress monitoring)
  • equations. I am thinking that it can be more useful to track one at each complexity, and only when it changed.
    Two approaches are to log
    * text, smaller log on disk
    * Latexify -> render as image -> TensorBoardLogger, prettier equations

I guess we only need to log per iteration by the head worker, right?

import os
import sys

import numpy as np
import scipy
import pysr
import matplotlib.pyplot as plt


def _get_lower(polygon):
    """Lower convex hull: https://stackoverflow.com/questions/76838415/lower-convex-hull"""
    minx = np.argmin(polygon[:, 0])
    maxx = np.argmax(polygon[:, 0]) + 1
    if minx >= maxx:
        lower_points = np.concatenate([polygon[minx:], polygon[:maxx]])
    else:
        lower_points = polygon[minx:maxx]
    return lower_points


def pareto_plot(equation_file, savefig=True, lower_convex_hull=True):
    """Plot Pareto front."""
    model = pysr.PySRRegressor.from_file(equation_file)
    hof = model.get_hof()

    hof.plot(x='complexity', y='loss', loglog=True, xlim=(1, None), ylabel='loss',
             drawstyle='steps-post')

    if lower_convex_hull:
        points = hof[['complexity', 'loss']].to_numpy()
        points = points[np.isfinite(points.sum(axis=1))]  # remove inf

        hull = scipy.spatial.ConvexHull(np.log(points))
        lower_points = _get_lower(points[hull.vertices])

        plt.plot(lower_points[:, 0], lower_points[:, 1], ls=':', label='convex hull')
        plt.legend()

    if savefig:
        fig_file = os.path.splitext(equation_file)[0] + '.pdf'
        plt.savefig(fig_file)
    else:
        return plt.gcf()


if __name__ == '__main__':
    for equation_file in sys.argv[1:]:
        pareto_plot(equation_file)

I need to remove Inf because all low complexity models all return infinity with my hacked objective.

image

@eelregit
Copy link

eelregit commented Jan 7, 2024

Theres seems to be 2 definitions of score, and it's a bit confusing which one is used in each context.

  1. Linear combination of loss and complexity, with the latter weighted by parsimony, in LossFunctions.jl.
    If I understand correctly, the order of magnitude of loss should be accounted when setting parsimony.
    (I don't understand how the frecency is implemented yet.)
  2. Negative derivative of log(loss) wrt linear complexity.

They both trade loss for complexity, with the first on linear scales and the second on log-linear scales.
This, and the log-log pareto front, inspire yet another definition,
the linear combination of log(loss) and log(complexity) (again with the latter weighted by parsimony).
And optimizing this score is equivalent to find the best model with - d log(loss) / d log(complexity) >= parsimony.
The optimal models based on such criteria correspond to the convex hull vertices in the above figure.

With this score, the pareto front (on log-log scales) are naturally divided into staged.
The log-linear pareto front also show similar "bouncing" behavior, though less pronounced, when it nearly "converges".
So maybe it's worth logging the loss / adding model selection at the first few vertices?

Also what do you think about using the log-log score in fitting?
The choice of a default parsimony can be more robust and won't need to account for the order of loss any more.
Adaptivity can be something like - d log(loss) / d log(complexity) >= parsimony + adaptive_parsimony / complexity.

@MilesCranmer
Copy link
Owner Author

Is there a way to create plots in TensorBoard or do you need to upload the entire image?

@MilesCranmer
Copy link
Owner Author

Yes the internal score is poorly named. It should really be regularized_loss. I can make a PR to adjust that.

@MilesCranmer
Copy link
Owner Author

MilesCranmer commented Jan 8, 2024

@eelregit I added plotting utilities. Do you want to try it out?

Here's a demo:

using SymbolicRegression
using Plots
using MLJBase

model = SRRegressor(; binary_operators=[+, -, *, /], niterations=500, maxsize=80)

X = rand(1000, 2) .* 10
y = X[:, 1] + X[:, 2] .^ 2.5

mach = machine(model, X, y)

fit!(mach)

plot(mach.fitresult; dpi=300, fontfamily="serif")

which creates:

image

I think the convex hull is nice but I am worried it's a bit too specific (it seems like it wouldn't be general outside of mean-squared error loss?). I always try to make things general so users can customize these things. So I'm leaning towards taking it out and simply letting people write their own plotting method (which they could pass in as a callback function). What do you think?

But maybe we could still have a decent default for the Pareto front plot itself.

@eelregit
Copy link

eelregit commented Jan 8, 2024

That's an interesting example! I tried that and see that the convex hull actually evolved from two stages into one, and probably started growing into two again.

After 100 iterations:
image

After 300 iterations:
image

After 500 iterations, a "phase transition", PySR find that 20+ complexity is enough for this simple problem.
More complex expressions disappeared from the hall of fame:
image

After 1000 iterations, a new stage starting to form above complexity 20:
image

I think that the convex hull could be a generic feature in the sense that we expect the loss-complexity trade-off
get worse at large complexity (therefore convex) for typical problems where an exact fit cannot be found.
So it shouldn't depend on the specific loss function.
If there's an exact fit, then we should see a single stage convex hull, stopping at the corresponding complexity.
Otherwise, it's too good to be true that the loss-complexity trade-off stays concave and get better with increasing complexity?
There might be very rare cases that the log-log trade-off stays as a power law (straight line), in which case the convex hull is useless.

I think it's great idea to have plot callbacks. And the convex hull can be a good example for that :)

@MilesCranmer
Copy link
Owner Author

MilesCranmer commented Mar 21, 2024

@eelregit sorry for the late follow-up. Teaching was a bit of a black hole – just finished!

I added a bunch of new things. Now logged:

  1. The area of the log-log convex hull – called "pareto volume". This seems like a really useful overall metric; thanks for sharing the idea!!
  2. The histogram of population complexities. Also a great idea. It seems the various tricks to diversity the populations are working pretty well here. Would be interesting to see how this scales for many of them.

Here's what the output looks like:

Screenshot 2024-03-21 at 00 21 10

And here's an example complexity distribution over time (time is represented by depth):

Screenshot 2024-03-21 at 00 27 46

Here are the saved plots:

Screenshot 2024-03-21 at 00 39 34

Finally, for posterity, here's the code I'm using:

using SymbolicRegression
using Plots
using MLJBase
using TensorBoardLogger: TBLogger

X = randn(Float32, 100, 5)
y = 2 * cos.(X[:, 4]) + X[:, 1] .^ 2 .- 2

model = SRRegressor(;
    binary_operators=[+, *, /, -],
    # unary_operators=[cos, exp],
    populations=20,
    maxsize=30,
    niterations=400,
    parallelism=:multithreading,
    logger=TBLogger("logs"),
    log_every_n=(scalars=1, plots=10),
)

mach = machine(model, X, y)
fit!(mach)

@MilesCranmer MilesCranmer changed the title [WIP] TensorBoard/Wandb integration – Generic search state logging via AbstractLogger Integration with TensorBoard and other logging utilities Mar 21, 2024
@eelregit
Copy link

Thanks Miles! The Pareto volume is a great metric idea.

Sorry for not testing things out earlier. I will do it this weekend.
To confirm (again), I can follow https://astroautomata.com/PySR/backend/
to use this branch, right?

@MilesCranmer
Copy link
Owner Author

Sorry for not testing things out earlier. I will do it this weekend. To confirm (again), I can follow https://astroautomata.com/PySR/backend/ to use this branch, right?

Yep! Just note that it has changed from earlier this year as PySR now uses a different method for interconnecting Python and Julia. The docs have been updated accordingly, so if you follow that you are good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants