Integration with TensorBoard and other logging utilities #277

MilesCranmer · 2024-01-06T15:33:03Z

@eelregit made a request for a tensorboard logging capability. Turns out this was a fantastic idea! This is super useful, and also extremely easy to integrate. Gotta love Julia.

This is a work-in-progress PR to add tensorboard logging capabilities. Actually any other logger will work which is <:Logging.AbstractLogger – which is in the Julia standard library. That includes Wandb as well, via Wandb.jl

It seems to work nicely thus far. Here's a code example. First, run julia -e 'using Pkg; pkg"add TensorBoardLogger MLJBase https://github.com/MilesCranmer/SymbolicRegression.jl/tree/tb-logging"' to install everything.

Then, you can run this with (copy-paste):

julia> using SymbolicRegression, TensorBoardLogger, Logging, MLJBase

julia> logger = TBLogger("tensorboard_logs/run", min_level=Logging.Info);

julia> model = SRRegressor(
           niterations=50,
           binary_operators=[+, -, *, mod],
           unary_operators=[],
           maxsize=40,
           logger=logger,
       );

julia> X = (a = rand(500), b = rand(500));

julia> y = @. 2 * cos(X.a * 23.5) - X.b ^ 2;

julia> mach = machine(model, X, y);

julia> fit!(mach);

which results in (you can run Python: Launch Tensorboard in VSCode via the Python extension) –

TODO:

Move the with_logger part into a separate function to clean things up. Probably in SearchUtils.jl.
Figure out what other stats we want to log. @eelregit what would be useful?
- TensorBoardLogger actually lets you log entire plots. We could log the whole pareto front maybe?
Figure out if we want to let the user pass a method which computes statistics they would find useful from the pareto front.
- Added as logging_callback which is passed a variety of relevant variables.
Add a unit test with SimpleLogger.
Document both logger and logging_callback in both equation_search and SRRegressor
Add example to documentation
Allow user to control frequency of logging
Make plots at a different frequency from scalar logs

MilesCranmer · 2024-01-06T16:19:38Z

FYI Wandb works too! https://github.com/avik-pal/Wandb.jl

eelregit · 2024-01-07T12:09:04Z

Thanks @MilesCranmer ! The tensorboard monitoring will complement the progress bar nicely for jobs longer than a few hours.

Here's just some ideas on which evolution histories can be useful:

loss based on model selection, which already looks great in the above figure
complexity fraction quantile to monitor the balance of equations across complexity:
something like fraction(<=0.8*maxsize), fraction(<=0.6*maxsize), ...
pareto front figure (see my simple python functions below)
performance stats scalars: the head worker load and expr/sec (already in the progress monitoring)
equations. I am thinking that it can be more useful to track one at each complexity, and only when it changed.
Two approaches are to log
* text, smaller log on disk
* Latexify -> render as image -> TensorBoardLogger, prettier equations

I guess we only need to log per iteration by the head worker, right?

import os
import sys

import numpy as np
import scipy
import pysr
import matplotlib.pyplot as plt


def _get_lower(polygon):
    """Lower convex hull: https://stackoverflow.com/questions/76838415/lower-convex-hull"""
    minx = np.argmin(polygon[:, 0])
    maxx = np.argmax(polygon[:, 0]) + 1
    if minx >= maxx:
        lower_points = np.concatenate([polygon[minx:], polygon[:maxx]])
    else:
        lower_points = polygon[minx:maxx]
    return lower_points


def pareto_plot(equation_file, savefig=True, lower_convex_hull=True):
    """Plot Pareto front."""
    model = pysr.PySRRegressor.from_file(equation_file)
    hof = model.get_hof()

    hof.plot(x='complexity', y='loss', loglog=True, xlim=(1, None), ylabel='loss',
             drawstyle='steps-post')

    if lower_convex_hull:
        points = hof[['complexity', 'loss']].to_numpy()
        points = points[np.isfinite(points.sum(axis=1))]  # remove inf

        hull = scipy.spatial.ConvexHull(np.log(points))
        lower_points = _get_lower(points[hull.vertices])

        plt.plot(lower_points[:, 0], lower_points[:, 1], ls=':', label='convex hull')
        plt.legend()

    if savefig:
        fig_file = os.path.splitext(equation_file)[0] + '.pdf'
        plt.savefig(fig_file)
    else:
        return plt.gcf()


if __name__ == '__main__':
    for equation_file in sys.argv[1:]:
        pareto_plot(equation_file)

I need to remove Inf because all low complexity models all return infinity with my hacked objective.

eelregit · 2024-01-07T13:21:54Z

Theres seems to be 2 definitions of score, and it's a bit confusing which one is used in each context.

Linear combination of loss and complexity, with the latter weighted by parsimony, in LossFunctions.jl.
If I understand correctly, the order of magnitude of loss should be accounted when setting parsimony.
(I don't understand how the frecency is implemented yet.)
Negative derivative of log(loss) wrt linear complexity.

They both trade loss for complexity, with the first on linear scales and the second on log-linear scales.
This, and the log-log pareto front, inspire yet another definition,
the linear combination of log(loss) and log(complexity) (again with the latter weighted by parsimony).
And optimizing this score is equivalent to find the best model with - d log(loss) / d log(complexity) >= parsimony.
The optimal models based on such criteria correspond to the convex hull vertices in the above figure.

With this score, the pareto front (on log-log scales) are naturally divided into staged.
The log-linear pareto front also show similar "bouncing" behavior, though less pronounced, when it nearly "converges".
So maybe it's worth logging the loss / adding model selection at the first few vertices?

Also what do you think about using the log-log score in fitting?
The choice of a default parsimony can be more robust and won't need to account for the order of loss any more.
Adaptivity can be something like - d log(loss) / d log(complexity) >= parsimony + adaptive_parsimony / complexity.

MilesCranmer · 2024-01-07T16:55:52Z

Is there a way to create plots in TensorBoard or do you need to upload the entire image?

ext/SymbolicRegressionPlotsExt.jl

MilesCranmer · 2024-01-07T18:51:29Z

Yes the internal score is poorly named. It should really be regularized_loss. I can make a PR to adjust that.

MilesCranmer · 2024-01-08T02:54:33Z

@eelregit I added plotting utilities. Do you want to try it out?

Here's a demo:

using SymbolicRegression
using Plots
using MLJBase

model = SRRegressor(; binary_operators=[+, -, *, /], niterations=500, maxsize=80)

X = rand(1000, 2) .* 10
y = X[:, 1] + X[:, 2] .^ 2.5

mach = machine(model, X, y)

fit!(mach)

plot(mach.fitresult; dpi=300, fontfamily="serif")

which creates:

I think the convex hull is nice but I am worried it's a bit too specific (it seems like it wouldn't be general outside of mean-squared error loss?). I always try to make things general so users can customize these things. So I'm leaning towards taking it out and simply letting people write their own plotting method (which they could pass in as a callback function). What do you think?

But maybe we could still have a decent default for the Pareto front plot itself.

eelregit · 2024-01-08T07:34:29Z

That's an interesting example! I tried that and see that the convex hull actually evolved from two stages into one, and probably started growing into two again.

After 100 iterations:

After 300 iterations:

After 500 iterations, a "phase transition", PySR find that 20+ complexity is enough for this simple problem.
More complex expressions disappeared from the hall of fame:

After 1000 iterations, a new stage starting to form above complexity 20:

I think that the convex hull could be a generic feature in the sense that we expect the loss-complexity trade-off
get worse at large complexity (therefore convex) for typical problems where an exact fit cannot be found.
So it shouldn't depend on the specific loss function.
If there's an exact fit, then we should see a single stage convex hull, stopping at the corresponding complexity.
Otherwise, it's too good to be true that the loss-complexity trade-off stays concave and get better with increasing complexity?
There might be very rare cases that the log-log trade-off stays as a power law (straight line), in which case the convex hull is useless.

I think it's great idea to have plot callbacks. And the convex hull can be a good example for that :)

MilesCranmer · 2024-03-21T00:41:51Z

@eelregit sorry for the late follow-up. Teaching was a bit of a black hole – just finished!

I added a bunch of new things. Now logged:

The area of the log-log convex hull – called "pareto volume". This seems like a really useful overall metric; thanks for sharing the idea!!
The histogram of population complexities. Also a great idea. It seems the various tricks to diversity the populations are working pretty well here. Would be interesting to see how this scales for many of them.

Here's what the output looks like:

And here's an example complexity distribution over time (time is represented by depth):

Here are the saved plots:

Finally, for posterity, here's the code I'm using:

using SymbolicRegression
using Plots
using MLJBase
using TensorBoardLogger: TBLogger

X = randn(Float32, 100, 5)
y = 2 * cos.(X[:, 4]) + X[:, 1] .^ 2 .- 2

model = SRRegressor(;
    binary_operators=[+, *, /, -],
    # unary_operators=[cos, exp],
    populations=20,
    maxsize=30,
    niterations=400,
    parallelism=:multithreading,
    logger=TBLogger("logs"),
    log_every_n=(scalars=1, plots=10),
)

mach = machine(model, X, y)
fit!(mach)

eelregit · 2024-03-21T13:57:45Z

Thanks Miles! The Pareto volume is a great metric idea.

Sorry for not testing things out earlier. I will do it this weekend.
To confirm (again), I can follow https://astroautomata.com/PySR/backend/
to use this branch, right?

MilesCranmer · 2024-03-21T14:28:51Z

Sorry for not testing things out earlier. I will do it this weekend. To confirm (again), I can follow https://astroautomata.com/PySR/backend/ to use this branch, right?

Yep! Just note that it has changed from earlier this year as PySR now uses a different method for interconnecting Python and Julia. The docs have been updated accordingly, so if you follow that you are good.

MilesCranmer changed the title ~~[WIP] Search state logging / TensorBoardLogger integration~~ [WIP] TensorBoard integration / Generic search state logging Jan 6, 2024

Add numerical logging for TensorBoardLogger integration

5973611

MilesCranmer force-pushed the tb-logging branch from f46f22c to 5973611 Compare January 6, 2024 15:36

MilesCranmer changed the title ~~[WIP] TensorBoard integration / Generic search state logging~~ [WIP] TensorBoard integration / Generic search state logging (via AbstractLogger) Jan 6, 2024

This comment was marked as resolved.

Sign in to view

MilesCranmer changed the title ~~[WIP] TensorBoard integration / Generic search state logging (via AbstractLogger)~~ [WIP] TensorBoard/Wandb integration / Generic search state logging (via AbstractLogger) Jan 6, 2024

MilesCranmer changed the title ~~[WIP] TensorBoard/Wandb integration / Generic search state logging (via AbstractLogger)~~ [WIP] TensorBoard/Wandb integration – Generic search state logging via AbstractLogger Jan 6, 2024

MilesCranmer added 4 commits January 6, 2024 16:41

Create logging_callback for user to define

05d8a62

Give error message for both logger and logging_callback passed.

90799ca

Clean up default logger

2d851e2

Fix unbound parameter

de69b29

MilesCranmer force-pushed the tb-logging branch from a2cf220 to de69b29 Compare January 6, 2024 17:04

MilesCranmer added 3 commits January 6, 2024 17:05

Remove unused import

62fd0ef

Merge branch 'master' into tb-logging

bc8cb41

Merge branch 'master' into tb-logging

65f413b

MilesCranmer force-pushed the tb-logging branch from c682a6d to 8fb1ab6 Compare January 6, 2024 23:28

Log all complexities over time

bb2e16b

MilesCranmer force-pushed the tb-logging branch from 8fb1ab6 to bb2e16b Compare January 6, 2024 23:43

MilesCranmer added 4 commits January 7, 2024 00:16

Add Pareto curve plotting to logger

7f0b113

More descriptive name for plotting utils

b4484ba

Fix using -> import

910e15a

Fix import

36015de

MilesCranmer force-pushed the tb-logging branch from 21c212c to 36015de Compare January 7, 2024 03:29

MilesCranmer commented Jan 7, 2024

View reviewed changes

ext/SymbolicRegressionPlotsExt.jl Outdated Show resolved Hide resolved

MilesCranmer mentioned this pull request Jan 7, 2024

Segfault from basic example avik-pal/Wandb.jl#27

Closed

Idiomatic plotting recipes for Pareto curve

59a373e

MilesCranmer force-pushed the tb-logging branch from c871f17 to 59a373e Compare January 8, 2024 01:51

MilesCranmer added 2 commits January 8, 2024 02:42

Plot convex hull

506412e

Fix for Julia 1.6

4b6a3cd

MilesCranmer mentioned this pull request Jan 9, 2024

Document how to add plot recipes in a new model implementation JuliaAI/MLJ.jl#1078

Open

MilesCranmer added 9 commits March 20, 2024 20:44

Merge branch 'master' into tb-logging

62c2aeb

Move logging utilities to src/Logigng.jl

c1ebb67

Fix merge errors

bfbd3ea

Fix merge errors

5c7d4c3

Allow user to specify different logging rate for plots

98bf25b

Add infiltrator for testing

8134157

Include convex hull area as logged metric

7e62ea7

Log the distribution of complexities

ee55eee

Fix plotting part of logging

332adc4

Remove infiltrator

935a8eb

MilesCranmer changed the title ~~[WIP] TensorBoard/Wandb integration – Generic search state logging via AbstractLogger~~ Integration with TensorBoard and other logging utilities Mar 21, 2024

This was referenced Mar 24, 2024

[Feature]: TensorBoard support MilesCranmer/PySR#581

Open

[Plot for Iterations/Generations with RMSE ] MilesCranmer/PySR#254

Closed

MilesCranmer added 7 commits June 16, 2024 02:32

Merge branch 'master' into tb-logging

2a0b90c

build: add missing Logging compat

fe053c3

test: fix logging test

78d08d2

refactor: move RecipesBase to extension

afc7711

test: add stabilization to MLJInterface

a491a64

test: reset variable names when needed

f4c9111

refactor: prevent method overwrite

5a759d5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integration with TensorBoard and other logging utilities #277

Integration with TensorBoard and other logging utilities #277

MilesCranmer commented Jan 6, 2024 •

edited

Loading

This comment was marked as resolved.

MilesCranmer commented Jan 6, 2024

eelregit commented Jan 7, 2024 •

edited

Loading

eelregit commented Jan 7, 2024 •

edited

Loading

MilesCranmer commented Jan 7, 2024

MilesCranmer commented Jan 7, 2024

MilesCranmer commented Jan 8, 2024 •

edited

Loading

eelregit commented Jan 8, 2024 •

edited

Loading

MilesCranmer commented Mar 21, 2024 •

edited

Loading

eelregit commented Mar 21, 2024

MilesCranmer commented Mar 21, 2024

Integration with TensorBoard and other logging utilities #277

Are you sure you want to change the base?

Integration with TensorBoard and other logging utilities #277

Conversation

MilesCranmer commented Jan 6, 2024 • edited Loading

This comment was marked as resolved.

MilesCranmer commented Jan 6, 2024

eelregit commented Jan 7, 2024 • edited Loading

eelregit commented Jan 7, 2024 • edited Loading

MilesCranmer commented Jan 7, 2024

MilesCranmer commented Jan 7, 2024

MilesCranmer commented Jan 8, 2024 • edited Loading

eelregit commented Jan 8, 2024 • edited Loading

MilesCranmer commented Mar 21, 2024 • edited Loading

eelregit commented Mar 21, 2024

MilesCranmer commented Mar 21, 2024

MilesCranmer commented Jan 6, 2024 •

edited

Loading

eelregit commented Jan 7, 2024 •

edited

Loading

eelregit commented Jan 7, 2024 •

edited

Loading

MilesCranmer commented Jan 8, 2024 •

edited

Loading

eelregit commented Jan 8, 2024 •

edited

Loading

MilesCranmer commented Mar 21, 2024 •

edited

Loading