Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Genie gives rise to a large number of GenieSession files via GenieSessionFileSession with enormous footprint >300GB #702

Open
zygmuntszpak opened this issue Jan 25, 2024 · 26 comments
Assignees

Comments

@zygmuntszpak
Copy link

Describe the bug
I'm writing an app (work related) which continually ingests an MQTT stream of IoT sensors, does some analysis and dynamically plots it using Stipple. The app is meant to run continually, but within 24 hours the PC runs out of disk space because of an enormous number of session files that are serialized to a temporary folder. There are multiple session files created within a minute, and each session file eventually becomes 20mb large. The sheer number of these files eventually crashed the program because there is no more disk space:

**Error stacktrace**
┌ Error: 2024-01-24 12:33:51 SQLite.SQLiteException("database or disk is full")
└ @ SearchLight C:\Users\zygmuntszpak\.julia\packages\SearchLight\Ps2Js\src\SearchLight.jl:213
┌ Error: 2024-01-24 12:33:51 SQLite.SQLiteException("database or disk is full")
└ @ SearchLight C:\Users\zygmuntszpak\.julia\packages\SearchLight\Ps2Js\src\SearchLight.jl:213
┌ Error: 2024-01-24 12:33:51 Failed to store session data
└ @ GenieSessionFileSession C:\Users\zygmuntszpak\.julia\packages\GenieSessionFileSession\otnJC\src\GenieSessionFileSession.jl:41
┌ Error: 2024-01-24 12:33:51 Failed to store session data
└ @ GenieSessionFileSession C:\Users\zygmuntszpak\.julia\packages\GenieSessionFileSession\otnJC\src\GenieSessionFileSession.jl:41
┌ Error: 2024-01-24 12:33:51 SystemError("close", 28, nothing)
└ @ GenieSessionFileSession C:\Users\zygmuntszpak\.julia\packages\GenieSessionFileSession\otnJC\src\GenieSessionFileSession.jl:42
┌ Error: 2024-01-24 12:33:51 SystemError("close", 28, nothing)
└ @ GenieSessionFileSession C:\Users\zygmuntszpak\.julia\packages\GenieSessionFileSession\otnJC\src\GenieSessionFileSession.jl:42
┌ Error: 2024-01-24 12:33:51 Resetting session
└ @ GenieSessionFileSession C:\Users\zygmuntszpak\.julia\packages\GenieSessionFileSession\otnJC\src\GenieSessionFileSession.jl:46
┌ Error: 2024-01-24 12:33:51 Resetting session
└ @ GenieSessionFileSession C:\Users\zygmuntszpak\.julia\packages\GenieSessionFileSession\otnJC\src\GenieSessionFileSession.jl:46
┌ Error: 2024-01-24 12:33:52 SQLite.SQLiteException("database or disk is full")
└ @ SearchLight C:\Users\zygmuntszpak\.julia\packages\SearchLight\Ps2Js\src\SearchLight.jl:213
Total retrieved records is: 760817
Inside Reading Task Timestamp: 2024-01-24T12:34:37.791
┌ Error: 2024-01-24 12:37:15 Failed to store session data
└ @ GenieSessionFileSession C:\Users\zygmuntszpak\.julia\packages\GenieSessionFileSession\otnJC\src\GenieSessionFileSession.jl:41
┌ Error: 2024-01-24 12:37:15 SystemError("close", 28, nothing)
└ @ GenieSessionFileSession C:\Users\zygmuntszpak\.julia\packages\GenieSessionFileSession\otnJC\src\GenieSessionFileSession.jl:42
┌ Error: 2024-01-24 12:37:15 Resetting session
└ @ GenieSessionFileSession C:\Users\zygmuntszpak\.julia\packages\GenieSessionFileSession\otnJC\src\GenieSessionFileSession.jl:46
┌ Error: 2024-01-24 12:37:22 SQLite.SQLiteException("database or disk is full")
└ @ SearchLight C:\Users\zygmuntszpak\.julia\packages\SearchLight\Ps2Js\src\SearchLight.jl:213
┌ Error: 2024-01-24 12:37:22 SQLite.SQLiteException("database or disk is full")
└ @ SearchLight C:\Users\zygmuntszpak\.julia\packages\SearchLight\Ps2Js\src\SearchLight.jl:213
┌ Error: 2024-01-24 12:37:23 SQLite.SQLiteException("database or disk is full")
└ @ SearchLight C:\Users\zygmuntszpak\.julia\packages\SearchLight\Ps2Js\src\SearchLight.jl:213
┌ Error: 2024-01-24 12:37:23 Failed to store session data
└ @ GenieSessionFileSession C:\Users\zygmuntszpak\.julia\packages\GenieSessionFileSession\otnJC\src\GenieSessionFileSession.jl:41
┌ Error: 2024-01-24 12:37:23 SystemError("close", 28, nothing)
└ @ GenieSessionFileSession C:\Users\zygmuntszpak\.julia\packages\GenieSessionFileSession\otnJC\src\GenieSessionFileSession.jl:42
┌ Error: 2024-01-24 12:37:23 Resetting session
└ @ GenieSessionFileSession C:\Users\zygmuntszpak\.julia\packages\GenieSessionFileSession\otnJC\src\GenieSessionFileSession.jl:46
┌ Error: 2024-01-24 12:37:24 SQLite.SQLiteException("database or disk is full")
└ @ SearchLight C:\Users\zygmuntszpak\.julia\packages\SearchLight\Ps2Js\src\SearchLight.jl:213
┌ Error: 2024-01-24 12:37:25 SQLite.SQLiteException("database or disk is full")
└ @ SearchLight C:\Users\zygmuntszpak\.julia\packages\SearchLight\Ps2Js\src\SearchLight.jl:213
┌ Error: 2024-01-24 12:37:25 SQLite.SQLiteException("database or disk is full")
└ @ SearchLight C:\Users\zygmuntszpak\.julia\packages\SearchLight\Ps2Js\src\SearchLight.jl:213
┌ Error: 2024-01-24 12:37:26 SQLite.SQLiteException("database or disk is full")
└ @ SearchLight C:\Users\zygmuntszpak\.julia\packages\SearchLight\Ps2Js\src\SearchLight.jl:213
┌ Error: 2024-01-24 12:37:27 SQLite.SQLiteException("database or disk is full")
└ @ SearchLight C:\Users\zygmuntszpak\.julia\packages\SearchLight\Ps2Js\src\SearchLight.jl:213
┌ Error: 2024-01-24 12:37:27 SQLite.SQLiteException("database or disk is full")
└ @ SearchLight C:\Users\zygmuntszpak\.julia\packages\SearchLight\Ps2Js\src\SearchLight.jl:213
┌ Error: 2024-01-24 12:37:27 SQLite.SQLiteException("database or disk is full")
└ @ SearchLight C:\Users\zygmuntszpak\.julia\packages\SearchLight\Ps2Js\src\SearchLight.jl:213
┌ Error: 2024-01-24 12:37:28 SQLite.SQLiteException("database or disk is full")
└ @ SearchLight C:\Users\zygmuntszpak\.julia\packages\SearchLight\Ps2Js\src\SearchLight.jl:213
┌ Error: 2024-01-24 12:37:29 SQLite.SQLiteException("database or disk is full")
└ @ SearchLight C:\Users\zygmuntszpak\.julia\packages\SearchLight\Ps2Js\src\SearchLight.jl:213
┌ Error: 2024-01-24 12:37:30 SQLite.SQLiteException("database or disk is full")
└ @ SearchLight C:\Users\zygmuntszpak\.julia\packages\SearchLight\Ps2Js\src\SearchLight.jl:213
┌ Error: 2024-01-24 12:37:30 SQLite.SQLiteException("database or disk is full")
└ @ SearchLight C:\Users\zygmuntszpak\.julia\packages\SearchLight\Ps2Js\src\SearchLight.jl:213
┌ Error: 2024-01-24 12:37:31 SQLite.SQLiteException("database or disk is full")
└ @ SearchLight C:\Users\zygmuntszpak\.julia\packages\SearchLight\Ps2Js\src\SearchLight.jl:213
┌ Error: 2024-01-24 12:37:31 Failed to store session data
└ @ GenieSessionFileSession C:\Users\zygmuntszpak\.julia\packages\GenieSessionFileSession\otnJC\src\GenieSessionFileSession.jl:41
┌ Error: 2024-01-24 12:37:31 SystemError("close", 28, nothing)
└ @ GenieSessionFileSession C:\Users\zygmuntszpak\.julia\packages\GenieSessionFileSession\otnJC\src\GenieSessionFileSession.jl:42
Unhandled Task ERROR: SystemError: flush: No space left on device
Stacktrace:
  [1] systemerror(p::String, errno::Int32; extrainfo::Nothing)
    @ Base .\error.jl:176
  [2] #systemerror#82
    @ .\error.jl:175 [inlined]
  [3] systemerror
    @ .\error.jl:175 [inlined]
  [4] flush(s::IOStream)
    @ Base .\iostream.jl:70
  [5] handle_message(::LoggingExtras.FileLogger, ::Base.CoreLogging.LogLevel, ::Vararg{Any};
 kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ LoggingExtras C:\Users\zygmuntszpak\.julia\packages\LoggingExtras\VLO3o\src\Sinks\filelogger.jl:47
  [6] handle_message(::LoggingExtras.FileLogger, ::Base.CoreLogging.LogLevel, ::String, ::Module, ::Symbol, ::Symbol, ::String, ::Int64)
    @ LoggingExtras C:\Users\zygmuntszpak\.julia\packages\LoggingExtras\VLO3o\src\Sinks\filelogger.jl:45
  [7] handle_message(::LoggingExtras.TeeLogger{Tuple{LoggingExtras.FileLogger, Logging.ConsoleLogger}}, ::Base.CoreLogging.LogLevel, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ LoggingExtras C:\Users\zygmuntszpak\.julia\packages\LoggingExtras\VLO3o\src\CompositionalLoggers\tee.jl:24
  [8] handle_message(::LoggingExtras.TeeLogger{Tuple{LoggingExtras.FileLogger, Logging.ConsoleLogger}}, ::Base.CoreLogging.LogLevel, ::String, ::Module, ::Symbol, ::Symbol, ::String, ::Int64)
    @ LoggingExtras C:\Users\zygmuntszpak\.julia\packages\LoggingExtras\VLO3o\src\CompositionalLoggers\tee.jl:21
  [9] handle_message(::LoggingExtras.TeeLogger{Tuple{LoggingExtras.TeeLogger{Tuple{LoggingExtras.FileLogger, Logging.ConsoleLogger}}, Genie.Logger.GenieLogger}}, ::Base.CoreLogging.LogLevel, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ LoggingExtras C:\Users\zygmuntszpak\.julia\packages\LoggingExtras\VLO3o\src\CompositionalLoggers\tee.jl:24
 [10] handle_message
    @ C:\Users\zygmuntszpak\.julia\packages\LoggingExtras\VLO3o\src\CompositionalLoggers\tee.jl:21 [inlined]
 [11] #handle_message#13
    @ C:\Users\zygmuntszpak\.julia\packages\LoggingExtras\VLO3o\src\CompositionalLoggers\transformer.jl:28 [inlined]
 [12] handle_message(::LoggingExtras.TransformerLogger{LoggingExtras.TeeLogger{Tuple{LoggingExtras.TeeLogger{Tuple{LoggingExtras.FileLogger, Logging.ConsoleLogger}}, Genie.Logger.GenieLogger}}, Genie.Logger.var"#1#2"{String}}, ::Base.CoreLogging.LogLevel, ::String, ::Module, ::Symbol, ::Symbol, ::String, ::Int64)
    @ LoggingExtras C:\Users\zygmuntszpak\.julia\packages\LoggingExtras\VLO3o\src\CompositionalLoggers\transformer.jl:20
 [13] handle_message(::LoggingExtras.MinLevelLogger{LoggingExtras.TransformerLogger{LoggingExtras.TeeLogger{Tuple{LoggingExtras.TeeLogger{Tuple{LoggingExtras.FileLogger, Logging.ConsoleLogger}}, Genie.Logger.GenieLogger}}, Genie.Logger.var"#1#2"{String}}, Base.CoreLogging.LogLevel}, ::Base.CoreLogging.LogLevel, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ LoggingExtras C:\Users\zygmuntszpak\.julia\packages\LoggingExtras\VLO3o\src\CompositionalLoggers\minlevelfiltered.jl:17
 [14] handle_message(::LoggingExtras.MinLevelLogger{LoggingExtras.TransformerLogger{LoggingExtras.TeeLogger{Tuple{LoggingExtras.TeeLogger{Tuple{LoggingExtras.FileLogger, Logging.ConsoleLogger}}, Genie.Logger.GenieLogger}}, Genie.Logger.var"#1#2"{String}}, Base.CoreLogging.LogLevel}, ::Base.CoreLogging.LogLevel, ::String, ::Module, ::Symbol, ::Symbol, ::String, ::Int64)
    @ LoggingExtras C:\Users\zygmuntszpak\.julia\packages\LoggingExtras\VLO3o\src\CompositionalLoggers\minlevelfiltered.jl:15
 [15] handle_message(j::VSCodeServer.VSCodeLogger, level::Base.CoreLogging.LogLevel, message::String, _module::Module, group::Symbol, id::Symbol, file::String, line::Int64; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ VSCodeServer c:\Users\zygmuntszpak\.vscode\extensions\julialang.language-julia-1.66.2\scripts\packages\VSCodeServer\src\progress.jl:31
 [16] handle_message(j::VSCodeServer.VSCodeLogger, level::Base.CoreLogging.LogLevel, message::String, _module::Module, group::Symbol, id::Symbol, file::String, line::Int64)
    @ VSCodeServer c:\Users\zygmuntszpak\.vscode\extensions\julialang.language-julia-1.66.2\scripts\packages\VSCodeServer\src\progress.jl:7
 [17] #invokelatest#2
    @ .\essentials.jl:819 [inlined]
 [18] invokelatest
    @ .\essentials.jl:816 [inlined]
 [19] macro expansion
    @ .\logging.jl:330 [inlined]
 [20] write(session::GenieSession.Session)
    @ GenieSessionFileSession C:\Users\zygmuntszpak\.julia\packages\GenieSessionFileSession\otnJC\src\GenieSessionFileSession.jl:55
 [21] persist
    @ C:\Users\zygmuntszpak\.julia\packages\GenieSessionFileSession\otnJC\src\GenieSessionFileSession.jl:116 [inlined]
 [22] set!
    @ C:\Users\zygmuntszpak\.julia\packages\GenieSession\Kmjen\src\GenieSession.jl:182 [inlined]
 [23] set!
    @ C:\Users\zygmuntszpak\.julia\packages\GenieSession\Kmjen\src\GenieSession.jl:187 [inlined]
 [24] (::Stipple.ModelStorage.Sessions.var"#2#4"{Main.App.DataStream.var"Main.App.DataStream_ReactiveModel", Main.App.DataStream.var"##Main.App.DataStream_ReactiveModel!#331"})(#unused#::Vector{PlotlyBase.GenericTrace{Dict{Symbol, Any}}})
    @ Stipple.ModelStorage.Sessions C:\Users\zygmuntszpak\.julia\packages\Stipple\4Csa4\src\ModelStorage.jl:39
 [25] #invokelatest#2
    @ .\essentials.jl:819 [inlined]
 [26] invokelatest
    @ .\essentials.jl:816 [inlined]
 [27] notify
    @ C:\Users\zygmuntszpak\.julia\packages\Observables\YdEbO\src\Observables.jl:206 [inlined]
 [28] setindex!(observable::Observable, val::Any)
    @ Observables C:\Users\zygmuntszpak\.julia\packages\Observables\YdEbO\src\Observables.jl:123
 [29] setindex!
    @ C:\Users\zygmuntszpak\.julia\packages\Observables\YdEbO\src\Observables.jl:109 [inlined]
 [30] update_r1_fcu1_in_temp_plot!(__model__::Main.App.DataStream.var"##Main.App.DataStream_ReactiveModel!#331")
    @ Main.App.DataStream C:\Users\zygmuntszpak\.julia\dev\LineZeroApp\lib\DataStream\room_1_plots.jl:2
 [31] update_plots!(__model__::Main.App.DataStream.var"##Main.App.DataStream_ReactiveModel!#331")
    @ Main.App.DataStream C:\Users\zygmuntszpak\.julia\dev\LineZeroApp\lib\DataStream\DataStream.jl:326
 [32] (::Main.App.DataStream.var"#171#172"{Main.App.DataStream.var"##Main.App.DataStream_ReactiveModel!#331"})()
    @ Main.App.DataStream .\threadingconstructs.jl:410

To reproduce
I need to construct a separate minimal Stipple app, run it for a while and monitor the number of session files to see if it will behave the same as my current app. As far as I can see I am not doing anything unusual in my current program. I am sending a lot of datapoints for plotting to the frontend, which is probably the reason for the large file size per session. However, I don't understand why so many sessions are created since I am simply starting the server once and letting it run continuously.

Expected behavior
I don't fully understand what is meant to be stored in a Session (i.e. why they become so large), and why so many temporary files are created. I obviously don't expect to generate 300GB of temporary files within 24 hours. Is persisting the session files to disk absolutely necessary? Perhaps I could periodically delete some of them (e.g. run a cleanup every 30 minutes to delete the oldest files etc?)

Additional context
Please include the output of
julia> versioninfo()

Julia Version 1.9.3
Commit bed2cd540a (2023-08-24 14:43 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 8 × 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, tigerlake)
  Threads: 4 on 8 virtual cores
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 4

and
pkg> st

  [6e4b80f9] BenchmarkTools v1.4.0
  [336ed68f] CSV v0.10.12
  [a93c6f00] DataFrames v1.6.1
  [c43c736e] Genie v5.23.8
  [a59fdf5c] GenieFramework v1.26.11
  [033835bb] JLD2 v0.4.45
  [0f8b85d8] JSON3 v1.14.0
  [b9914132] JSONTables v1.0.3
  [db317de6] Mosquitto v0.8.1
  [a03496cd] PlotlyBase v0.8.19
  [1bd9f7bb] RemoteREPL v0.2.17 `C:\Users\zygmuntszpak\.julia\dev\RemoteREPL`
  [0aa819cd] SQLite v1.6.0
  [340e8cb6] SearchLight v2.10.0
  [21a827c4] SearchLightSQLite v2.2.2
  [2913bbd2] StatsBase v0.34.2
  [4acbeb90] Stipple v0.27.33
  [a3c5d34a] StippleUI v0.22.18
  [f269a46b] TimeZones v1.13.0
  [9bd350c2] OpenSSH_jll v9.3.2+0
  [9e88b42a] Serialization
  [8dfed614] Test

Please answer these optional questions to help us understand, prioritise, and assign the issue

1/ Are you using Genie at work or for hobby/personal projects?
This is a work project, and I am busy running a trial data capture. I was surprised to run out of disk space within a day.

2/ Can you give us, in a few words, some details about the app you're building with Genie?
I'm writing an app (work related) which continually ingests an MQTT stream of IoT sensors, does some analysis and dynamically plots it using Stipple. The app is meant to run continually.

@zygmuntszpak
Copy link
Author

I just realised that perhaps this issue should have been filed under Stipple.jl repository.

@essenciary
Copy link
Member

@zygmuntszpak the sessions are used to store the state of the model and set it to its latest value after a page refresh for example. So it's not necessary, it's more of a UX feature.

A 20 MB serialized session file is pretty large though, so for performance it would be better not to use the session anymore.

But the weird part is that it should definitely not write multiple files - there is one session file per user (per browser session). Can you explain a bit how the requests are being made?

@zygmuntszpak
Copy link
Author

This arises out of a Stipple app that I have made. I will try to create a minimal working example ASAP for you, but the basic structure is something like this:

module DataStream

const loop_retrieve_data =  Threads.Atomic{Bool}(false)
@genietools
@app begin
    @private running = false
    @in livestream_checked = false

    @private ext_heat_in_temp::DataFrame =  filter(sensor -> sensor.point == "EXT_HEAT_IN_TEMP", list_of_sensor_messages ) |> DataFrame

    @onchange isready begin
        @show "App is loaded"      
    end

    @onchange livestream_checked begin
        if livestream_checked 

# I pass the __model__ variable here because I want to move all the code for updating the plots outside
# of the @app block. Otherwise, I end up with a lot of business logic in between all of these handlers
# and it makes it all harder to read. The __model__ variable is implicitly created by the macros as far as I see.
# Perhaps what I am doing here is not permitted, and is the cause of all the additional sessions?

            spawn_update_plots_task!(__model__)
        else
            # Stop the existing task via this global atomic variable
            DataStream.loop_retrieve_data[] = false
        end    
    end
end
  

const loop_retrieve_data =  Threads.Atomic{Bool}(false)

function spawn_update_plots_task!(__model__)
    loop_retrieve_data[] = true
    DataStream.update_plots_task[] =  @spawn update_plots!(__model__)
    errormonitor(DataStream.update_plots_task[])
    return nothing
end

function update_plots!(__model__)
    println("Task started")
    while loop_retrieve_data[]
        # Grab the latest readings from the SQL database
        list_of_sensor_messages = retrieve_data()

        # Update the reactive data fields
        __model__.ext_heat_in_temp[] =  filter(sensor -> sensor.point == "EXT_HEAT_IN_TEMP", list_of_sensor_messages ) |> DataFrame |> x->pick_subset(__model__, x)


        update_ext_heat_in_temp_plot!(__model__)
 
        sleep(1)
    end
end

function update_ext_heat_in_temp_plot!(__model__)
    __model__.ext_heat_in_temp_trace[] = [scatter(
        x = __model__.ext_heat_in_temp[:, "timestamp"],
        y = __model__.ext_heat_in_temp[:, "value"],
        mode = "lines",
        marker = attr(size=10, color="rgba(255, 182, 193, .9)"),
        name = "Temperature")]
    return nothing
end

end

Then I have an app.jl file with something along these lines

module App

using GenieFramework
@genietools

# Need to do this here because setting the Logging level in SearchLight directly 
# does not work (currently broken). If we don't put this log level to warn, then every time we do a SQL query the 
# logs and stdout are flooded with info messages summarising the SQL query. 
Genie.Configuration.config!(log_level = Genie.Logging.Warn)

using SearchLight
using SearchLightSQLite
using Serialization
using Base.Threads
using Dates

include(joinpath("lib","app","resources","sensors", "Sensors.jl"))
include(joinpath("lib","app","resources","sensors", "SensorsValidator.jl"))
using .Sensors
using .SensorsValidator

export Sensor

include(joinpath("lib","DataStream", "DataStream.jl"))
using .DataStream

@page("/", joinpath("lib", "DataStream", "datastream_ui.jl"), layout = "layout.jl", model = DataStream)

end # module App

@essenciary
Copy link
Member

Thanks - working on a quick patch to allow disabling the storage to session.

@essenciary
Copy link
Member

essenciary commented Jan 25, 2024

Until the patch is out you can safely delete the files. You can get the the path with:

Stipple.ModelStorage.Sessions.GenieSessionFileSession.SESSIONS_PATH[]

@essenciary
Copy link
Member

@zygmuntszpak OK, currently tagging a new version of Stipple.jl that allows disabling the model storage. Once it's out you can use it like this:

module App
# set up Genie development environment
using GenieFramework
Stipple.enable_model_storage(false)
@genietools

# ... rest of your code 

Commit here: GenieFramework/Stipple.jl@17bc484

@essenciary
Copy link
Member

Another thing, please make sure you run your deployed app in production env. You can pass GENIE_ENV=prod as an environment variable to the Julia process. This has an important impact on the performance of the app (and also stores the sessions in the app's folder, under sessions/.

With the patch the sessions are still created (as a security feature) but the model is no longer stored.

If you still get a high number of sessions, let me know. There should only be one session per user/browser.

@zygmuntszpak
Copy link
Author

Thank you. I'll also continue trying to produce a MWE for the large number of sessions I was experiencing.

Regarding switching to the production environment, is it not sufficient for me to do something like:

Genie.Configuration.config!(app_env= "prod")

at the start of my code?

I also have a suspicion for what might be the cause of #659 so I'll let you know if I make any significant discoveries there.

@zygmuntszpak
Copy link
Author

Here is a MWE which gives rise to many sessions. I haven't tested this with your latest patch yet. I've attached the project as a standalone package GenieDebug and left only the core pieces to make a proper Genie app.
GenieDebug.zip

A preview of the main files:
app.jl

module App

using GenieFramework
@genietools

# TODO This needs to be read from a file instead (as in the SearchLight config file)
# We need to do this here because setting the Logging level in SearchLight directly 
# does not work (currently broken)
Genie.Configuration.config!(log_level = Genie.Logging.Warn)

using Base.Threads

include("DataStream/DataStream.jl")
using .DataStream

@page("/", joinpath("DataStream", "datastream_ui.jl"), layout = "layout.jl", model = DataStream)

end # module App

DataStream.jl

module DataStream
using GenieFramework
using DataFrames
using PlotlyBase
using Base.Threads

@genietools

const update_plots_task::Ref{Task} = Ref(Task(nothing))
const loop_retrieve_data =  Threads.Atomic{Bool}(false)

@app begin
    @in livestream_checked = false
    
    @private ext_heat_in_temp::DataFrame =  DataFrame(timestamp=1:10000, value=rand(10000))
 
    @out ext_heat_in_temp_trace = [scatter()]
    @out ext_heat_in_temp_layout = PlotlyBase.Layout(
        xaxis_title = "Time",
        yaxis_title = "Temperature (Celcius)",
        title = "External Heat In Temperature"
    )

 
    @onchange isready begin
        @show "App is loaded"
        
    end

    @onchange livestream_checked begin
        if livestream_checked 
            spawn_update_plots_task!(__model__)
        else
            # Stop the existing task via this global atomic variable
            DataStream.loop_retrieve_data[] = false
        end    
    end
end

function retrieve_data()
    return rand(100)
end

function update_plots!(__model__)
    println("Task started")
    while loop_retrieve_data[]
        # Update the reactive data fields
        __model__.ext_heat_in_temp[] =   DataFrame(timestamp=1:10000, value=rand(10000))
     
        update_ext_heat_in_temp_plot!(__model__)
        
        println("Inside Reading Task Timestamp: " * string(Dates.now()))

        sleep(1)
    end
end

function spawn_update_plots_task!(__model__)
    loop_retrieve_data[] = true
    DataStream.update_plots_task[] =  @spawn update_plots!(__model__)
    errormonitor(DataStream.update_plots_task[])
    return nothing
end

function update_ext_heat_in_temp_plot!(__model__)
    __model__.ext_heat_in_temp_trace[] = [scatter(
        x = __model__.ext_heat_in_temp[:, "timestamp"],
        y = __model__.ext_heat_in_temp[:, "value"],
        mode = "lines",
        marker = attr(size=10, color="rgba(255, 182, 193, .9)"),
        name = "Temperature")]
    return nothing
end

end

datastream_ui.jl

header(class="st-header q-pa-sm",
    checkbox("Live Stream", :livestream_checked)
)

cell(class = "st-module", 
    [
        Stipple.Html.div(class = "q-mt-pa", 
        [
            plot(:ext_heat_in_temp_trace, layout = :ext_heat_in_temp_layout)
        ])
    ])

layout.jl

cell(style="display: flex; justify-content: space-between; align-items: center; background-color: #112244; padding: 10px 50px; color: #ffffff; top: 0; width: 100%; box-sizing: border-box;", [
    cell(style="font-size: 1.5em; font-weight: bold;",
        "Debug App"
    ),
    Html.div(style="display: flex; gap: 20px;", [
        a(href="/", style="text-decoration: none; color: #ffffff;",
            "Data Stream"
        )
    ])
])
page(model, partial=true, [@yield])

@PGimenez
Copy link
Member

PGimenez commented Jan 25, 2024

Thank you. I'll also continue trying to produce a MWE for the large number of sessions I was experiencing.

Regarding switching to the production environment, is it not sufficient for me to do something like:

Genie.Configuration.config!(app_env= "prod")
at the start of my code?

The flag needs to be set before Genie.loadapp() is called. Say you are launch the app from a SSH connection, you would do

nohup export GENIE_ENV=prod && julia --project -e "using GenieFramework; Genie.loadapp(); up(async=false);" &

Otherwise, you can switch the application to PROD by default by editing the config/env/global.jl file and adding this line:

ENV["GENIE_ENV"] = "prod"

https://learn.genieframework.com/docs/reference/server/configuration

@essenciary
Copy link
Member

I just released a patch so that we can also set GENIE_ENV in the .env file. This is the easiest IMO. Add a .env file to the app saying GENIE_ENV=prod

@essenciary
Copy link
Member

essenciary commented Jan 25, 2024

Regarding switching to the production environment, is it not sufficient for me to do something like:

Genie.Configuration.config!(app_env= "prod")

at the start of my code?

This does change the app env but if you put it there it's run very late in the load order, so its impact is only partial (and so are the benefits).

@essenciary
Copy link
Member

essenciary commented Jan 25, 2024

@zygmuntszpak thanks for the MWE. I've run it locally and I can't reproduce the issue of excessive sessions files being created. In my tests it works as expected by creating N+1 sessions (where N is the number of clients/browsers connected). The +1 is from Stipple.init, upon starting the app. Here is what the session folder looks like after testing the app with 3 different browsers (Safari, Chrome and Firefox)
image

@essenciary
Copy link
Member

Oh! Now I see it. When you check "Live Stream" it creates a few sessions per second.

@essenciary
Copy link
Member

@zygmuntszpak OK, I found the issue. It is caused by using @spawn inside spawn_update_plots_task!.
If you remove @spawn to look like

DataStream.update_plots_task[] = update_plots!(__model__)

then the issue goes away. Clearly it has to do with running the update and the model in different threads/processes but I don't fully understand what exactly causes it, that would take a lot more debugging.

I suggest running without @spawn, for me the performance looks the same. Also, with @spawn the updates might not arrive in the right order (not that you can tell on this chart but in other situation it might matter).

FYI @hhaensel interesting bug here :)

@zygmuntszpak
Copy link
Author

In reality I have a lot more streams and I run some analysis and data wrangling operations and after those are complete I take the result and plot it. The reason why I put those in a different thread is not to hold up the responsiveness of the server to other actions a user might perform.

So while in this instance not spawning a separate task on a seperate thread makes no difference, in my actual use case it does because it may take several seconds to complete processing between plot updates.

Interesting that you pointed out the updates may not arrive in the right order. Are you referring to the situation when an update for a different reactive variable is triggered in the main thread, and and an update for the plots is triggered in this separate thread I spawned, and then the separate thread might transmit before the first? If all the updates to the plots happen in the separate thread, then surely those plot updates should happen in the right order because I'm only spawning one separate thread which is executing the plot updates in sequence.

I think that spawning a thread and triggering updates is the reason for occasional closed connections from the browser that we are seeing in the other issue. I think that some data race condition is created so that while genie is trying to construct a websocket frame to transmit to the browser due to an update of a reactive variable, the reactive variable data changes midway and a corrupted websocket frame is generated. This may explain why it is difficult to consistently reproduce the closed connection issue (because it relies on an occasional data race condition).

My current thinking is to introduce a channel to communicate between the separate thread which will retrieve the data, do so processing and push it into a channel when it is ready to be plotted. Then the main app must read from the channel and update the plots when they are ready.

@essenciary
Copy link
Member

@zygmuntszpak Have you run the app with the latest version of Stipple and with Stipple.enable_model_storage(false)? That seems to solve the issue of the sessions just fine while still using @spawn. See this capture of the exact MWP you've sent, with model storage disabled.
2024-01-26 10 48 30

@zygmuntszpak
Copy link
Author

@essenciary I have just now tested GenieDebug.jl with your latest patch and can confirm that the issue with all of the additional spurious sessions is now resolved, even when spawning a separate thread. Thank you very much!

Out of curiosity, I added a println of Genie.WebChannels.connected_clients() inside the spawned task and in both the "broken" version and your patched version, the number of connections was two. So, somehow, additional session files were being generated in the "broken" version, even though there were only two connected clients.

I just released a patch so that we can also set GENIE_ENV in the .env file. This is the easiest IMO. Add a .env file to the app saying GENIE_ENV=prod

I couldn't get this to work. I added a .env file in the root directory of the GenieDebug folder and obtained the following error:

julia> Genie.loadapp()


 ██████╗ ███████╗███╗   ██╗██╗███████╗    ███████╗
██╔════╝ ██╔════╝████╗  ██║██║██╔════╝    ██╔════╝
██║  ███╗█████╗  ██╔██╗ ██║██║█████╗      ███████╗
██║   ██║██╔══╝  ██║╚██╗██║██║██╔══╝      ╚════██║
╚██████╔╝███████╗██║ ╚████║██║███████╗    ███████║
 ╚═════╝ ╚══════╝╚═╝  ╚═══╝╚═╝╚══════╝    ╚══════╝

| Website  https://genieframework.com
| GitHub   https://github.com/genieframework
| Docs     https://genieframework.com/docs
| Discord  https://discord.com/invite/9zyZbD6J7H
| Twitter  https://twitter.com/essenciary

Active env: PROD

┌ Warning: 
│             No secret token is defined through `Genie.Secrets.secret_token!("token")`. Such a token
│             is needed to hash and to encrypt/decrypt sensitive data in Genie, including cookie
│             and session data.
│ 
│             If your app relies on cookies or sessions make sure you generate a valid token,
│             otherwise the encrypted data will become unreadable between app restarts.
│ 
│             You can resolve this issue by generating a valid `config/secrets.jl` file with a
│             random token, calling `Genie.Generator.write_secrets_file()`.
│ 
└ @ Genie.Secrets C:\Users\zygmu\.julia\packages\Genie\5qchC\src\Secrets.jl:27
Loading appERROR: LoadError: SystemError: opening file "C:\\Users\\zygmu\\.julia\\dev\\GenieDebug\\DataStream\\config\\env\\global.jl": No such file or directory
in expression starting at C:\Users\zygmu\.julia\dev\GenieDebug\DataStream\DataStream.jl:1
Stacktrace:
  [1] systemerror(p::String, errno::Int32; extrainfo::Nothing)
    @ Base .\error.jl:176
  [2] kwcall(::NamedTuple{(:extrainfo,), Tuple{Nothing}}, ::typeof(systemerror), p::String, errno::Int32)
    @ Base .\error.jl:176
  [3] kwcall(::NamedTuple{(:extrainfo,), Tuple{Nothing}}, ::typeof(systemerror), p::String)
    @ Base .\error.jl:176
  [4] #systemerror#82
    @ .\error.jl:175 [inlined]
  [5] systemerror
    @ .\error.jl:175 [inlined]
  [6] open(fname::String; lock::Bool, read::Nothing, write::Nothing, create::Nothing, truncate::Nothing, append::Nothing)
    @ Base .\iostream.jl:293
  [7] open
    @ .\iostream.jl:275 [inlined]
  [8] open(f::Base.var"#418#419"{String}, args::String; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Base .\io.jl:393
  [9] open
    @ .\io.jl:392 [inlined]
 [10] read
    @ .\io.jl:473 [inlined]
 [11] _include(mapexpr::Function, mod::Module, _path::String)
    @ Base .\loading.jl:1955
 [12] include
    @ .\Base.jl:457 [inlined]
 [13] bootstrap(context::Module; show_banner::Bool)
    @ Genie.Loader C:\Users\zygmu\.julia\packages\Genie\5qchC\src\Loader.jl:78
 [14] kwcall(::NamedTuple{(:show_banner,), Tuple{Bool}}, ::typeof(Genie.Loader.bootstrap), context::Module)
    @ Genie.Loader C:\Users\zygmu\.julia\packages\Genie\5qchC\src\Loader.jl:64
 [15] top-level scope
    @ C:\Users\zygmu\.julia\packages\GenieFramework\VrbUK\src\GenieFramework.jl:113
in expression starting at C:\Users\zygmu\.julia\dev\GenieDebug\app.jl:15

Ready! 

It appears to be looking for the global.jl which presumably, it shouldn't?

I ended up getting the production environment to work by creating config/env/global.jl (as per PGimenez) where the config directory is relative to the GenieDebug folder and not the DataStream folder

@essenciary
Copy link
Member

essenciary commented Jan 26, 2024

Great to hear that the sessions issue is solved!

Connected clients:

  • there is something that is not right there. I fiddled with the app for hours but it's complicated.
  • first, you're probably aware that your app has global state (all the connected clients get the same data). I presume it's by design.
  • the fact that so many sessions are started seems to indicate that new instances of the model are created every second, but due to the global state of the app, the (only) connected clients still get updated.
  • if you look in the browser developer tools, you will probably see web socket disconnection errors, indicating that the initial websocket is gone.

.env file:
I can't reproduce - see attached screenshot. Can you share your Manifest.toml file please? Thanks

image

@zygmuntszpak
Copy link
Author

first, you're probably aware that your app has global state (all the connected clients get the same data). I presume it's by design.

Actually, I'm surprised that the current app has global state. The current documentation states:

When a user makes an HTTP request to a route, a new ReactiveModel instance is created from the storage for that specific user session. This ensures that each user has an isolated state and can interact with the application independently, without affecting the state of other users. The model instantiated for the request can be accessed with @init when using the route function instead of @route:

route("/") do 
    model = @init
    @show model
    page(model, ui()) |> html
end

Example 1 of the Stipple repo says that one must declare a global model in the following manner:

route("/") do
  global model
  model = Name |> init
  page(model, ui()) |> html
end

I was actually interested in making a global model, and assumed that @init uses the implicit name of the ReactiveModel that's generated by @app. However, I found it difficult to reconcile the excellent tutorial given at JuliaCon (source code ) with existing documentation. For instance, in GenieDebug I have a line

@page("/", joinpath("DataStream", "datastream_ui.jl"), layout = "layout.jl", model = DataStream)

The documentation explains:

The app.jl file comes with a basic skeleton for your app. The block delimited by the @app macro will hold the reactive code making the UI interactive. The @page macro defines a route, and the file that will be returned to the browser when this route is visited.

However, when I @macroexpand I obtain

@macroexpand @page("/", joinpath("DataStream", "datastream_ui.jl"), layout = "layout.jl", model = DataStream)

:(Stipple.Pages.Page(layout = "layout.jl", model = DataStream, context = Main, "/", view = joinpath("DataStream", "datastream_ui.jl")))

which makes no reference to route("/") . This is probably happening somewhere in Stipple.Pages.Page, but then how do inject a global model into that statement?

I noticed that there was some discussion about global models which seems to suggest some kind of alternative, but I couldn't glean enough from the discussion to understand how to proceed.

So in summary, I am actually surprised that I have a global state. I'm curious which part of my GenieDebug app sets the global state, and what one would in principle do to turn it off. In my actual use case, I have a multi-page app and so would like to understand how to toggle global versus non-global models in such a scenario.

the fact that so many sessions are started seems to indicate that new instances of the model are created every second, but due to the global state of the app, the (only) connected clients still get updated. if you look in the browser developer tools, you will probably see web socket disconnection errors, indicating that the initial websocket is gone.

Using your latest patch with Stipple.enable_model_storage(false), I don't see multiple connections.

websocket-recording.mp4
connected-clients.mp4

I do, however, see invalid frame header errors from time to time which I suspect are due to some data race condition.

Regarding the .env file, after rebooting the machine everything seems to work correctly now and the .env file is properly recognised. However, one oddity is that a session file is still written to a temporary directory in the temp folder, even if I set the app to the production environment. Is the session folder meant to be constructed automatically? Or are the sessions simply no longer created because we have set Stipple.enable_model_storage(false) and the only session file that is created was supposed to be created in the temporary directory in both prod and dev modes?

Here is the Manifest file in case it proves useful:
manifest_as_txt.txt

@essenciary
Copy link
Member

@zygmuntszpak thanks for the feedback.

The global requires some debugging, I don't have deep knowledge about that area as I try to never use global models. I think @PGimenez and @hhaensel might understand that better. But I'll try to take some time to dig into that myself. I suspect it has to do with explicitly passing the module into @page. I'm attaching a version of the app that is much simplified and that does not use a global state. However I've been having issues with the async tasks so it's not working entirely.

GenieDebug 2.zip

Thanks for sharing the manifest file, I'll try it out. Sounds weird that it still uses the temp folder. Sessions however, are still used. The use of session is two fold:
1/ as the "secret" to map a user session to a data state (this is still kept)
2/ to store the state of the model between page reloads and automatically update the UI (this is disabled by the new directive).

@zygmuntszpak
Copy link
Author

Thanks for the alternative GenieDebug structure. I see what you mean regarding the async task. The issue is that it appears to no longer be possible to end the loop because livestream_checked continues to evaluate to true inside the while loop

    @onchange livestream_checked begin
        @warn "Livestream checked: $livestream_checked"

        @async begin
            while livestream_checked
                @warn livestream_checked
                ext_heat_in_temp_trace =  update_stream()
                sleep(1)
                yield()
            end
        end |> errormonitor
    end

However, I upped the number of points that would need to be plotted to slow down the app and noticed something rather odd. After unchecking the livestream_checked box, I did see a log message (a @warn message) that said "Livestream checked: false", but the task kept looping anyway. So it seems that livestream_checked inside the async task refers to a different variable and never sees the update.

@essenciary
Copy link
Member

Yes, this, coupled with the high number of sessions seems to indicate that the async task gets disconnected from ws updates.

@essenciary
Copy link
Member

This is an interesting case that is worth diving into. I suspect that it has to do with data sync across threads/workers. I'm not an expert and need to research more, but if some data serialization is involved, then I expect a websocket connection can not be serialized and restored.

@PGimenez
Copy link
Member

Thanks for the alternative GenieDebug structure. I see what you mean regarding the async task. The issue is that it appears to no longer be possible to end the loop because livestream_checked continues to evaluate to true inside the while loop

    @onchange livestream_checked begin
        @warn "Livestream checked: $livestream_checked"

        @async begin
            while livestream_checked
                @warn livestream_checked
                ext_heat_in_temp_trace =  update_stream()
                sleep(1)
                yield()
            end
        end |> errormonitor
    end

However, I upped the number of points that would need to be plotted to slow down the app and noticed something rather odd. After unchecking the livestream_checked box, I did see a log message (a @warn message) that said "Livestream checked: false", but the task kept looping anyway. So it seems that livestream_checked inside the async task refers to a different variable and never sees the update.

Apparently the issue is that when the variable's value is changed in the browser, the new value is propagated to the backend but not to the async task running the loop. This could be a limitation of Observables.jl, which is what Stipple uses for reactivity. To make your loop work, you'd need to define another variable to control the loop, and change its value from the Julia code instead of from the browser.

Here's what I did to make it work:

  @private run_livestream = false
   @onchange livestream_checked begin
       run_livestream = livestream_checked
       @warn "Livestream checked: $livestream_checked"
       @async begin
           while run_livestream
               @warn livestream_checked
               @warn run_livestream
               ext_heat_in_temp_trace =  update_stream()
               sleep(1)
           end
       end |> errormonitor
   end

@PGimenez
Copy link
Member

However, I think this issue is specific to your MWE @zygmuntszpak, perhaps because of the configuration changes. For example, this app works fine and I'm turning off the loop with a button in the browser:

module App
using GenieFramework
@genietools

@app begin
    @in running = false
    @in x = 0.00
    @in spawn = false
    @onbutton spawn begin
        if !running
            running = true
            @async begin
                x = 0
                while x <= 100 && running
                    x = x + 1
                    sleep(1)
                end
                x = 0
            running = false
            end
        end
    end
end

function ui()
    [btn("Spawn task", @click(:spawn)),btn("Stop task", @click("running = false")),bignumber("Counter", :x)]
end

@page("/", ui)
Server.isrunning() || Server.up()
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants