-
-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: Some cloud resource state info not getting saved if exited the process due to OS_SIGNALS
or network failures
#257
Comments
We need to increase the knowledge base |
@Horiodino Can you included from where you find the bug? |
May be a global sync.WaitGroup can help with this problem along with context.Context to know when the signal of stop is triggered and in the Cleanup() we can wait till the waitgroup has reached to 0 |
or what we can do is we can just wait just for saving the data, then we return exit-code! but it does not make any point if it returns after saving or after running the entire function |
I was just testing Aws-Ha and for some reason my system crashed and after that I see that the Api request has been successfully created resource, but the data is not saved as it takes some few seconds to allocate resources and to save that info we need to wait for some time. |
yes i am planning for sync.Waitgrpup to tell us then somethingis not yet completed |
OS_SIGNALS
or network failures
we should use the Process Context be listening for this
// Run starts the ksctl server.
func Run(ctx context.Context, wg *sync.WaitGroup) {
defer wg.Done()
sig := make(chan os.Signal, 1)
signal.Notify(sig, os.Interrupt)
// Start the server.
go func() {
if err := startServer(ctx); err != nil {
slog.Info("Failed to start server: %v", err)
}
}()
// Wait for a signal to terminate.
select {
case <-sig:
slog.Info("Received termination signal. Shutting down...")
case <-ctx.Done():
slog.Info("Context done. Shutting down...")
}
// Wait for all goroutines to finish.
// Close the server.
if err := closeServer(); err != nil {
slog.Info("Failed to close server: %v", err)
}
} |
Describe 🐞
we have ksctl dashbaord and cli , for that currently static db #226 , for that what we can do is db for dashboard and we use local storage for the cli .
for now we can use contexts, dashboard and cli in the core and based upon that we can use it in savestatehelper funtion , we use context as const but need to check condition regarding context
You can check the code below for better context
Here you can see that if there is program termination just after
resp
got the result from azure API but just before it got assigned to theazureCloudState
. so there is a potential data loss.Provider in effect
Reproduce 💻 ➡️ 💻
It's hard to get the timing correct but there is fairly good chances that it will happen some point in time:
to reproduce just exit the program after all Nic are made, after that wait 2 or 3 seconds and exit.
Possible Fix & Expected behavior🔧
we could have a global waitgroup which acts as a protection that before the program terminates the waitgroup.Wait will happen and the left-out process happens
Screenshots 🖼️
Operating System
ksctl version
current and all previous versions
Additional context
Add any other context about the problem here.
Check Contribution's guidelines
The text was updated successfully, but these errors were encountered: