Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make module data dir and config reading logs debug level #4566

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

benjirewis
Copy link
Member

@benjirewis benjirewis commented Nov 19, 2024

No description provided.

@viambot viambot added the safe to test This pull request is marked safe to test from a trusted zone label Nov 19, 2024
@@ -73,7 +73,7 @@ func newCloudWatcher(ctx context.Context, config *Config, logger logging.Logger)
}
newConfig, err := readFromCloud(cancelCtx, config, prevCfg, false, checkForNewCert, logger)
if err != nil {
logger.Errorw("error reading cloud config", "error", err)
logger.Debugw("error reading cloud config; will try again", "error", err)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I (and apparently SEs) have found this log to be output quite frequently on machines with a poor network connection. Particularly when we're in the firstRead case, and the timeout for reading is lowered, this path is not really an "error." I would be in favor of lowering this to the debug level.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm bought into not needing to log this every 10 seconds. I do like having something loosely track whether we're online or offline, but happy to find a way to get that into FTDC.

For robot startup, do we at least log whether we've got a cloud config or are using the cached config? I would like to keep a log line for that. But happy to have it manifest as a simple "we're using a cloud config" or "we're using the cached config". And not inferring which config is getting used based on seeing "error reading cloud config".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we at least log whether we've got a cloud config or are using the cached config?

Good question; we only log when we are relying on the cache. These logs still appear with my changes:

2024-11-19T18:44:06.644Z	WARN	rdk	config/reader.go:621	failed to read config from cloud, checking cache	{"error":"context deadline exceeded"}
2024-11-19T18:44:06.646Z	WARN	rdk	config/reader.go:637	unable to get cloud config; using cached version	{"config last updated":"2024-11-19T13:43:54.075-0500","error":"context deadline exceeded"

We do not log when we are relying on a cloud config. I could add that, but I also feel that since that's the default expected behavior, maybe we don't want a log for it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not log when we are relying on a cloud config. I could add that, but I also feel that since that's the default expected behavior, maybe we don't want a log for it?

I think we can live with that. Being explicit wouldn't hurt either.

The config/reader.go:621 log line -- is there any value in that one existing? My guess is that the "error, context deadline exceeded' that's logged as part of 637 is just a copy of the error reported by 621?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config/reader.go:621 log line -- is there any value in that one existing?

Yeah I don't think so; removed because your guess below is exactly correct and nothing too critical happens between the two log lines.

My guess is that the "error, context deadline exceeded' that's logged as part of 637 is just a copy of the error reported by 621?

module/server.go Outdated Show resolved Hide resolved
module/modmanager/manager.go Show resolved Hide resolved
module/modmanager/manager.go Show resolved Hide resolved
@benjirewis benjirewis marked this pull request as ready for review November 19, 2024 17:11
Copy link
Member

@dgottlieb dgottlieb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a clarification on the cloud config error for startup

@@ -73,7 +73,7 @@ func newCloudWatcher(ctx context.Context, config *Config, logger logging.Logger)
}
newConfig, err := readFromCloud(cancelCtx, config, prevCfg, false, checkForNewCert, logger)
if err != nil {
logger.Errorw("error reading cloud config", "error", err)
logger.Debugw("error reading cloud config; will try again", "error", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm bought into not needing to log this every 10 seconds. I do like having something loosely track whether we're online or offline, but happy to find a way to get that into FTDC.

For robot startup, do we at least log whether we've got a cloud config or are using the cached config? I would like to keep a log line for that. But happy to have it manifest as a simple "we're using a cloud config" or "we're using the cached config". And not inferring which config is getting used based on seeing "error reading cloud config".

module/modmanager/manager.go Show resolved Hide resolved
module/modmanager/manager.go Show resolved Hide resolved
module/server.go Outdated Show resolved Hide resolved
@benjirewis benjirewis marked this pull request as draft November 19, 2024 18:51
@benjirewis
Copy link
Member Author

Converting to draft until we find fix for data races that new goutils version brought up.

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 20, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 21, 2024
@benjirewis benjirewis marked this pull request as ready for review November 21, 2024 22:16
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
safe to test This pull request is marked safe to test from a trusted zone
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants