Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements system signal handling for archiving interruption / termination #22487

Open
wants to merge 10 commits into
base: 5.x-dev
Choose a base branch
from

Conversation

sgiehl
Copy link
Member

@sgiehl sgiehl commented Aug 7, 2024

Description

Provides handling of SIGINT and SIGTERM for both core:archive and scheduled-tasks:run.

SIGINT

When receiving SIGINT, the current step in the process (currently running archives, current scheduled task) should be completed and then the command should gracefully stop.

SIGTERM

When receiving SIGTERM, the archiving tries to stop immediately, based on the current step and system capabilities.

Due to the way the default CliMulti is implemented, archiving requests made with curl or shell_exec will still be awaited to completion. Archiving tasks run with Symfony/Process will be killed immediately and the invalidation of those are reset to be picked up in the next archiving run.

As scheduled tasks are stopped after the current task is finished, to avoid any timing problems with emails not being sent or being sent too often, or aborted downloads leaving the system in an unplanned state.

Tests

The behaviour of both signals should have full test coverage, documenting and asserting the way invalidations are handled.

As those tests require interaction with background shell processes, they currently use Symfony/Process, though a switch to proc_open is possible.

Every step in the tests is safeguarded by timeouts, to not leave zombie processes running if a test fails. For local debugging, depending on where a breakpoint is set (should work through the all shell execution levels), the timeouts can be changed in the implemented waitForSuccess helper methods to allow for an indefinite wait. This may require a local process cleanup after the test finishes, as some background tasks (e.g. CliMulti::executeAsyncCli) are not stopped with the associated test or core:archive process.

Refs DEV-18318

Review

@sgiehl sgiehl added the Stability For issues that make Matomo more stable and reliable to run for sys admins. label Aug 7, 2024
@sgiehl sgiehl added this to the 5.2.0 milestone Aug 7, 2024
@sgiehl sgiehl changed the title Implements system signal handling for archive interruption / termination Implements system signal handling for archiving interruption / termination Aug 7, 2024
@mneudert mneudert force-pushed the processsignals branch 2 times, most recently from 9a75dc0 to 78826a9 Compare August 16, 2024 15:23
@mneudert mneudert force-pushed the processsignals branch 4 times, most recently from 8d19658 to 138b7e8 Compare August 26, 2024 19:03
@mneudert mneudert force-pushed the processsignals branch 2 times, most recently from a53f679 to 66d7fba Compare August 28, 2024 08:11
@mneudert mneudert marked this pull request as ready for review August 28, 2024 14:34
@mneudert mneudert added Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. Needs Review PRs that need a code review labels Aug 28, 2024
@mneudert mneudert requested a review from a team August 28, 2024 14:34
@mneudert mneudert changed the base branch from 5.x-dev to cli-multi-symfony August 28, 2024 14:35
Base automatically changed from cli-multi-symfony to 5.x-dev September 2, 2024 15:30
caddoo
caddoo previously approved these changes Sep 5, 2024
Copy link
Contributor

@caddoo caddoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice clean implementation, with good test coverage 👍

public function setUp(): void
{
if (!extension_loaded('pcntl') || !function_exists('pcntl_signal')) {
$this->markTestSkipped('signal test cannot run without ext-pcntl');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand exactly when this might happen, can you elaborate?

Just worried that for some reason we skip this test without realizing and things break.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you run the tests on a system where the methods might be disabled for example. We have similar checks for other things as well. It might though be useful to start tracking skipped tests on our CI, so we are aware of changes...

@@ -121,6 +131,11 @@ public function run()

// loop through each task
foreach ($tasks as $task) {
if (in_array($this->signal, [\SIGINT, \SIGTERM], true)) {
$this->logger->info("Scheduler: Aborting due to received signal");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be helpful to log the signal that is received, SIGINT or SIGTERM

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might not hurt, but it also doesn't make a difference between the signals here.
Also this is currently only triggered through core archiving, and that one prints the signal once it receives one: https://github.com/matomo-org/matomo/pull/22487/files#diff-f41c275e883996b8b69cb765b1818e12e82cbe034e648bc3f7d03610cc7abad1R317

michalkleiner
michalkleiner previously approved these changes Sep 6, 2024
Copy link
Contributor

@michalkleiner michalkleiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done indeed! The UI test failures are unrelated.

Copy link
Contributor

This issue is in "needs review" but there has been no activity for 7 days. ping @matomo-org/core-reviewers

@github-actions github-actions bot added the Stale The label used by the Close Stale Issues action label Sep 14, 2024
Copy link
Contributor

This PR was last updated more than one month ago, maybe it's time to close it. Please check if there is anything we still can do or close this PR. ping @matomo-org/core-reviewers

@github-actions github-actions bot added the Stale for long The label used by the Close Stale Issues action label Oct 26, 2024
@github-actions github-actions bot removed the Stale The label used by the Close Stale Issues action label Nov 20, 2024
@mneudert mneudert dismissed stale reviews from michalkleiner and caddoo via f5dcce5 November 22, 2024 14:51
@mneudert mneudert added Do not close PRs with this label won't be marked as stale by the Close Stale Issues action and removed Stale for long The label used by the Close Stale Issues action labels Nov 22, 2024
@sgiehl sgiehl modified the milestones: 5.2.0, 5.3.0 Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Do not close PRs with this label won't be marked as stale by the Close Stale Issues action Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. Needs Review PRs that need a code review Stability For issues that make Matomo more stable and reliable to run for sys admins.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants