Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Wal-fetch #426

Open
rroux opened this issue Oct 6, 2020 · 1 comment
Open

Error Wal-fetch #426

rroux opened this issue Oct 6, 2020 · 1 comment

Comments

@rroux
Copy link

rroux commented Oct 6, 2020

Hello,
I'm currently testing a restore on one of the PostgreSQL dockers.
The backup works fine.
Restoring the full backup works, but not restoring the wal transaction files.
The restore works and after a few minutes, I have this type of error in the postgreSQL logs:

[2020-10-06 06:38:30 UTC] LOG: restored log file "00000001000009000000001B" from archive
[2020-10-06 06:38:30 UTC] LOG: server process (PID 733444) exited with exit code 2
[2020-10-06 06:38:30 UTC] LOG: terminating any other active server processes
[2020-10-06 06:38:30 UTC] FATAL: could not restore file "00000001000009000000001C" from archive: child process was terminated by signal 3: Quit
[2020-10-06 06:38:30 UTC] LOG: all server processes terminated; reinitializing
[2020-10-06 06:38:30 UTC] LOG: database system was interrupted while in recovery at log time 2020-09-21 01:21:10 UTC
[2020-10-06 06:38:30 UTC] HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
[2020-10-06 06:38:30 UTC] LOG: starting point-in-time recovery to 2020-10-05 03:58:55+00 wal_e.operator.backup INFO MSG: promoted prefetched wal segment
`

Then, the restore loops over a few Wal files without stopping.

Here is my recovery.conf file:
restore_command = 'envdir /var/lib/postgresql/data/wal-e.d/env wal-e wal-fetch %f %p'
recovery_target_time = '2020-10-05 03:58:55'

@hatharom
Copy link

hatharom commented Feb 9, 2021

How is the backup works for you inside a docker?
whenever I try to execut wal-e backup-fetch I get this error:
wal_e.exception.UserException: ERROR: MSG: attempting to overwrite a live data directory

DETAIL: Found a postmaster.pid lockfile, and aborting
HINT: Shut down postgres. If there is a stale lockfile, then remove it after being very sure postgres is not running.

Which is clear. Postgres shouldn't be running.
However postgres(at least the official postgres image) inside container can't be stopped, because the container itself will exit.

If I would start wal-e from other container then the postgres container itself wouldnt be able to call wal-e commands

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants