-
-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to replace broken hard drive on a bare metal server? #786
Comments
Disko can run incrementally, we don't recommend it for users that don't have good recovery options since we have not tested all edge cases. But if you are testing, you can check if it works for your configuration. |
Thanks for the suggestion 🙇. And yes this machine doesn't have anything important yet so loosing all of my data is okay. I tried your suggestion by moving my flake and all .nix files into the server
|
@onnimonni There's a typo in your command. You wrote |
Ah that's true and thanks for the help. I guess this doesn't work because I needed to use configurable list of drives
But I got it working by directly using the flake instead of just the disko config. This was probably because I was using the
After this the partitions were created properly but it didn't mount the
I did then try to replace the old partition with new one but it failed:
|
I did get the "new" disk back to zpool by running:
And then rebooted the machine and also the
Happy to hear feedback about this approach but I'm glad to see this worked out 👍 I'm willing to summarize this guide and do a PR to create |
Glad to hear it! Hmm, I would say that ideally, disko should be able to do this automatically. Like I mentioned in #107, modelling a degraded pool and re-attaching devices when disko runs. Feel free to write a guide on this, I'll be happy to review it! Make sure that while writing, every step is extremely clear, and make sure to show a full configuration that allows readers to follow the exact steps. Ideally, go through all the steps again on your test machine and document that while you're doing it to make sure the guide actually works. |
For if someone else wants to give a stab writing the documentation here the repo being used for your test configuration (and the step in its history) is this https://github.com/onnimonni/hetzner-auction-nixos-example/tree/45aaf7100167f08f417224fd6a1b1dac74795fb9 right @onnimonni? |
I gave a stab at drafting up some docs on this but haven't been able to test because I don't have any unused hardware laying around to do so. Feel free to take as much or as little inspiration from them as you would like. |
We usually simulate these steps with qemu's nvme emulation: https://qemu-project.gitlab.io/qemu/system/devices/nvme.html |
This is a script I had flying around:
|
oh that would be very useful for testing out the steps I have written, I'll give it a stab sometimes later this week (likely on Friday or Saturday when I'm stuck on a plane/layover) |
@jan-leila thanks for writing the disko disk replacement docs. I tried to follow them and I have only Apple Silicon based laptop available but my server is x86-64. I think I have working x86-64 linux builder available locally but when I ran the disko format command from my own machine it fails like this: $ nix run github:nix-community/disko -- --mode format --flake .#myHost root@my-machine
error: flake 'github:nix-community/disko' does not provide attribute 'apps.aarch64-darwin.default', 'defaultApp.aarch64-darwin', 'packages.aarch64-darwin.default' or 'defaultPackage.aarch64-darwin' |
looks like the tool only supports building on:
assuming the remote builds thing you are trying is a feature that exists (I have never used it myself and don't want to try and speak authoritatively on it) there should probably be a build (or maybe this tool needs to be split into two? one for provisioning and one for calling the provisioner and providing it config) for macs maybe open a separate issue about this? |
I think you’ll want to use nixos-anywhere if you want to run disko from another machine |
True. I didn't realise this wasn't available when running remotely so I just copied the flake into the remote machine like I did last time.
I then bumped into following version mismatch error:
And then updated the disko in the remote server
After this I was succesfully able to run:
It returned lot's of errors of things already existing but for the disk which I had wiped with Then I just needed to run:
And it started to resilver the missing partition. If it would be a new disk I of course would have changed the disks in my disko config and the replace command above would have worked directly without running the Thanks a lot for the guide 👍 |
Can you show me how the command to just do disko format remotely? When I ran with I would only want the new disk to be formatted and to join the zpool I have running. |
Actually I was mistaken and |
Hey,
I'm preparing to the case where one or more of my harddrives will eventually fail. To simulate this I put one of my machines into rescue mode and completely wiped partitions of one drive with:
wipefs -a /dev/nvme1n1
and rebooted (nvme1n1
contained the/boot
ESP partition in my case andnvme0n1
had the fallback boot).It booted up nicely and now I'm wondering what is the recommended way to recreate the partitions to a new drive and let the drive to join to the existing zfs pool?
I tried to just by deploying the same disko config again and it fails because of the missing
/boot
partition:The text was updated successfully, but these errors were encountered: