Skip to content

Latest commit

 

History

History
170 lines (127 loc) · 4.85 KB

zfs_replace_disk.md

File metadata and controls

170 lines (127 loc) · 4.85 KB

再次故障

重启发生再此rpool丢失,此时

#zfs list
no datasets available

尝试修复rpool,也失败

#zpool scrub rpool
cannot open 'rpool': no such p

尝试导入rpool,但是显示这个pool状态是DEGRADED,说明存在磁盘异常

[root@testtfs-1-1 /var/log]
#zpool import -a

[root@testtfs-1-1 /var/log]
#zpool list
NAME    SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
rpool  7.25T   161G  7.09T         -     0%     2%  1.00x  DEGRADED  -

检查存储池问题

#zpool status -v rpool
  pool: rpool
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: scrub repaired 0 in 0h19m with 0 errors on Tue Jul 18 12:15:21 2017
config:

        NAME                                      STATE     READ WRITE CKSUM
        rpool                                     DEGRADED     0     0     0
          raidz1-0                                DEGRADED     0     0     0
            8555117260269250814                   UNAVAIL      0     0     0  was /dev/sdb1
            HGST_HUS724020ALA640_PN2134P6HJEZMP1  ONLINE       0     0     0
            HGST_HUS724020ALA640_PN2134P6HKPSKP1  ONLINE       0     0     0
            HGST_HUS724020ALA640_PN2134P6HLLU0X1  ONLINE       0     0     0

errors: No known data errors

这个问题和how to recover degraded zpool非常相像

这里出现的raidz中有一块磁盘UNAVAIL了

8555117260269250814                   UNAVAIL      0     0     0  was /dev/sdb1

解决的方法是剔除掉故障磁盘:

zpool replace rpool 8555117260269250814 sdb1

然后重新恢复存储池

zpool scrub rpool

仔细看上面zpool status -v rpool输出就可以看到如何处理

 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: scrub repaired 0 in 0h19m with 0 errors on Tue Jul 18 12:15:21 2017
  • 实践记录

再次重启服务器,可以看到rpool已经被导入,并且磁盘挂载正常了

#zpool list
NAME    SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
rpool  7.25T   161G  7.09T         -     0%     2%  1.00x  DEGRADED  -
#zfs list
NAME           USED  AVAIL  REFER  MOUNTPOINT
rpool          117G  5.07T   198K  none
rpool/data    87.6G  5.07T  87.6G  /data
rpool/docker  29.7G  5.07T  29.7G  /var/lib/docker

#df -h
Filesystem      Size  Used Avail Use% Mounted on
...
rpool/data      5.2T   88G  5.1T   2% /data
rpool/docker    5.1T   30G  5.1T   1% /var/lib/docker

不过,此时故障磁盘还存在

尝试替换

#zpool replace rpool 8555117260269250814 sdb1

invalid vdev specification
use '-f' to override the following errors:
/dev/sdb1 is part of active pool 'rpool'

#zpool replace -f rpool 8555117260269250814 sdb1

invalid vdev specification
the following errors must be manually repaired:
/dev/sdb1 is part of active pool 'rpool'
  • 先将故障磁盘 offline 掉
zpool offline rpool 8555117260269250814

此时检查故障磁盘已经offline掉了

#zpool status rpool
  pool: rpool
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: scrub repaired 0 in 0h18m with 0 errors on Mon Oct 30 12:23:26 2017
config:

        NAME                                      STATE     READ WRITE CKSUM
        rpool                                     DEGRADED     0     0     0
          raidz1-0                                DEGRADED     0     0     0
            8555117260269250814                   OFFLINE      0     0     0  was /dev/sdb1
            HGST_HUS724020ALA640_PN2134P6HJEZMP1  ONLINE       0     0     0
            HGST_HUS724020ALA640_PN2134P6HKPSKP1  ONLINE       0     0     0
            HGST_HUS724020ALA640_PN2134P6HLLU0X1  ONLINE       0     0     0

errors: No known data errors
  • 联系机房更换磁盘

磁盘依旧划分一个分区/dev/sdb1

  • 尝试替换故障磁盘
zpool replace rpool 8555117260269250814 sdb1

参考