Proxmox and replacing disk in ZFS pool
For one time this blog post will not be around Telecom and Cisco/Juniper/Nokia or something like this.
Just to keep in mind how to replace a faulty device in a ZFS pool.
I have :
root@pve:~# zpool status -x
root@pve:~# zpool status
pool: pve-zfs
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://zfsonlinux.org/msg/ZFS-8000-4J
scan: resilvered 41.9M in 0 days 00:00:11 with 0 errors on Sun Jul 24 13:38:51 2022
config:
NAME STATE READ WRITE CKSUM
pve-zfs DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
wwn-0x50014ee267b78b52 ONLINE 0 0 0
2534239155907356895 FAULTED 0 0 0 was /dev/sdb1
mirror-1 ONLINE 0 0 0
wwn-0x50014ee267b63342 ONLINE 0 0 0
wwn-0x50014ee2bd0cf6b4 ONLINE 0 0 0
errors: No known data errors
But how to replace this faulty device when all the howto on the net talk about replace/make offline the old disk… But in my situation I have made an RMA on the disk and don’t have mind to make the faulty device offline.
Nevertheless, I have replaced my 2TB disk with a new one, such as : But If I made :
root@pve:~# zpool replace pve-zfs 2534239155907356895 ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M1EUJ8KN
invalid vdev specification
use '-f' to override the following errors:
/dev/disk/by-id/ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M1EUJ8KN-part1 contains a filesystem of type 'ntfs'
After make a little apt-get install parted :
root@pve:~# parted /dev/sda
GNU Parted 3.2
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) print
Model: ATA WDC WD20EFRX-68E (scsi)
Disk /dev/sda: 2000GB
Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Disk Flags:
Number Start End Size Type File system Flags
1 1049kB 2000GB 2000GB primary ntfs
(parted) rm 1
(parted) print
Model: ATA WDC WD20EFRX-68E (scsi)
Disk /dev/sda: 2000GB
Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Disk Flags:
Number Start End Size Type File system Flags
(parted) mklabel GPT
Warning: The existing disk label on /dev/sda will be destroyed and all data on this disk will be lost. Do you want to continue?
Yes/No? Yes
(parted) q
Information: You may need to update /etc/fstab.
root@pve:~#
So :
root@pve:~# zpool replace pve-zfs 2534239155907356895 ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M1EUJ8KN
root@pve:~# zpool status -x
pool: pve-zfs
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Sun Jul 24 14:23:11 2022
10.5G scanned at 716M/s, 4.04G issued at 276M/s, 450G total
0B resilvered, 0.90% done, 0 days 00:27:37 to go
config:
NAME STATE READ WRITE CKSUM
pve-zfs DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
wwn-0x50014ee267b78b52 ONLINE 0 0 0
replacing-1 DEGRADED 0 0 0
2534239155907356895 FAULTED 0 0 0 was /dev/sdb1
ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M1EUJ8KN ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
wwn-0x50014ee267b63342 ONLINE 0 0 0
wwn-0x50014ee2bd0cf6b4 ONLINE 0 0 0
errors: No known data errors
root@pve:~#
How I get the new device name :
root@pve:~# ls -l /dev/disk/by-id | grep J8KN
lrwxrwxrwx 1 root root 9 Jul 24 14:23 ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M1EUJ8KN -> ../../sda
lrwxrwxrwx 1 root root 10 Jul 24 14:23 ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M1EUJ8KN-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Jul 24 14:23 ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M1EUJ8KN-part9 -> ../../sda9
root@pve:~#
Where “J8KN” is a pattern of the Serial Number you can pick on the new disk.