Proxmox and replacing disk in ZFS pool

For one time this blog post will not be around Telecom and Cisco/Juniper/Nokia or something like this.

Just to keep in mind how to replace a faulty device in a ZFS pool.

I have :

root@pve:~# zpool status -x
root@pve:~#  zpool status
  pool: pve-zfs
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: resilvered 41.9M in 0 days 00:00:11 with 0 errors on Sun Jul 24 13:38:51 2022
config:

        NAME                        STATE     READ WRITE CKSUM
        pve-zfs                     DEGRADED     0     0     0
          mirror-0                  DEGRADED     0     0     0
            wwn-0x50014ee267b78b52  ONLINE       0     0     0
            2534239155907356895     FAULTED      0     0     0  was /dev/sdb1
          mirror-1                  ONLINE       0     0     0
            wwn-0x50014ee267b63342  ONLINE       0     0     0
            wwn-0x50014ee2bd0cf6b4  ONLINE       0     0     0

errors: No known data errors

But how to replace this faulty device when all the howto on the net talk about replace/make offline the old disk… But in my situation I have made an RMA on the disk and don’t have mind to make the faulty device offline.

Nevertheless, I have replaced my 2TB disk with a new one, such as :
But If I made :

root@pve:~# zpool replace pve-zfs   2534239155907356895  ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M1EUJ8KN
invalid vdev specification
use '-f' to override the following errors:
/dev/disk/by-id/ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M1EUJ8KN-part1 contains a filesystem of type 'ntfs'

After make a little apt-get install parted :

root@pve:~# parted /dev/sda
GNU Parted 3.2
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) print                                                            
Model: ATA WDC WD20EFRX-68E (scsi)
Disk /dev/sda: 2000GB
Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Disk Flags: 

Number  Start   End     Size    Type     File system  Flags
 1      1049kB  2000GB  2000GB  primary  ntfs

(parted) rm 1                                                             
(parted) print                                                            
Model: ATA WDC WD20EFRX-68E (scsi)
Disk /dev/sda: 2000GB
Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Disk Flags: 

Number  Start  End  Size  Type  File system  Flags

(parted) mklabel GPT                                                      
Warning: The existing disk label on /dev/sda will be destroyed and all data on this disk will be lost. Do you want to continue?
Yes/No? Yes                                                               
(parted) q                                                                
Information: You may need to update /etc/fstab.

root@pve:~#

So :

root@pve:~# zpool replace pve-zfs   2534239155907356895  ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M1EUJ8KN
root@pve:~# zpool status -x
  pool: pve-zfs
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sun Jul 24 14:23:11 2022
        10.5G scanned at 716M/s, 4.04G issued at 276M/s, 450G total
        0B resilvered, 0.90% done, 0 days 00:27:37 to go
config:

        NAME                                            STATE     READ WRITE CKSUM
        pve-zfs                                         DEGRADED     0     0     0
          mirror-0                                      DEGRADED     0     0     0
            wwn-0x50014ee267b78b52                      ONLINE       0     0     0
            replacing-1                                 DEGRADED     0     0     0
              2534239155907356895                       FAULTED      0     0     0  was /dev/sdb1
              ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M1EUJ8KN  ONLINE       0     0     0
          mirror-1                                      ONLINE       0     0     0
            wwn-0x50014ee267b63342                      ONLINE       0     0     0
            wwn-0x50014ee2bd0cf6b4                      ONLINE       0     0     0

errors: No known data errors
root@pve:~#

How I get the new device name :

root@pve:~# ls -l /dev/disk/by-id | grep J8KN
lrwxrwxrwx 1 root root  9 Jul 24 14:23 ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M1EUJ8KN -> ../../sda
lrwxrwxrwx 1 root root 10 Jul 24 14:23 ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M1EUJ8KN-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Jul 24 14:23 ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M1EUJ8KN-part9 -> ../../sda9
root@pve:~# 

Where “J8KN” is a pattern of the Serial Number you can pick on the new disk.

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.