Proxmox – ZFS – Dead drive on active VM, recover from replicated disk

So you run a proxmox cluster, and you have 3 nodes.

Your VMs are all replicated, using ZFS replication, and in HA mode.

Each ZFS volume runs on a single drive, because we don’t have too much money, and it’s an home setup, OR, your RAID back end went nuts and you lost a full physical volume.

Issue is, your VM did not migrate through HA to another node, because obviously, the PVE node was not down, as only 1 of the ZFS drives died.

Then, I had major power outage, and a second one, ending after auto reboot, with some VMs on one node, some on another, and, this specific VM (my nextcloud), with a 1TB virtual disk, that was on the dead physical disk.

At this stage, I have a failed VM, on an active node, without valid disk for the VM, BUT, thanks to the cluster, I still see with « zfs list » on the other nodes, that the replicated copies are still here !

So here are the steps, finally easier and faster than expected :

Move the VM manually in another node of the cluster, and this is very simple, as SSH with a shell on any PVE nodes :

mv /etc/pve/nodes/pve1/qemu-server/106.conf /etc/pve/nodes/pve6/qemu-server/

Now, you did it, refresh the web interface, and the VM is now back on the node that has a valid disk for your VM ! …but it won’t start because it was in failed state at HA level. Not an issue, on the VM, menu « more », manage HA, and set it to « disabled », validate. Then again, in the menu, set it to « started ».

VM should start again, using the replica of the disk !

Now, on the node that had the failed disk, I have SATA drives, I just hot unplugged the dead unit, and put a new disk.

Initiate the new disk as GPT, and we’ll need to add it to ZFS. Issue is, volume already exists. So we need to go to « datacenter » then « storage », double click on the needed ZFS Pool, and uncheck the node that had the failed drive.

Validate, and go add the disk on the node, in ZFS, with the same ZFS Pool name.
Now you have valid new ZFS volume.

Make sure your run the replication from the currently running node, to this empty volume !!! So has you have the data replicated again.

I was a bit stressed, even if I have backup on another machine at filesystem level with backuppc of my VM content, it’s always much much much more easy and quick to get the existing VM back on track.

I hope it will help you. I realized at least how easy and simple it is to move a VM that is not running from one host to another.

Have a great day, and go to hell power outage and dead drives ! (Yes I have UPS, surge protection… still).


lundi, juin 15th, 2020 proxmox, Technologie

2 Commentaires to Proxmox – ZFS – Dead drive on active VM, recover from replicated disk

  • Pascal dit :

    Hi, you saved my life today !
    Same issue, without HA but scheduled replication.
    Just move the VM on the healthy node, will connect automatically to the replicated disk, which was not crashed. Thks !

  • Ajouter un commentaire

    Not f'd — you won't find me on Facebook
    juin 2020
    L M M J V S D

    Suivez moi sur twitter - follow me on twitter
    Follow on LinkedIn
    [FSF Associate Member]
    Free Software, Free Society
    Compacter une image virtualbox VDI
    Bon petit tutoriel esxi
    Marche d'appliances vmware
    Installer ESXi sur un disque IDE
    Installer ESXi 3.5 sur un disque USB
    Installer proxmox avec DRBD et migration / réplication à chaud
    Installer OSSEC avec VMware
    Information sur le VDI
    Ouvrir des ports dynamiquement iptables - knockd
    Autre tres bon tuto knockd
    Docs Arp poisoning - Anglais
    Metasploit test de pénétration
    Zone H - sites piratés en temps réel
    Blog invisible things
    Tips protection sécurité wordpress
    Pfsense - distribution firewall opensource - adsl internet failover
    Iproute 2 mini how to - linux advanced routing
    ClearOS - la passerelle sécuritaire lan - wan
    CDN - Accélération de la distribution de données
    drbd iscsi ocfs2 dm multipath tutoriel
    Load balancing LVS
    Load balancing opensource list
    HA-Proxy :
    HAproxy - http load balancer
    Simple tutoriel HAproxy
    HAproxy - debian tutoriel
    Centos - Ip failover
    Configuratoin DM-Multipath Redhat
    VMware Doubletake - continuité
    Quelques liens sur la réplication MySQL : Manuel MySQL, chapitre sur la réplication
    Manuel MySQL, Tutoriel clair sur la mise en place
    Autre tuto sur la mise en place de la réplication MySQL
    Références pour optimisation du serveur MySQL
    Utilisation de EXPLAIN mysql pour optimiser vos bases
    optimiser vos bases - requetes et index
    Un outil de clonage disque en reseau
    Internet NAS 250Go 250 accès VPN
    Server ISCSI avec Ubuntu tuto
    ISCSI centos redhat tutoriel
    Gérer et étendre un LVM
    Créer sa piratebox ! trop cool
    Deaddrops, les clés USB dans les murs, aussi cool !
    Télécharger Xenu
    Comment utiliser Xenu
    optimisation hébergement wordpress
    Super howto wordpress (En)
    Test de charge serveur web - Load impact
    Zeroshell - le mini-routeur wifi tout en un
    Retroshare, votre réseau d'échange crypté!
    Openvpn sur centos redhat
    Intégrer Linux dans active directory
    Routage inter-vlan avec Linux
    Routage avec OSPF
    Network Weathermap
    Boutons twitter
    Analyser les tendances des recherches Google
    Protocole sitemap - robots.txt
    Creer des animations CSS3
    Code php pour interagir avec twitter
    E reputation
    TRUCS ET ASTUCES GNU/LINUX : - Actus et tips linux
    Configurer GRUB2 et grub2 ici
    Panoet - en anglais - tips & tricks
    Readylines tips and trick pertinents
    Squid Clamav - proxy antivirus
    Apprendre Unix en 10 minutes
    13 tips sur les expressions régulières
    IE Sous linux IES
    LDAP 2.4 Quickstart guide
    Tutoriel LDAP
    Installation annuaire LDAP
    Serveur Mail Postfix - Dovecot - LDAP - MDS
    Créer un linux personnalisé en ligne - custom linux
    Super site sur linux - en
    Capistrano - déploiement automatisé
    Nagios tutoriel et doc
    Nagios plugin NRPE tuto
    Nagios plugin NRPE autre tuto
    Nagios plugin NRPE officiel
    Zabbix - fonctionnalités
    Zabbix - installation
    Guide MRTGsys - grapher la charge locale
    MRTGsys - ajouter des graphs
    MRTGsys - interpréter les données
    Shinken - Monitoring
    Thruk Monitoring webinterface
    Shinken - Tutoriel
    Shinken - Référence chez Nicolargo
    RemixJobs IT jobs
    USB Multiboot
    Reset mot de passe windows
    Java python et autres tips, intéressant !
    Forum inforeseau
    Open Clipart
    Excellent comic en ligne