Proxmox – ZFS – Dead drive on active VM, recover from replicated disk

So you run a proxmox cluster, and you have 3 nodes.

Your VMs are all replicated, using ZFS replication, and in HA mode.

Each ZFS volume runs on a single drive, because we don’t have too much money, and it’s an home setup, OR, your RAID back end went nuts and you lost a full physical volume.

Issue is, your VM did not migrate through HA to another node, because obviously, the PVE node was not down, as only 1 of the ZFS drives died.

Then, I had major power outage, and a second one, ending after auto reboot, with some VMs on one node, some on another, and, this specific VM (my nextcloud), with a 1TB virtual disk, that was on the dead physical disk.

At this stage, I have a failed VM, on an active node, without valid disk for the VM, BUT, thanks to the cluster, I still see with « zfs list » on the other nodes, that the replicated copies are still here !

So here are the steps, finally easier and faster than expected :

Move the manually in another node of the cluster, and this is very simple, as SSH with a shell on any PVE nodes :

mv /etc/pve/nodes/pve1/qemu-server/106.conf /etc/pve/nodes/pve6/qemu-server/

Now, you did it, refresh the web interface, and the VM is now back on the node that has a valid disk for your VM ! …but it won’t start because it was in failed state at HA level. Not an issue, on the VM, menu « more », manage HA, and set it to « disabled », validate. Then again, in the menu, set it to « started ».

VM should start again, using the replica of the disk !

Now, on the node that had the failed disk, I have SATA drives, I just hot unplugged the dead unit, and put a new disk.

Initiate the new disk as GPT, and we’ll need to add it to ZFS. Issue is, volume already exists. So we need to go to « datacenter » then « storage », double click on the needed ZFS Pool, and uncheck the node that had the failed drive.

Validate, and go add the disk on the node, in ZFS, with the same ZFS Pool name.
Now you have valid new ZFS volume.

Make sure your run the replication from the currently running node, to this empty volume !!! So has you have the data replicated again.

I was a bit stressed, even if I have backup on another machine at filesystem level with backuppc of my VM content, it’s always much much much more easy and quick to get the existing VM back on track.

I hope it will help you. I realized at least how easy and simple it is to move a VM that is not running from one host to another.

Have a great day, and go to hell power outage and dead drives ! (Yes I have UPS, surge protection… still).


lundi, juin 15th, 2020 proxmox, Technologie

Ajouter un commentaire

Not f'd — you won't find me on Facebook
juin 2020

Suivez moi sur twitter - follow me on twitter
Follow on LinkedIn
[FSF Associate Member]
Free Software, Free Society
Compacter une image virtualbox VDI
Bon petit tutoriel esxi
Marche d'appliances vmware
Installer ESXi sur un disque IDE
Installer ESXi 3.5 sur un disque USB
Installer proxmox avec DRBD et migration / réplication à chaud
Installer OSSEC avec VMware
Information sur le VDI
Ouvrir des ports dynamiquement iptables - knockd
Autre tres bon tuto knockd
Docs Arp poisoning - Anglais
Metasploit test de pénétration
Zone H - sites piratés en temps réel
Blog invisible things
Tips protection sécurité wordpress
Pfsense - distribution firewall opensource - adsl internet failover
Iproute 2 mini how to - linux advanced routing
ClearOS - la passerelle sécuritaire lan - wan
CDN - Accélération de la distribution de données
drbd iscsi ocfs2 dm multipath tutoriel
Load balancing LVS
Load balancing opensource list
HA-Proxy :
HAproxy - http load balancer
Simple tutoriel HAproxy
HAproxy - debian tutoriel
Centos - Ip failover
Configuratoin DM-Multipath Redhat
VMware Doubletake - continuité
Quelques liens sur la réplication MySQL : Manuel MySQL, chapitre sur la réplication
Manuel MySQL, Tutoriel clair sur la mise en place
Autre tuto sur la mise en place de la réplication MySQL
Références pour optimisation du serveur MySQL
Utilisation de EXPLAIN mysql pour optimiser vos bases
optimiser vos bases - requetes et index
Un outil de clonage disque en reseau
Internet NAS 250Go 250 accès VPN
Server ISCSI avec Ubuntu tuto
ISCSI centos redhat tutoriel
Gérer et étendre un LVM
Créer sa piratebox ! trop cool
Deaddrops, les clés USB dans les murs, aussi cool !
Télécharger Xenu
Comment utiliser Xenu
optimisation hébergement wordpress
Super howto wordpress (En)
Test de charge serveur web - Load impact
Zeroshell - le mini-routeur wifi tout en un
Retroshare, votre réseau d'échange crypté!
Openvpn sur centos redhat
Intégrer Linux dans active directory
Routage inter-vlan avec Linux
Routage avec OSPF
Network Weathermap
Boutons twitter
Analyser les tendances des recherches Google
Protocole sitemap - robots.txt
Creer des animations CSS3
Code php pour interagir avec twitter
E reputation
TRUCS ET ASTUCES GNU/LINUX : - Actus et tips linux
Configurer GRUB2 et grub2 ici
Panoet - en anglais - tips & tricks
Readylines tips and trick pertinents
Squid Clamav - proxy antivirus
Apprendre Unix en 10 minutes
13 tips sur les expressions régulières
IE Sous linux IES
LDAP 2.4 Quickstart guide
Tutoriel LDAP
Installation annuaire LDAP
Serveur Mail Postfix - Dovecot - LDAP - MDS
Créer un linux personnalisé en ligne - custom linux
Super site sur linux - en
Capistrano - déploiement automatisé
Nagios tutoriel et doc
Nagios plugin NRPE tuto
Nagios plugin NRPE autre tuto
Nagios plugin NRPE officiel
Zabbix - fonctionnalités
Zabbix - installation
Guide MRTGsys - grapher la charge locale
MRTGsys - ajouter des graphs
MRTGsys - interpréter les données
Shinken - Monitoring
Thruk Monitoring webinterface
Shinken - Tutoriel
Shinken - Référence chez Nicolargo
RemixJobs IT jobs
USB Multiboot
Reset mot de passe windows
Java python et autres tips, intéressant !
Forum inforeseau
Open Clipart
Excellent comic en ligne