Replies: 18 comments 26 replies
-
|
hi, it was mostly designed as backup tool. It could serve as replication tool too if this would be implemented: currently the backup data is stored in a special format that does not allow to boot a virtual machine directly from If it would support storing the backup data as qcow images, yes, it could be used as a "replication tool". At the time i started the implementation i wanted a streamable format for the backup data (which is not quite possible with qcow images). Besides that, i think, the Qemu stack has other implementations to replicate virtual machines, it is called COLO. https://wiki.qemu.org/Features/COLO Not sure if this can be setup via libvirt. FWIW, using the dirty bitmap mechanism to create block level "backups" currently also requires filesystem freeze/thaw in the virtual machine, so that is not different from a storage level approach of replicating the data. I have already had thoughts about designing a replication tool around the dirty bitmap feature, but i think this would then be a complete new implementation and probably in another language (golang). |
Beta Was this translation helpful? Give feedback.
-
|
ive moved this discussion to the (maybe) future repository. Im learning as we go.. so ive setup dockerized builds for all kinds of distributions now, the latest releases are available at: https://grinser.de/~abi/vmsync/releases/ The rocky linux builds should be binary compatible to the according RHEL releases. |
Beta Was this translation helpful? Give feedback.
-
|
I see... Yet I am not really willing to overcomplexify my setups at this point (I already have bridges + openvswitch instances, so adding docker on top of this will require me to make firewall rules and various network exceptions). I decided to add another EL10 host, so I could just "play" without any further setup requirements. I'm connecting from hyper01p to hyper02p. So far my ssh setup seems to be working without any major issue (hosts files are filled with ips, I allowed root ssh connexion for this setup, authorized_keys files and ssh host config is done): When I try to use the same settings with vmsync I cannot get a connection yet: I've removed my Any chances that the go lib for ssh doesn't like that key format ? |
Beta Was this translation helpful? Give feedback.
-
|
I've also grabbed the rocky linux 9.3 release to use on my AlmaLinux 9.7 to AlmaLinux 10.1 test setup. I've tried to run the replica as below: My primary thought is that |
Beta Was this translation helpful? Give feedback.
-
|
-output-dir is something that ive used for testing locally. Its of no use currently an i think ill remove it.
|
Beta Was this translation helpful? Give feedback.
-
|
Initial copy result: Second run result: Perhaps the only thing I would request is to change the wording |
Beta Was this translation helpful? Give feedback.
-
|
Next tests: Noticed that the VMs need to be running to be able to replicate. Of course I do understand that a MVP does not include "side quest" options. But it would be a nice feature for the future ;) I've finally replicated a VM over a WAN link: Second run: So far so good, everything works ;) |
Beta Was this translation helpful? Give feedback.
-
My assumption that you would send the nbd output via ssh was wrong just because I didn't need to open a firewall port, which was normal since I configured my firewall to accept any traffic between both hosts. I'm currently investigating two things:
Will report back once I get some answers. |
Beta Was this translation helpful? Give feedback.
-
|
While continuing tests, I think I found a blocker I guess. On source system, disk is 20G: Launching incremental copy On target system: Note that the source disk is 20G whereas the target disk is rouhly the size of the incremental send. Decided to run it all over again and start on a fresh replica: On source: Still on source: On target, after initial vmsync run: Since the target file is not an exact copy, checksumming won't do, but I somehow guess that having 20G of data on both sides is "good enough(TM)" for my test. Could probably just fire up the machine to make sure. Now when I run a second vmsync command on source (same invocation): on target: Again, lost the parent backing checkpoint. Is there something I missed ? Some context: Source Destination |
Beta Was this translation helpful? Give feedback.
-
|
hm.. i would have had the impression its possible to incrementally change a qcow image via NBD, but seems im wrong.. I think what needs to be done is to create a new, temporary qcow image on the target with backing image pointing to the already existing one and then rebase the contents... like described in the qemu documentation: https://qemu-project.gitlab.io/qemu/interop/bitmaps.html#example-second-incremental-backup |
Beta Was this translation helpful? Give feedback.
-
|
yeah, apparently a temporary image with pointing to the base image via backing file option and then committing the changes is required. so my test was:
ive pushed a new release. |
Beta Was this translation helpful? Give feedback.
-
|
Currently making some tests.
I'd say it depends on what level of "correctness" you want to achieve. At least mtime check is fairly easy to implement and covers 99% of all problems, the same way as rsync does when not runnig with
I totally see your point of course. It's just that on a disaster recovery, being able to rollback to n-X snapshots/checkpoints is a very common and useful option, when one replicates let's say every 15 minutes and a disaster happened 2 hours ago or so, that's where it's a real plus to have those checkpoints/snapshots ready, without the actual need to restore from a backup.
Cool, didn't properly read all the options ;)
For the security concerns, I've setup a VPN between some test servers to secure transfers. As for the actual tests: I have two hosts: A and B Both hosts are setup so I can ssh from A into B as root and from B into A, using the public IPs or the VPN IPs, with the same ssh key. Using public IP Using VPN IP I can use vmsync via the public IPs Whereas the same command via the VPN asks me for the password. On the hyper01p side, there's only one line I'm getting in the sshd logs when I use vmsync to connect via VPN: I have checked this multiple times, and have no idea why ssh would work via my VPN but vmsync wouldn't. Is there any chance that the ssh implementation can't properly detect TCP MSS ? [EDIT] I could probably use tcp mss clamping, which would probably resvolve the issue, but that's rather a big decision on servers that shouldn't be taken just for one program |
Beta Was this translation helpful? Give feedback.
-
|
the qemu URI is not part of the regular ssh communication but the libvirt layer. I guess virsh -c qemu+ssh://10.11.12.13/system may ask for th epassword, too. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the insight, that helped me diag the issue. Perhaps, in order to be consistent, you should remove This way, both implementations would use the same authentication method, as defined by the system. A part from the ssh stuff; I've logged into the remote host to check: Interestingly enough, I've killed the process with Anyway, I'm back at making vmsync experiments now ;) |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the version number in logs as well as the nbd server fix. |
Beta Was this translation helpful? Give feedback.
-
|
So far a replica failed. I've decided to discard both replicas and start over again with vmsync 0.19. I'll report back. |
Beta Was this translation helpful? Give feedback.
-
|
for testing ive been using something like this.
so fare ive not seen any issues. |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
-
First of all, sorry if this isn't the right place to ask this question, and sorry if the question is quite broad.
I'm designing various KVM/qemu disaster recovery solutions, mostly based upon storage replication (zfs send/receive, lvm thin send/receive, drbd...).
So far, all of the solutions are at block level, and are quescing agnostic (at least they must trigger a intermediary freeze/thaw script for said VMs). I tend to use them because of the low RPO/RTO which is difficult to achieve with backups.
Reading the technial design of virtnbdbackup, I noticed that it has almost everything needed to make a top notch disaster recovery solution, eg differential backup that includes changed block tracking including fstrim support and quescing support out of the box, plus "multi" snapshots (ie backing chains if I understood well).
Now I am wondering how much of a gap there would be to use virtnbvbackup as disaster recovery solution with a low RTO/RPO, ie replicating changes every 5 minutes, and be able to start a VM directly from backup, while copying a flattened version of said VM storage in background and then make a pivot once copy is complete.
Is that a usecase virtnbdbackup can handle ?
Thank you for any insight.
Beta Was this translation helpful? Give feedback.
All reactions