| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151 | 
							- .. include:: ../global.rst.inc
 
- .. highlight:: none
 
- Backing up entire disk images
 
- =============================
 
- Backing up disk images can still be efficient with Borg because its `deduplication`_
 
- technique makes sure only the modified parts of the file are stored. Borg also has
 
- optional simple sparse file support for extract.
 
- It is of utmost importance to pin down the disk you want to back up.
 
- You need to use the SERIAL for that.
 
- Use:
 
- .. code-block:: bash
 
-     # You can find the short disk serial by:
 
-     # udevadm info --query=property --name=nvme1n1 | grep ID_SERIAL_SHORT | cut -d '=' -f 2
 
-     DISK_SERIAL="7VS0224F"
 
-     DISK_ID=$(readlink -f /dev/disk/by-id/*"${DISK_SERIAL}") # Returns /dev/nvme1n1
 
-     mapfile -t PARTITIONS < <(lsblk -o NAME,TYPE -p -n -l "$DISK_ID" | awk '$2 == "part" {print $1}')
 
-     echo "Partitions of $DISK_ID:"
 
-     echo "${PARTITIONS[@]}"
 
-     echo "Disk Identifier: $DISK_ID"
 
-     # Use the following line to perform a borg backup for the full disk:
 
-     # borg create --read-special /path/to/repo::{now} "$DISK_ID"
 
-     # Use the following to perform a borg backup for all partitions of the disk
 
-     # borg create --read-special /path/to/repo::{now} "${PARTITIONS[@]}"
 
-     # Example output:
 
-     # Partitions of /dev/nvme1n1:
 
-     # /dev/nvme1n1p1
 
-     # /dev/nvme1n1p2
 
-     # /dev/nvme1n1p3
 
-     # Disk Identifier: /dev/nvme1n1
 
-     # borg create --read-special /path/to/repo::{now} /dev/nvme1n1
 
-     # borg create --read-special /path/to/repo::{now} /dev/nvme1n1p1 /dev/nvme1n1p2 /dev/nvme1n1p3
 
- Decreasing the size of image backups
 
- ------------------------------------
 
- Disk images are as large as the full disk when uncompressed and might not get much
 
- smaller post-deduplication after heavy use because virtually all file systems don't
 
- actually delete file data on disk but instead delete the filesystem entries referencing
 
- the data. Therefore, if a disk nears capacity and files are deleted again, the change
 
- will barely decrease the space it takes up when compressed and deduplicated. Depending
 
- on the filesystem, there are several ways to decrease the size of a disk image:
 
- Using ntfsclone (NTFS, i.e. Windows VMs)
 
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
- ``ntfsclone`` can only operate on filesystems with the journal cleared (i.e. turned-off
 
- machines), which somewhat limits its utility in the case of VM snapshots. However, when
 
- it can be used, its special image format is even more efficient than just zeroing and
 
- deduplicating. For backup, save the disk header and the contents of each partition::
 
-     HEADER_SIZE=$(sfdisk -lo Start $DISK | grep -A1 -P 'Start$' | tail -n1 | xargs echo)
 
-     PARTITIONS=$(sfdisk -lo Device,Type $DISK | sed -e '1,/Device\s*Type/d')
 
-     dd if=$DISK count=$HEADER_SIZE | borg create repo::hostname-partinfo -
 
-     echo "$PARTITIONS" | grep NTFS | cut -d' ' -f1 | while read x; do
 
-         PARTNUM=$(echo $x | grep -Eo "[0-9]+$")
 
-         ntfsclone -so - $x | borg create repo::hostname-part$PARTNUM -
 
-     done
 
-     # to back up non-NTFS partitions as well:
 
-     echo "$PARTITIONS" | grep -v NTFS | cut -d' ' -f1 | while read x; do
 
-         PARTNUM=$(echo $x | grep -Eo "[0-9]+$")
 
-         borg create --read-special repo::hostname-part$PARTNUM $x
 
-     done
 
- Restoration is a similar process::
 
-     borg extract --stdout repo::hostname-partinfo | dd of=$DISK && partprobe
 
-     PARTITIONS=$(sfdisk -lo Device,Type $DISK | sed -e '1,/Device\s*Type/d')
 
-     borg list --format {archive}{NL} repo | grep 'part[0-9]*$' | while read x; do
 
-         PARTNUM=$(echo $x | grep -Eo "[0-9]+$")
 
-         PARTITION=$(echo "$PARTITIONS" | grep -E "$DISKp?$PARTNUM" | head -n1)
 
-         if echo "$PARTITION" | cut -d' ' -f2- | grep -q NTFS; then
 
-             borg extract --stdout repo::$x | ntfsclone -rO $(echo "$PARTITION" | cut -d' ' -f1) -
 
-         else
 
-             borg extract --stdout repo::$x | dd of=$(echo "$PARTITION" | cut -d' ' -f1)
 
-         fi
 
-     done
 
- .. note::
 
-    When backing up a disk image (as opposed to a real block device), mount it as
 
-    a loopback image to use the above snippets::
 
-        DISK=$(losetup -Pf --show /path/to/disk/image)
 
-        # do backup as shown above
 
-        losetup -d $DISK
 
- Using zerofree (ext2, ext3, ext4)
 
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
- ``zerofree`` works similarly to ntfsclone in that it zeros out unused chunks of the FS,
 
- except it works in place, zeroing the original partition. This makes the backup process
 
- a bit simpler::
 
-     sfdisk -lo Device,Type $DISK | sed -e '1,/Device\s*Type/d' | grep Linux | cut -d' ' -f1 | xargs -n1 zerofree
 
-     borg create --read-special repo::hostname-disk $DISK
 
- Because the partitions were zeroed in place, restoration is only one command::
 
-     borg extract --stdout repo::hostname-disk | dd of=$DISK
 
- .. note:: The "traditional" way to zero out space on a partition, especially one already
 
-           mounted, is to simply ``dd`` from ``/dev/zero`` to a temporary file and delete
 
-           it. This is ill-advised for the reasons mentioned in the ``zerofree`` man page:
 
-           - it is slow
 
-           - it makes the disk image (temporarily) grow to its maximal extent
 
-           - it (temporarily) uses all free space on the disk, so other concurrent write actions may fail.
 
- Virtual machines
 
- ----------------
 
- If you use non-snapshotting backup tools like Borg to back up virtual machines, then
 
- the VMs should be turned off for the duration of the backup. Backing up live VMs can
 
- (and will) result in corrupted or inconsistent backup contents: a VM image is just a
 
- regular file to Borg with the same issues as regular files when it comes to concurrent
 
- reading and writing from the same file.
 
- For backing up live VMs use filesystem snapshots on the VM host, which establishes
 
- crash-consistency for the VM images. This means that with most file systems (that
 
- are journaling) the FS will always be fine in the backup (but may need a journal
 
- replay to become accessible).
 
- Usually this does not mean that file *contents* on the VM are consistent, since file
 
- contents are normally not journaled. Notable exceptions are ext4 in data=journal mode,
 
- ZFS and btrfs (unless nodatacow is used).
 
- Applications designed with crash-consistency in mind (most relational databases like
 
- PostgreSQL, SQLite etc. but also for example Borg repositories) should always be able
 
- to recover to a consistent state from a backup created with crash-consistent snapshots
 
- (even on ext4 with data=writeback or XFS). Other applications may require a lot of work
 
- to reach application-consistency; it's a broad and complex issue that cannot be explained
 
- in entirety here.
 
- Hypervisor snapshots capturing most of the VM's state can also be used for backups and
 
- can be a better alternative to pure file system based snapshots of the VM's disk, since
 
- no state is lost. Depending on the application this can be the easiest and most reliable
 
- way to create application-consistent backups.
 
- Borg doesn't intend to address these issues due to their huge complexity and
 
- platform/software dependency. Combining Borg with the mechanisms provided by the platform
 
- (snapshots, hypervisor features) will be the best approach to start tackling them.
 
 
  |