Quellcode durchsuchen

update docs about separated compaction

Thomas Waldmann vor 6 Jahren
Ursprung
Commit
e6fcf4ea42

+ 6 - 4
docs/internals/data-structures.rst

@@ -158,9 +158,11 @@ such obsolete entries is called sparse, while a segment containing no such entri
 Since writing a ``DELETE`` tag does not actually delete any data and
 thus does not free disk space any log-based data store will need a
 compaction strategy (somewhat analogous to a garbage collector).
+
 Borg uses a simple forward compacting algorithm,
 which avoids modifying existing segments.
-Compaction runs when a commit is issued (unless the :ref:`append_only_mode` is active).
+Compaction runs when a commit is issued with ``compact=True`` parameter, e.g.
+by the ``borg compact`` command (unless the :ref:`append_only_mode` is active).
 One client transaction can manifest as multiple physical transactions,
 since compaction is transacted, too, and Borg does not distinguish between the two::
 
@@ -197,9 +199,9 @@ The 1.1.x series writes version 2 of the format and reads either version.
 When reading a version 1 hints file, Borg 1.1.x will
 read all sparse segments to determine their sparsity.
 
-This process may take some time if a repository is kept in the append-only mode,
-which causes the number of sparse segments to grow. Repositories not in append-only
-mode have no sparse segments in 1.0.x, since compaction is unconditional.
+This process may take some time if a repository has been kept in append-only mode
+or ``borg compact`` has not been used for a longer time, which both has caused
+the number of sparse segments to grow.
 
 Compaction processes sparse segments from oldest to newest; sparse segments
 which don't contain enough deleted data to justify compaction are skipped. This

+ 14 - 5
docs/quickstart.rst

@@ -59,7 +59,7 @@ Also helpful:
 - if you use LVM: use a LV + a filesystem that you can resize later and have
   some unallocated PEs you can add to the LV.
 - consider using quotas
-- use `prune` regularly
+- use `prune` and `compact` regularly
 
 .. [1] This failsafe can fail in these circumstances:
 
@@ -105,8 +105,10 @@ Some files which aren't necessarily needed in this backup are excluded. See
 :ref:`borg_patterns` on how to add more exclude options.
 
 After the backup this script also uses the :ref:`borg_prune` subcommand to keep
-only a certain number of old archives and deletes the others in order to preserve
-disk space.
+only a certain number of old archives and deletes the others.
+
+Finally, it uses the :ref:`borg_compact` subcommand to remove deleted objects
+from the segment files in the repository to preserve disk space.
 
 Before running, make sure that the repository is initialized as documented in
 :ref:`remote_repos` and that the script has the correct permissions to be executable
@@ -176,17 +178,24 @@ backed up and that the ``prune`` command is keeping and deleting the correct bac
 
     prune_exit=$?
 
+    # actually free repo disk space by compacting segments
+
+    borg compact
+
+    compact_exit=$?
+
     # use highest exit code as global exit code
     global_exit=$(( backup_exit > prune_exit ? backup_exit : prune_exit ))
+    global_exit=$(( compact_exit > global_exit ? compact_exit : global_exit ))
 
     if [ ${global_exit} -eq 1 ];
     then
-        info "Backup and/or Prune finished with a warning"
+        info "Backup, Prune and/or Compact finished with a warning"
     fi
 
     if [ ${global_exit} -gt 1 ];
     then
-        info "Backup and/or Prune finished with an error"
+        info "Backup, Prune and/or Compact finished with an error"
     fi
 
     exit ${global_exit}

+ 2 - 0
docs/usage/delete.rst

@@ -6,6 +6,8 @@ Examples
 
     # delete a single backup archive:
     $ borg delete /path/to/repo::Monday
+    # actually free disk space:
+    $ borg compact /path/to/repo
 
     # delete all archives whose names begin with the machine's hostname followed by "-"
     $ borg delete --prefix '{hostname}-' /path/to/repo

+ 42 - 7
docs/usage/notes.rst

@@ -148,16 +148,51 @@ Now, let's see how to restore some LVs from such a backup. ::
     $ borg extract --stdout /path/to/repo::arch dev/vg0/home-snapshot > /dev/vg0/home
 
 
+.. _separate_compaction:
+
+Separate compaction
+~~~~~~~~~~~~~~~~~~~
+
+Borg does not auto-compact the segment files in the repository at commit time
+(at the end of each repository-writing command) any more.
+
+This is new since borg 1.2.0 and requires borg >= 1.2.0 on client and server.
+
+This causes a similar behaviour of the repository as if it was in append-only
+mode (see below) most of the time (until ``borg compact`` is invoked or an
+old client triggers auto-compaction).
+
+This has some notable consequences:
+
+- repository space is not freed immediately when deleting / pruning archives
+- commands finish quicker
+- repository is more robust and might be easier to recover after damages (as
+  it contains data in a more sequential manner, historic manifests, multiple
+  commits - until you run ``borg compact``)
+- user can choose when to run compaction (it should be done regularly, but not
+  neccessarily after each single borg command)
+- user can choose from where to invoke ``borg compact`` to do the compaction
+  (from client or from server, it does not need a key)
+- less repo sync data traffic in case you create a copy of your repository by
+  using a sync tool (like rsync, rclone, ...)
+
+You can manually run compaction by invoking the ``borg compact`` command.
+
 .. _append_only_mode:
 
-Append-only mode
-~~~~~~~~~~~~~~~~
+Append-only mode (forbid compaction)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+A repository can be made "append-only", which means that Borg will never
+overwrite or delete committed data (append-only refers to the segment files,
+but borg will also reject to delete the repository completely).
+
+If ``borg compact`` command is used on a repo in append-only mode, there
+will be no warning or error, but no compaction will happen.
 
-A repository can be made "append-only", which means that Borg will never overwrite or
-delete committed data (append-only refers to the segment files, but borg will also
-reject to delete the repository completely). This is useful for scenarios where a
-backup client machine backups remotely to a backup server using ``borg serve``, since
-a hacked client machine cannot delete backups on the server permanently.
+append-only is useful for scenarios where a backup client machine backups
+remotely to a backup server using ``borg serve``, since a hacked client machine
+cannot delete backups on the server permanently.
 
 To activate append-only mode, set ``append_only`` to 1 in the repository config::
 

+ 2 - 0
docs/usage/prune.rst

@@ -23,6 +23,8 @@ first so you will see what it would do without it actually doing anything.
     # Same as above but only apply to archive names starting with the hostname
     # of the machine followed by a "-" character:
     $ borg prune -v --list --keep-daily=7 --keep-weekly=4 --prefix='{hostname}-' /path/to/repo
+    # actually free disk space:
+    $ borg compact /path/to/repo
 
     # Keep 7 end of day, 4 additional end of week archives,
     # and an end of month archive for every month:

+ 29 - 9
src/borg/archiver.py

@@ -2311,6 +2311,7 @@ class Archiver:
         # It will replace the entire :ref:`foo` verbatim.
         rst_plain_text_references = {
             'a_status_oddity': '"I am seeing ‘A’ (added) status for a unchanged file!?"',
+            'separate_compaction': '"Separate compaction"',
         }
 
         def process_epilog(epilog):
@@ -3220,9 +3221,13 @@ class Archiver:
 
         delete_epilog = process_epilog("""
         This command deletes an archive from the repository or the complete repository.
-        Disk space is reclaimed accordingly. If you delete the complete repository, the
-        local cache for it (if any) is also deleted. Alternatively, you can delete just
-        the local cache with the ``--cache-only`` option.
+
+        Important: When deleting archives, repository disk space is **not** freed until
+        you run ``borg compact``.
+
+        If you delete the complete repository, the local cache for it (if any) is
+        also deleted. Alternatively, you can delete just the local cache with the
+        ``--cache-only`` option.
 
         When using ``--stats``, you will get some statistics about how much data was
         deleted - the "Deleted data" deduplicated size there is most interesting as
@@ -3376,8 +3381,12 @@ class Archiver:
 
         prune_epilog = process_epilog("""
         The prune command prunes a repository by deleting all archives not matching
-        any of the specified retention options. This command is normally used by
-        automated backup scripts wanting to keep a certain number of historic backups.
+        any of the specified retention options.
+
+        Important: Repository disk space is **not** freed until you run ``borg compact``.
+
+        This command is normally used by automated backup scripts wanting to keep a
+        certain number of historic backups.
 
         Also, prune automatically removes checkpoint archives (incomplete archives left
         behind by interrupted backup runs) except if the checkpoint is the latest
@@ -3564,6 +3573,8 @@ class Archiver:
 
         This is an *experimental* feature. Do *not* use this on your only backup.
 
+        Important: Repository disk space is **not** freed until you run ``borg compact``.
+
         ``--exclude``, ``--exclude-from``, ``--exclude-if-present``, ``--keep-exclude-tags``, and PATH
         have the exact same semantics as in "borg create". If PATHs are specified the
         resulting archive will only contain files from these PATHs.
@@ -3592,10 +3603,9 @@ class Archiver:
 
         With ``--target`` the original archive is not replaced, instead a new archive is created.
 
-        When rechunking space usage can be substantial, expect at least the entire
-        deduplicated size of the archives using the previous chunker params.
-        When recompressing expect approx. (throughput / checkpoint-interval) in space usage,
-        assuming all chunks are recompressed.
+        When rechunking (or recompressing), space usage can be substantial - expect
+        at least the entire deduplicated size of the archives using the previous
+        chunker (or compression) params.
 
         If you recently ran borg check --repair and it had to fix lost chunks with all-zero
         replacement chunks, please first run another backup for the same data and re-run
@@ -3697,6 +3707,16 @@ class Archiver:
 
         compact_epilog = process_epilog("""
         This command frees repository space by compacting segments.
+
+        Use this regularly to avoid running out of space - you do not need to use this
+        after each borg command though.
+
+        borg compact does not need a key, so it is possible to invoke it from the
+        client or also from the server.
+
+        Depending on the amount of segments that need compaction, it may take a while.
+
+        See :ref:`separate_compaction` in Additional Notes for more details.
         """)
         subparser = subparsers.add_parser('compact', parents=[common_parser], add_help=False,
                                           description=self.do_compact.__doc__,