浏览代码

update docs about separated compaction

Thomas Waldmann 7 年之前
父节点
当前提交
e6fcf4ea42
共有 6 个文件被更改,包括 95 次插入25 次删除
  1. 6 4
      docs/internals/data-structures.rst
  2. 14 5
      docs/quickstart.rst
  3. 2 0
      docs/usage/delete.rst
  4. 42 7
      docs/usage/notes.rst
  5. 2 0
      docs/usage/prune.rst
  6. 29 9
      src/borg/archiver.py

+ 6 - 4
docs/internals/data-structures.rst

@@ -158,9 +158,11 @@ such obsolete entries is called sparse, while a segment containing no such entri
 Since writing a ``DELETE`` tag does not actually delete any data and
 Since writing a ``DELETE`` tag does not actually delete any data and
 thus does not free disk space any log-based data store will need a
 thus does not free disk space any log-based data store will need a
 compaction strategy (somewhat analogous to a garbage collector).
 compaction strategy (somewhat analogous to a garbage collector).
+
 Borg uses a simple forward compacting algorithm,
 Borg uses a simple forward compacting algorithm,
 which avoids modifying existing segments.
 which avoids modifying existing segments.
-Compaction runs when a commit is issued (unless the :ref:`append_only_mode` is active).
+Compaction runs when a commit is issued with ``compact=True`` parameter, e.g.
+by the ``borg compact`` command (unless the :ref:`append_only_mode` is active).
 One client transaction can manifest as multiple physical transactions,
 One client transaction can manifest as multiple physical transactions,
 since compaction is transacted, too, and Borg does not distinguish between the two::
 since compaction is transacted, too, and Borg does not distinguish between the two::
 
 
@@ -197,9 +199,9 @@ The 1.1.x series writes version 2 of the format and reads either version.
 When reading a version 1 hints file, Borg 1.1.x will
 When reading a version 1 hints file, Borg 1.1.x will
 read all sparse segments to determine their sparsity.
 read all sparse segments to determine their sparsity.
 
 
-This process may take some time if a repository is kept in the append-only mode,
-which causes the number of sparse segments to grow. Repositories not in append-only
-mode have no sparse segments in 1.0.x, since compaction is unconditional.
+This process may take some time if a repository has been kept in append-only mode
+or ``borg compact`` has not been used for a longer time, which both has caused
+the number of sparse segments to grow.
 
 
 Compaction processes sparse segments from oldest to newest; sparse segments
 Compaction processes sparse segments from oldest to newest; sparse segments
 which don't contain enough deleted data to justify compaction are skipped. This
 which don't contain enough deleted data to justify compaction are skipped. This

+ 14 - 5
docs/quickstart.rst

@@ -59,7 +59,7 @@ Also helpful:
 - if you use LVM: use a LV + a filesystem that you can resize later and have
 - if you use LVM: use a LV + a filesystem that you can resize later and have
   some unallocated PEs you can add to the LV.
   some unallocated PEs you can add to the LV.
 - consider using quotas
 - consider using quotas
-- use `prune` regularly
+- use `prune` and `compact` regularly
 
 
 .. [1] This failsafe can fail in these circumstances:
 .. [1] This failsafe can fail in these circumstances:
 
 
@@ -105,8 +105,10 @@ Some files which aren't necessarily needed in this backup are excluded. See
 :ref:`borg_patterns` on how to add more exclude options.
 :ref:`borg_patterns` on how to add more exclude options.
 
 
 After the backup this script also uses the :ref:`borg_prune` subcommand to keep
 After the backup this script also uses the :ref:`borg_prune` subcommand to keep
-only a certain number of old archives and deletes the others in order to preserve
-disk space.
+only a certain number of old archives and deletes the others.
+
+Finally, it uses the :ref:`borg_compact` subcommand to remove deleted objects
+from the segment files in the repository to preserve disk space.
 
 
 Before running, make sure that the repository is initialized as documented in
 Before running, make sure that the repository is initialized as documented in
 :ref:`remote_repos` and that the script has the correct permissions to be executable
 :ref:`remote_repos` and that the script has the correct permissions to be executable
@@ -176,17 +178,24 @@ backed up and that the ``prune`` command is keeping and deleting the correct bac
 
 
     prune_exit=$?
     prune_exit=$?
 
 
+    # actually free repo disk space by compacting segments
+
+    borg compact
+
+    compact_exit=$?
+
     # use highest exit code as global exit code
     # use highest exit code as global exit code
     global_exit=$(( backup_exit > prune_exit ? backup_exit : prune_exit ))
     global_exit=$(( backup_exit > prune_exit ? backup_exit : prune_exit ))
+    global_exit=$(( compact_exit > global_exit ? compact_exit : global_exit ))
 
 
     if [ ${global_exit} -eq 1 ];
     if [ ${global_exit} -eq 1 ];
     then
     then
-        info "Backup and/or Prune finished with a warning"
+        info "Backup, Prune and/or Compact finished with a warning"
     fi
     fi
 
 
     if [ ${global_exit} -gt 1 ];
     if [ ${global_exit} -gt 1 ];
     then
     then
-        info "Backup and/or Prune finished with an error"
+        info "Backup, Prune and/or Compact finished with an error"
     fi
     fi
 
 
     exit ${global_exit}
     exit ${global_exit}

+ 2 - 0
docs/usage/delete.rst

@@ -6,6 +6,8 @@ Examples
 
 
     # delete a single backup archive:
     # delete a single backup archive:
     $ borg delete /path/to/repo::Monday
     $ borg delete /path/to/repo::Monday
+    # actually free disk space:
+    $ borg compact /path/to/repo
 
 
     # delete all archives whose names begin with the machine's hostname followed by "-"
     # delete all archives whose names begin with the machine's hostname followed by "-"
     $ borg delete --prefix '{hostname}-' /path/to/repo
     $ borg delete --prefix '{hostname}-' /path/to/repo

+ 42 - 7
docs/usage/notes.rst

@@ -148,16 +148,51 @@ Now, let's see how to restore some LVs from such a backup. ::
     $ borg extract --stdout /path/to/repo::arch dev/vg0/home-snapshot > /dev/vg0/home
     $ borg extract --stdout /path/to/repo::arch dev/vg0/home-snapshot > /dev/vg0/home
 
 
 
 
+.. _separate_compaction:
+
+Separate compaction
+~~~~~~~~~~~~~~~~~~~
+
+Borg does not auto-compact the segment files in the repository at commit time
+(at the end of each repository-writing command) any more.
+
+This is new since borg 1.2.0 and requires borg >= 1.2.0 on client and server.
+
+This causes a similar behaviour of the repository as if it was in append-only
+mode (see below) most of the time (until ``borg compact`` is invoked or an
+old client triggers auto-compaction).
+
+This has some notable consequences:
+
+- repository space is not freed immediately when deleting / pruning archives
+- commands finish quicker
+- repository is more robust and might be easier to recover after damages (as
+  it contains data in a more sequential manner, historic manifests, multiple
+  commits - until you run ``borg compact``)
+- user can choose when to run compaction (it should be done regularly, but not
+  neccessarily after each single borg command)
+- user can choose from where to invoke ``borg compact`` to do the compaction
+  (from client or from server, it does not need a key)
+- less repo sync data traffic in case you create a copy of your repository by
+  using a sync tool (like rsync, rclone, ...)
+
+You can manually run compaction by invoking the ``borg compact`` command.
+
 .. _append_only_mode:
 .. _append_only_mode:
 
 
-Append-only mode
-~~~~~~~~~~~~~~~~
+Append-only mode (forbid compaction)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+A repository can be made "append-only", which means that Borg will never
+overwrite or delete committed data (append-only refers to the segment files,
+but borg will also reject to delete the repository completely).
+
+If ``borg compact`` command is used on a repo in append-only mode, there
+will be no warning or error, but no compaction will happen.
 
 
-A repository can be made "append-only", which means that Borg will never overwrite or
-delete committed data (append-only refers to the segment files, but borg will also
-reject to delete the repository completely). This is useful for scenarios where a
-backup client machine backups remotely to a backup server using ``borg serve``, since
-a hacked client machine cannot delete backups on the server permanently.
+append-only is useful for scenarios where a backup client machine backups
+remotely to a backup server using ``borg serve``, since a hacked client machine
+cannot delete backups on the server permanently.
 
 
 To activate append-only mode, set ``append_only`` to 1 in the repository config::
 To activate append-only mode, set ``append_only`` to 1 in the repository config::
 
 

+ 2 - 0
docs/usage/prune.rst

@@ -23,6 +23,8 @@ first so you will see what it would do without it actually doing anything.
     # Same as above but only apply to archive names starting with the hostname
     # Same as above but only apply to archive names starting with the hostname
     # of the machine followed by a "-" character:
     # of the machine followed by a "-" character:
     $ borg prune -v --list --keep-daily=7 --keep-weekly=4 --prefix='{hostname}-' /path/to/repo
     $ borg prune -v --list --keep-daily=7 --keep-weekly=4 --prefix='{hostname}-' /path/to/repo
+    # actually free disk space:
+    $ borg compact /path/to/repo
 
 
     # Keep 7 end of day, 4 additional end of week archives,
     # Keep 7 end of day, 4 additional end of week archives,
     # and an end of month archive for every month:
     # and an end of month archive for every month:

+ 29 - 9
src/borg/archiver.py

@@ -2311,6 +2311,7 @@ class Archiver:
         # It will replace the entire :ref:`foo` verbatim.
         # It will replace the entire :ref:`foo` verbatim.
         rst_plain_text_references = {
         rst_plain_text_references = {
             'a_status_oddity': '"I am seeing ‘A’ (added) status for a unchanged file!?"',
             'a_status_oddity': '"I am seeing ‘A’ (added) status for a unchanged file!?"',
+            'separate_compaction': '"Separate compaction"',
         }
         }
 
 
         def process_epilog(epilog):
         def process_epilog(epilog):
@@ -3220,9 +3221,13 @@ class Archiver:
 
 
         delete_epilog = process_epilog("""
         delete_epilog = process_epilog("""
         This command deletes an archive from the repository or the complete repository.
         This command deletes an archive from the repository or the complete repository.
-        Disk space is reclaimed accordingly. If you delete the complete repository, the
-        local cache for it (if any) is also deleted. Alternatively, you can delete just
-        the local cache with the ``--cache-only`` option.
+
+        Important: When deleting archives, repository disk space is **not** freed until
+        you run ``borg compact``.
+
+        If you delete the complete repository, the local cache for it (if any) is
+        also deleted. Alternatively, you can delete just the local cache with the
+        ``--cache-only`` option.
 
 
         When using ``--stats``, you will get some statistics about how much data was
         When using ``--stats``, you will get some statistics about how much data was
         deleted - the "Deleted data" deduplicated size there is most interesting as
         deleted - the "Deleted data" deduplicated size there is most interesting as
@@ -3376,8 +3381,12 @@ class Archiver:
 
 
         prune_epilog = process_epilog("""
         prune_epilog = process_epilog("""
         The prune command prunes a repository by deleting all archives not matching
         The prune command prunes a repository by deleting all archives not matching
-        any of the specified retention options. This command is normally used by
-        automated backup scripts wanting to keep a certain number of historic backups.
+        any of the specified retention options.
+
+        Important: Repository disk space is **not** freed until you run ``borg compact``.
+
+        This command is normally used by automated backup scripts wanting to keep a
+        certain number of historic backups.
 
 
         Also, prune automatically removes checkpoint archives (incomplete archives left
         Also, prune automatically removes checkpoint archives (incomplete archives left
         behind by interrupted backup runs) except if the checkpoint is the latest
         behind by interrupted backup runs) except if the checkpoint is the latest
@@ -3564,6 +3573,8 @@ class Archiver:
 
 
         This is an *experimental* feature. Do *not* use this on your only backup.
         This is an *experimental* feature. Do *not* use this on your only backup.
 
 
+        Important: Repository disk space is **not** freed until you run ``borg compact``.
+
         ``--exclude``, ``--exclude-from``, ``--exclude-if-present``, ``--keep-exclude-tags``, and PATH
         ``--exclude``, ``--exclude-from``, ``--exclude-if-present``, ``--keep-exclude-tags``, and PATH
         have the exact same semantics as in "borg create". If PATHs are specified the
         have the exact same semantics as in "borg create". If PATHs are specified the
         resulting archive will only contain files from these PATHs.
         resulting archive will only contain files from these PATHs.
@@ -3592,10 +3603,9 @@ class Archiver:
 
 
         With ``--target`` the original archive is not replaced, instead a new archive is created.
         With ``--target`` the original archive is not replaced, instead a new archive is created.
 
 
-        When rechunking space usage can be substantial, expect at least the entire
-        deduplicated size of the archives using the previous chunker params.
-        When recompressing expect approx. (throughput / checkpoint-interval) in space usage,
-        assuming all chunks are recompressed.
+        When rechunking (or recompressing), space usage can be substantial - expect
+        at least the entire deduplicated size of the archives using the previous
+        chunker (or compression) params.
 
 
         If you recently ran borg check --repair and it had to fix lost chunks with all-zero
         If you recently ran borg check --repair and it had to fix lost chunks with all-zero
         replacement chunks, please first run another backup for the same data and re-run
         replacement chunks, please first run another backup for the same data and re-run
@@ -3697,6 +3707,16 @@ class Archiver:
 
 
         compact_epilog = process_epilog("""
         compact_epilog = process_epilog("""
         This command frees repository space by compacting segments.
         This command frees repository space by compacting segments.
+
+        Use this regularly to avoid running out of space - you do not need to use this
+        after each borg command though.
+
+        borg compact does not need a key, so it is possible to invoke it from the
+        client or also from the server.
+
+        Depending on the amount of segments that need compaction, it may take a while.
+
+        See :ref:`separate_compaction` in Additional Notes for more details.
         """)
         """)
         subparser = subparsers.add_parser('compact', parents=[common_parser], add_help=False,
         subparser = subparsers.add_parser('compact', parents=[common_parser], add_help=False,
                                           description=self.do_compact.__doc__,
                                           description=self.do_compact.__doc__,