浏览代码

Merge pull request #3970 from ThomasWaldmann/compact-commits2

compact commits / separate compaction
TW 7 年之前
父节点
当前提交
9115ad58d9

+ 18 - 0
docs/changes.rst

@@ -139,6 +139,7 @@ Compatibility notes:
 - dropped support / testing for Python 3.4 and 3.5, minimum requirement is 3.6.
   In case your OS does not provide Python >= 3.6, consider using our binary,
   which does not need an external Python interpreter.
+- freeing repository space only happens when "borg compact" is invoked.
 - list: corrected mix-up of "isomtime" and "mtime" formats. Previously,
   "isomtime" was the default but produced a verbose human format,
   while "mtime" produced a ISO-8601-like format.
@@ -148,15 +149,32 @@ Compatibility notes:
 
 New features:
 
+- compact: "borg compact" needs to be used to free repository space by
+  compacting the segments (reading sparse segments, rewriting still needed
+  data to new segments, deleting the sparse segments).
+  Borg < 1.2 invoked compaction automatically at the end of each repository
+  writing command.
+  Borg >= 1.2 does not do that any more to give better speed, more control,
+  more segment file stability (== less stuff moving to newer segments) and
+  more robustness.
+  See the docs about "borg compact" for more details.
+- "borg compact --cleanup-commits" is to cleanup the tons of 17byte long
+  commit-ony segment files caused by borg 1.1.x issue #2850.
+  Invoke this once after upgrading (the server side) borg to 1.2.
+  Compaction now automatically removes unneeded commit-only segment files.
 - prune: Show which rule was applied to keep archive, #2886
 
 Fixes:
 
+- repository compaction now automatically removes unneeded 17byte commit-only
+  segments, #2850
 - avoid stale filehandle issues, #3265
 - make swidth available on all posix platforms, #2667
 
 Other changes:
 
+- repository: better speed and less stuff moving around by using separate
+  segment files for manifest DELETEs and PUTs, #3947
 - use pyinstaller v3.3.1 to build binaries
 - msgpack: switch to recent "msgpack" pypi pkg name, #3890
 - llfuse: modernize / simplify llfuse version requirements

+ 6 - 4
docs/internals/data-structures.rst

@@ -158,9 +158,11 @@ such obsolete entries is called sparse, while a segment containing no such entri
 Since writing a ``DELETE`` tag does not actually delete any data and
 thus does not free disk space any log-based data store will need a
 compaction strategy (somewhat analogous to a garbage collector).
+
 Borg uses a simple forward compacting algorithm,
 which avoids modifying existing segments.
-Compaction runs when a commit is issued (unless the :ref:`append_only_mode` is active).
+Compaction runs when a commit is issued with ``compact=True`` parameter, e.g.
+by the ``borg compact`` command (unless the :ref:`append_only_mode` is active).
 One client transaction can manifest as multiple physical transactions,
 since compaction is transacted, too, and Borg does not distinguish between the two::
 
@@ -197,9 +199,9 @@ The 1.1.x series writes version 2 of the format and reads either version.
 When reading a version 1 hints file, Borg 1.1.x will
 read all sparse segments to determine their sparsity.
 
-This process may take some time if a repository is kept in the append-only mode,
-which causes the number of sparse segments to grow. Repositories not in append-only
-mode have no sparse segments in 1.0.x, since compaction is unconditional.
+This process may take some time if a repository has been kept in append-only mode
+or ``borg compact`` has not been used for a longer time, which both has caused
+the number of sparse segments to grow.
 
 Compaction processes sparse segments from oldest to newest; sparse segments
 which don't contain enough deleted data to justify compaction are skipped. This

+ 87 - 0
docs/man/borg-compact.1

@@ -0,0 +1,87 @@
+.\" Man page generated from reStructuredText.
+.
+.TH BORG-COMPACT 1 "2018-07-14" "" "borg backup tool"
+.SH NAME
+borg-compact \- compact segment files in the repository
+.
+.nr rst2man-indent-level 0
+.
+.de1 rstReportMargin
+\\$1 \\n[an-margin]
+level \\n[rst2man-indent-level]
+level margin: \\n[rst2man-indent\\n[rst2man-indent-level]]
+-
+\\n[rst2man-indent0]
+\\n[rst2man-indent1]
+\\n[rst2man-indent2]
+..
+.de1 INDENT
+.\" .rstReportMargin pre:
+. RS \\$1
+. nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin]
+. nr rst2man-indent-level +1
+.\" .rstReportMargin post:
+..
+.de UNINDENT
+. RE
+.\" indent \\n[an-margin]
+.\" old: \\n[rst2man-indent\\n[rst2man-indent-level]]
+.nr rst2man-indent-level -1
+.\" new: \\n[rst2man-indent\\n[rst2man-indent-level]]
+.in \\n[rst2man-indent\\n[rst2man-indent-level]]u
+..
+.SH SYNOPSIS
+.sp
+borg [common options] compact [options] REPOSITORY
+.SH DESCRIPTION
+.sp
+This command frees repository space by compacting segments.
+.sp
+Use this regularly to avoid running out of space \- you do not need to use this
+after each borg command though.
+.sp
+borg compact does not need a key, so it is possible to invoke it from the
+client or also from the server.
+.sp
+Depending on the amount of segments that need compaction, it may take a while.
+.sp
+See \fIseparate_compaction\fP in Additional Notes for more details.
+.SH OPTIONS
+.sp
+See \fIborg\-common(1)\fP for common options of Borg commands.
+.SS arguments
+.INDENT 0.0
+.TP
+.B REPOSITORY
+repository to compact
+.UNINDENT
+.SS optional arguments
+.INDENT 0.0
+.TP
+.B \-\-cleanup\-commits
+cleanup commit\-only 17\-byte segment files
+.UNINDENT
+.SH EXAMPLES
+.INDENT 0.0
+.INDENT 3.5
+.sp
+.nf
+.ft C
+# compact segments and free repo disk space
+$ borg compact /path/to/repo
+
+# same as above plus clean up 17byte commit\-only segments,
+# use this one time after upgrading borg (server) to 1.2+
+# to clean up the tiny segments files created by borg 1.1:
+$ borg compact \-\-cleanup\-commits /path/to/repo
+.ft P
+.fi
+.UNINDENT
+.UNINDENT
+.SH SEE ALSO
+.sp
+\fIborg\-common(1)\fP
+.SH AUTHOR
+The Borg Collective
+.\" Generated by docutils manpage writer.
+.

+ 33 - 4
docs/man/borg-delete.1

@@ -1,6 +1,6 @@
 .\" Man page generated from reStructuredText.
 .
-.TH BORG-DELETE 1 "2017-11-25" "" "borg backup tool"
+.TH BORG-DELETE 1 "2018-07-14" "" "borg backup tool"
 .SH NAME
 borg-delete \- Delete an existing repository or archives
 .
@@ -36,13 +36,28 @@ borg [common options] delete [options] [TARGET] [ARCHIVE...]
 .SH DESCRIPTION
 .sp
 This command deletes an archive from the repository or the complete repository.
-Disk space is reclaimed accordingly. If you delete the complete repository, the
-local cache for it (if any) is also deleted.
+.sp
+Important: When deleting archives, repository disk space is \fBnot\fP freed until
+you run \fBborg compact\fP\&.
+.sp
+If you delete the complete repository, the local cache for it (if any) is
+also deleted. Alternatively, you can delete just the local cache with the
+\fB\-\-cache\-only\fP option.
 .sp
 When using \fB\-\-stats\fP, you will get some statistics about how much data was
 deleted \- the "Deleted data" deduplicated size there is most interesting as
 that is how much your repository will shrink.
 Please note that the "All archives" stats refer to the state after deletion.
+.sp
+You can delete multiple archives by specifying their common prefix, if they
+have one, using the \fB\-\-prefix PREFIX\fP option. You can also specify a shell
+pattern to match multiple archives using the \fB\-\-glob\-archives GLOB\fP option
+(for more info on these patterns, see \fBborg help patterns\fP). Note that these
+two options are mutually exclusive.
+.sp
+To avoid accidentally deleting archives, especially when using glob patterns,
+it might be helpful to use the \fB\-\-dry\-run\fP to test out the command without
+actually making any changes to the repository.
 .SH OPTIONS
 .sp
 See \fIborg\-common(1)\fP for common options of Borg commands.
@@ -58,6 +73,9 @@ archives to delete
 .SS optional arguments
 .INDENT 0.0
 .TP
+.B \-n\fP,\fB  \-\-dry\-run
+do not change repository
+.TP
 .B \-s\fP,\fB  \-\-stats
 print statistics for the deleted archive
 .TP
@@ -96,6 +114,17 @@ consider last N archives after other filters were applied
 .ft C
 # delete a single backup archive:
 $ borg delete /path/to/repo::Monday
+# actually free disk space:
+$ borg compact /path/to/repo
+
+# delete all archives whose names begin with the machine\(aqs hostname followed by "\-"
+$ borg delete \-\-prefix \(aq{hostname}\-\(aq /path/to/repo
+
+# delete all archives whose names contain "\-2012\-"
+$ borg delete \-\-glob\-archives \(aq*\-2012\-*\(aq /path/to/repo
+
+# see what would be deleted if delete was run without \-\-dry\-run
+$ borg delete \-v \-\-dry\-run \-a \(aq*\-May\-*\(aq /path/to/repo
 
 # delete the whole repository and the related local cache:
 $ borg delete /path/to/repo
@@ -110,7 +139,7 @@ Type \(aqYES\(aq if you understand this and want to continue: YES
 .UNINDENT
 .SH SEE ALSO
 .sp
-\fIborg\-common(1)\fP
+\fIborg\-common(1)\fP, \fIborg\-compact(1)\fP
 .SH AUTHOR
 The Borg Collective
 .\" Generated by docutils manpage writer.

+ 11 - 18
docs/man/borg-prune.1

@@ -1,6 +1,6 @@
 .\" Man page generated from reStructuredText.
 .
-.TH BORG-PRUNE 1 "2017-11-25" "" "borg backup tool"
+.TH BORG-PRUNE 1 "2018-07-14" "" "borg backup tool"
 .SH NAME
 borg-prune \- Prune repository archives according to specified rules
 .
@@ -36,8 +36,12 @@ borg [common options] prune [options] [REPOSITORY]
 .SH DESCRIPTION
 .sp
 The prune command prunes a repository by deleting all archives not matching
-any of the specified retention options. This command is normally used by
-automated backup scripts wanting to keep a certain number of historic backups.
+any of the specified retention options.
+.sp
+Important: Repository disk space is \fBnot\fP freed until you run \fBborg compact\fP\&.
+.sp
+This command is normally used by automated backup scripts wanting to keep a
+certain number of historic backups.
 .sp
 Also, prune automatically removes checkpoint archives (incomplete archives left
 behind by interrupted backup runs) except if the checkpoint is the latest
@@ -162,6 +166,8 @@ $ borg prune \-v \-\-list \-\-dry\-run \-\-keep\-daily=7 \-\-keep\-weekly=4 /pat
 # Same as above but only apply to archive names starting with the hostname
 # of the machine followed by a "\-" character:
 $ borg prune \-v \-\-list \-\-keep\-daily=7 \-\-keep\-weekly=4 \-\-prefix=\(aq{hostname}\-\(aq /path/to/repo
+# actually free disk space:
+$ borg compact /path/to/repo
 
 # Keep 7 end of day, 4 additional end of week archives,
 # and an end of month archive for every month:
@@ -175,23 +181,10 @@ $ borg prune \-v \-\-list \-\-keep\-within=10d \-\-keep\-weekly=4 \-\-keep\-mont
 .UNINDENT
 .UNINDENT
 .sp
-There is also a visualized prune example in \fBdocs/misc/prune\-example.txt\fP:
-.IP "System Message: ERROR/3 (docs/virtmanpage.rst:, line 145)"
-Unknown directive type "highlight".
-.INDENT 0.0
-.INDENT 3.5
-.sp
-.nf
-.ft C
-\&.. highlight:: none
-
-.ft P
-.fi
-.UNINDENT
-.UNINDENT
+There is also a visualized prune example in \fBdocs/misc/prune\-example.txt\fP\&.
 .SH SEE ALSO
 .sp
-\fIborg\-common(1)\fP
+\fIborg\-common(1)\fP, \fIborg\-compact(1)\fP
 .SH AUTHOR
 The Borg Collective
 .\" Generated by docutils manpage writer.

+ 8 - 7
docs/man/borg-recreate.1

@@ -1,6 +1,6 @@
 .\" Man page generated from reStructuredText.
 .
-.TH BORG-RECREATE 1 "2017-11-25" "" "borg backup tool"
+.TH BORG-RECREATE 1 "2018-07-14" "" "borg backup tool"
 .SH NAME
 borg-recreate \- Re-create archives
 .
@@ -39,6 +39,8 @@ Recreate the contents of existing archives.
 .sp
 This is an \fIexperimental\fP feature. Do \fInot\fP use this on your only backup.
 .sp
+Important: Repository disk space is \fBnot\fP freed until you run \fBborg compact\fP\&.
+.sp
 \fB\-\-exclude\fP, \fB\-\-exclude\-from\fP, \fB\-\-exclude\-if\-present\fP, \fB\-\-keep\-exclude\-tags\fP, and PATH
 have the exact same semantics as in "borg create". If PATHs are specified the
 resulting archive will only contain files from these PATHs.
@@ -67,10 +69,9 @@ archive that is built during the operation exists at the same time at
 .sp
 With \fB\-\-target\fP the original archive is not replaced, instead a new archive is created.
 .sp
-When rechunking space usage can be substantial, expect at least the entire
-deduplicated size of the archives using the previous chunker params.
-When recompressing expect approx. (throughput / checkpoint\-interval) in space usage,
-assuming all chunks are recompressed.
+When rechunking (or recompressing), space usage can be substantial \- expect
+at least the entire deduplicated size of the archives using the previous
+chunker (or compression) params.
 .sp
 If you recently ran borg check \-\-repair and it had to fix lost chunks with all\-zero
 replacement chunks, please first run another backup for the same data and re\-run
@@ -151,8 +152,8 @@ manually specify the archive creation date/time (UTC, yyyy\-mm\-ddThh:mm:ss form
 .BI \-C \ COMPRESSION\fP,\fB \ \-\-compression \ COMPRESSION
 select compression algorithm, see the output of the "borg help compression" command for details.
 .TP
-.B \-\-recompress
-recompress data chunks according to \fB\-\-compression\fP if \fIif\-different\fP\&. When \fIalways\fP, chunks that are already compressed that way are not skipped, but compressed again. Only the algorithm is considered for \fIif\-different\fP, not the compression level (if any).
+.BI \-\-recompress \ MODE
+recompress data chunks according to \fB\-\-compression\fP\&. MODE \fIif\-different\fP: recompress if current compression is with a different compression algorithm (the level is not considered). MODE \fIalways\fP: recompress even if current compression is with the same compression algorithm (use this to change the compression level). MODE \fInever\fP (default): do not recompress.
 .TP
 .BI \-\-chunker\-params \ PARAMS
 specify the chunker parameters (CHUNK_MIN_EXP, CHUNK_MAX_EXP, HASH_MASK_BITS, HASH_WINDOW_SIZE) or \fIdefault\fP to use the current defaults. default: 19,23,21,4095

+ 14 - 5
docs/quickstart.rst

@@ -59,7 +59,7 @@ Also helpful:
 - if you use LVM: use a LV + a filesystem that you can resize later and have
   some unallocated PEs you can add to the LV.
 - consider using quotas
-- use `prune` regularly
+- use `prune` and `compact` regularly
 
 .. [1] This failsafe can fail in these circumstances:
 
@@ -105,8 +105,10 @@ Some files which aren't necessarily needed in this backup are excluded. See
 :ref:`borg_patterns` on how to add more exclude options.
 
 After the backup this script also uses the :ref:`borg_prune` subcommand to keep
-only a certain number of old archives and deletes the others in order to preserve
-disk space.
+only a certain number of old archives and deletes the others.
+
+Finally, it uses the :ref:`borg_compact` subcommand to remove deleted objects
+from the segment files in the repository to preserve disk space.
 
 Before running, make sure that the repository is initialized as documented in
 :ref:`remote_repos` and that the script has the correct permissions to be executable
@@ -176,17 +178,24 @@ backed up and that the ``prune`` command is keeping and deleting the correct bac
 
     prune_exit=$?
 
+    # actually free repo disk space by compacting segments
+
+    borg compact
+
+    compact_exit=$?
+
     # use highest exit code as global exit code
     global_exit=$(( backup_exit > prune_exit ? backup_exit : prune_exit ))
+    global_exit=$(( compact_exit > global_exit ? compact_exit : global_exit ))
 
     if [ ${global_exit} -eq 1 ];
     then
-        info "Backup and/or Prune finished with a warning"
+        info "Backup, Prune and/or Compact finished with a warning"
     fi
 
     if [ ${global_exit} -gt 1 ];
     then
-        info "Backup and/or Prune finished with an error"
+        info "Backup, Prune and/or Compact finished with an error"
     fi
 
     exit ${global_exit}

+ 1 - 0
docs/usage.rst

@@ -42,6 +42,7 @@ Usage
    usage/diff
    usage/delete
    usage/prune
+   usage/compact
    usage/info
    usage/mount
    usage/key

+ 15 - 0
docs/usage/compact.rst

@@ -0,0 +1,15 @@
+.. include:: compact.rst.inc
+
+Examples
+~~~~~~~~
+::
+
+    # compact segments and free repo disk space
+    $ borg compact /path/to/repo
+
+    # same as above plus clean up 17byte commit-only segments,
+    # use this one time after upgrading borg (server) to 1.2+
+    # to clean up the tiny segments files created by borg 1.1:
+    $ borg compact --cleanup-commits /path/to/repo
+
+

+ 63 - 0
docs/usage/compact.rst.inc

@@ -0,0 +1,63 @@
+.. IMPORTANT: this file is auto-generated from borg's built-in help, do not edit!
+
+.. _borg_compact:
+
+borg compact
+------------
+.. code-block:: none
+
+    borg [common options] compact [options] REPOSITORY
+
+.. only:: html
+
+    .. class:: borg-options-table
+
+    +-------------------------------------------------------+-----------------------+-------------------------------------------+
+    | **positional arguments**                                                                                                  |
+    +-------------------------------------------------------+-----------------------+-------------------------------------------+
+    |                                                       | ``REPOSITORY``        | repository to compact                     |
+    +-------------------------------------------------------+-----------------------+-------------------------------------------+
+    | **optional arguments**                                                                                                    |
+    +-------------------------------------------------------+-----------------------+-------------------------------------------+
+    |                                                       | ``--cleanup-commits`` | cleanup commit-only 17-byte segment files |
+    +-------------------------------------------------------+-----------------------+-------------------------------------------+
+    | .. class:: borg-common-opt-ref                                                                                            |
+    |                                                                                                                           |
+    | :ref:`common_options`                                                                                                     |
+    +-------------------------------------------------------+-----------------------+-------------------------------------------+
+
+    .. raw:: html
+
+        <script type='text/javascript'>
+        $(document).ready(function () {
+            $('.borg-options-table colgroup').remove();
+        })
+        </script>
+
+.. only:: latex
+
+    REPOSITORY
+        repository to compact
+
+
+    optional arguments
+        --cleanup-commits     cleanup commit-only 17-byte segment files
+
+
+    :ref:`common_options`
+        |
+
+Description
+~~~~~~~~~~~
+
+This command frees repository space by compacting segments.
+
+Use this regularly to avoid running out of space - you do not need to use this
+after each borg command though.
+
+borg compact does not need a key, so it is possible to invoke it from the
+client or also from the server.
+
+Depending on the amount of segments that need compaction, it may take a while.
+
+See :ref:`separate_compaction` in Additional Notes for more details.

+ 2 - 0
docs/usage/delete.rst

@@ -6,6 +6,8 @@ Examples
 
     # delete a single backup archive:
     $ borg delete /path/to/repo::Monday
+    # actually free disk space:
+    $ borg compact /path/to/repo
 
     # delete all archives whose names begin with the machine's hostname followed by "-"
     $ borg delete --prefix '{hostname}-' /path/to/repo

+ 21 - 3
docs/usage/delete.rst.inc

@@ -21,6 +21,8 @@ borg delete
     +-----------------------------------------------------------------------------+---------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
     | **optional arguments**                                                                                                                                                                                                                                                       |
     +-----------------------------------------------------------------------------+---------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                                             | ``-n``, ``--dry-run``                 | do not change repository                                                                                                                               |
+    +-----------------------------------------------------------------------------+---------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
     |                                                                             | ``-s``, ``--stats``                   | print statistics for the deleted archive                                                                                                               |
     +-----------------------------------------------------------------------------+---------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
     |                                                                             | ``--cache-only``                      | delete only the local cache for the given repository                                                                                                   |
@@ -63,6 +65,7 @@ borg delete
 
 
     optional arguments
+        -n, --dry-run    do not change repository
         -s, --stats     print statistics for the deleted archive
         --cache-only    delete only the local cache for the given repository
         --force         force deletion of corrupted archives, use ``--force --force`` in case ``--force`` does not work.
@@ -84,10 +87,25 @@ Description
 ~~~~~~~~~~~
 
 This command deletes an archive from the repository or the complete repository.
-Disk space is reclaimed accordingly. If you delete the complete repository, the
-local cache for it (if any) is also deleted.
+
+Important: When deleting archives, repository disk space is **not** freed until
+you run ``borg compact``.
+
+If you delete the complete repository, the local cache for it (if any) is
+also deleted. Alternatively, you can delete just the local cache with the
+``--cache-only`` option.
 
 When using ``--stats``, you will get some statistics about how much data was
 deleted - the "Deleted data" deduplicated size there is most interesting as
 that is how much your repository will shrink.
-Please note that the "All archives" stats refer to the state after deletion.
+Please note that the "All archives" stats refer to the state after deletion.
+
+You can delete multiple archives by specifying their common prefix, if they
+have one, using the ``--prefix PREFIX`` option. You can also specify a shell
+pattern to match multiple archives using the ``--glob-archives GLOB`` option
+(for more info on these patterns, see ``borg help patterns``). Note that these
+two options are mutually exclusive.
+
+To avoid accidentally deleting archives, especially when using glob patterns,
+it might be helpful to use the ``--dry-run`` to test out the command without
+actually making any changes to the repository.

+ 42 - 7
docs/usage/notes.rst

@@ -148,16 +148,51 @@ Now, let's see how to restore some LVs from such a backup. ::
     $ borg extract --stdout /path/to/repo::arch dev/vg0/home-snapshot > /dev/vg0/home
 
 
+.. _separate_compaction:
+
+Separate compaction
+~~~~~~~~~~~~~~~~~~~
+
+Borg does not auto-compact the segment files in the repository at commit time
+(at the end of each repository-writing command) any more.
+
+This is new since borg 1.2.0 and requires borg >= 1.2.0 on client and server.
+
+This causes a similar behaviour of the repository as if it was in append-only
+mode (see below) most of the time (until ``borg compact`` is invoked or an
+old client triggers auto-compaction).
+
+This has some notable consequences:
+
+- repository space is not freed immediately when deleting / pruning archives
+- commands finish quicker
+- repository is more robust and might be easier to recover after damages (as
+  it contains data in a more sequential manner, historic manifests, multiple
+  commits - until you run ``borg compact``)
+- user can choose when to run compaction (it should be done regularly, but not
+  neccessarily after each single borg command)
+- user can choose from where to invoke ``borg compact`` to do the compaction
+  (from client or from server, it does not need a key)
+- less repo sync data traffic in case you create a copy of your repository by
+  using a sync tool (like rsync, rclone, ...)
+
+You can manually run compaction by invoking the ``borg compact`` command.
+
 .. _append_only_mode:
 
-Append-only mode
-~~~~~~~~~~~~~~~~
+Append-only mode (forbid compaction)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+A repository can be made "append-only", which means that Borg will never
+overwrite or delete committed data (append-only refers to the segment files,
+but borg will also reject to delete the repository completely).
+
+If ``borg compact`` command is used on a repo in append-only mode, there
+will be no warning or error, but no compaction will happen.
 
-A repository can be made "append-only", which means that Borg will never overwrite or
-delete committed data (append-only refers to the segment files, but borg will also
-reject to delete the repository completely). This is useful for scenarios where a
-backup client machine backups remotely to a backup server using ``borg serve``, since
-a hacked client machine cannot delete backups on the server permanently.
+append-only is useful for scenarios where a backup client machine backups
+remotely to a backup server using ``borg serve``, since a hacked client machine
+cannot delete backups on the server permanently.
 
 To activate append-only mode, set ``append_only`` to 1 in the repository config::
 

+ 2 - 0
docs/usage/prune.rst

@@ -23,6 +23,8 @@ first so you will see what it would do without it actually doing anything.
     # Same as above but only apply to archive names starting with the hostname
     # of the machine followed by a "-" character:
     $ borg prune -v --list --keep-daily=7 --keep-weekly=4 --prefix='{hostname}-' /path/to/repo
+    # actually free disk space:
+    $ borg compact /path/to/repo
 
     # Keep 7 end of day, 4 additional end of week archives,
     # and an end of month archive for every month:

+ 6 - 2
docs/usage/prune.rst.inc

@@ -98,8 +98,12 @@ Description
 ~~~~~~~~~~~
 
 The prune command prunes a repository by deleting all archives not matching
-any of the specified retention options. This command is normally used by
-automated backup scripts wanting to keep a certain number of historic backups.
+any of the specified retention options.
+
+Important: Repository disk space is **not** freed until you run ``borg compact``.
+
+This command is normally used by automated backup scripts wanting to keep a
+certain number of historic backups.
 
 Also, prune automatically removes checkpoint archives (incomplete archives left
 behind by interrupted backup runs) except if the checkpoint is the latest

+ 59 - 58
docs/usage/recreate.rst.inc

@@ -12,59 +12,59 @@ borg recreate
 
     .. class:: borg-options-table
 
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    | **positional arguments**                                                                                                                                                                                                                                                                                                                                                        |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``REPOSITORY_OR_ARCHIVE``                         | repository/archive to recreate                                                                                                                                                                                                                                      |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``PATH``                                          | paths to recreate; patterns are supported                                                                                                                                                                                                                           |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    | **optional arguments**                                                                                                                                                                                                                                                                                                                                                          |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``--list``                                        | output verbose list of items (files, dirs, ...)                                                                                                                                                                                                                     |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``--filter STATUSCHARS``                          | only display items with the given status characters (listed in borg create --help)                                                                                                                                                                                  |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``-n``, ``--dry-run``                             | do not change anything                                                                                                                                                                                                                                              |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``-s``, ``--stats``                               | print statistics at end                                                                                                                                                                                                                                             |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    | .. class:: borg-common-opt-ref                                                                                                                                                                                                                                                                                                                                                  |
-    |                                                                                                                                                                                                                                                                                                                                                                                 |
-    | :ref:`common_options`                                                                                                                                                                                                                                                                                                                                                           |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    | **Exclusion options**                                                                                                                                                                                                                                                                                                                                                           |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``-e PATTERN``, ``--exclude PATTERN``             | exclude paths matching PATTERN                                                                                                                                                                                                                                      |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``--exclude-from EXCLUDEFILE``                    | read exclude patterns from EXCLUDEFILE, one per line                                                                                                                                                                                                                |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``--pattern PATTERN``                             | experimental: include/exclude paths matching PATTERN                                                                                                                                                                                                                |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``--patterns-from PATTERNFILE``                   | experimental: read include/exclude patterns from PATTERNFILE, one per line                                                                                                                                                                                          |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``--exclude-caches``                              | exclude directories that contain a CACHEDIR.TAG file (http://www.brynosaurus.com/cachedir/spec.html)                                                                                                                                                                |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``--exclude-if-present NAME``                     | exclude directories that are tagged by containing a filesystem object with the given NAME                                                                                                                                                                           |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``--keep-exclude-tags``, ``--keep-tag-files``     | if tag objects are specified with ``--exclude-if-present``, don't omit the tag objects themselves from the backup archive                                                                                                                                           |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    | **Archive options**                                                                                                                                                                                                                                                                                                                                                             |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``--target TARGET``                               | create a new archive with the name ARCHIVE, do not replace existing archive (only applies for a single archive)                                                                                                                                                     |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``-c SECONDS``, ``--checkpoint-interval SECONDS`` | write checkpoint every SECONDS seconds (Default: 1800)                                                                                                                                                                                                              |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``--comment COMMENT``                             | add a comment text to the archive                                                                                                                                                                                                                                   |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``--timestamp TIMESTAMP``                         | manually specify the archive creation date/time (UTC, yyyy-mm-ddThh:mm:ss format). alternatively, give a reference file/directory.                                                                                                                                  |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``-C COMPRESSION``, ``--compression COMPRESSION`` | select compression algorithm, see the output of the "borg help compression" command for details.                                                                                                                                                                    |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``--recompress``                                  | recompress data chunks according to ``--compression`` if `if-different`. When `always`, chunks that are already compressed that way are not skipped, but compressed again. Only the algorithm is considered for `if-different`, not the compression level (if any). |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-    |                                                       | ``--chunker-params PARAMS``                       | specify the chunker parameters (CHUNK_MIN_EXP, CHUNK_MAX_EXP, HASH_MASK_BITS, HASH_WINDOW_SIZE) or `default` to use the current defaults. default: 19,23,21,4095                                                                                                    |
-    +-------------------------------------------------------+---------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    | **positional arguments**                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``REPOSITORY_OR_ARCHIVE``                         | repository/archive to recreate                                                                                                                                                                                                                                                                                                                                             |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``PATH``                                          | paths to recreate; patterns are supported                                                                                                                                                                                                                                                                                                                                  |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    | **optional arguments**                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``--list``                                        | output verbose list of items (files, dirs, ...)                                                                                                                                                                                                                                                                                                                            |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``--filter STATUSCHARS``                          | only display items with the given status characters (listed in borg create --help)                                                                                                                                                                                                                                                                                         |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``-n``, ``--dry-run``                             | do not change anything                                                                                                                                                                                                                                                                                                                                                     |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``-s``, ``--stats``                               | print statistics at end                                                                                                                                                                                                                                                                                                                                                    |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    | .. class:: borg-common-opt-ref                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
+    |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
+    | :ref:`common_options`                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    | **Exclusion options**                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``-e PATTERN``, ``--exclude PATTERN``             | exclude paths matching PATTERN                                                                                                                                                                                                                                                                                                                                             |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``--exclude-from EXCLUDEFILE``                    | read exclude patterns from EXCLUDEFILE, one per line                                                                                                                                                                                                                                                                                                                       |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``--pattern PATTERN``                             | experimental: include/exclude paths matching PATTERN                                                                                                                                                                                                                                                                                                                       |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``--patterns-from PATTERNFILE``                   | experimental: read include/exclude patterns from PATTERNFILE, one per line                                                                                                                                                                                                                                                                                                 |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``--exclude-caches``                              | exclude directories that contain a CACHEDIR.TAG file (http://www.brynosaurus.com/cachedir/spec.html)                                                                                                                                                                                                                                                                       |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``--exclude-if-present NAME``                     | exclude directories that are tagged by containing a filesystem object with the given NAME                                                                                                                                                                                                                                                                                  |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``--keep-exclude-tags``, ``--keep-tag-files``     | if tag objects are specified with ``--exclude-if-present``, don't omit the tag objects themselves from the backup archive                                                                                                                                                                                                                                                  |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    | **Archive options**                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``--target TARGET``                               | create a new archive with the name ARCHIVE, do not replace existing archive (only applies for a single archive)                                                                                                                                                                                                                                                            |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``-c SECONDS``, ``--checkpoint-interval SECONDS`` | write checkpoint every SECONDS seconds (Default: 1800)                                                                                                                                                                                                                                                                                                                     |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``--comment COMMENT``                             | add a comment text to the archive                                                                                                                                                                                                                                                                                                                                          |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``--timestamp TIMESTAMP``                         | manually specify the archive creation date/time (UTC, yyyy-mm-ddThh:mm:ss format). alternatively, give a reference file/directory.                                                                                                                                                                                                                                         |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``-C COMPRESSION``, ``--compression COMPRESSION`` | select compression algorithm, see the output of the "borg help compression" command for details.                                                                                                                                                                                                                                                                           |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``--recompress MODE``                             | recompress data chunks according to ``--compression``. MODE `if-different`: recompress if current compression is with a different compression algorithm (the level is not considered). MODE `always`: recompress even if current compression is with the same compression algorithm (use this to change the compression level). MODE `never` (default): do not recompress. |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+    |                                                       | ``--chunker-params PARAMS``                       | specify the chunker parameters (CHUNK_MIN_EXP, CHUNK_MAX_EXP, HASH_MASK_BITS, HASH_WINDOW_SIZE) or `default` to use the current defaults. default: 19,23,21,4095                                                                                                                                                                                                           |
+    +-------------------------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 
     .. raw:: html
 
@@ -108,7 +108,7 @@ borg recreate
         --comment COMMENT                             add a comment text to the archive
         --timestamp TIMESTAMP                         manually specify the archive creation date/time (UTC, yyyy-mm-ddThh:mm:ss format). alternatively, give a reference file/directory.
         -C COMPRESSION, --compression COMPRESSION     select compression algorithm, see the output of the "borg help compression" command for details.
-        --recompress                                  recompress data chunks according to ``--compression`` if `if-different`. When `always`, chunks that are already compressed that way are not skipped, but compressed again. Only the algorithm is considered for `if-different`, not the compression level (if any).
+        --recompress MODE                             recompress data chunks according to ``--compression``. MODE `if-different`: recompress if current compression is with a different compression algorithm (the level is not considered). MODE `always`: recompress even if current compression is with the same compression algorithm (use this to change the compression level). MODE `never` (default): do not recompress.
         --chunker-params PARAMS                       specify the chunker parameters (CHUNK_MIN_EXP, CHUNK_MAX_EXP, HASH_MASK_BITS, HASH_WINDOW_SIZE) or `default` to use the current defaults. default: 19,23,21,4095
 
 
@@ -119,6 +119,8 @@ Recreate the contents of existing archives.
 
 This is an *experimental* feature. Do *not* use this on your only backup.
 
+Important: Repository disk space is **not** freed until you run ``borg compact``.
+
 ``--exclude``, ``--exclude-from``, ``--exclude-if-present``, ``--keep-exclude-tags``, and PATH
 have the exact same semantics as in "borg create". If PATHs are specified the
 resulting archive will only contain files from these PATHs.
@@ -147,10 +149,9 @@ archive that is built during the operation exists at the same time at
 
 With ``--target`` the original archive is not replaced, instead a new archive is created.
 
-When rechunking space usage can be substantial, expect at least the entire
-deduplicated size of the archives using the previous chunker params.
-When recompressing expect approx. (throughput / checkpoint-interval) in space usage,
-assuming all chunks are recompressed.
+When rechunking (or recompressing), space usage can be substantial - expect
+at least the entire deduplicated size of the archives using the previous
+chunker (or compression) params.
 
 If you recently ran borg check --repair and it had to fix lost chunks with all-zero
 replacement chunks, please first run another backup for the same data and re-run

+ 2 - 0
setup_docs.py

@@ -273,6 +273,8 @@ class build_man(Command):
         'mount': ('umount', 'extract'),  # Would be cooler if these two were on the same page
         'umount': ('mount', ),
         'extract': ('mount', ),
+        'delete': ('compact', ),
+        'prune': ('compact', ),
     }
 
     rst_prelude = textwrap.dedent("""

+ 3 - 4
src/borg/archive.py

@@ -493,7 +493,7 @@ Utilization of max. archive size: {csize_max:.0%}
             pass
         self.manifest.archives[name] = (self.id, metadata.time)
         self.manifest.write()
-        self.repository.commit()
+        self.repository.commit(compact=False)
         self.cache.commit()
 
     def calc_stats(self, cache):
@@ -1722,9 +1722,8 @@ class ArchiveChecker:
         if self.repair:
             logger.info('Writing Manifest.')
             self.manifest.write()
-            logger.info('Committing repo (may take a while, due to compact_segments)...')
-            self.repository.commit(save_space=save_space)
-            logger.info('Finished committing repo.')
+            logger.info('Committing repo.')
+            self.repository.commit(compact=False, save_space=save_space)
 
 
 class ArchiveRecreater:

+ 64 - 20
src/borg/archiver.py

@@ -243,7 +243,7 @@ class Archiver:
         manifest = Manifest(key, repository)
         manifest.key = key
         manifest.write()
-        repository.commit()
+        repository.commit(compact=False)
         with Cache(repository, key, manifest, warn_if_unencrypted=False):
             pass
         if key.tam_required:
@@ -1012,7 +1012,7 @@ class Archiver:
         name = replace_placeholders(args.name)
         archive.rename(name)
         manifest.write()
-        repository.commit()
+        repository.commit(compact=False)
         cache.commit()
         return self.exit_code
 
@@ -1062,7 +1062,7 @@ class Archiver:
             elif deleted:
                 manifest.write()
                 # note: might crash in compact() after committing the repo
-                repository.commit()
+                repository.commit(compact=False)
                 logger.info('Done. Run "borg check --repair" to clean up the mess.')
             else:
                 logger.warning('Aborted.')
@@ -1078,7 +1078,7 @@ class Archiver:
                         stats, progress=args.progress, forced=args.forced)
             if not dry_run:
                 manifest.write()
-                repository.commit(save_space=args.save_space)
+                repository.commit(compact=False, save_space=args.save_space)
                 cache.commit()
             if args.stats:
                 log_multi(DASHES,
@@ -1387,7 +1387,7 @@ class Archiver:
             pi.finish()
             if to_delete and not args.dry_run:
                 manifest.write()
-                repository.commit(save_space=args.save_space)
+                repository.commit(compact=False, save_space=args.save_space)
                 cache.commit()
             if args.stats:
                 log_multi(DASHES,
@@ -1414,7 +1414,7 @@ class Archiver:
                     print(format_archive(archive_info), '[%s]' % bin_to_hex(archive_info.id))
                 manifest.config[b'tam_required'] = True
                 manifest.write()
-                repository.commit()
+                repository.commit(compact=False)
             if not key.tam_required:
                 key.tam_required = True
                 key.change_passphrase(key._passphrase)
@@ -1437,7 +1437,7 @@ class Archiver:
                     print('Key location:', key.find_key())
             manifest.config[b'tam_required'] = False
             manifest.write()
-            repository.commit()
+            repository.commit(compact=False)
         else:
             # mainly for upgrades from Attic repositories,
             # but also supports borg 0.xx -> 1.0 upgrade.
@@ -1500,7 +1500,7 @@ class Archiver:
                     logger.info('Skipped archive %s: Nothing to do. Archive was not processed.', name)
         if not args.dry_run:
             manifest.write()
-            repository.commit()
+            repository.commit(compact=False)
             cache.commit()
         return self.exit_code
 
@@ -1532,7 +1532,16 @@ class Archiver:
             # that would be bad if somebody uses rsync with ignore-existing (or
             # any other mechanism relying on existing segment data not changing).
             # see issue #1867.
-            repository.commit()
+            repository.commit(compact=False)
+
+    @with_repository(manifest=False, exclusive=True)
+    def do_compact(self, args, repository):
+        """compact segment files in the repository"""
+        # see the comment in do_with_lock about why we do it like this:
+        data = repository.get(Manifest.MANIFEST_ID)
+        repository.put(Manifest.MANIFEST_ID, data)
+        repository.commit(compact=True, cleanup_commits=args.cleanup_commits)
+        return EXIT_SUCCESS
 
     @with_repository(exclusive=True, manifest=False)
     def do_config(self, args, repository):
@@ -1788,7 +1797,7 @@ class Archiver:
             h = hashlib.sha256(data)  # XXX hardcoded
             repository.put(h.digest(), data)
             print("object %s put." % h.hexdigest())
-        repository.commit()
+        repository.commit(compact=False)
         return EXIT_SUCCESS
 
     @with_repository(manifest=False, exclusive=True)
@@ -1808,7 +1817,7 @@ class Archiver:
                 except Repository.ObjectNotFound:
                     print("object %s not found." % hex_id)
         if modified:
-            repository.commit()
+            repository.commit(compact=False)
         print('Done.')
         return EXIT_SUCCESS
 
@@ -2302,6 +2311,7 @@ class Archiver:
         # It will replace the entire :ref:`foo` verbatim.
         rst_plain_text_references = {
             'a_status_oddity': '"I am seeing ‘A’ (added) status for a unchanged file!?"',
+            'separate_compaction': '"Separate compaction"',
         }
 
         def process_epilog(epilog):
@@ -3211,9 +3221,13 @@ class Archiver:
 
         delete_epilog = process_epilog("""
         This command deletes an archive from the repository or the complete repository.
-        Disk space is reclaimed accordingly. If you delete the complete repository, the
-        local cache for it (if any) is also deleted. Alternatively, you can delete just
-        the local cache with the ``--cache-only`` option.
+
+        Important: When deleting archives, repository disk space is **not** freed until
+        you run ``borg compact``.
+
+        If you delete the complete repository, the local cache for it (if any) is
+        also deleted. Alternatively, you can delete just the local cache with the
+        ``--cache-only`` option.
 
         When using ``--stats``, you will get some statistics about how much data was
         deleted - the "Deleted data" deduplicated size there is most interesting as
@@ -3367,8 +3381,12 @@ class Archiver:
 
         prune_epilog = process_epilog("""
         The prune command prunes a repository by deleting all archives not matching
-        any of the specified retention options. This command is normally used by
-        automated backup scripts wanting to keep a certain number of historic backups.
+        any of the specified retention options.
+
+        Important: Repository disk space is **not** freed until you run ``borg compact``.
+
+        This command is normally used by automated backup scripts wanting to keep a
+        certain number of historic backups.
 
         Also, prune automatically removes checkpoint archives (incomplete archives left
         behind by interrupted backup runs) except if the checkpoint is the latest
@@ -3555,6 +3573,8 @@ class Archiver:
 
         This is an *experimental* feature. Do *not* use this on your only backup.
 
+        Important: Repository disk space is **not** freed until you run ``borg compact``.
+
         ``--exclude``, ``--exclude-from``, ``--exclude-if-present``, ``--keep-exclude-tags``, and PATH
         have the exact same semantics as in "borg create". If PATHs are specified the
         resulting archive will only contain files from these PATHs.
@@ -3583,10 +3603,9 @@ class Archiver:
 
         With ``--target`` the original archive is not replaced, instead a new archive is created.
 
-        When rechunking space usage can be substantial, expect at least the entire
-        deduplicated size of the archives using the previous chunker params.
-        When recompressing expect approx. (throughput / checkpoint-interval) in space usage,
-        assuming all chunks are recompressed.
+        When rechunking (or recompressing), space usage can be substantial - expect
+        at least the entire deduplicated size of the archives using the previous
+        chunker (or compression) params.
 
         If you recently ran borg check --repair and it had to fix lost chunks with all-zero
         replacement chunks, please first run another backup for the same data and re-run
@@ -3686,6 +3705,31 @@ class Archiver:
         subparser.add_argument('args', metavar='ARGS', nargs=argparse.REMAINDER,
                                help='command arguments')
 
+        compact_epilog = process_epilog("""
+        This command frees repository space by compacting segments.
+
+        Use this regularly to avoid running out of space - you do not need to use this
+        after each borg command though.
+
+        borg compact does not need a key, so it is possible to invoke it from the
+        client or also from the server.
+
+        Depending on the amount of segments that need compaction, it may take a while.
+
+        See :ref:`separate_compaction` in Additional Notes for more details.
+        """)
+        subparser = subparsers.add_parser('compact', parents=[common_parser], add_help=False,
+                                          description=self.do_compact.__doc__,
+                                          epilog=compact_epilog,
+                                          formatter_class=argparse.RawDescriptionHelpFormatter,
+                                          help='compact segment files / free space in repo')
+        subparser.set_defaults(func=self.do_compact)
+        subparser.add_argument('location', metavar='REPOSITORY',
+                               type=location_validator(archive=False),
+                               help='repository to compact')
+        subparser.add_argument('--cleanup-commits', dest='cleanup_commits', action='store_true',
+                               help='cleanup commit-only 17-byte segment files')
+
         config_epilog = process_epilog("""
         This command gets and sets options in a local repository or cache config file.
         For security reasons, this command only works on local repositories.

+ 6 - 2
src/borg/remote.py

@@ -462,6 +462,8 @@ def api(*, since, **kwargs_decorator):
                     continue
                 if 'previously' in restriction and named[name] == restriction['previously']:
                     continue
+                if restriction.get('dontcare', False):
+                    continue
 
                 raise self.RPCServerOutdated("{0} {1}={2!s}".format(f.__name__, name, named[name]),
                                              format_version(restriction['since']))
@@ -889,8 +891,10 @@ This problem will go away as soon as the server has been upgraded to 1.0.7+.
     def check(self, repair=False, save_space=False):
         """actual remoting is done via self.call in the @api decorator"""
 
-    @api(since=parse_version('1.0.0'))
-    def commit(self, save_space=False):
+    @api(since=parse_version('1.0.0'),
+         compact={'since': parse_version('1.2.0a0'), 'previously': True, 'dontcare': True},
+         cleanup_commits={'since': parse_version('1.2.0a0'), 'previously': False, 'dontcare': True})
+    def commit(self, save_space=False, compact=True, cleanup_commits=False):
         """actual remoting is done via self.call in the @api decorator"""
 
     @api(since=parse_version('1.0.0'))

+ 23 - 15
src/borg/repository.py

@@ -21,6 +21,7 @@ from .helpers import ProgressIndicatorPercent
 from .helpers import bin_to_hex
 from .helpers import hostname_is_unique
 from .helpers import secure_erase, truncate_and_unlink
+from .helpers import Manifest
 from .locking import Lock, LockError, LockErrorT
 from .logger import create_logger
 from .lrucache import LRUCache
@@ -416,7 +417,7 @@ class Repository:
             self.lock.release()
             self.lock = None
 
-    def commit(self, save_space=False):
+    def commit(self, save_space=False, compact=True, cleanup_commits=False):
         """Commit transaction
         """
         # save_space is not used anymore, but stays for RPC/API compatibility.
@@ -426,8 +427,17 @@ class Repository:
             raise exception
         self.check_free_space()
         self.log_storage_quota()
-        self.io.write_commit()
-        if not self.append_only:
+        segment = self.io.write_commit()
+        self.segments.setdefault(segment, 0)
+        self.compact[segment] += LoggedIO.header_fmt.size
+        if compact and not self.append_only:
+            if cleanup_commits:
+                # due to bug #2850, there might be a lot of commit-only segment files.
+                # this is for a one-time cleanup of these 17byte files.
+                for segment, filename in self.io.segment_iterator():
+                    if os.path.getsize(filename) == 17:
+                        self.segments[segment] = 0
+                        self.compact[segment] = LoggedIO.header_fmt.size
             self.compact_segments()
         self.write_index()
         self.rollback()
@@ -460,7 +470,7 @@ class Repository:
                 raise
             self.prepare_txn(self.get_transaction_id())
             # don't leave an open transaction around
-            self.commit()
+            self.commit(compact=False)
             return self.open_index(self.get_transaction_id())
 
     def prepare_txn(self, transaction_id, do_cleanup=True):
@@ -676,6 +686,8 @@ class Repository:
             nonlocal unused
             # commit the new, compact, used segments
             segment = self.io.write_commit(intermediate=intermediate)
+            self.segments.setdefault(segment, 0)
+            self.compact[segment] += LoggedIO.header_fmt.size
             logger.debug('complete_xfer: wrote %scommit at segment %d', 'intermediate ' if intermediate else '', segment)
             # get rid of the old, sparse, unused segments. free space.
             for segment in unused:
@@ -951,7 +963,6 @@ class Repository:
                     if current_index.get(key, (-1, -1)) != value:
                         report_error('Index mismatch for key {}. {} != {}'.format(key, value, current_index.get(key, (-1, -1))))
         if repair:
-            self.compact_segments()
             self.write_index()
         self.rollback()
         if error_found:
@@ -1240,8 +1251,8 @@ class LoggedIO:
     def segment_filename(self, segment):
         return os.path.join(self.path, 'data', str(segment // self.segments_per_dir), str(segment))
 
-    def get_write_fd(self, no_new=False, raise_full=False):
-        if not no_new and self.offset and self.offset > self.limit:
+    def get_write_fd(self, no_new=False, want_new=False, raise_full=False):
+        if not no_new and (want_new or self.offset and self.offset > self.limit):
             if raise_full:
                 raise self.SegmentFull
             self.close_segment()
@@ -1453,7 +1464,7 @@ class LoggedIO:
         if data_size > MAX_DATA_SIZE:
             # this would push the segment entry size beyond MAX_OBJECT_SIZE.
             raise IntegrityError('More than allowed put data [{} > {}]'.format(data_size, MAX_DATA_SIZE))
-        fd = self.get_write_fd(raise_full=raise_full)
+        fd = self.get_write_fd(want_new=(id == Manifest.MANIFEST_ID), raise_full=raise_full)
         size = data_size + self.put_header_fmt.size
         offset = self.offset
         header = self.header_no_crc_fmt.pack(size, TAG_PUT)
@@ -1463,7 +1474,7 @@ class LoggedIO:
         return self.segment, offset
 
     def write_delete(self, id, raise_full=False):
-        fd = self.get_write_fd(raise_full=raise_full)
+        fd = self.get_write_fd(want_new=(id == Manifest.MANIFEST_ID), raise_full=raise_full)
         header = self.header_no_crc_fmt.pack(self.put_header_fmt.size, TAG_DELETE)
         crc = self.crc_fmt.pack(crc32(id, crc32(header)) & 0xffffffff)
         fd.write(b''.join((crc, header, id)))
@@ -1471,14 +1482,11 @@ class LoggedIO:
         return self.segment, self.put_header_fmt.size
 
     def write_commit(self, intermediate=False):
+        # Intermediate commits go directly into the current segment - this makes checking their validity more
+        # expensive, but is faster and reduces clobber. Final commits go into a new segment.
+        fd = self.get_write_fd(want_new=not intermediate)
         if intermediate:
-            # Intermediate commits go directly into the current segment - this makes checking their validity more
-            # expensive, but is faster and reduces clobber.
-            fd = self.get_write_fd()
             fd.sync()
-        else:
-            self.close_segment()
-            fd = self.get_write_fd()
         header = self.header_no_crc_fmt.pack(self.header_fmt.size, TAG_COMMIT)
         crc = self.crc_fmt.pack(crc32(header) & 0xffffffff)
         fd.write(b''.join((crc, header)))

+ 17 - 17
src/borg/testsuite/archiver.py

@@ -1463,7 +1463,7 @@ class ArchiverTestCase(ArchiverTestCaseBase):
                 if 'chunks' in item:
                     first_chunk_id = item.chunks[0].id
                     repository.delete(first_chunk_id)
-                    repository.commit()
+                    repository.commit(compact=False)
                     break
         output = self.cmd('delete', '--force', self.repository_location + '::test')
         self.assert_in('deleted archive was corrupted', output)
@@ -1479,7 +1479,7 @@ class ArchiverTestCase(ArchiverTestCaseBase):
             archive = Archive(repository, key, manifest, 'test')
             id = archive.metadata.items[0]
             repository.put(id, b'corrupted items metadata stream chunk')
-            repository.commit()
+            repository.commit(compact=False)
         self.cmd('delete', '--force', '--force', self.repository_location + '::test')
         self.cmd('check', '--repair', self.repository_location)
         output = self.cmd('list', self.repository_location)
@@ -1533,7 +1533,7 @@ class ArchiverTestCase(ArchiverTestCaseBase):
             manifest, key = Manifest.load(repository, Manifest.NO_OPERATION_CHECK)
             manifest.config[b'feature_flags'] = {operation.value.encode(): {b'mandatory': [b'unknown-feature']}}
             manifest.write()
-            repository.commit()
+            repository.commit(compact=False)
 
     def cmd_raises_unknown_feature(self, args):
         if self.FORK_DEFAULT:
@@ -2249,7 +2249,7 @@ class ArchiverTestCase(ArchiverTestCaseBase):
                     break
             else:
                 assert False  # missed the file
-            repository.commit()
+            repository.commit(compact=False)
         self.cmd('check', '--repair', self.repository_location, exit_code=0)
 
         mountpoint = os.path.join(self.tmpdir, 'mountpoint')
@@ -2970,7 +2970,7 @@ class ArchiverCheckTestCase(ArchiverTestCaseBase):
                     break
             else:
                 self.fail('should not happen')
-            repository.commit()
+            repository.commit(compact=False)
         self.cmd('check', self.repository_location, exit_code=1)
         output = self.cmd('check', '--repair', self.repository_location, exit_code=0)
         self.assert_in('New missing file chunk detected', output)
@@ -3013,7 +3013,7 @@ class ArchiverCheckTestCase(ArchiverTestCaseBase):
         archive, repository = self.open_archive('archive1')
         with repository:
             repository.delete(archive.metadata.items[0])
-            repository.commit()
+            repository.commit(compact=False)
         self.cmd('check', self.repository_location, exit_code=1)
         self.cmd('check', '--repair', self.repository_location, exit_code=0)
         self.cmd('check', self.repository_location, exit_code=0)
@@ -3022,7 +3022,7 @@ class ArchiverCheckTestCase(ArchiverTestCaseBase):
         archive, repository = self.open_archive('archive1')
         with repository:
             repository.delete(archive.id)
-            repository.commit()
+            repository.commit(compact=False)
         self.cmd('check', self.repository_location, exit_code=1)
         self.cmd('check', '--repair', self.repository_location, exit_code=0)
         self.cmd('check', self.repository_location, exit_code=0)
@@ -3031,7 +3031,7 @@ class ArchiverCheckTestCase(ArchiverTestCaseBase):
         archive, repository = self.open_archive('archive1')
         with repository:
             repository.delete(Manifest.MANIFEST_ID)
-            repository.commit()
+            repository.commit(compact=False)
         self.cmd('check', self.repository_location, exit_code=1)
         output = self.cmd('check', '-v', '--repair', self.repository_location, exit_code=0)
         self.assert_in('archive1', output)
@@ -3044,7 +3044,7 @@ class ArchiverCheckTestCase(ArchiverTestCaseBase):
             manifest = repository.get(Manifest.MANIFEST_ID)
             corrupted_manifest = manifest + b'corrupted!'
             repository.put(Manifest.MANIFEST_ID, corrupted_manifest)
-            repository.commit()
+            repository.commit(compact=False)
         self.cmd('check', self.repository_location, exit_code=1)
         output = self.cmd('check', '-v', '--repair', self.repository_location, exit_code=0)
         self.assert_in('archive1', output)
@@ -3061,7 +3061,7 @@ class ArchiverCheckTestCase(ArchiverTestCaseBase):
             chunk = repository.get(archive.id)
             corrupted_chunk = chunk + b'corrupted!'
             repository.put(archive.id, corrupted_chunk)
-            repository.commit()
+            repository.commit(compact=False)
         self.cmd('check', self.repository_location, exit_code=1)
         output = self.cmd('check', '-v', '--repair', self.repository_location, exit_code=0)
         self.assert_in('archive2', output)
@@ -3086,7 +3086,7 @@ class ArchiverCheckTestCase(ArchiverTestCaseBase):
             })
             archive_id = key.id_hash(archive)
             repository.put(archive_id, key.encrypt(archive))
-            repository.commit()
+            repository.commit(compact=False)
         self.cmd('check', self.repository_location, exit_code=1)
         self.cmd('check', '--repair', self.repository_location, exit_code=0)
         output = self.cmd('list', self.repository_location)
@@ -3098,7 +3098,7 @@ class ArchiverCheckTestCase(ArchiverTestCaseBase):
         self.cmd('check', self.repository_location, exit_code=0)
         with Repository(self.repository_location, exclusive=True) as repository:
             repository.put(b'01234567890123456789012345678901', b'xxxx')
-            repository.commit()
+            repository.commit(compact=False)
         self.cmd('check', self.repository_location, exit_code=1)
         self.cmd('check', self.repository_location, exit_code=1)
         self.cmd('check', '--repair', self.repository_location, exit_code=0)
@@ -3117,7 +3117,7 @@ class ArchiverCheckTestCase(ArchiverTestCaseBase):
                     data = repository.get(chunk.id) + b'1234'
                     repository.put(chunk.id, data)
                     break
-            repository.commit()
+            repository.commit(compact=False)
         self.cmd('check', self.repository_location, exit_code=0)
         output = self.cmd('check', '--verify-data', self.repository_location, exit_code=1)
         assert bin_to_hex(chunk.id) + ', integrity error' in output
@@ -3136,7 +3136,7 @@ class ArchiverCheckTestCase(ArchiverTestCaseBase):
         with Repository(self.repository_location, exclusive=True) as repository:
             for id_ in repository.list():
                 repository.delete(id_)
-            repository.commit()
+            repository.commit(compact=False)
         self.cmd('check', self.repository_location, exit_code=1)
 
     def test_attic013_acl_bug(self):
@@ -3179,7 +3179,7 @@ class ManifestAuthenticationTest(ArchiverTestCaseBase):
                 'config': {},
                 'timestamp': (datetime.utcnow() + timedelta(days=1)).strftime(ISO_FORMAT),
             })))
-            repository.commit()
+            repository.commit(compact=False)
 
     def test_fresh_init_tam_required(self):
         self.cmd('init', '--encryption=repokey', self.repository_location)
@@ -3191,7 +3191,7 @@ class ManifestAuthenticationTest(ArchiverTestCaseBase):
                 'archives': {},
                 'timestamp': (datetime.utcnow() + timedelta(days=1)).strftime(ISO_FORMAT),
             })))
-            repository.commit()
+            repository.commit(compact=False)
 
         with pytest.raises(TAMRequiredError):
             self.cmd('list', self.repository_location)
@@ -3209,7 +3209,7 @@ class ManifestAuthenticationTest(ArchiverTestCaseBase):
             manifest = msgpack.unpackb(key.decrypt(None, repository.get(Manifest.MANIFEST_ID)))
             del manifest[b'tam']
             repository.put(Manifest.MANIFEST_ID, key.encrypt(msgpack.packb(manifest)))
-            repository.commit()
+            repository.commit(compact=False)
         output = self.cmd('list', '--debug', self.repository_location)
         assert 'archive1234' in output
         assert 'TAM not found and not required' in output

+ 50 - 48
src/borg/testsuite/repository.py

@@ -50,7 +50,7 @@ class RepositoryTestCaseBase(BaseTestCase):
         self.repository.put(H(0), b'foo')
         self.repository.put(H(1), b'bar')
         self.repository.put(H(3), b'bar')
-        self.repository.commit()
+        self.repository.commit(compact=False)
         self.repository.put(H(1), b'bar2')
         self.repository.put(H(2), b'boo')
         self.repository.delete(H(3))
@@ -65,7 +65,7 @@ class RepositoryTestCase(RepositoryTestCaseBase):
         self.assert_equal(self.repository.get(key50), b'SOMEDATA')
         self.repository.delete(key50)
         self.assert_raises(Repository.ObjectNotFound, lambda: self.repository.get(key50))
-        self.repository.commit()
+        self.repository.commit(compact=False)
         self.repository.close()
         with self.open() as repository2:
             self.assert_raises(Repository.ObjectNotFound, lambda: repository2.get(key50))
@@ -79,10 +79,10 @@ class RepositoryTestCase(RepositoryTestCaseBase):
         """
         self.repository.put(H(0), b'foo')
         self.repository.put(H(1), b'foo')
-        self.repository.commit()
+        self.repository.commit(compact=False)
         self.repository.delete(H(0))
         self.repository.put(H(1), b'bar')
-        self.repository.commit()
+        self.repository.commit(compact=False)
         self.assert_equal(self.repository.get(H(1)), b'bar')
 
     def test_consistency(self):
@@ -102,7 +102,7 @@ class RepositoryTestCase(RepositoryTestCaseBase):
         """
         self.repository.put(H(0), b'foo')
         self.assert_equal(self.repository.get(H(0)), b'foo')
-        self.repository.commit()
+        self.repository.commit(compact=False)
         self.repository.put(H(0), b'foo2')
         self.assert_equal(self.repository.get(H(0)), b'foo2')
         self.repository.rollback()
@@ -113,29 +113,29 @@ class RepositoryTestCase(RepositoryTestCaseBase):
         """
         self.repository.put(H(0), b'foo')
         self.repository.put(H(0), b'foo2')
-        self.repository.commit()
+        self.repository.commit(compact=False)
         self.assert_equal(self.repository.get(H(0)), b'foo2')
 
     def test_single_kind_transactions(self):
         # put
         self.repository.put(H(0), b'foo')
-        self.repository.commit()
+        self.repository.commit(compact=False)
         self.repository.close()
         # replace
         self.repository = self.open()
         with self.repository:
             self.repository.put(H(0), b'bar')
-            self.repository.commit()
+            self.repository.commit(compact=False)
         # delete
         self.repository = self.open()
         with self.repository:
             self.repository.delete(H(0))
-            self.repository.commit()
+            self.repository.commit(compact=False)
 
     def test_list(self):
         for x in range(100):
             self.repository.put(H(x), b'SOMEDATA')
-        self.repository.commit()
+        self.repository.commit(compact=False)
         all = self.repository.list()
         self.assert_equal(len(all), 100)
         first_half = self.repository.list(limit=50)
@@ -149,7 +149,7 @@ class RepositoryTestCase(RepositoryTestCaseBase):
     def test_scan(self):
         for x in range(100):
             self.repository.put(H(x), b'SOMEDATA')
-        self.repository.commit()
+        self.repository.commit(compact=False)
         all = self.repository.scan()
         assert len(all) == 100
         first_half = self.repository.scan(limit=50)
@@ -177,6 +177,8 @@ class LocalRepositoryTestCase(RepositoryTestCaseBase):
     def _assert_sparse(self):
         # The superseded 123456... PUT
         assert self.repository.compact[0] == 41 + 9
+        # a COMMIT
+        assert self.repository.compact[1] == 9
         # The DELETE issued by the superseding PUT (or issued directly)
         assert self.repository.compact[2] == 41
         self.repository._rebuild_sparse(0)
@@ -185,14 +187,14 @@ class LocalRepositoryTestCase(RepositoryTestCaseBase):
     def test_sparse1(self):
         self.repository.put(H(0), b'foo')
         self.repository.put(H(1), b'123456789')
-        self.repository.commit()
+        self.repository.commit(compact=False)
         self.repository.put(H(1), b'bar')
         self._assert_sparse()
 
     def test_sparse2(self):
         self.repository.put(H(0), b'foo')
         self.repository.put(H(1), b'123456789')
-        self.repository.commit()
+        self.repository.commit(compact=False)
         self.repository.delete(H(1))
         self._assert_sparse()
 
@@ -207,14 +209,14 @@ class LocalRepositoryTestCase(RepositoryTestCaseBase):
         # ...while _rebuild_sparse can mark whole segments as completely sparse (which then includes the segment magic)
         assert self.repository.compact[0] == 41 + 41 + 4 + len(MAGIC)
 
-        self.repository.commit()
+        self.repository.commit(compact=True)
         assert 0 not in [segment for segment, _ in self.repository.io.segment_iterator()]
 
     def test_uncommitted_garbage(self):
         # uncommitted garbage should be no problem, it is cleaned up automatically.
         # we just have to be careful with invalidation of cached FDs in LoggedIO.
         self.repository.put(H(0), b'foo')
-        self.repository.commit()
+        self.repository.commit(compact=False)
         # write some crap to a uncommitted segment file
         last_segment = self.repository.io.get_latest_segment()
         with open(self.repository.io.segment_filename(last_segment + 1), 'wb') as f:
@@ -224,7 +226,7 @@ class LocalRepositoryTestCase(RepositoryTestCaseBase):
         self.repository = self.open()
         with self.repository:
             self.repository.put(H(0), b'bar')  # this may trigger compact_segments()
-            self.repository.commit()
+            self.repository.commit(compact=True)
         # the point here is that nothing blows up with an exception.
 
 
@@ -244,7 +246,7 @@ class RepositoryCommitTestCase(RepositoryTestCaseBase):
         self.add_keys()
         self.repository.compact_segments = None
         try:
-            self.repository.commit()
+            self.repository.commit(compact=True)
         except TypeError:
             pass
         self.reopen()
@@ -256,7 +258,7 @@ class RepositoryCommitTestCase(RepositoryTestCaseBase):
         self.add_keys()
         self.repository.write_index = None
         try:
-            self.repository.commit()
+            self.repository.commit(compact=False)
         except TypeError:
             pass
         self.reopen()
@@ -294,7 +296,7 @@ class RepositoryCommitTestCase(RepositoryTestCaseBase):
         self.add_keys()
         self.repository.io.delete_segment = None
         try:
-            self.repository.commit()
+            self.repository.commit(compact=False)
         except TypeError:
             pass
         self.reopen()
@@ -313,9 +315,9 @@ class RepositoryCommitTestCase(RepositoryTestCaseBase):
     def test_moved_deletes_are_tracked(self):
         self.repository.put(H(1), b'1')
         self.repository.put(H(2), b'2')
-        self.repository.commit()
+        self.repository.commit(compact=False)
         self.repository.delete(H(1))
-        self.repository.commit()
+        self.repository.commit(compact=True)
         last_segment = self.repository.io.get_latest_segment() - 1
         num_deletes = 0
         for tag, key, offset, size in self.repository.io.iter_objects(last_segment):
@@ -325,7 +327,7 @@ class RepositoryCommitTestCase(RepositoryTestCaseBase):
         assert num_deletes == 1
         assert last_segment in self.repository.compact
         self.repository.put(H(3), b'3')
-        self.repository.commit()
+        self.repository.commit(compact=True)
         assert last_segment not in self.repository.compact
         assert not self.repository.io.segment_exists(last_segment)
         for segment, _ in self.repository.io.segment_iterator():
@@ -337,7 +339,7 @@ class RepositoryCommitTestCase(RepositoryTestCaseBase):
         self.repository.put(H(1), b'1')
         # This is the segment with our original PUT of interest
         put_segment = get_latest_segment()
-        self.repository.commit()
+        self.repository.commit(compact=False)
 
         # We now delete H(1), and force this segment to not be compacted, which can happen
         # if it's not sparse enough (symbolized by H(2) here).
@@ -349,12 +351,12 @@ class RepositoryCommitTestCase(RepositoryTestCaseBase):
         del self.repository.compact[put_segment]
         del self.repository.compact[delete_segment]
 
-        self.repository.commit()
+        self.repository.commit(compact=True)
 
         # Now we perform an unrelated operation on the segment containing the DELETE,
         # causing it to be compacted.
         self.repository.delete(H(2))
-        self.repository.commit()
+        self.repository.commit(compact=True)
 
         assert self.repository.io.segment_exists(put_segment)
         assert not self.repository.io.segment_exists(delete_segment)
@@ -370,7 +372,7 @@ class RepositoryCommitTestCase(RepositoryTestCaseBase):
         self.repository.put(H(1), b'1')
         self.repository.delete(H(1))
         assert self.repository.shadow_index[H(1)] == [0]
-        self.repository.commit()
+        self.repository.commit(compact=True)
         # note how an empty list means that nothing is shadowed for sure
         assert self.repository.shadow_index[H(1)] == []
         self.repository.put(H(1), b'1')
@@ -397,21 +399,21 @@ class RepositoryAppendOnlyTestCase(RepositoryTestCaseBase):
         def segments_in_repository():
             return len(list(self.repository.io.segment_iterator()))
         self.repository.put(H(0), b'foo')
-        self.repository.commit()
+        self.repository.commit(compact=False)
 
         self.repository.append_only = False
         assert segments_in_repository() == 2
         self.repository.put(H(0), b'foo')
-        self.repository.commit()
+        self.repository.commit(compact=True)
         # normal: compact squashes the data together, only one segment
-        assert segments_in_repository() == 4
+        assert segments_in_repository() == 2
 
         self.repository.append_only = True
-        assert segments_in_repository() == 4
+        assert segments_in_repository() == 2
         self.repository.put(H(0), b'foo')
-        self.repository.commit()
+        self.repository.commit(compact=False)
         # append only: does not compact, only new segments written
-        assert segments_in_repository() == 6
+        assert segments_in_repository() == 4
 
 
 class RepositoryFreeSpaceTestCase(RepositoryTestCaseBase):
@@ -424,7 +426,7 @@ class RepositoryFreeSpaceTestCase(RepositoryTestCaseBase):
         with self.repository:
             self.repository.put(H(0), b'foobar')
             with pytest.raises(Repository.InsufficientFreeSpaceError):
-                self.repository.commit()
+                self.repository.commit(compact=False)
         assert os.path.exists(self.repository.path)
 
     def test_create_free_space(self):
@@ -443,7 +445,7 @@ class QuotaTestCase(RepositoryTestCaseBase):
         assert self.repository.storage_quota_use == 1234 + 5678 + 2 * 41
         self.repository.delete(H(1))
         assert self.repository.storage_quota_use == 5678 + 41
-        self.repository.commit()
+        self.repository.commit(compact=False)
         self.reopen()
         with self.repository:
             # Open new transaction; hints and thus quota data is not loaded unless needed.
@@ -456,12 +458,12 @@ class QuotaTestCase(RepositoryTestCaseBase):
         self.repository.storage_quota = 50
         self.repository.put(H(1), b'')
         assert self.repository.storage_quota_use == 41
-        self.repository.commit()
+        self.repository.commit(compact=False)
         with pytest.raises(Repository.StorageQuotaExceeded):
             self.repository.put(H(2), b'')
         assert self.repository.storage_quota_use == 82
         with pytest.raises(Repository.StorageQuotaExceeded):
-            self.repository.commit()
+            self.repository.commit(compact=False)
         assert self.repository.storage_quota_use == 82
         self.reopen()
         with self.repository:
@@ -517,13 +519,13 @@ class RepositoryAuxiliaryCorruptionTestCase(RepositoryTestCaseBase):
     def setUp(self):
         super().setUp()
         self.repository.put(H(0), b'foo')
-        self.repository.commit()
+        self.repository.commit(compact=False)
         self.repository.close()
 
     def do_commit(self):
         with self.repository:
             self.repository.put(H(0), b'fox')
-            self.repository.commit()
+            self.repository.commit(compact=False)
 
     def test_corrupted_hints(self):
         with open(os.path.join(self.repository.path, 'hints.1'), 'ab') as fd:
@@ -620,9 +622,9 @@ class RepositoryAuxiliaryCorruptionTestCase(RepositoryTestCaseBase):
             assert self.repository.get(H(0)) == b'foo'
             self.repository.put(H(1), b'bar')
             self.repository.put(H(2), b'baz')
-            self.repository.commit()
+            self.repository.commit(compact=False)
             self.repository.put(H(2), b'bazz')
-            self.repository.commit()
+            self.repository.commit(compact=False)
 
         hints_path = os.path.join(self.repository.path, 'hints.5')
         with open(hints_path, 'r+b') as fd:
@@ -640,7 +642,7 @@ class RepositoryAuxiliaryCorruptionTestCase(RepositoryTestCaseBase):
             self.repository.append_only = False
             self.repository.put(H(3), b'1234')
             # Do a compaction run. Succeeds, since the failed checksum prompted a rebuild of the index+hints.
-            self.repository.commit()
+            self.repository.commit(compact=True)
 
             assert len(self.repository) == 4
             assert self.repository.get(H(0)) == b'foo'
@@ -656,7 +658,7 @@ class RepositoryAuxiliaryCorruptionTestCase(RepositoryTestCaseBase):
             self.repository.put(H(3), b'1234')
             # Do a compaction run. Fails, since the corrupted refcount was not detected and leads to an assertion failure.
             with pytest.raises(AssertionError) as exc_info:
-                self.repository.commit()
+                self.repository.commit(compact=True)
             assert 'Corrupted segment reference count' in str(exc_info.value)
 
 
@@ -678,7 +680,7 @@ class RepositoryCheckTestCase(RepositoryTestCaseBase):
         for ids in segments:
             for id_ in ids:
                 self.repository.put(H(id_), b'data')
-            self.repository.commit()
+            self.repository.commit(compact=False)
 
     def get_head(self):
         return sorted(int(n) for n in os.listdir(os.path.join(self.tmppath, 'repository', 'data', '0')) if n.isdigit())[-1]
@@ -757,7 +759,7 @@ class RepositoryCheckTestCase(RepositoryTestCaseBase):
         self.check(status=False)
         self.assert_equal(self.list_indices(), ['index.1'])
         self.check(repair=True, status=True)
-        self.assert_equal(self.list_indices(), ['index.3'])
+        self.assert_equal(self.list_indices(), ['index.2'])
         self.check(status=True)
         self.get_objects(3)
         self.assert_equal(set([1, 2, 3]), self.list_objects())
@@ -783,7 +785,7 @@ class RepositoryCheckTestCase(RepositoryTestCaseBase):
         self.repository.put(H(0), b'data2')
         # Simulate a crash before compact
         with patch.object(Repository, 'compact_segments') as compact:
-            self.repository.commit()
+            self.repository.commit(compact=True)
             compact.assert_called_once_with()
         self.reopen()
         with self.repository:
@@ -903,18 +905,18 @@ class RemoteLegacyFree(RepositoryTestCaseBase):
     def test_legacy_free(self):
         # put
         self.repository.put(H(0), b'foo')
-        self.repository.commit()
+        self.repository.commit(compact=False)
         self.repository.close()
         # replace
         self.repository = self.open()
         with self.repository:
             self.repository.put(H(0), b'bar')
-            self.repository.commit()
+            self.repository.commit(compact=False)
         # delete
         self.repository = self.open()
         with self.repository:
             self.repository.delete(H(0))
-            self.repository.commit()
+            self.repository.commit(compact=False)
 
 
 class RemoteRepositoryCheckTestCase(RepositoryCheckTestCase):