Prechádzať zdrojové kódy

docs: updates / removing outdated stuff

Thomas Waldmann 1 rok pred
rodič
commit
ace97fadec

+ 1 - 1
README.rst

@@ -69,7 +69,7 @@ Main features
 **Speed**
   * performance-critical code (chunking, compression, encryption) is
     implemented in C/Cython
-  * local caching of files/chunks index data
+  * local caching
   * quick detection of unmodified files
 
 **Data encryption**

+ 0 - 2
docs/deployment/hosting-repositories.rst

@@ -68,8 +68,6 @@ can be filled to the specified quota.
 If storage quotas are used, ensure that all deployed Borg releases
 support storage quotas.
 
-Refer to :ref:`internals_storage_quota` for more details on storage quotas.
-
 **Specificities: Append-only repositories**
 
 Running ``borg init`` via a ``borg serve --append-only`` server will **not**

+ 19 - 71
docs/faq.rst

@@ -14,7 +14,7 @@ What is the difference between a repo on an external hard drive vs. repo on a se
 If Borg is running in client/server mode, the client uses SSH as a transport to
 talk to the remote agent, which is another Borg process (Borg is installed on
 the server, too) started automatically by the client. The Borg server is doing
-storage-related low-level repo operations (get, put, commit, check, compact),
+storage-related low-level repo operations (list, load and store objects),
 while the Borg client does the high-level stuff: deduplication, encryption,
 compression, dealing with archives, backups, restores, etc., which reduces the
 amount of data that goes over the network.
@@ -27,17 +27,7 @@ which is slower.
 Can I back up from multiple servers into a single repository?
 -------------------------------------------------------------
 
-In order for the deduplication used by Borg to work, it
-needs to keep a local cache containing checksums of all file
-chunks already stored in the repository. This cache is stored in
-``~/.cache/borg/``.  If Borg detects that a repository has been
-modified since the local cache was updated it will need to rebuild
-the cache. This rebuild can be quite time consuming.
-
-So, yes it's possible. But it will be most efficient if a single
-repository is only modified from one place. Also keep in mind that
-Borg will keep an exclusive lock on the repository while creating
-or deleting archives, which may make *simultaneous* backups fail.
+Yes, you can! Even simultaneously.
 
 Can I back up to multiple, swapped backup targets?
 --------------------------------------------------
@@ -131,13 +121,13 @@ If a backup stops mid-way, does the already-backed-up data stay there?
 
 Yes, the data transferred into the repo stays there - just avoid running
 ``borg compact`` before you completed the backup, because that would remove
-unused chunks.
+chunks that were already transferred to the repo, but not (yet) referenced
+by an archive.
 
 If a backup was interrupted, you normally do not need to do anything special,
-just invoke ``borg create`` as you always do. If the repository is still locked,
-you may need to run ``borg break-lock`` before the next backup. You may use the
-same archive name as in previous attempt or a different one (e.g. if you always
-include the current datetime), it does not matter.
+just invoke ``borg create`` as you always do. You may use the same archive name
+as in previous attempt or a different one (e.g. if you always include the
+current datetime), it does not matter.
 
 Borg always does full single-pass backups, so it will start again
 from the beginning - but it will be much faster, because some of the data was
@@ -201,23 +191,6 @@ Yes, if you want to detect accidental data damage (like bit rot), use the
 If you want to be able to detect malicious tampering also, use an encrypted
 repo. It will then be able to check using CRCs and HMACs.
 
-Can I use Borg on SMR hard drives?
-----------------------------------
-
-SMR (shingled magnetic recording) hard drives are very different from
-regular hard drives. Applications have to behave in certain ways or
-performance will be heavily degraded.
-
-Borg ships with default settings suitable for SMR drives,
-and has been successfully tested on *Seagate Archive v2* drives
-using the ext4 file system.
-
-Some Linux kernel versions between 3.19 and 4.5 had various bugs
-handling device-managed SMR drives, leading to IO errors, unresponsive
-drives and unreliable operation in general.
-
-For more details, refer to :issue:`2252`.
-
 .. _faq-integrityerror:
 
 I get an IntegrityError or similar - what now?
@@ -336,7 +309,7 @@ Why is the time elapsed in the archive stats different from wall clock time?
 ----------------------------------------------------------------------------
 
 Borg needs to write the time elapsed into the archive metadata before finalizing
-the archive and committing the repo & cache.
+the archive and saving the files cache.
 This means when Borg is run with e.g. the ``time`` command, the duration shown
 in the archive stats may be shorter than the full time the command runs for.
 
@@ -372,8 +345,7 @@ will of course delete everything in the archive, not only some files.
 :ref:`borg_recreate` command to rewrite all archives with a different
 ``--exclude`` pattern. See the examples in the manpage for more information.
 
-Finally, run :ref:`borg_compact` with the ``--threshold 0`` option to delete the
-data chunks from the repository.
+Finally, run :ref:`borg_compact` to delete the data chunks from the repository.
 
 Can I safely change the compression level or algorithm?
 --------------------------------------------------------
@@ -383,6 +355,7 @@ are calculated *before* compression. New compression settings
 will only be applied to new chunks, not existing chunks. So it's safe
 to change them.
 
+Use ``borg rcompress`` to efficiently recompress a complete repository.
 
 Security
 ########
@@ -728,7 +701,7 @@ This can make creation of the first archive slower, but saves time
 and disk space on subsequent runs. Here what Borg does when you run ``borg create``:
 
 - Borg chunks the file (using the relatively expensive buzhash algorithm)
-- It then computes the "id" of the chunk (hmac-sha256 (often slow, except
+- It then computes the "id" of the chunk (hmac-sha256 (slow, except
   if your CPU has sha256 acceleration) or blake2b (fast, in software))
 - Then it checks whether this chunk is already in the repo (local hashtable lookup,
   fast). If so, the processing of the chunk is completed here. Otherwise it needs to
@@ -739,9 +712,8 @@ and disk space on subsequent runs. Here what Borg does when you run ``borg creat
 - Transmits to repo. If the repo is remote, this usually involves an SSH connection
   (does its own encryption / authentication).
 - Stores the chunk into a key/value store (the key is the chunk id, the value
-  is the data). While doing that, it computes CRC32 / XXH64 of the data (repo low-level
-  checksum, used by borg check --repository) and also updates the repo index
-  (another hashtable).
+  is the data). While doing that, it computes XXH64 of the data (repo low-level
+  checksum, used by borg check --repository).
 
 Subsequent backups are usually very fast if most files are unchanged and only
 a few are new or modified. The high performance on unchanged files primarily depends
@@ -969,6 +941,12 @@ To achieve this, run ``borg create`` within the mountpoint/snapshot directory:
     cd /mnt/rootfs
     borg create rootfs_backup .
 
+Another way (without changing the directory) is to use the slashdot hack:
+
+::
+
+    borg create rootfs_backup /mnt/rootfs/./
+
 
 I am having troubles with some network/FUSE/special filesystem, why?
 --------------------------------------------------------------------
@@ -1048,16 +1026,6 @@ to make it behave correctly::
 .. _workaround: https://unix.stackexchange.com/a/123236
 
 
-Can I disable checking for free disk space?
--------------------------------------------
-
-In some cases, the free disk space of the target volume is reported incorrectly.
-This can happen for CIFS- or FUSE shares. If you are sure that your target volume
-will always have enough disk space, you can use the following workaround to disable
-checking for free disk space::
-
-    borg config -- additional_free_space -2T
-
 How do I rename a repository?
 -----------------------------
 
@@ -1074,26 +1042,6 @@ It may be useful to set ``BORG_RELOCATED_REPO_ACCESS_IS_OK=yes`` to avoid the
 prompts when renaming multiple repositories or in a non-interactive context
 such as a script. See :doc:`deployment` for an example.
 
-The repository quota size is reached, what can I do?
-----------------------------------------------------
-
-The simplest solution is to increase or disable the quota and resume the backup:
-
-::
-
-    borg config /path/to/repo storage_quota 0
-
-If you are bound to the quota, you have to free repository space. The first to
-try is running :ref:`borg_compact` to free unused backup space (see also
-:ref:`separate_compaction`):
-
-::
-
-    borg compact /path/to/repo
-
-If your repository is already compacted, run :ref:`borg_prune` or
-:ref:`borg_delete` to delete archives that you do not need anymore, and then run
-``borg compact`` again.
 
 My backup disk is full, what can I do?
 --------------------------------------

BIN
docs/internals/compaction.odg


BIN
docs/internals/compaction.png


+ 140 - 361
docs/internals/data-structures.rst

@@ -19,63 +19,51 @@ discussion about internals`_ and also on static code analysis.
 Repository
 ----------
 
-.. Some parts of this description were taken from the Repository docstring
-
-Borg stores its data in a `Repository`, which is a file system based
-transactional key-value store. Thus the repository does not know about
-the concept of archives or items.
-
-Each repository has the following file structure:
-
-README
-  simple text file telling that this is a Borg repository
-
-config
-  repository configuration
+Borg stores its data in a `Repository`, which is a key-value store and has
+the following structure:
+
+config/
+  readme
+    simple text object telling that this is a Borg repository
+  id
+    the unique repository ID encoded as hexadecimal number text
+  version
+    the repository version encoded as decimal number text
+  manifest
+    some data about the repository, binary
+  last-key-checked
+    repository check progress (partial checks, full checks' checkpointing),
+    path of last object checked as text
+  space-reserve.N
+    purely random binary data to reserve space, e.g. for disk-full emergencies
+
+There is a list of pointers to archive objects in this directory:
+
+archives/
+  0000... .. ffff...
+
+The actual data is stored into a nested directory structure, using the full
+object ID as name. Each (encrypted and compressed) object is stored separately.
 
 data/
-  directory where the actual data is stored
-
-hints.%d
-  hints for repository compaction
-
-index.%d
-  repository index
-
-lock.roster and lock.exclusive/*
-  used by the locking system to manage shared and exclusive locks
-
-Transactionality is achieved by using a log (aka journal) to record changes. The log is a series of numbered files
-called segments_. Each segment is a series of log entries. The segment number together with the offset of each
-entry relative to its segment start establishes an ordering of the log entries. This is the "definition" of
-time for the purposes of the log.
-
-.. _config-file:
-
-Config file
-~~~~~~~~~~~
+  00/ .. ff/
+    00/ .. ff/
+      0000... .. ffff...
 
-Each repository has a ``config`` file which is a ``INI``-style file
-and looks like this::
+keys/
+  repokey
+    When using encryption in repokey mode, the encrypted, passphrase protected
+    key is stored here as a base64 encoded text.
 
-    [repository]
-    version = 2
-    segments_per_dir = 1000
-    max_segment_size = 524288000
-    id = 57d6c1d52ce76a836b532b0e42e677dec6af9fca3673db511279358828a21ed6
+locks/
+  used by the locking system to manage shared and exclusive locks.
 
-This is where the ``repository.id`` is stored. It is a unique
-identifier for repositories. It will not change if you move the
-repository around so you can make a local transfer then decide to move
-the repository to another (even remote) location at a later time.
 
 Keys
 ~~~~
 
-Repository keys are byte-strings of fixed length (32 bytes), they
-don't have a particular meaning (except for the Manifest_).
-
-Normally the keys are computed like this::
+Repository object IDs (which are used as key into the key-value store) are
+byte-strings of fixed length (256bit, 32 bytes), computed like this::
 
   key = id = id_hash(plaintext_data)  # plain = not encrypted, not compressed, not obfuscated
 
@@ -84,247 +72,68 @@ The id_hash function depends on the :ref:`encryption mode <borg_rcreate>`.
 As the id / key is used for deduplication, id_hash must be a cryptographically
 strong hash or MAC.
 
-Segments
-~~~~~~~~
-
-Objects referenced by a key are stored inline in files (`segments`) of approx.
-500 MB size in numbered subdirectories of ``repo/data``. The number of segments
-per directory is controlled by the value of ``segments_per_dir``. If you change
-this value in a non-empty repository, you may also need to relocate the segment
-files manually.
-
-A segment starts with a magic number (``BORG_SEG`` as an eight byte ASCII string),
-followed by a number of log entries. Each log entry consists of (in this order):
-
-* crc32 checksum (uint32):
-  - for PUT2: CRC32(size + tag + key + digest)
-  - for PUT: CRC32(size + tag + key + payload)
-  - for DELETE: CRC32(size + tag + key)
-  - for COMMIT: CRC32(size + tag)
-* size (uint32) of the entry (including the whole header)
-* tag (uint8): PUT(0), DELETE(1), COMMIT(2) or PUT2(3)
-* key (256 bit) - only for PUT/PUT2/DELETE
-* payload (size - 41 bytes) - only for PUT
-* xxh64 digest (64 bit) = XXH64(size + tag + key + payload) - only for PUT2
-* payload (size - 41 - 8 bytes) - only for PUT2
-
-PUT2 is new since repository version 2. For new log entries PUT2 is used.
-PUT is still supported to read version 1 repositories, but not generated any more.
-If we talk about ``PUT`` in general, it shall usually mean PUT2 for repository
-version 2+.
+Repository objects
+~~~~~~~~~~~~~~~~~~
 
-Those files are strictly append-only and modified only once.
+Each repository object is stored separately, under its ID into data/xx/yy/xxyy...
 
-When an object is written to the repository a ``PUT`` entry is written
-to the file containing the object id and payload. If an object is deleted
-a ``DELETE`` entry is appended with the object id.
+A repo object has a structure like this:
 
-A ``COMMIT`` tag is written when a repository transaction is
-committed. The segment number of the segment containing
-a commit is the **transaction ID**.
+* 32bit meta size
+* 32bit data size
+* 64bit xxh64(meta)
+* 64bit xxh64(data)
+* meta
+* data
 
-When a repository is opened any ``PUT`` or ``DELETE`` operations not
-followed by a ``COMMIT`` tag are discarded since they are part of a
-partial/uncommitted transaction.
+The size and xxh64 hashes can be used for server-side corruption checks without
+needing to decrypt anything (which would require the borg key).
 
-The size of individual segments is limited to 4 GiB, since the offset of entries
-within segments is stored in a 32-bit unsigned integer in the repository index.
+The overall size of repository objects varies from very small (a small source
+file will be stored as a single repo object) to medium (big source files will
+be cut into medium sized chunks of some MB).
 
-Objects / Payload structure
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Metadata and data are separately encrypted and authenticated (depending on
+the user's choices).
 
-All data (the manifest, archives, archive item stream chunks and file data
-chunks) is compressed, optionally obfuscated and encrypted. This produces some
-additional metadata (size and compression information), which is separately
-serialized and also encrypted.
+See :ref:`data-encryption` for a graphic outlining the anatomy of the
+encryption.
 
-See :ref:`data-encryption` for a graphic outlining the anatomy of the encryption in Borg.
-What you see at the bottom there is done twice: once for the data and once for the metadata.
+Repo object metadata
+~~~~~~~~~~~~~~~~~~~~
 
-An object (the payload part of a segment file log entry) must be like:
+Metadata is a msgpacked (and encrypted/authenticated) dict with:
 
-- length of encrypted metadata (16bit unsigned int)
-- encrypted metadata (incl. encryption header), when decrypted:
+- ctype (compression type 0..255)
+- clevel (compression level 0..255)
+- csize (overall compressed (and maybe obfuscated) data size)
+- psize (only when obfuscated: payload size without the obfuscation trailer)
+- size (uncompressed size of the data)
 
-  - msgpacked dict with:
-
-    - ctype (compression type 0..255)
-    - clevel (compression level 0..255)
-    - csize (overall compressed (and maybe obfuscated) data size)
-    - psize (only when obfuscated: payload size without the obfuscation trailer)
-    - size (uncompressed size of the data)
-- encrypted data (incl. encryption header), when decrypted:
-
-  - compressed data (with an optional all-zero-bytes obfuscation trailer)
-
-This new, more complex repo v2 object format was implemented to be able to query the
-metadata efficiently without having to read, transfer and decrypt the (usually much bigger)
-data part.
-
-The metadata is encrypted not to disclose potentially sensitive information that could be
-used for e.g. fingerprinting attacks.
+Having this separately encrypted metadata makes it more efficient to query
+the metadata without having to read, transfer and decrypt the (usually much
+bigger) data part.
 
 The compression `ctype` and `clevel` is explained in :ref:`data-compression`.
 
 
-Index, hints and integrity
-~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-The **repository index** is stored in ``index.<TRANSACTION_ID>`` and is used to
-determine an object's location in the repository. It is a HashIndex_,
-a hash table using open addressing.
-
-It maps object keys_ to:
-
-* segment number (unit32)
-* offset of the object's entry within the segment (uint32)
-* size of the payload, not including the entry header (uint32)
-* flags (uint32)
-
-The **hints file** is a msgpacked file named ``hints.<TRANSACTION_ID>``.
-It contains:
-
-* version
-* list of segments
-* compact
-* shadow_index
-* storage_quota_use
-
-The **integrity file** is a msgpacked file named ``integrity.<TRANSACTION_ID>``.
-It contains checksums of the index and hints files and is described in the
-:ref:`Checksumming data structures <integrity_repo>` section below.
-
-If the index or hints are corrupted, they are re-generated automatically.
-If they are outdated, segments are replayed from the index state to the currently
-committed transaction.
-
 Compaction
 ~~~~~~~~~~
 
-For a given key only the last entry regarding the key, which is called current (all other entries are called
-superseded), is relevant: If there is no entry or the last entry is a DELETE then the key does not exist.
-Otherwise the last PUT defines the value of the key.
-
-By superseding a PUT (with either another PUT or a DELETE) the log entry becomes obsolete. A segment containing
-such obsolete entries is called sparse, while a segment containing no such entries is called compact.
-
-Since writing a ``DELETE`` tag does not actually delete any data and
-thus does not free disk space any log-based data store will need a
-compaction strategy (somewhat analogous to a garbage collector).
-
-Borg uses a simple forward compacting algorithm, which avoids modifying existing segments.
-Compaction runs when a commit is issued with ``compact=True`` parameter, e.g.
-by the ``borg compact`` command (unless the :ref:`append_only_mode` is active).
-
-The compaction algorithm requires two inputs in addition to the segments themselves:
+``borg compact`` is used to free repository space. It will:
 
-(i) Which segments are sparse, to avoid scanning all segments (impractical).
-    Further, Borg uses a conditional compaction strategy: Only those
-    segments that exceed a threshold sparsity are compacted.
-
-    To implement the threshold condition efficiently, the sparsity has
-    to be stored as well. Therefore, Borg stores a mapping ``(segment
-    id,) -> (number of sparse bytes,)``.
-
-(ii) Each segment's reference count, which indicates how many live objects are in a segment.
-     This is not strictly required to perform the algorithm. Rather, it is used to validate
-     that a segment is unused before deleting it. If the algorithm is incorrect, or the reference
-     count was not accounted correctly, then an assertion failure occurs.
-
-These two pieces of information are stored in the hints file (`hints.N`)
-next to the index (`index.N`).
-
-Compaction may take some time if a repository has been kept in append-only mode
-or ``borg compact`` has not been used for a longer time, which both has caused
-the number of sparse segments to grow.
-
-Compaction processes sparse segments from oldest to newest; sparse segments
-which don't contain enough deleted data to justify compaction are skipped. This
-avoids doing e.g. 500 MB of writing current data to a new segment when only
-a couple kB were deleted in a segment.
-
-Segments that are compacted are read in entirety. Current entries are written to
-a new segment, while superseded entries are omitted. After each segment an intermediary
-commit is written to the new segment. Then, the old segment is deleted
-(asserting that the reference count diminished to zero), freeing disk space.
-
-A simplified example (excluding conditional compaction and with simpler
-commit logic) showing the principal operation of compaction:
-
-.. figure:: compaction.png
-    :figwidth: 100%
-    :width: 100%
+- list all object IDs present in the repository
+- read all archives and determine which object IDs are in use
+- remove all unused objects from the repository
+- inform / warn about anything remarkable it found:
 
-(The actual algorithm is more complex to avoid various consistency issues, refer to
-the ``borg.repository`` module for more comments and documentation on these issues.)
+  - warn about IDs used, but not present (data loss!)
+  - inform about IDs that reappeared that were previously lost
+- compute statistics about:
 
-.. _internals_storage_quota:
+  - compression and deduplication factors
+  - repository space usage and space freed
 
-Storage quotas
-~~~~~~~~~~~~~~
-
-Quotas are implemented at the Repository level. The active quota of a repository
-is determined by the ``storage_quota`` `config` entry or a run-time override (via :ref:`borg_serve`).
-The currently used quota is stored in the hints file. Operations (PUT and DELETE) during
-a transaction modify the currently used quota:
-
-- A PUT adds the size of the *log entry* to the quota,
-  i.e. the length of the data plus the 41 byte header.
-- A DELETE subtracts the size of the deleted log entry from the quota,
-  which includes the header.
-
-Thus, PUT and DELETE are symmetric and cancel each other out precisely.
-
-The quota does not track on-disk size overheads (due to conditional compaction
-or append-only mode). In normal operation the inclusion of the log entry headers
-in the quota act as a faithful proxy for index and hints overheads.
-
-By tracking effective content size, the client can *always* recover from a full quota
-by deleting archives. This would not be possible if the quota tracked on-disk size,
-since journaling DELETEs requires extra disk space before space is freed.
-Tracking effective size on the other hand accounts DELETEs immediately as freeing quota.
-
-.. rubric:: Enforcing the quota
-
-The storage quota is meant as a robust mechanism for service providers, therefore
-:ref:`borg_serve` has to enforce it without loopholes (e.g. modified clients).
-The following sections refer to using quotas on remotely accessed repositories.
-For local access, consider *client* and *serve* the same.
-Accordingly, quotas cannot be enforced with local access,
-since the quota can be changed in the repository config.
-
-The quota is enforcible only if *all* :ref:`borg_serve` versions
-accessible to clients support quotas (see next section). Further, quota is
-per repository. Therefore, ensure clients can only access a defined set of repositories
-with their quotas set, using ``--restrict-to-repository``.
-
-If the client exceeds the storage quota the ``StorageQuotaExceeded`` exception is
-raised. Normally a client could ignore such an exception and just send a ``commit()``
-command anyway, circumventing the quota. However, when ``StorageQuotaExceeded`` is raised,
-it is stored in the ``transaction_doomed`` attribute of the repository.
-If the transaction is doomed, then commit will re-raise this exception, aborting the commit.
-
-The transaction_doomed indicator is reset on a rollback (which erases the quota-exceeding
-state).
-
-.. rubric:: Compatibility with older servers and enabling quota after-the-fact
-
-If no quota data is stored in the hints file, Borg assumes zero quota is used.
-Thus, if a repository with an enabled quota is written to with an older ``borg serve``
-version that does not understand quotas, then the quota usage will be erased.
-
-The client version is irrelevant to the storage quota and has no part in it.
-The form of error messages due to exceeding quota varies with client versions.
-
-A similar situation arises when upgrading from a Borg release that did not have quotas.
-Borg will start tracking quota use from the time of the upgrade, starting at zero.
-
-If the quota shall be enforced accurately in these cases, either
-
-- delete the ``index.N`` and ``hints.N`` files, forcing Borg to rebuild both,
-  re-acquiring quota data in the process, or
-- edit the msgpacked ``hints.N`` file (not recommended and thus not
-  documented further).
 
 The object graph
 ----------------
@@ -344,10 +153,10 @@ More on how this helps security in :ref:`security_structural_auth`.
 The manifest
 ~~~~~~~~~~~~
 
-The manifest is the root of the object hierarchy. It references
-all archives in a repository, and thus all data in it.
-Since no object references it, it cannot be stored under its ID key.
-Instead, the manifest has a fixed all-zero key.
+Compared to borg 1.x:
+
+- the manifest moved from object ID 0 to config/manifest
+- the archives list has been moved from the manifest to archives/*
 
 The manifest is rewritten each time an archive is created, deleted,
 or modified. It looks like this:
@@ -523,17 +332,18 @@ these may/may not be implemented and purely serve as examples.
 Archives
 ~~~~~~~~
 
-Each archive is an object referenced by the manifest. The archive object
-itself does not store any of the data contained in the archive it describes.
+Each archive is an object referenced by an entry below archives/.
+The archive object itself does not store any of the data contained in the
+archive it describes.
 
 Instead, it contains a list of chunks which form a msgpacked stream of items_.
 The archive object itself further contains some metadata:
 
 * *version*
-* *name*, which might differ from the name set in the manifest.
+* *name*, which might differ from the name set in the archives/* object.
   When :ref:`borg_check` rebuilds the manifest (e.g. if it was corrupted) and finds
   more than one archive object with the same name, it adds a counter to the name
-  in the manifest, but leaves the *name* field of the archives as it was.
+  in archives/*, but leaves the *name* field of the archives as they were.
 * *item_ptrs*, a list of "pointer chunk" IDs.
   Each "pointer chunk" contains a list of chunk IDs of item metadata.
 * *command_line*, the command line which was used to create the archive
@@ -676,7 +486,7 @@ In memory, the files cache is a key -> value mapping (a Python *dict*) and conta
   - file size
   - file ctime_ns (or mtime_ns)
   - age (0 [newest], 1, 2, 3, ..., BORG_FILES_CACHE_TTL - 1)
-  - list of chunk ids representing the file's contents
+  - list of chunk (id, size) tuples representing the file's contents
 
 To determine whether a file has not changed, cached values are looked up via
 the key in the mapping and compared to the current file attribute values.
@@ -717,9 +527,9 @@ The on-disk format of the files cache is a stream of msgpacked tuples (key, valu
 Loading the files cache involves reading the file, one msgpack object at a time,
 unpacking it, and msgpacking the value (in an effort to save memory).
 
-The **chunks cache** is stored in ``cache/chunks`` and is used to determine
-whether we already have a specific chunk, to count references to it and also
-for statistics.
+The **chunks cache** is not persisted to disk, but dynamically built in memory
+by querying the existing object IDs from the repository.
+It is used to determine whether we already have a specific chunk.
 
 The chunks cache is a key -> value mapping and contains:
 
@@ -728,14 +538,10 @@ The chunks cache is a key -> value mapping and contains:
   - chunk id_hash
 * value:
 
-  - reference count
-  - size
+  - reference count (always MAX_VALUE as we do not refcount anymore)
+  - size (0 for prev. existing objects, we can't query their plaintext size)
 
-The chunks cache is a HashIndex_. Due to some restrictions of HashIndex,
-the reference count of each given chunk is limited to a constant, MAX_VALUE
-(introduced below in HashIndex_), approximately 2**32.
-If a reference count hits MAX_VALUE, decrementing it yields MAX_VALUE again,
-i.e. the reference count is pinned to MAX_VALUE.
+The chunks cache is a HashIndex_.
 
 .. _cache-memory-usage:
 
@@ -747,14 +553,12 @@ Here is the estimated memory usage of Borg - it's complicated::
   chunk_size ~= 2 ^ HASH_MASK_BITS  (for buzhash chunker, BLOCK_SIZE for fixed chunker)
   chunk_count ~= total_file_size / chunk_size
 
-  repo_index_usage = chunk_count * 48
-
   chunks_cache_usage = chunk_count * 40
 
-  files_cache_usage = total_file_count * 240 + chunk_count * 80
+  files_cache_usage = total_file_count * 240 + chunk_count * 165
 
-  mem_usage ~= repo_index_usage + chunks_cache_usage + files_cache_usage
-             = chunk_count * 164 + total_file_count * 240
+  mem_usage ~= chunks_cache_usage + files_cache_usage
+             = chunk_count * 205 + total_file_count * 240
 
 Due to the hashtables, the best/usual/worst cases for memory allocation can
 be estimated like that::
@@ -772,11 +576,9 @@ It is also assuming that typical chunk size is 2^HASH_MASK_BITS (if you have
 a lot of files smaller than this statistical medium chunk size, you will have
 more chunks than estimated above, because 1 file is at least 1 chunk).
 
-If a remote repository is used the repo index will be allocated on the remote side.
-
-The chunks cache, files cache and the repo index are all implemented as hash
-tables. A hash table must have a significant amount of unused entries to be
-fast - the so-called load factor gives the used/unused elements ratio.
+The chunks cache and files cache are all implemented as hash tables.
+A hash table must have a significant amount of unused entries to be fast -
+the so-called load factor gives the used/unused elements ratio.
 
 When a hash table gets full (load factor getting too high), it needs to be
 grown (allocate new, bigger hash table, copy all elements over to it, free old
@@ -802,7 +604,7 @@ b) with ``create --chunker-params buzhash,19,23,21,4095`` (default):
 HashIndex
 ---------
 
-The chunks cache and the repository index are stored as hash tables, with
+The chunks cache is implemented as a hash table, with
 only one slot per bucket, spreading hash collisions to the following
 buckets. As a consequence the hash is just a start position for a linear
 search. If a key is looked up that is not in the table, then the hash table
@@ -905,7 +707,7 @@ Both modes
 ~~~~~~~~~~
 
 Encryption keys (and other secrets) are kept either in a key file on the client
-('keyfile' mode) or in the repository config on the server ('repokey' mode).
+('keyfile' mode) or in the repository under keys/repokey ('repokey' mode).
 In both cases, the secrets are generated from random and then encrypted by a
 key derived from your passphrase (this happens on the client before the key
 is stored into the keyfile or as repokey).
@@ -923,8 +725,7 @@ Key files
 When initializing a repository with one of the "keyfile" encryption modes,
 Borg creates an associated key file in ``$HOME/.config/borg/keys``.
 
-The same key is also used in the "repokey" modes, which store it in the repository
-in the configuration file.
+The same key is also used in the "repokey" modes, which store it in the repository.
 
 The internal data structure is as follows:
 
@@ -1016,11 +817,10 @@ methods in one repo does not influence deduplication.
 
 See ``borg create --help`` about how to specify the compression level and its default.
 
-Lock files
-----------
+Lock files (fslocking)
+----------------------
 
-Borg uses locks to get (exclusive or shared) access to the cache and
-the repository.
+Borg uses filesystem locks to get (exclusive or shared) access to the cache.
 
 The locking system is based on renaming a temporary directory
 to `lock.exclusive` (for
@@ -1037,24 +837,46 @@ to `lock.exclusive`, it has the lock for it. If renaming fails
 denotes a thread on the host which is still alive), lock acquisition fails.
 
 The cache lock is usually in `~/.cache/borg/REPOID/lock.*`.
-The repository lock is in `repository/lock.*`.
+
+Locks (storelocking)
+--------------------
+
+To implement locking based on ``borgstore``, borg stores objects below locks/.
+
+The objects contain:
+
+- a timestamp when lock was created (or refreshed)
+- host / process / thread information about lock owner
+- lock type: exclusive or shared
+
+Using that information, borg implements:
+
+- lock auto-expiry: if a lock is old and has not been refreshed in time,
+  it will be automatically ignored and deleted. the primary purpose of this
+  is to get rid of stale locks by borg processes on other machines.
+- lock auto-removal if the owner process is dead. the primary purpose of this
+  is to quickly get rid of stale locks by borg processes on the same machine.
+
+Breaking the locks
+------------------
 
 In case you run into troubles with the locks, you can use the ``borg break-lock``
 command after you first have made sure that no Borg process is
 running on any machine that accesses this resource. Be very careful, the cache
 or repository might get damaged if multiple processes use it at the same time.
 
+If there is an issue just with the repository lock, it will usually resolve
+automatically (see above), just retry later.
+
+
 Checksumming data structures
 ----------------------------
 
 As detailed in the previous sections, Borg generates and stores various files
-containing important meta data, such as the repository index, repository hints,
-chunks caches and files cache.
+containing important meta data, such as the files cache.
 
-Data corruption in these files can damage the archive data in a repository,
-e.g. due to wrong reference counts in the chunks cache. Only some parts of Borg
-were designed to handle corrupted data structures, so a corrupted files cache
-may cause crashes or write incorrect archives.
+Data corruption in the files cache could create incorrect archives, e.g. due
+to wrong object IDs or sizes in the files cache.
 
 Therefore, Borg calculates checksums when writing these files and tests checksums
 when reading them. Checksums are generally 64-bit XXH64 hashes.
@@ -1086,11 +908,11 @@ xxHash was expressly designed for data blocks of these sizes.
 Lower layer — file_integrity
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-To accommodate the different transaction models used for the cache and repository,
-there is a lower layer (borg.crypto.file_integrity.IntegrityCheckedFile)
-wrapping a file-like object, performing streaming calculation and comparison of checksums.
-Checksum errors are signalled by raising an exception (borg.crypto.file_integrity.FileIntegrityError)
-at the earliest possible moment.
+There is a lower layer (borg.crypto.file_integrity.IntegrityCheckedFile)
+wrapping a file-like object, performing streaming calculation and comparison
+of checksums.
+Checksum errors are signalled by raising an exception at the earliest possible
+moment (borg.crypto.file_integrity.FileIntegrityError).
 
 .. rubric:: Calculating checksums
 
@@ -1138,14 +960,9 @@ a ".integrity" file next to the data file.
 Upper layer
 ~~~~~~~~~~~
 
-Storage of integrity data depends on the component using it, since they have
-different transaction mechanisms, and integrity data needs to be
-transacted with the data it is supposed to protect.
-
 .. rubric:: Main cache files: chunks and files cache
 
-The integrity data of the ``chunks`` and ``files`` caches is stored in the
-cache ``config``, since all three are transacted together.
+The integrity data of the ``files`` cache is stored in the cache ``config``.
 
 The ``[integrity]`` section is used:
 
@@ -1161,7 +978,7 @@ The ``[integrity]`` section is used:
 
     [integrity]
     manifest = 10e...21c
-    chunks = {"algorithm": "XXH64", "digests": {"HashHeader": "eab...39e3", "final": "e2a...b24"}}
+    files = {"algorithm": "XXH64", "digests": {"HashHeader": "eab...39e3", "final": "e2a...b24"}}
 
 The manifest ID is duplicated in the integrity section due to the way all Borg
 versions handle the config file. Instead of creating a "new" config file from
@@ -1181,44 +998,6 @@ easy to tell whether the checksums concern the current state of the cache.
 Integrity errors are fatal in these files, terminating the program,
 and are not automatically corrected at this time.
 
-.. _integrity_repo:
-
-.. rubric:: Repository index and hints
-
-The repository associates index and hints files with a transaction by including the
-transaction ID in the file names. Integrity data is stored in a third file
-("integrity.<TRANSACTION_ID>"). Like the hints file, it is msgpacked:
-
-.. code-block:: python
-
-    {
-        'version': 2,
-        'hints': '{"algorithm": "XXH64", "digests": {"final": "411208db2aa13f1a"}}',
-        'index': '{"algorithm": "XXH64", "digests": {"HashHeader": "846b7315f91b8e48", "final": "cb3e26cadc173e40"}}'
-    }
-
-The *version* key started at 2, the same version used for the hints. Since Borg has
-many versioned file formats, this keeps the number of different versions in use
-a bit lower.
-
-The other keys map an auxiliary file, like *index* or *hints* to their integrity data.
-Note that the JSON is stored as-is, and not as part of the msgpack structure.
-
-Integrity errors result in deleting the affected file(s) (index/hints) and rebuilding the index,
-which is the same action taken when corruption is noticed in other ways (e.g. HashIndex can
-detect most corrupted headers, but not data corruption). A warning is logged as well.
-The exit code is not influenced, since remote repositories cannot perform that action.
-Raising the exit code would be possible for local repositories, but is not implemented.
-
-Unlike the cache design this mechanism can have false positives whenever an older version
-*rewrites* the auxiliary files for a transaction created by a newer version,
-since that might result in a different index (due to hash-table resizing) or hints file
-(hash ordering, or the older version 1 format), while not invalidating the integrity file.
-
-For example, using 1.1 on a repository, noticing corruption or similar issues and then running
-``borg-1.0 check --repair``, which rewrites the index and hints, results in this situation.
-Borg 1.1 would erroneously report checksum errors in the hints and/or index files and trigger
-an automatic rebuild of these files.
 
 HardLinkManager and the hlid concept
 ------------------------------------

BIN
docs/internals/object-graph.odg


BIN
docs/internals/object-graph.png


+ 21 - 9
docs/internals/security.rst

@@ -31,14 +31,14 @@ deleted between attacks).
 Under these circumstances Borg guarantees that the attacker cannot
 
 1. modify the data of any archive without the client detecting the change
-2. rename, remove or add an archive without the client detecting the change
+2. rename or add an archive without the client detecting the change
 3. recover plain-text data
 4. recover definite (heuristics based on access patterns are possible)
    structural information such as the object graph (which archives
    refer to what chunks)
 
 The attacker can always impose a denial of service per definition (he could
-forbid connections to the repository, or delete it entirely).
+forbid connections to the repository, or delete it partly or entirely).
 
 
 .. _security_structural_auth:
@@ -47,12 +47,12 @@ Structural Authentication
 -------------------------
 
 Borg is fundamentally based on an object graph structure (see :ref:`internals`),
-where the root object is called the manifest.
+where the root objects are the archives.
 
 Borg follows the `Horton principle`_, which states that
 not only the message must be authenticated, but also its meaning (often
 expressed through context), because every object used is referenced by a
-parent object through its object ID up to the manifest. The object ID in
+parent object through its object ID up to the archive list entry. The object ID in
 Borg is a MAC of the object's plaintext, therefore this ensures that
 an attacker cannot change the context of an object without forging the MAC.
 
@@ -64,8 +64,8 @@ represent packed file metadata. On their own, it's not clear that these objects
 would represent what they do, but by the archive item referring to them
 in a particular part of its own data structure assigns this meaning.
 
-This results in a directed acyclic graph of authentication from the manifest
-to the data chunks of individual files.
+This results in a directed acyclic graph of authentication from the archive
+list entry to the data chunks of individual files.
 
 Above used to be all for borg 1.x and was the reason why it needed the
 tertiary authentication mechanism (TAM) for manifest and archives.
@@ -80,11 +80,23 @@ the object ID (via giving the ID as AAD), there is no way an attacker (without
 access to the borg key) could change the type of the object or move content
 to a different object ID.
 
-This effectively 'anchors' the manifest (and also other metadata, like archives)
-to the key, which is controlled by the client, thereby anchoring the entire DAG,
-making it impossible for an attacker to add, remove or modify any part of the
+This effectively 'anchors' each archive to the key, which is controlled by the
+client, thereby anchoring the DAG starting from the archives list entry,
+making it impossible for an attacker to add or modify any part of the
 DAG without Borg being able to detect the tampering.
 
+Please note that removing an archive by removing an entry from archives/*
+is possible and is done by ``borg delete`` and ``borg prune`` within their
+normal operation. An attacker could also remove some entries there, but, due to
+encryption, would not know what exactly they are removing. An attacker with
+repository access could also remove other parts of the repository or the whole
+repository, so there is not much point in protecting against archive removal.
+
+The borg 1.x way of having the archives list within the manifest chunk was
+problematic as it required a read-modify-write operation on the manifest,
+requiring a lock on the repository. We want to try less locking and more
+parallelism in future.
+
 Passphrase notes
 ----------------
 

+ 2 - 19
docs/quickstart.rst

@@ -35,18 +35,6 @@ of free space on the destination filesystem that has your backup repository
 (and also on ~/.cache). A few GB should suffice for most hard-drive sized
 repositories. See also :ref:`cache-memory-usage`.
 
-Borg doesn't use space reserved for root on repository disks (even when run as root).
-On file systems which do not support this mechanism (e.g. XFS) we recommend to reserve
-some space in Borg itself just to be safe by adjusting the ``additional_free_space``
-setting (a good starting point is ``2G``)::
-
-    borg config additional_free_space 2G
-
-If Borg runs out of disk space, it tries to free as much space as it
-can while aborting the current operation safely, which allows the user to free more space
-by deleting/pruning archives. This mechanism is not bullet-proof in some
-circumstances [1]_.
-
 If you do run out of disk space, it can be hard or impossible to free space,
 because Borg needs free space to operate - even to delete backup archives.
 
@@ -55,18 +43,13 @@ in your backup log files (you check them regularly anyway, right?).
 
 Also helpful:
 
-- create a big file as a "space reserve", that you can delete to free space
+- use `borg rspace` to reserve some disk space that can be freed when the fs
+  does not have free space any more.
 - if you use LVM: use a LV + a filesystem that you can resize later and have
   some unallocated PEs you can add to the LV.
 - consider using quotas
 - use `prune` and `compact` regularly
 
-.. [1] This failsafe can fail in these circumstances:
-
-    - The underlying file system doesn't support statvfs(2), or returns incorrect
-      data, or the repository doesn't reside on a single file system
-    - Other tasks fill the disk simultaneously
-    - Hard quotas (which may not be reflected in statvfs(2))
 
 Important note about permissions
 --------------------------------