|
@@ -19,63 +19,51 @@ discussion about internals`_ and also on static code analysis.
|
|
|
Repository
|
|
|
----------
|
|
|
|
|
|
-.. Some parts of this description were taken from the Repository docstring
|
|
|
-
|
|
|
-Borg stores its data in a `Repository`, which is a file system based
|
|
|
-transactional key-value store. Thus the repository does not know about
|
|
|
-the concept of archives or items.
|
|
|
-
|
|
|
-Each repository has the following file structure:
|
|
|
-
|
|
|
-README
|
|
|
- simple text file telling that this is a Borg repository
|
|
|
-
|
|
|
-config
|
|
|
- repository configuration
|
|
|
+Borg stores its data in a `Repository`, which is a key-value store and has
|
|
|
+the following structure:
|
|
|
+
|
|
|
+config/
|
|
|
+ readme
|
|
|
+ simple text object telling that this is a Borg repository
|
|
|
+ id
|
|
|
+ the unique repository ID encoded as hexadecimal number text
|
|
|
+ version
|
|
|
+ the repository version encoded as decimal number text
|
|
|
+ manifest
|
|
|
+ some data about the repository, binary
|
|
|
+ last-key-checked
|
|
|
+ repository check progress (partial checks, full checks' checkpointing),
|
|
|
+ path of last object checked as text
|
|
|
+ space-reserve.N
|
|
|
+ purely random binary data to reserve space, e.g. for disk-full emergencies
|
|
|
+
|
|
|
+There is a list of pointers to archive objects in this directory:
|
|
|
+
|
|
|
+archives/
|
|
|
+ 0000... .. ffff...
|
|
|
+
|
|
|
+The actual data is stored into a nested directory structure, using the full
|
|
|
+object ID as name. Each (encrypted and compressed) object is stored separately.
|
|
|
|
|
|
data/
|
|
|
- directory where the actual data is stored
|
|
|
-
|
|
|
-hints.%d
|
|
|
- hints for repository compaction
|
|
|
-
|
|
|
-index.%d
|
|
|
- repository index
|
|
|
-
|
|
|
-lock.roster and lock.exclusive/*
|
|
|
- used by the locking system to manage shared and exclusive locks
|
|
|
-
|
|
|
-Transactionality is achieved by using a log (aka journal) to record changes. The log is a series of numbered files
|
|
|
-called segments_. Each segment is a series of log entries. The segment number together with the offset of each
|
|
|
-entry relative to its segment start establishes an ordering of the log entries. This is the "definition" of
|
|
|
-time for the purposes of the log.
|
|
|
-
|
|
|
-.. _config-file:
|
|
|
-
|
|
|
-Config file
|
|
|
-~~~~~~~~~~~
|
|
|
+ 00/ .. ff/
|
|
|
+ 00/ .. ff/
|
|
|
+ 0000... .. ffff...
|
|
|
|
|
|
-Each repository has a ``config`` file which is a ``INI``-style file
|
|
|
-and looks like this::
|
|
|
+keys/
|
|
|
+ repokey
|
|
|
+ When using encryption in repokey mode, the encrypted, passphrase protected
|
|
|
+ key is stored here as a base64 encoded text.
|
|
|
|
|
|
- [repository]
|
|
|
- version = 2
|
|
|
- segments_per_dir = 1000
|
|
|
- max_segment_size = 524288000
|
|
|
- id = 57d6c1d52ce76a836b532b0e42e677dec6af9fca3673db511279358828a21ed6
|
|
|
+locks/
|
|
|
+ used by the locking system to manage shared and exclusive locks.
|
|
|
|
|
|
-This is where the ``repository.id`` is stored. It is a unique
|
|
|
-identifier for repositories. It will not change if you move the
|
|
|
-repository around so you can make a local transfer then decide to move
|
|
|
-the repository to another (even remote) location at a later time.
|
|
|
|
|
|
Keys
|
|
|
~~~~
|
|
|
|
|
|
-Repository keys are byte-strings of fixed length (32 bytes), they
|
|
|
-don't have a particular meaning (except for the Manifest_).
|
|
|
-
|
|
|
-Normally the keys are computed like this::
|
|
|
+Repository object IDs (which are used as key into the key-value store) are
|
|
|
+byte-strings of fixed length (256bit, 32 bytes), computed like this::
|
|
|
|
|
|
key = id = id_hash(plaintext_data) # plain = not encrypted, not compressed, not obfuscated
|
|
|
|
|
@@ -84,247 +72,68 @@ The id_hash function depends on the :ref:`encryption mode <borg_rcreate>`.
|
|
|
As the id / key is used for deduplication, id_hash must be a cryptographically
|
|
|
strong hash or MAC.
|
|
|
|
|
|
-Segments
|
|
|
-~~~~~~~~
|
|
|
-
|
|
|
-Objects referenced by a key are stored inline in files (`segments`) of approx.
|
|
|
-500 MB size in numbered subdirectories of ``repo/data``. The number of segments
|
|
|
-per directory is controlled by the value of ``segments_per_dir``. If you change
|
|
|
-this value in a non-empty repository, you may also need to relocate the segment
|
|
|
-files manually.
|
|
|
-
|
|
|
-A segment starts with a magic number (``BORG_SEG`` as an eight byte ASCII string),
|
|
|
-followed by a number of log entries. Each log entry consists of (in this order):
|
|
|
-
|
|
|
-* crc32 checksum (uint32):
|
|
|
- - for PUT2: CRC32(size + tag + key + digest)
|
|
|
- - for PUT: CRC32(size + tag + key + payload)
|
|
|
- - for DELETE: CRC32(size + tag + key)
|
|
|
- - for COMMIT: CRC32(size + tag)
|
|
|
-* size (uint32) of the entry (including the whole header)
|
|
|
-* tag (uint8): PUT(0), DELETE(1), COMMIT(2) or PUT2(3)
|
|
|
-* key (256 bit) - only for PUT/PUT2/DELETE
|
|
|
-* payload (size - 41 bytes) - only for PUT
|
|
|
-* xxh64 digest (64 bit) = XXH64(size + tag + key + payload) - only for PUT2
|
|
|
-* payload (size - 41 - 8 bytes) - only for PUT2
|
|
|
-
|
|
|
-PUT2 is new since repository version 2. For new log entries PUT2 is used.
|
|
|
-PUT is still supported to read version 1 repositories, but not generated any more.
|
|
|
-If we talk about ``PUT`` in general, it shall usually mean PUT2 for repository
|
|
|
-version 2+.
|
|
|
+Repository objects
|
|
|
+~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
-Those files are strictly append-only and modified only once.
|
|
|
+Each repository object is stored separately, under its ID into data/xx/yy/xxyy...
|
|
|
|
|
|
-When an object is written to the repository a ``PUT`` entry is written
|
|
|
-to the file containing the object id and payload. If an object is deleted
|
|
|
-a ``DELETE`` entry is appended with the object id.
|
|
|
+A repo object has a structure like this:
|
|
|
|
|
|
-A ``COMMIT`` tag is written when a repository transaction is
|
|
|
-committed. The segment number of the segment containing
|
|
|
-a commit is the **transaction ID**.
|
|
|
+* 32bit meta size
|
|
|
+* 32bit data size
|
|
|
+* 64bit xxh64(meta)
|
|
|
+* 64bit xxh64(data)
|
|
|
+* meta
|
|
|
+* data
|
|
|
|
|
|
-When a repository is opened any ``PUT`` or ``DELETE`` operations not
|
|
|
-followed by a ``COMMIT`` tag are discarded since they are part of a
|
|
|
-partial/uncommitted transaction.
|
|
|
+The size and xxh64 hashes can be used for server-side corruption checks without
|
|
|
+needing to decrypt anything (which would require the borg key).
|
|
|
|
|
|
-The size of individual segments is limited to 4 GiB, since the offset of entries
|
|
|
-within segments is stored in a 32-bit unsigned integer in the repository index.
|
|
|
+The overall size of repository objects varies from very small (a small source
|
|
|
+file will be stored as a single repo object) to medium (big source files will
|
|
|
+be cut into medium sized chunks of some MB).
|
|
|
|
|
|
-Objects / Payload structure
|
|
|
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
+Metadata and data are separately encrypted and authenticated (depending on
|
|
|
+the user's choices).
|
|
|
|
|
|
-All data (the manifest, archives, archive item stream chunks and file data
|
|
|
-chunks) is compressed, optionally obfuscated and encrypted. This produces some
|
|
|
-additional metadata (size and compression information), which is separately
|
|
|
-serialized and also encrypted.
|
|
|
+See :ref:`data-encryption` for a graphic outlining the anatomy of the
|
|
|
+encryption.
|
|
|
|
|
|
-See :ref:`data-encryption` for a graphic outlining the anatomy of the encryption in Borg.
|
|
|
-What you see at the bottom there is done twice: once for the data and once for the metadata.
|
|
|
+Repo object metadata
|
|
|
+~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
-An object (the payload part of a segment file log entry) must be like:
|
|
|
+Metadata is a msgpacked (and encrypted/authenticated) dict with:
|
|
|
|
|
|
-- length of encrypted metadata (16bit unsigned int)
|
|
|
-- encrypted metadata (incl. encryption header), when decrypted:
|
|
|
+- ctype (compression type 0..255)
|
|
|
+- clevel (compression level 0..255)
|
|
|
+- csize (overall compressed (and maybe obfuscated) data size)
|
|
|
+- psize (only when obfuscated: payload size without the obfuscation trailer)
|
|
|
+- size (uncompressed size of the data)
|
|
|
|
|
|
- - msgpacked dict with:
|
|
|
-
|
|
|
- - ctype (compression type 0..255)
|
|
|
- - clevel (compression level 0..255)
|
|
|
- - csize (overall compressed (and maybe obfuscated) data size)
|
|
|
- - psize (only when obfuscated: payload size without the obfuscation trailer)
|
|
|
- - size (uncompressed size of the data)
|
|
|
-- encrypted data (incl. encryption header), when decrypted:
|
|
|
-
|
|
|
- - compressed data (with an optional all-zero-bytes obfuscation trailer)
|
|
|
-
|
|
|
-This new, more complex repo v2 object format was implemented to be able to query the
|
|
|
-metadata efficiently without having to read, transfer and decrypt the (usually much bigger)
|
|
|
-data part.
|
|
|
-
|
|
|
-The metadata is encrypted not to disclose potentially sensitive information that could be
|
|
|
-used for e.g. fingerprinting attacks.
|
|
|
+Having this separately encrypted metadata makes it more efficient to query
|
|
|
+the metadata without having to read, transfer and decrypt the (usually much
|
|
|
+bigger) data part.
|
|
|
|
|
|
The compression `ctype` and `clevel` is explained in :ref:`data-compression`.
|
|
|
|
|
|
|
|
|
-Index, hints and integrity
|
|
|
-~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
-
|
|
|
-The **repository index** is stored in ``index.<TRANSACTION_ID>`` and is used to
|
|
|
-determine an object's location in the repository. It is a HashIndex_,
|
|
|
-a hash table using open addressing.
|
|
|
-
|
|
|
-It maps object keys_ to:
|
|
|
-
|
|
|
-* segment number (unit32)
|
|
|
-* offset of the object's entry within the segment (uint32)
|
|
|
-* size of the payload, not including the entry header (uint32)
|
|
|
-* flags (uint32)
|
|
|
-
|
|
|
-The **hints file** is a msgpacked file named ``hints.<TRANSACTION_ID>``.
|
|
|
-It contains:
|
|
|
-
|
|
|
-* version
|
|
|
-* list of segments
|
|
|
-* compact
|
|
|
-* shadow_index
|
|
|
-* storage_quota_use
|
|
|
-
|
|
|
-The **integrity file** is a msgpacked file named ``integrity.<TRANSACTION_ID>``.
|
|
|
-It contains checksums of the index and hints files and is described in the
|
|
|
-:ref:`Checksumming data structures <integrity_repo>` section below.
|
|
|
-
|
|
|
-If the index or hints are corrupted, they are re-generated automatically.
|
|
|
-If they are outdated, segments are replayed from the index state to the currently
|
|
|
-committed transaction.
|
|
|
-
|
|
|
Compaction
|
|
|
~~~~~~~~~~
|
|
|
|
|
|
-For a given key only the last entry regarding the key, which is called current (all other entries are called
|
|
|
-superseded), is relevant: If there is no entry or the last entry is a DELETE then the key does not exist.
|
|
|
-Otherwise the last PUT defines the value of the key.
|
|
|
-
|
|
|
-By superseding a PUT (with either another PUT or a DELETE) the log entry becomes obsolete. A segment containing
|
|
|
-such obsolete entries is called sparse, while a segment containing no such entries is called compact.
|
|
|
-
|
|
|
-Since writing a ``DELETE`` tag does not actually delete any data and
|
|
|
-thus does not free disk space any log-based data store will need a
|
|
|
-compaction strategy (somewhat analogous to a garbage collector).
|
|
|
-
|
|
|
-Borg uses a simple forward compacting algorithm, which avoids modifying existing segments.
|
|
|
-Compaction runs when a commit is issued with ``compact=True`` parameter, e.g.
|
|
|
-by the ``borg compact`` command (unless the :ref:`append_only_mode` is active).
|
|
|
-
|
|
|
-The compaction algorithm requires two inputs in addition to the segments themselves:
|
|
|
-
|
|
|
-(i) Which segments are sparse, to avoid scanning all segments (impractical).
|
|
|
- Further, Borg uses a conditional compaction strategy: Only those
|
|
|
- segments that exceed a threshold sparsity are compacted.
|
|
|
-
|
|
|
- To implement the threshold condition efficiently, the sparsity has
|
|
|
- to be stored as well. Therefore, Borg stores a mapping ``(segment
|
|
|
- id,) -> (number of sparse bytes,)``.
|
|
|
-
|
|
|
-(ii) Each segment's reference count, which indicates how many live objects are in a segment.
|
|
|
- This is not strictly required to perform the algorithm. Rather, it is used to validate
|
|
|
- that a segment is unused before deleting it. If the algorithm is incorrect, or the reference
|
|
|
- count was not accounted correctly, then an assertion failure occurs.
|
|
|
-
|
|
|
-These two pieces of information are stored in the hints file (`hints.N`)
|
|
|
-next to the index (`index.N`).
|
|
|
-
|
|
|
-Compaction may take some time if a repository has been kept in append-only mode
|
|
|
-or ``borg compact`` has not been used for a longer time, which both has caused
|
|
|
-the number of sparse segments to grow.
|
|
|
-
|
|
|
-Compaction processes sparse segments from oldest to newest; sparse segments
|
|
|
-which don't contain enough deleted data to justify compaction are skipped. This
|
|
|
-avoids doing e.g. 500 MB of writing current data to a new segment when only
|
|
|
-a couple kB were deleted in a segment.
|
|
|
-
|
|
|
-Segments that are compacted are read in entirety. Current entries are written to
|
|
|
-a new segment, while superseded entries are omitted. After each segment an intermediary
|
|
|
-commit is written to the new segment. Then, the old segment is deleted
|
|
|
-(asserting that the reference count diminished to zero), freeing disk space.
|
|
|
-
|
|
|
-A simplified example (excluding conditional compaction and with simpler
|
|
|
-commit logic) showing the principal operation of compaction:
|
|
|
-
|
|
|
-.. figure:: compaction.png
|
|
|
- :figwidth: 100%
|
|
|
- :width: 100%
|
|
|
-
|
|
|
-(The actual algorithm is more complex to avoid various consistency issues, refer to
|
|
|
-the ``borg.repository`` module for more comments and documentation on these issues.)
|
|
|
-
|
|
|
-.. _internals_storage_quota:
|
|
|
-
|
|
|
-Storage quotas
|
|
|
-~~~~~~~~~~~~~~
|
|
|
-
|
|
|
-Quotas are implemented at the Repository level. The active quota of a repository
|
|
|
-is determined by the ``storage_quota`` `config` entry or a run-time override (via :ref:`borg_serve`).
|
|
|
-The currently used quota is stored in the hints file. Operations (PUT and DELETE) during
|
|
|
-a transaction modify the currently used quota:
|
|
|
-
|
|
|
-- A PUT adds the size of the *log entry* to the quota,
|
|
|
- i.e. the length of the data plus the 41 byte header.
|
|
|
-- A DELETE subtracts the size of the deleted log entry from the quota,
|
|
|
- which includes the header.
|
|
|
-
|
|
|
-Thus, PUT and DELETE are symmetric and cancel each other out precisely.
|
|
|
-
|
|
|
-The quota does not track on-disk size overheads (due to conditional compaction
|
|
|
-or append-only mode). In normal operation the inclusion of the log entry headers
|
|
|
-in the quota act as a faithful proxy for index and hints overheads.
|
|
|
-
|
|
|
-By tracking effective content size, the client can *always* recover from a full quota
|
|
|
-by deleting archives. This would not be possible if the quota tracked on-disk size,
|
|
|
-since journaling DELETEs requires extra disk space before space is freed.
|
|
|
-Tracking effective size on the other hand accounts DELETEs immediately as freeing quota.
|
|
|
-
|
|
|
-.. rubric:: Enforcing the quota
|
|
|
-
|
|
|
-The storage quota is meant as a robust mechanism for service providers, therefore
|
|
|
-:ref:`borg_serve` has to enforce it without loopholes (e.g. modified clients).
|
|
|
-The following sections refer to using quotas on remotely accessed repositories.
|
|
|
-For local access, consider *client* and *serve* the same.
|
|
|
-Accordingly, quotas cannot be enforced with local access,
|
|
|
-since the quota can be changed in the repository config.
|
|
|
-
|
|
|
-The quota is enforcible only if *all* :ref:`borg_serve` versions
|
|
|
-accessible to clients support quotas (see next section). Further, quota is
|
|
|
-per repository. Therefore, ensure clients can only access a defined set of repositories
|
|
|
-with their quotas set, using ``--restrict-to-repository``.
|
|
|
+``borg compact`` is used to free repository space. It will:
|
|
|
|
|
|
-If the client exceeds the storage quota the ``StorageQuotaExceeded`` exception is
|
|
|
-raised. Normally a client could ignore such an exception and just send a ``commit()``
|
|
|
-command anyway, circumventing the quota. However, when ``StorageQuotaExceeded`` is raised,
|
|
|
-it is stored in the ``transaction_doomed`` attribute of the repository.
|
|
|
-If the transaction is doomed, then commit will re-raise this exception, aborting the commit.
|
|
|
+- list all object IDs present in the repository
|
|
|
+- read all archives and determine which object IDs are in use
|
|
|
+- remove all unused objects from the repository
|
|
|
+- inform / warn about anything remarkable it found:
|
|
|
|
|
|
-The transaction_doomed indicator is reset on a rollback (which erases the quota-exceeding
|
|
|
-state).
|
|
|
+ - warn about IDs used, but not present (data loss!)
|
|
|
+ - inform about IDs that reappeared that were previously lost
|
|
|
+- compute statistics about:
|
|
|
|
|
|
-.. rubric:: Compatibility with older servers and enabling quota after-the-fact
|
|
|
+ - compression and deduplication factors
|
|
|
+ - repository space usage and space freed
|
|
|
|
|
|
-If no quota data is stored in the hints file, Borg assumes zero quota is used.
|
|
|
-Thus, if a repository with an enabled quota is written to with an older ``borg serve``
|
|
|
-version that does not understand quotas, then the quota usage will be erased.
|
|
|
-
|
|
|
-The client version is irrelevant to the storage quota and has no part in it.
|
|
|
-The form of error messages due to exceeding quota varies with client versions.
|
|
|
-
|
|
|
-A similar situation arises when upgrading from a Borg release that did not have quotas.
|
|
|
-Borg will start tracking quota use from the time of the upgrade, starting at zero.
|
|
|
-
|
|
|
-If the quota shall be enforced accurately in these cases, either
|
|
|
-
|
|
|
-- delete the ``index.N`` and ``hints.N`` files, forcing Borg to rebuild both,
|
|
|
- re-acquiring quota data in the process, or
|
|
|
-- edit the msgpacked ``hints.N`` file (not recommended and thus not
|
|
|
- documented further).
|
|
|
|
|
|
The object graph
|
|
|
----------------
|
|
@@ -344,10 +153,10 @@ More on how this helps security in :ref:`security_structural_auth`.
|
|
|
The manifest
|
|
|
~~~~~~~~~~~~
|
|
|
|
|
|
-The manifest is the root of the object hierarchy. It references
|
|
|
-all archives in a repository, and thus all data in it.
|
|
|
-Since no object references it, it cannot be stored under its ID key.
|
|
|
-Instead, the manifest has a fixed all-zero key.
|
|
|
+Compared to borg 1.x:
|
|
|
+
|
|
|
+- the manifest moved from object ID 0 to config/manifest
|
|
|
+- the archives list has been moved from the manifest to archives/*
|
|
|
|
|
|
The manifest is rewritten each time an archive is created, deleted,
|
|
|
or modified. It looks like this:
|
|
@@ -523,17 +332,18 @@ these may/may not be implemented and purely serve as examples.
|
|
|
Archives
|
|
|
~~~~~~~~
|
|
|
|
|
|
-Each archive is an object referenced by the manifest. The archive object
|
|
|
-itself does not store any of the data contained in the archive it describes.
|
|
|
+Each archive is an object referenced by an entry below archives/.
|
|
|
+The archive object itself does not store any of the data contained in the
|
|
|
+archive it describes.
|
|
|
|
|
|
Instead, it contains a list of chunks which form a msgpacked stream of items_.
|
|
|
The archive object itself further contains some metadata:
|
|
|
|
|
|
* *version*
|
|
|
-* *name*, which might differ from the name set in the manifest.
|
|
|
+* *name*, which might differ from the name set in the archives/* object.
|
|
|
When :ref:`borg_check` rebuilds the manifest (e.g. if it was corrupted) and finds
|
|
|
more than one archive object with the same name, it adds a counter to the name
|
|
|
- in the manifest, but leaves the *name* field of the archives as it was.
|
|
|
+ in archives/*, but leaves the *name* field of the archives as they were.
|
|
|
* *item_ptrs*, a list of "pointer chunk" IDs.
|
|
|
Each "pointer chunk" contains a list of chunk IDs of item metadata.
|
|
|
* *command_line*, the command line which was used to create the archive
|
|
@@ -676,7 +486,7 @@ In memory, the files cache is a key -> value mapping (a Python *dict*) and conta
|
|
|
- file size
|
|
|
- file ctime_ns (or mtime_ns)
|
|
|
- age (0 [newest], 1, 2, 3, ..., BORG_FILES_CACHE_TTL - 1)
|
|
|
- - list of chunk ids representing the file's contents
|
|
|
+ - list of chunk (id, size) tuples representing the file's contents
|
|
|
|
|
|
To determine whether a file has not changed, cached values are looked up via
|
|
|
the key in the mapping and compared to the current file attribute values.
|
|
@@ -717,9 +527,9 @@ The on-disk format of the files cache is a stream of msgpacked tuples (key, valu
|
|
|
Loading the files cache involves reading the file, one msgpack object at a time,
|
|
|
unpacking it, and msgpacking the value (in an effort to save memory).
|
|
|
|
|
|
-The **chunks cache** is stored in ``cache/chunks`` and is used to determine
|
|
|
-whether we already have a specific chunk, to count references to it and also
|
|
|
-for statistics.
|
|
|
+The **chunks cache** is not persisted to disk, but dynamically built in memory
|
|
|
+by querying the existing object IDs from the repository.
|
|
|
+It is used to determine whether we already have a specific chunk.
|
|
|
|
|
|
The chunks cache is a key -> value mapping and contains:
|
|
|
|
|
@@ -728,14 +538,10 @@ The chunks cache is a key -> value mapping and contains:
|
|
|
- chunk id_hash
|
|
|
* value:
|
|
|
|
|
|
- - reference count
|
|
|
- - size
|
|
|
+ - reference count (always MAX_VALUE as we do not refcount anymore)
|
|
|
+ - size (0 for prev. existing objects, we can't query their plaintext size)
|
|
|
|
|
|
-The chunks cache is a HashIndex_. Due to some restrictions of HashIndex,
|
|
|
-the reference count of each given chunk is limited to a constant, MAX_VALUE
|
|
|
-(introduced below in HashIndex_), approximately 2**32.
|
|
|
-If a reference count hits MAX_VALUE, decrementing it yields MAX_VALUE again,
|
|
|
-i.e. the reference count is pinned to MAX_VALUE.
|
|
|
+The chunks cache is a HashIndex_.
|
|
|
|
|
|
.. _cache-memory-usage:
|
|
|
|
|
@@ -747,14 +553,12 @@ Here is the estimated memory usage of Borg - it's complicated::
|
|
|
chunk_size ~= 2 ^ HASH_MASK_BITS (for buzhash chunker, BLOCK_SIZE for fixed chunker)
|
|
|
chunk_count ~= total_file_size / chunk_size
|
|
|
|
|
|
- repo_index_usage = chunk_count * 48
|
|
|
-
|
|
|
chunks_cache_usage = chunk_count * 40
|
|
|
|
|
|
- files_cache_usage = total_file_count * 240 + chunk_count * 80
|
|
|
+ files_cache_usage = total_file_count * 240 + chunk_count * 165
|
|
|
|
|
|
- mem_usage ~= repo_index_usage + chunks_cache_usage + files_cache_usage
|
|
|
- = chunk_count * 164 + total_file_count * 240
|
|
|
+ mem_usage ~= chunks_cache_usage + files_cache_usage
|
|
|
+ = chunk_count * 205 + total_file_count * 240
|
|
|
|
|
|
Due to the hashtables, the best/usual/worst cases for memory allocation can
|
|
|
be estimated like that::
|
|
@@ -772,11 +576,9 @@ It is also assuming that typical chunk size is 2^HASH_MASK_BITS (if you have
|
|
|
a lot of files smaller than this statistical medium chunk size, you will have
|
|
|
more chunks than estimated above, because 1 file is at least 1 chunk).
|
|
|
|
|
|
-If a remote repository is used the repo index will be allocated on the remote side.
|
|
|
-
|
|
|
-The chunks cache, files cache and the repo index are all implemented as hash
|
|
|
-tables. A hash table must have a significant amount of unused entries to be
|
|
|
-fast - the so-called load factor gives the used/unused elements ratio.
|
|
|
+The chunks cache and files cache are all implemented as hash tables.
|
|
|
+A hash table must have a significant amount of unused entries to be fast -
|
|
|
+the so-called load factor gives the used/unused elements ratio.
|
|
|
|
|
|
When a hash table gets full (load factor getting too high), it needs to be
|
|
|
grown (allocate new, bigger hash table, copy all elements over to it, free old
|
|
@@ -802,7 +604,7 @@ b) with ``create --chunker-params buzhash,19,23,21,4095`` (default):
|
|
|
HashIndex
|
|
|
---------
|
|
|
|
|
|
-The chunks cache and the repository index are stored as hash tables, with
|
|
|
+The chunks cache is implemented as a hash table, with
|
|
|
only one slot per bucket, spreading hash collisions to the following
|
|
|
buckets. As a consequence the hash is just a start position for a linear
|
|
|
search. If a key is looked up that is not in the table, then the hash table
|
|
@@ -905,7 +707,7 @@ Both modes
|
|
|
~~~~~~~~~~
|
|
|
|
|
|
Encryption keys (and other secrets) are kept either in a key file on the client
|
|
|
-('keyfile' mode) or in the repository config on the server ('repokey' mode).
|
|
|
+('keyfile' mode) or in the repository under keys/repokey ('repokey' mode).
|
|
|
In both cases, the secrets are generated from random and then encrypted by a
|
|
|
key derived from your passphrase (this happens on the client before the key
|
|
|
is stored into the keyfile or as repokey).
|
|
@@ -923,8 +725,7 @@ Key files
|
|
|
When initializing a repository with one of the "keyfile" encryption modes,
|
|
|
Borg creates an associated key file in ``$HOME/.config/borg/keys``.
|
|
|
|
|
|
-The same key is also used in the "repokey" modes, which store it in the repository
|
|
|
-in the configuration file.
|
|
|
+The same key is also used in the "repokey" modes, which store it in the repository.
|
|
|
|
|
|
The internal data structure is as follows:
|
|
|
|
|
@@ -1016,11 +817,10 @@ methods in one repo does not influence deduplication.
|
|
|
|
|
|
See ``borg create --help`` about how to specify the compression level and its default.
|
|
|
|
|
|
-Lock files
|
|
|
-----------
|
|
|
+Lock files (fslocking)
|
|
|
+----------------------
|
|
|
|
|
|
-Borg uses locks to get (exclusive or shared) access to the cache and
|
|
|
-the repository.
|
|
|
+Borg uses filesystem locks to get (exclusive or shared) access to the cache.
|
|
|
|
|
|
The locking system is based on renaming a temporary directory
|
|
|
to `lock.exclusive` (for
|
|
@@ -1037,24 +837,46 @@ to `lock.exclusive`, it has the lock for it. If renaming fails
|
|
|
denotes a thread on the host which is still alive), lock acquisition fails.
|
|
|
|
|
|
The cache lock is usually in `~/.cache/borg/REPOID/lock.*`.
|
|
|
-The repository lock is in `repository/lock.*`.
|
|
|
+
|
|
|
+Locks (storelocking)
|
|
|
+--------------------
|
|
|
+
|
|
|
+To implement locking based on ``borgstore``, borg stores objects below locks/.
|
|
|
+
|
|
|
+The objects contain:
|
|
|
+
|
|
|
+- a timestamp when lock was created (or refreshed)
|
|
|
+- host / process / thread information about lock owner
|
|
|
+- lock type: exclusive or shared
|
|
|
+
|
|
|
+Using that information, borg implements:
|
|
|
+
|
|
|
+- lock auto-expiry: if a lock is old and has not been refreshed in time,
|
|
|
+ it will be automatically ignored and deleted. the primary purpose of this
|
|
|
+ is to get rid of stale locks by borg processes on other machines.
|
|
|
+- lock auto-removal if the owner process is dead. the primary purpose of this
|
|
|
+ is to quickly get rid of stale locks by borg processes on the same machine.
|
|
|
+
|
|
|
+Breaking the locks
|
|
|
+------------------
|
|
|
|
|
|
In case you run into troubles with the locks, you can use the ``borg break-lock``
|
|
|
command after you first have made sure that no Borg process is
|
|
|
running on any machine that accesses this resource. Be very careful, the cache
|
|
|
or repository might get damaged if multiple processes use it at the same time.
|
|
|
|
|
|
+If there is an issue just with the repository lock, it will usually resolve
|
|
|
+automatically (see above), just retry later.
|
|
|
+
|
|
|
+
|
|
|
Checksumming data structures
|
|
|
----------------------------
|
|
|
|
|
|
As detailed in the previous sections, Borg generates and stores various files
|
|
|
-containing important meta data, such as the repository index, repository hints,
|
|
|
-chunks caches and files cache.
|
|
|
+containing important meta data, such as the files cache.
|
|
|
|
|
|
-Data corruption in these files can damage the archive data in a repository,
|
|
|
-e.g. due to wrong reference counts in the chunks cache. Only some parts of Borg
|
|
|
-were designed to handle corrupted data structures, so a corrupted files cache
|
|
|
-may cause crashes or write incorrect archives.
|
|
|
+Data corruption in the files cache could create incorrect archives, e.g. due
|
|
|
+to wrong object IDs or sizes in the files cache.
|
|
|
|
|
|
Therefore, Borg calculates checksums when writing these files and tests checksums
|
|
|
when reading them. Checksums are generally 64-bit XXH64 hashes.
|
|
@@ -1086,11 +908,11 @@ xxHash was expressly designed for data blocks of these sizes.
|
|
|
Lower layer — file_integrity
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
-To accommodate the different transaction models used for the cache and repository,
|
|
|
-there is a lower layer (borg.crypto.file_integrity.IntegrityCheckedFile)
|
|
|
-wrapping a file-like object, performing streaming calculation and comparison of checksums.
|
|
|
-Checksum errors are signalled by raising an exception (borg.crypto.file_integrity.FileIntegrityError)
|
|
|
-at the earliest possible moment.
|
|
|
+There is a lower layer (borg.crypto.file_integrity.IntegrityCheckedFile)
|
|
|
+wrapping a file-like object, performing streaming calculation and comparison
|
|
|
+of checksums.
|
|
|
+Checksum errors are signalled by raising an exception at the earliest possible
|
|
|
+moment (borg.crypto.file_integrity.FileIntegrityError).
|
|
|
|
|
|
.. rubric:: Calculating checksums
|
|
|
|
|
@@ -1134,19 +956,13 @@ The *digests* key contains a mapping of part names to their digests.
|
|
|
Integrity data is generally stored by the upper layers, introduced below. An exception
|
|
|
is the DetachedIntegrityCheckedFile, which automatically writes and reads it from
|
|
|
a ".integrity" file next to the data file.
|
|
|
-It is used for archive chunks indexes in chunks.archive.d.
|
|
|
|
|
|
Upper layer
|
|
|
~~~~~~~~~~~
|
|
|
|
|
|
-Storage of integrity data depends on the component using it, since they have
|
|
|
-different transaction mechanisms, and integrity data needs to be
|
|
|
-transacted with the data it is supposed to protect.
|
|
|
-
|
|
|
.. rubric:: Main cache files: chunks and files cache
|
|
|
|
|
|
-The integrity data of the ``chunks`` and ``files`` caches is stored in the
|
|
|
-cache ``config``, since all three are transacted together.
|
|
|
+The integrity data of the ``files`` cache is stored in the cache ``config``.
|
|
|
|
|
|
The ``[integrity]`` section is used:
|
|
|
|
|
@@ -1162,7 +978,7 @@ The ``[integrity]`` section is used:
|
|
|
|
|
|
[integrity]
|
|
|
manifest = 10e...21c
|
|
|
- chunks = {"algorithm": "XXH64", "digests": {"HashHeader": "eab...39e3", "final": "e2a...b24"}}
|
|
|
+ files = {"algorithm": "XXH64", "digests": {"HashHeader": "eab...39e3", "final": "e2a...b24"}}
|
|
|
|
|
|
The manifest ID is duplicated in the integrity section due to the way all Borg
|
|
|
versions handle the config file. Instead of creating a "new" config file from
|
|
@@ -1182,52 +998,6 @@ easy to tell whether the checksums concern the current state of the cache.
|
|
|
Integrity errors are fatal in these files, terminating the program,
|
|
|
and are not automatically corrected at this time.
|
|
|
|
|
|
-.. rubric:: chunks.archive.d
|
|
|
-
|
|
|
-Indices in chunks.archive.d are not transacted and use DetachedIntegrityCheckedFile,
|
|
|
-which writes the integrity data to a separate ".integrity" file.
|
|
|
-
|
|
|
-Integrity errors result in deleting the affected index and rebuilding it.
|
|
|
-This logs a warning and increases the exit code to WARNING (1).
|
|
|
-
|
|
|
-.. _integrity_repo:
|
|
|
-
|
|
|
-.. rubric:: Repository index and hints
|
|
|
-
|
|
|
-The repository associates index and hints files with a transaction by including the
|
|
|
-transaction ID in the file names. Integrity data is stored in a third file
|
|
|
-("integrity.<TRANSACTION_ID>"). Like the hints file, it is msgpacked:
|
|
|
-
|
|
|
-.. code-block:: python
|
|
|
-
|
|
|
- {
|
|
|
- 'version': 2,
|
|
|
- 'hints': '{"algorithm": "XXH64", "digests": {"final": "411208db2aa13f1a"}}',
|
|
|
- 'index': '{"algorithm": "XXH64", "digests": {"HashHeader": "846b7315f91b8e48", "final": "cb3e26cadc173e40"}}'
|
|
|
- }
|
|
|
-
|
|
|
-The *version* key started at 2, the same version used for the hints. Since Borg has
|
|
|
-many versioned file formats, this keeps the number of different versions in use
|
|
|
-a bit lower.
|
|
|
-
|
|
|
-The other keys map an auxiliary file, like *index* or *hints* to their integrity data.
|
|
|
-Note that the JSON is stored as-is, and not as part of the msgpack structure.
|
|
|
-
|
|
|
-Integrity errors result in deleting the affected file(s) (index/hints) and rebuilding the index,
|
|
|
-which is the same action taken when corruption is noticed in other ways (e.g. HashIndex can
|
|
|
-detect most corrupted headers, but not data corruption). A warning is logged as well.
|
|
|
-The exit code is not influenced, since remote repositories cannot perform that action.
|
|
|
-Raising the exit code would be possible for local repositories, but is not implemented.
|
|
|
-
|
|
|
-Unlike the cache design this mechanism can have false positives whenever an older version
|
|
|
-*rewrites* the auxiliary files for a transaction created by a newer version,
|
|
|
-since that might result in a different index (due to hash-table resizing) or hints file
|
|
|
-(hash ordering, or the older version 1 format), while not invalidating the integrity file.
|
|
|
-
|
|
|
-For example, using 1.1 on a repository, noticing corruption or similar issues and then running
|
|
|
-``borg-1.0 check --repair``, which rewrites the index and hints, results in this situation.
|
|
|
-Borg 1.1 would erroneously report checksum errors in the hints and/or index files and trigger
|
|
|
-an automatic rebuild of these files.
|
|
|
|
|
|
HardLinkManager and the hlid concept
|
|
|
------------------------------------
|