浏览代码

docs: provide more details on object layout

While writing my own out-of-band decoder, I had a hard time figuring out
how to unpack the manifest. From the description, I was only able to
read that the manifest is msgpack'd, but I had not been able to figure
out that it's also going through the same encryption+compression logic
as all other things do.

This should make it a little clearer and provide the necessary
information to understand how the compression works.
Jonas Schäfer 2 年之前
父节点
当前提交
bd89fb717e
共有 1 个文件被更改,包括 21 次插入6 次删除
  1. 21 6
      docs/internals/data-structures.rst

+ 21 - 6
docs/internals/data-structures.rst

@@ -121,6 +121,14 @@ partial/uncommitted transaction.
 The size of individual segments is limited to 4 GiB, since the offset of entries
 The size of individual segments is limited to 4 GiB, since the offset of entries
 within segments is stored in a 32-bit unsigned integer in the repository index.
 within segments is stored in a 32-bit unsigned integer in the repository index.
 
 
+Objects
+~~~~~~~
+
+All objects (the manifest, archives, archive item streams chunks and file data
+chunks) are encrypted and/or compressed. See :ref:`data-encryption` for a
+graphic outlining the anatomy of an object in Borg. The `type` for compression
+is explained in :ref:`data-compression`.
+
 Index, hints and integrity
 Index, hints and integrity
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 
@@ -860,6 +868,8 @@ HashIndex is implemented in C and wrapped with Cython in a class-based interface
 The Cython wrapper checks every passed value against these reserved values and
 The Cython wrapper checks every passed value against these reserved values and
 raises an AssertionError if they are used.
 raises an AssertionError if they are used.
 
 
+.. _data-encryption:
+
 Encryption
 Encryption
 ----------
 ----------
 
 
@@ -961,18 +971,23 @@ key file, wrapped using the standard ``textwrap`` module with a header.
 The header is a single line with a MAGIC string, a space and a hexadecimal
 The header is a single line with a MAGIC string, a space and a hexadecimal
 representation of the repository id.
 representation of the repository id.
 
 
+.. _data-compression:
+
 Compression
 Compression
 -----------
 -----------
 
 
-Borg supports the following compression methods:
+Borg supports the following compression methods, each identified by two bytes:
 
 
-- none (no compression, pass through data 1:1)
-- lz4 (low compression, but super fast)
+- none (no compression, pass through data 1:1), identified by ``\x00\x00``
+- lz4 (low compression, but super fast), identified by ``\x01\x00``
 - zstd (level 1-22 offering a wide range: level 1 is lower compression and high
 - zstd (level 1-22 offering a wide range: level 1 is lower compression and high
-  speed, level 22 is higher compression and lower speed) - since borg 1.1.4
+  speed, level 22 is higher compression and lower speed) - since borg 1.1.4,
+  identified by ``\x03\x00``
 - zlib (level 0-9, level 0 is no compression [but still adding zlib overhead],
 - zlib (level 0-9, level 0 is no compression [but still adding zlib overhead],
-  level 1 is low, level 9 is high compression)
-- lzma (level 0-9, level 0 is low, level 9 is high compression).
+  level 1 is low, level 9 is high compression), identified by a zlib header
+  (``\x.8\x..``)
+- lzma (level 0-9, level 0 is low, level 9 is high compression), identified
+  by ``\x02\x00``.
 
 
 Speed:  none > lz4 > zlib > lzma, lz4 > zstd
 Speed:  none > lz4 > zlib > lzma, lz4 > zstd
 Compression: lzma > zlib > lz4 > none, zstd > lz4
 Compression: lzma > zlib > lz4 > none, zstd > lz4