Thomas Waldmann пре 1 недеља
родитељ
комит
d23704e112
3 измењених фајлова са 23 додато и 6 уклоњено
  1. 2 2
      docs/internals.rst
  2. 11 0
      docs/internals/data-structures.rst
  3. 10 4
      docs/internals/security.rst

+ 2 - 2
docs/internals.rst

@@ -19,8 +19,8 @@ specified when the backup was performed.
 Deduplication is performed globally across all data in the repository
 (multiple backups and even multiple hosts), both on data and file
 metadata, using :ref:`chunks` created by the chunker using the
-Buzhash_ algorithm ("buzhash" chunker) or a simpler fixed blocksize
-algorithm ("fixed" chunker).
+Buzhash_ algorithm ("buzhash" and "buzhash64" chunker) or a simpler
+fixed blocksize algorithm ("fixed" chunker).
 
 To perform the repository-wide deduplication, a hash of each
 chunk is checked against the :ref:`chunks cache <cache>`, which is a

+ 11 - 0
docs/internals/data-structures.rst

@@ -399,6 +399,7 @@ Borg has these chunkers:
   supporting a header block of different size.
 - "buzhash": variable, content-defined blocksize, uses a rolling hash
   computed by the Buzhash_ algorithm.
+- "buzhash64": similar to "buzhash", but improved 64bit implementation
 
 For some more general usage hints see also ``--chunker-params``.
 
@@ -469,6 +470,16 @@ for the repository, and stored encrypted in the keyfile. This is to prevent
 chunk size based fingerprinting attacks on your encrypted repo contents (to
 guess what files you have based on a specific set of chunk sizes).
 
+"buzhash64" chunker
++++++++++++++++++++
+
+Similar to "buzhash", but using 64bit wide hash values.
+
+The buzhash table is cryptographically derived from secret key material.
+
+These changes should improve resistance against attacks and also solve
+some of the issues of the original (32bit / XORed table) implementation.
+
 .. _cache:
 
 The cache

+ 10 - 4
docs/internals/security.rst

@@ -361,13 +361,19 @@ The chunks stored in the repo are the (compressed, encrypted and authenticated)
 output of the chunker. The sizes of these stored chunks are influenced by the
 compression, encryption and authentication.
 
-buzhash chunker
-~~~~~~~~~~~~~~~
+buzhash and buzhash64 chunker
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-The buzhash chunker chunks according to the input data, the chunker's
-parameters and the secret chunker seed (which all influence the chunk boundary
+The buzhash chunkers chunk according to the input data, the chunker's
+parameters and secret key material (which all influence the chunk boundary
 positions).
 
+Secret key material:
+
+- "buzhash": chunker seed (32bits), used for XORing the hardcoded buzhash table
+- "buzhash64": bh64_key (256bits) is derived from ID key, used to cryptographically
+  generate the table.
+
 Small files below some specific threshold (default: 512 KiB) result in only one
 chunk (identical content / size as the original file), bigger files result in
 multiple chunks.