瀏覽代碼

Change documentation inaccuracy on chunk size.

We know use only "target chunk size" when speaking of the chunk size
that is expected to happen most of the time. This removes statistical
and mathematical inacurracies that could be troublesome for mathematical
people.

Fixes #5336
Guinness 4 年之前
父節點
當前提交
61c92110e6
共有 1 個文件被更改,包括 3 次插入3 次删除
  1. 3 3
      docs/internals/data-structures.rst

+ 3 - 3
docs/internals/data-structures.rst

@@ -608,8 +608,8 @@ default is not to have a differently sized header chunk).
 "buzhash" chunker
 +++++++++++++++++
 
-The buzhash chunker triggers (chunks) when the last HASH_MASK_BITS bits of
-the hash are zero, producing chunks of 2^HASH_MASK_BITS Bytes on average.
+The buzhash chunker triggers (chunks) when the last HASH_MASK_BITS bits of the
+hash are zero, producing chunks with a target size of 2^HASH_MASK_BITS Bytes.
 
 Buzhash is **only** used for cutting the chunks at places defined by the
 content, the buzhash value is **not** used as the deduplication criteria (we
@@ -621,7 +621,7 @@ can be used to tune the chunker parameters, the default is:
 
 - CHUNK_MIN_EXP = 19 (minimum chunk size = 2^19 B = 512 kiB)
 - CHUNK_MAX_EXP = 23 (maximum chunk size = 2^23 B = 8 MiB)
-- HASH_MASK_BITS = 21 (statistical medium chunk size ~= 2^21 B = 2 MiB)
+- HASH_MASK_BITS = 21 (target chunk size ~= 2^21 B = 2 MiB)
 - HASH_WINDOW_SIZE = 4095 [B] (`0xFFF`)
 
 The buzhash table is altered by XORing it with a seed randomly generated once