Parcourir la source

document archive limitation, #1452

Thomas Waldmann il y a 9 ans
Parent
commit
c834b2969c
2 fichiers modifiés avec 41 ajouts et 2 suppressions
  1. 11 0
      docs/faq.rst
  2. 30 2
      docs/internals.rst

+ 11 - 0
docs/faq.rst

@@ -62,6 +62,17 @@ Which file types, attributes, etc. are *not* preserved?
       holes in a sparse file.
     * filesystem specific attributes, like ext4 immutable bit, see :issue:`618`.
 
+Are there other known limitations?
+----------------------------------
+
+- A single archive can only reference a limited volume of file/dir metadata,
+  usually corresponding to tens or hundreds of millions of files/dirs.
+  When trying to go beyond that limit, you will get a fatal IntegrityError
+  exception telling that the (archive) object is too big.
+  An easy workaround is to create multiple archives with less items each.
+  See also the :ref:`archive_limitation` and :issue:`1452`.
+
+
 Why is my backup bigger than with attic? Why doesn't |project_name| do compression by default?
 ----------------------------------------------------------------------------------------------
 

+ 30 - 2
docs/internals.rst

@@ -160,12 +160,40 @@ object that contains:
 
 * version
 * name
-* list of chunks containing item metadata
+* list of chunks containing item metadata (size: count * ~40B)
 * cmdline
 * hostname
 * username
 * time
 
+.. _archive_limitation:
+
+Note about archive limitations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The archive is currently stored as a single object in the repository
+and thus limited in size to MAX_OBJECT_SIZE (20MiB).
+
+As one chunk list entry is ~40B, that means we can reference ~500.000 item
+metadata stream chunks per archive.
+
+Each item metadata stream chunk is ~128kiB (see hardcoded ITEMS_CHUNKER_PARAMS).
+
+So that means the whole item metadata stream is limited to ~64GiB chunks.
+If compression is used, the amount of storable metadata is bigger - by the
+compression factor.
+
+If the medium size of an item entry is 100B (small size file, no ACLs/xattrs),
+that means a limit of ~640 million files/directories per archive.
+
+If the medium size of an item entry is 2kB (~100MB size files or more
+ACLs/xattrs), the limit will be ~32 million files/directories per archive.
+
+If one tries to create an archive object bigger than MAX_OBJECT_SIZE, a fatal
+IntegrityError will be raised.
+
+A workaround is to create multiple archives with less items each, see
+also :issue:`1452`.
 
 The Item
 --------
@@ -174,7 +202,7 @@ Each item represents a file, directory or other fs item and is stored as an
 ``item`` dictionary that contains:
 
 * path
-* list of data chunks
+* list of data chunks (size: count * ~40B)
 * user
 * group
 * uid