فهرست منبع

docs: add new AEAD modes to security docs

Thomas Waldmann 3 سال پیش
والد
کامیت
f4a6ad080b
2فایلهای تغییر یافته به همراه113 افزوده شده و 3 حذف شده
  1. 28 0
      docs/internals/data-structures.rst
  2. 85 3
      docs/internals/security.rst

+ 28 - 0
docs/internals/data-structures.rst

@@ -865,6 +865,31 @@ Encryption
 
 
 .. seealso:: The :ref:`borgcrypto` section for an in-depth review.
 .. seealso:: The :ref:`borgcrypto` section for an in-depth review.
 
 
+AEAD modes
+~~~~~~~~~~
+
+Uses modern AEAD ciphers: AES-OCB or CHACHA20-POLY1305.
+For each borg invocation, a new sessionkey is derived from the borg key material
+and the 48bit IV starts from 0 again (both ciphers internally add a 32bit counter
+to our IV, so we'll just count up by 1 per chunk).
+
+The chunk layout is best seen at the bottom of this diagram:
+
+.. figure:: encryption-aead.png
+    :figwidth: 100%
+    :width: 100%
+
+No special IV/counter management is needed here due to the use of session keys.
+
+A 48 bit IV is way more than needed: If you only backed up 4kiB chunks (2^12B),
+the IV would "limit" the data encrypted in one session to 2^(12+48)B == 2.3 exabytes,
+meaning you would run against other limitations (RAM, storage, time) way before that.
+In practice, chunks are usually bigger, for big files even much bigger, giving an
+even higher limit.
+
+Legacy modes
+~~~~~~~~~~~~
+
 AES_-256 is used in CTR mode (so no need for padding). A 64 bit initialization
 AES_-256 is used in CTR mode (so no need for padding). A 64 bit initialization
 vector is used, a MAC is computed on the encrypted chunk
 vector is used, a MAC is computed on the encrypted chunk
 and both are stored in the chunk. Encryption and MAC use two different keys.
 and both are stored in the chunk. Encryption and MAC use two different keys.
@@ -884,6 +909,9 @@ To reduce payload size, only 8 bytes of the 16 bytes nonce is saved in the
 payload, the first 8 bytes are always zeros. This does not affect security but
 payload, the first 8 bytes are always zeros. This does not affect security but
 limits the maximum repository capacity to only 295 exabytes (2**64 * 16 bytes).
 limits the maximum repository capacity to only 295 exabytes (2**64 * 16 bytes).
 
 
+Both modes
+~~~~~~~~~~
+
 Encryption keys (and other secrets) are kept either in a key file on the client
 Encryption keys (and other secrets) are kept either in a key file on the client
 ('keyfile' mode) or in the repository config on the server ('repokey' mode).
 ('keyfile' mode) or in the repository config on the server ('repokey' mode).
 In both cases, the secrets are generated from random and then encrypted by a
 In both cases, the secrets are generated from random and then encrypted by a

+ 85 - 3
docs/internals/security.rst

@@ -124,7 +124,88 @@ prompt is a set BORG_PASSPHRASE. See issue :issue:`2169` for details.
 Encryption
 Encryption
 ----------
 ----------
 
 
-Encryption is currently based on the Encrypt-then-MAC construction,
+AEAD modes
+~~~~~~~~~~
+
+Modes: --encryption (repokey|keyfile)-[blake2-](aes-ocb|chacha20-poly1305)
+
+Supported: borg 1.3+
+
+Encryption with these modes is based on AEAD ciphers (authenticated encryption
+with associated data) and session keys.
+
+Depending on the chosen mode (see :ref:`borg_init`) different AEAD ciphers are used:
+
+- AES-256-OCB - super fast, single-pass algorithm IF you have hw accelerated AES.
+- chacha20-poly1305 - very fast, purely software based AEAD cipher.
+
+The chunk ID is derived via a MAC over the plaintext (mac key taken from borg key):
+
+- HMAC-SHA256 - super fast IF you have hw accelerated SHA256.
+- Blake2b - very fast, purely software based algorithm.
+
+For each borg invocation, a new session id is generated by `os.urandom`_.
+
+From that session id, the initial key material (ikm, taken from the borg key)
+and an application and cipher specific salt, borg derives a session key via HKDF.
+
+For each session key, IVs (nonces) are generated by a counter which increments for
+each encrypted message.
+
+Session::
+
+    sessionid = os.urandom(24)
+    ikm = enc_key || enc_hmac_key
+    salt = "borg-session-key-CIPHERNAME"
+    sessionkey = HKDF(ikm, sessionid, salt)
+    message_iv = 0
+
+Encryption::
+
+    id = MAC(id_key, data)
+    compressed = compress(data)
+
+    header = type-byte || 00h || message_iv || sessionid
+    aad = id || header
+    message_iv++
+    encrypted, auth_tag = AEAD_encrypt(session_key, message_iv, compressed, aad)
+    authenticated = header || auth_tag || encrypted
+
+Decryption::
+
+    # Given: input *authenticated* data and a *chunk-id* to assert
+    type-byte, past_message_iv, past_sessionid, auth_tag, encrypted = SPLIT(authenticated)
+
+    ASSERT(type-byte is correct)
+
+    past_key = HKDF(ikm, past_sessionid, salt)
+    decrypted = AEAD_decrypt(past_key, past_message_iv, authenticated)
+
+    decompressed = decompress(decrypted)
+
+    ASSERT( CONSTANT-TIME-COMPARISON( chunk-id, MAC(id_key, decompressed) ) )
+
+Notable:
+
+- More modern and often faster AEAD ciphers instead of self-assembled stuff.
+- Due to the usage of session keys, IVs (nonces) do not need special care here as
+  they did for the legacy encryption modes.
+- The id is now also input into the authentication tag computation.
+  This strongly associates the id with the written data (== associates the key with
+  the value). When later reading the data for some id, authentication will only
+  succeed if what we get was really written by us for that id.
+
+
+Legacy modes
+~~~~~~~~~~~~
+
+Modes: --encryption (repokey|keyfile)-[blake2]
+
+Supported: all borg versions, blake2 since 1.1
+
+DEPRECATED. We strongly suggest you use the safer AEAD modes, see above.
+
+Encryption with these modes is based on the Encrypt-then-MAC construction,
 which is generally seen as the most robust way to create an authenticated
 which is generally seen as the most robust way to create an authenticated
 encryption scheme from encryption and message authentication primitives.
 encryption scheme from encryption and message authentication primitives.
 
 
@@ -253,7 +334,7 @@ Implementations used
 We do not implement cryptographic primitives ourselves, but rely
 We do not implement cryptographic primitives ourselves, but rely
 on widely used libraries providing them:
 on widely used libraries providing them:
 
 
-- AES-CTR and HMAC-SHA-256 from OpenSSL 1.0 / 1.1 are used,
+- AES-CTR, AES-OCB, CHACHA20-POLY1305 and HMAC-SHA-256 from OpenSSL 1.1 are used,
   which is also linked into the static binaries we provide.
   which is also linked into the static binaries we provide.
   We think this is not an additional risk, since we don't ever
   We think this is not an additional risk, since we don't ever
   use OpenSSL's networking, TLS or X.509 code, but only their
   use OpenSSL's networking, TLS or X.509 code, but only their
@@ -268,7 +349,8 @@ on widely used libraries providing them:
 
 
 Implemented cryptographic constructions are:
 Implemented cryptographic constructions are:
 
 
-- Encrypt-then-MAC based on AES-256-CTR and either HMAC-SHA-256
+- AEAD modes: AES-OCB and CHACHA20-POLY1305 are straight from OpenSSL.
+- Legacy modes: Encrypt-then-MAC based on AES-256-CTR and either HMAC-SHA-256
   or keyed BLAKE2b256 as described above under Encryption_.
   or keyed BLAKE2b256 as described above under Encryption_.
 - Encrypt-and-MAC based on AES-256-CTR and HMAC-SHA-256
 - Encrypt-and-MAC based on AES-256-CTR and HMAC-SHA-256
   as described above under `Offline key security`_.
   as described above under `Offline key security`_.