Переглянути джерело

Merge pull request #6425 from m3nu/issue/6305/performance-faq

Add discussion and note on file attrs.
TW 3 роки тому
батько
коміт
68d5e33b06
1 змінених файлів з 50 додано та 10 видалено
  1. 50 10
      docs/faq.rst

+ 50 - 10
docs/faq.rst

@@ -880,13 +880,30 @@ If you run into that, try this:
 What's the expected backup performance?
 What's the expected backup performance?
 ---------------------------------------
 ---------------------------------------
 
 
-A first backup will usually be somehow "slow" because there is a lot of data
-to process. Performance here depends on a lot of factors, so it is hard to
-give specific numbers.
+Compared to simply copying files (e.g. with ``rsync``), Borg has more work to do.
+This can make creation of the first archive slower, but saves time 
+and disk space on subsequent runs. Here what Borg does when you run ``borg create``:
+
+- Borg chunks the file (using the relatively expensive buzhash algorithm)
+- It then computes the "id" of the chunk (hmac-sha256 (often slow, except 
+  if your CPU has sha256 acceleration) or blake2b (fast, in software))
+- Then it checks whether this chunk is already in the repo (local hashtable lookup, 
+  fast). If so, the processing of the chunk is completed here. Otherwise it needs to 
+  process the chunk:
+- Compresses (the default lz4 is super fast)
+- Encrypts (AES, usually fast if your CPU has AES acceleration as usual
+  since about 10y)
+- Authenticates ("signs") using hmac-sha256 or blake2b (see above),
+- Transmits to repo. If the repo is remote, this usually involves an SSH connection
+  (does its own encryption / authentication).
+- Stores the chunk into a key/value store (the key is the chunk id, the value 
+  is the data). While doing that, it computes a CRC32 of the data (repo low-level
+  checksum, used by borg check --repository) and also updates the repo index 
+  (another hashtable).
 
 
 Subsequent backups are usually very fast if most files are unchanged and only
 Subsequent backups are usually very fast if most files are unchanged and only
 a few are new or modified. The high performance on unchanged files primarily depends
 a few are new or modified. The high performance on unchanged files primarily depends
-only on a few factors (like fs recursion + metadata reading performance and the
+only on a few factors (like FS recursion + metadata reading performance and the
 files cache working as expected) and much less on other factors.
 files cache working as expected) and much less on other factors.
 
 
 E.g., for this setup:
 E.g., for this setup:
@@ -904,14 +921,37 @@ few FAQ entries below.
 
 
 .. _slow_backup:
 .. _slow_backup:
 
 
-Why is backup slow for me?
+Why is my backup so slow?
 --------------------------
 --------------------------
 
 
-So, if you feel your Borg backup is too slow somehow, you should find out why.
-
-The usual way to approach this is to add ``--list --filter=AME --stats`` to your
-``borg create`` call to produce more log output, including a file list (with file status
-characters) and also some statistics at the end of the backup.
+If you feel your Borg backup is too slow somehow, here is what you can do:
+
+- Make sure Borg has enough RAM (depends on how big your repo is / how many
+  files you have)
+- Use one of the blake2 modes for --encryption except if you positively know 
+  your CPU (and openssl) accelerates sha256 (then stay with hmac-sha256).
+- Don't use any expensive compression. The default is lz4 and super fast.
+  Uncompressed is often slower than lz4.
+- Just wait. You can also interrupt it and start it again as often as you like,
+  it will converge against a valid "completed" state (see ``--checkpoint-interval``,
+  maybe use the default, but in any case don't make it too short). It is starting
+  from the beginning each time, but it is still faster then as it does not store 
+  data into the repo which it already has there from last checkpoint.
+- If you don’t need additional file attributes, you can disable them with ``--noflags``,
+  ``--noacls``, ``--noxattrs``. This can lead to noticable performance improvements
+  when your backup consists of many small files.
+
+If you feel that Borg "freezes" on a file, it could be in the middle of processing a
+large file (like ISOs or VM images). Borg < 1.2 announces file names *after* finishing
+with the file. This can lead to displaying the name of a small file, while processing the
+next (larger) file. For very big files this can lead to the progress display show some
+previous short file for a long time while it processes the big one. With Borg 1.2 this 
+was changed to announcing the filename before starting to process it.
+
+To see what files have changed and take more time processing, you can also add
+``--list --filter=AME --stats`` to your ``borg create`` call to produce more log output, 
+including a file list (with file status characters) and also some statistics at 
+the end of the backup.
 
 
 Then you do the backup and look at the log output:
 Then you do the backup and look at the log output: