resources.rst.inc 4.5 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495
  1. Resource Usage
  2. ~~~~~~~~~~~~~~
  3. Borg might use a lot of resources depending on the size of the data set it is dealing with.
  4. If one uses Borg in a client/server way (with a ssh: repository),
  5. the resource usage occurs in part on the client and in another part on the
  6. server.
  7. If one uses Borg as a single process (with a filesystem repo),
  8. all the resource usage occurs in that one process, so just add up client +
  9. server to get the approximate resource usage.
  10. CPU client:
  11. - **borg create:** does chunking, hashing, compression, crypto (high CPU usage)
  12. - **chunks cache sync:** quite heavy on CPU, doing lots of hashtable operations.
  13. - **borg extract:** crypto, decompression (medium to high CPU usage)
  14. - **borg check:** similar to extract, but depends on options given.
  15. - **borg prune / borg delete archive:** low to medium CPU usage
  16. - **borg delete repo:** done on the server
  17. It won't go beyond 100% of 1 core as the code is currently single-threaded.
  18. Especially higher zlib and lzma compression levels use significant amounts
  19. of CPU cycles. Crypto might be cheap on the CPU (if hardware accelerated) or
  20. expensive (if not).
  21. CPU server:
  22. It usually doesn't need much CPU, it just deals with the key/value store
  23. (repository) and uses the repository index for that.
  24. borg check: the repository check computes the checksums of all chunks
  25. (medium CPU usage)
  26. borg delete repo: low CPU usage
  27. CPU (only for client/server operation):
  28. When using borg in a client/server way with a ssh:-type repo, the ssh
  29. processes used for the transport layer will need some CPU on the client and
  30. on the server due to the crypto they are doing - esp. if you are pumping
  31. big amounts of data.
  32. Memory (RAM) client:
  33. The chunks index and the files index are read into memory for performance
  34. reasons. Might need big amounts of memory (see below).
  35. Compression, esp. lzma compression with high levels might need substantial
  36. amounts of memory.
  37. Memory (RAM) server:
  38. The server process will load the repository index into memory. Might need
  39. considerable amounts of memory, but less than on the client (see below).
  40. Chunks index (client only):
  41. Proportional to the amount of data chunks in your repo. Lots of chunks
  42. in your repo imply a big chunks index.
  43. It is possible to tweak the chunker params (see create options).
  44. Files index (client only):
  45. Proportional to the amount of files in your last backups. Can be switched
  46. off (see create options), but next backup might be much slower if you do.
  47. The speed benefit of using the files cache is proportional to file size.
  48. Repository index (server only):
  49. Proportional to the amount of data chunks in your repo. Lots of chunks
  50. in your repo imply a big repository index.
  51. It is possible to tweak the chunker params (see create options) to
  52. influence the amount of chunks being created.
  53. Temporary files (client):
  54. Reading data and metadata from a FUSE mounted repository will consume up to
  55. the size of all deduplicated, small chunks in the repository. Big chunks
  56. won't be locally cached.
  57. Temporary files (server):
  58. A non-trivial amount of data will be stored on the remote temp directory
  59. for each client that connects to it. For some remotes, this can fill the
  60. default temporary directory at /tmp. This can be remediated by ensuring the
  61. $TMPDIR, $TEMP, or $TMP environment variable is properly set for the sshd
  62. process.
  63. For some OSes, this can be done just by setting the correct value in the
  64. .bashrc (or equivalent login config file for other shells), however in
  65. other cases it may be necessary to first enable ``PermitUserEnvironment yes``
  66. in your ``sshd_config`` file, then add ``environment="TMPDIR=/my/big/tmpdir"``
  67. at the start of the public key to be used in the ``authorized_hosts`` file.
  68. Cache files (client only):
  69. Contains the chunks index and files index (plus a collection of single-
  70. archive chunk indexes which might need huge amounts of disk space,
  71. depending on archive count and size - see FAQ about how to reduce).
  72. Network (only for client/server operation):
  73. If your repository is remote, all deduplicated (and optionally compressed/
  74. encrypted) data of course has to go over the connection (``ssh://`` repo url).
  75. If you use a locally mounted network filesystem, additionally some copy
  76. operations used for transaction support also go over the connection. If
  77. you back up multiple sources to one target repository, additional traffic
  78. happens for cache resynchronization.