| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514 | 
							- .. include:: global.rst.inc
 
- .. highlight:: none
 
- .. _faq:
 
- Frequently asked questions
 
- ==========================
 
- Can I backup VM disk images?
 
- ----------------------------
 
- Yes, the `deduplication`_ technique used by
 
- |project_name| makes sure only the modified parts of the file are stored.
 
- Also, we have optional simple sparse file support for extract.
 
- If you use non-snapshotting backup tools like Borg to back up virtual machines,
 
- then these should be turned off for doing so. Backing up live VMs this way can (and will)
 
- result in corrupted or inconsistent backup contents: a VM image is just a regular file to
 
- Borg with the same issues as regular files when it comes to concurrent reading and writing from
 
- the same file.
 
- For backing up live VMs use file system snapshots on the VM host, which establishes
 
- crash-consistency for the VM images. This means that with most file systems
 
- (that are journaling) the FS will always be fine in the backup (but may need a
 
- journal replay to become accessible).
 
- Usually this does not mean that file *contents* on the VM are consistent, since file
 
- contents are normally not journaled. Notable exceptions are ext4 in data=journal mode,
 
- ZFS and btrfs (unless nodatacow is used).
 
- Applications designed with crash-consistency in mind (most relational databases
 
- like PostgreSQL, SQLite etc. but also for example Borg repositories) should always
 
- be able to recover to a consistent state from a backup created with
 
- crash-consistent snapshots (even on ext4 with data=writeback or XFS).
 
- Hypervisor snapshots capturing most of the VM's state can also be used for backups
 
- and can be a better alternative to pure file system based snapshots of the VM's disk,
 
- since no state is lost. Depending on the application this can be the easiest and most
 
- reliable way to create application-consistent backups.
 
- Other applications may require a lot of work to reach application-consistency:
 
- It's a broad and complex issue that cannot be explained in entirety here.
 
- Borg doesn't intend to address these issues due to their huge complexity
 
- and platform/software dependency. Combining Borg with the mechanisms provided
 
- by the platform (snapshots, hypervisor features) will be the best approach
 
- to start tackling them.
 
- Can I backup from multiple servers into a single repository?
 
- ------------------------------------------------------------
 
- Yes, but in order for the deduplication used by |project_name| to work, it
 
- needs to keep a local cache containing checksums of all file
 
- chunks already stored in the repository. This cache is stored in
 
- ``~/.cache/borg/``.  If |project_name| detects that a repository has been
 
- modified since the local cache was updated it will need to rebuild
 
- the cache. This rebuild can be quite time consuming.
 
- So, yes it's possible. But it will be most efficient if a single
 
- repository is only modified from one place. Also keep in mind that
 
- |project_name| will keep an exclusive lock on the repository while creating
 
- or deleting archives, which may make *simultaneous* backups fail.
 
- Can I copy or synchronize my repo to another location?
 
- ------------------------------------------------------
 
- Yes, you could just copy all the files. Make sure you do that while no
 
- backup is running. So what you get here is this:
 
- - client machine ---borg create---> repo1
 
- - repo1 ---copy---> repo2
 
- There is no special borg command to do the copying, just use cp or rsync if
 
- you want to do that.
 
- But think about whether that is really what you want. If something goes
 
- wrong in repo1, you will have the same issue in repo2 after the copy.
 
- If you want to have 2 independent backups, it is better to do it like this:
 
- - client machine ---borg create---> repo1
 
- - client machine ---borg create---> repo2
 
- Which file types, attributes, etc. are *not* preserved?
 
- -------------------------------------------------------
 
-     * UNIX domain sockets (because it does not make sense - they are
 
-       meaningless without the running process that created them and the process
 
-       needs to recreate them in any case). So, don't panic if your backup
 
-       misses a UDS!
 
-     * The precise on-disk (or rather: not-on-disk) representation of the holes
 
-       in a sparse file.
 
-       Archive creation has no special support for sparse files, holes are
 
-       backed up as (deduplicated and compressed) runs of zero bytes.
 
-       Archive extraction has optional support to extract all-zero chunks as
 
-       holes in a sparse file.
 
-     * filesystem specific attributes, like ext4 immutable bit, see :issue:`618`.
 
- Are there other known limitations?
 
- ----------------------------------
 
- - A single archive can only reference a limited volume of file/dir metadata,
 
-   usually corresponding to tens or hundreds of millions of files/dirs.
 
-   When trying to go beyond that limit, you will get a fatal IntegrityError
 
-   exception telling that the (archive) object is too big.
 
-   An easy workaround is to create multiple archives with less items each.
 
-   See also the :ref:`archive_limitation` and :issue:`1452`.
 
- Why is my backup bigger than with attic? Why doesn't |project_name| do compression by default?
 
- ----------------------------------------------------------------------------------------------
 
- Attic was rather unflexible when it comes to compression, it always
 
- compressed using zlib level 6 (no way to switch compression off or
 
- adjust the level or algorithm).
 
- |project_name| offers a lot of different compression algorithms and
 
- levels. Which of them is the best for you pretty much depends on your
 
- use case, your data, your hardware -- so you need to do an informed
 
- decision about whether you want to use compression, which algorithm
 
- and which level you want to use. This is why compression defaults to
 
- none.
 
- How can I specify the encryption passphrase programmatically?
 
- -------------------------------------------------------------
 
- The encryption passphrase can be specified programmatically using the
 
- `BORG_PASSPHRASE` environment variable. This is convenient when setting up
 
- automated encrypted backups. Another option is to use
 
- key file based encryption with a blank passphrase. See
 
- :ref:`encrypted_repos` for more details.
 
- .. _password_env:
 
- .. note:: Be careful how you set the environment; using the ``env``
 
-           command, a ``system()`` call or using inline shell scripts
 
-           might expose the credentials in the process list directly
 
-           and they will be readable to all users on a system. Using
 
-           ``export`` in a shell script file should be safe, however, as
 
-           the environment of a process is `accessible only to that
 
-           user
 
-           <https://security.stackexchange.com/questions/14000/environment-variable-accessibility-in-linux/14009#14009>`_.
 
- When backing up to remote encrypted repos, is encryption done locally?
 
- ----------------------------------------------------------------------
 
- Yes, file and directory metadata and data is locally encrypted, before
 
- leaving the local machine. We do not mean the transport layer encryption
 
- by that, but the data/metadata itself. Transport layer encryption (e.g.
 
- when ssh is used as a transport) applies additionally.
 
- When backing up to remote servers, do I have to trust the remote server?
 
- ------------------------------------------------------------------------
 
- Yes and No.
 
- No, as far as data confidentiality is concerned - if you use encryption,
 
- all your files/dirs data and metadata are stored in their encrypted form
 
- into the repository.
 
- Yes, as an attacker with access to the remote server could delete (or
 
- otherwise make unavailable) all your backups.
 
- How can I protect against a hacked backup client?
 
- -------------------------------------------------
 
- Assume you backup your backup client machine C to the backup server S and
 
- C gets hacked. In a simple push setup, the attacker could then use borg on
 
- C to delete all backups residing on S.
 
- These are your options to protect against that:
 
- - Do not allow to permanently delete data from the repo, see :ref:`append_only_mode`.
 
- - Use a pull-mode setup using ``ssh -R``, see :issue:`900`.
 
- - Mount C's filesystem on another machine and then create a backup of it.
 
- - Do not give C filesystem-level access to S.
 
- How can I protect against a hacked backup server?
 
- -------------------------------------------------
 
- Just in case you got the impression that pull-mode backups are way more safe
 
- than push-mode, you also need to consider the case that your backup server S
 
- gets hacked. In case S has access to a lot of clients C, that might bring you
 
- into even bigger trouble than a hacked backup client in the previous FAQ entry.
 
- These are your options to protect against that:
 
- - Use the standard push-mode setup (see also previous FAQ entry).
 
- - Mount (the repo part of) S's filesystem on C.
 
- - Do not give S file-system level access to C.
 
- - Have your backup server at a well protected place (maybe not reachable from
 
-   the internet), configure it safely, apply security updates, monitor it, ...
 
- How can I protect against theft, sabotage, lightning, fire, ...?
 
- ----------------------------------------------------------------
 
- In general: if your only backup medium is nearby the backupped machine and
 
- always connected, you can easily get into trouble: they likely share the same
 
- fate if something goes really wrong.
 
- Thus:
 
- - have multiple backup media
 
- - have media disconnected from network, power, computer
 
- - have media at another place
 
- - have a relatively recent backup on your media
 
- Why do I get "connection closed by remote" after a while?
 
- ---------------------------------------------------------
 
- When doing a backup to a remote server (using a ssh: repo URL), it sometimes
 
- stops after a while (some minutes, hours, ... - not immediately) with
 
- "connection closed by remote" error message. Why?
 
- That's a good question and we are trying to find a good answer in :issue:`636`.
 
- Why am I seeing idle borg serve processes on the repo server?
 
- -------------------------------------------------------------
 
- Maybe the ssh connection between client and server broke down and that was not
 
- yet noticed on the server. Try these settings:
 
- ::
 
-     # /etc/ssh/sshd_config on borg repo server - kill connection to client
 
-     # after ClientAliveCountMax * ClientAliveInterval seconds with no response
 
-     ClientAliveInterval 20
 
-     ClientAliveCountMax 3
 
- If you have multiple borg create ... ; borg create ... commands in a already
 
- serialized way in a single script, you need to give them --lock-wait N (with N
 
- being a bit more than the time the server needs to terminate broken down
 
- connections and release the lock).
 
- The borg cache eats way too much disk space, what can I do?
 
- -----------------------------------------------------------
 
- There is a temporary (but maybe long lived) hack to avoid using lots of disk
 
- space for chunks.archive.d (see :issue:`235` for details):
 
- ::
 
-     # this assumes you are working with the same user as the backup.
 
-     # you can get the REPOID from the "config" file inside the repository.
 
-     cd ~/.cache/borg/<REPOID>
 
-     rm -rf chunks.archive.d ; touch chunks.archive.d
 
- This deletes all the cached archive chunk indexes and replaces the directory
 
- that kept them with a file, so borg won't be able to store anything "in" there
 
- in future.
 
- This has some pros and cons, though:
 
- - much less disk space needs for ~/.cache/borg.
 
- - chunk cache resyncs will be slower as it will have to transfer chunk usage
 
-   metadata for all archives from the repository (which might be slow if your
 
-   repo connection is slow) and it will also have to build the hashtables from
 
-   that data.
 
-   chunk cache resyncs happen e.g. if your repo was written to by another
 
-   machine (if you share same backup repo between multiple machines) or if
 
-   your local chunks cache was lost somehow.
 
- The long term plan to improve this is called "borgception", see :issue:`474`.
 
- If a backup stops mid-way, does the already-backed-up data stay there?
 
- ----------------------------------------------------------------------
 
- Yes, |project_name| supports resuming backups.
 
- During a backup a special checkpoint archive named ``<archive-name>.checkpoint``
 
- is saved every checkpoint interval (the default value for this is 5
 
- minutes) containing all the data backed-up until that point.
 
- Checkpoints only happen between files (so they don't help for interruptions
 
- happening while a very large file is being processed).
 
- This checkpoint archive is a valid archive (all files in it are valid and complete),
 
- but it is only a partial backup (not all files that you wanted to backup are
 
- contained in it). Having it in the repo until a successful, full backup is
 
- completed is useful because it references all the transmitted chunks up
 
- to the checkpoint. This means that in case of an interruption, you only need to
 
- retransfer the data since the last checkpoint.
 
- If a backup was interrupted, you do not need to do any special considerations,
 
- just invoke ``borg create`` as you always do. You may use the same archive name
 
- as in previous attempt or a different one (e.g. if you always include the current
 
- datetime), it does not matter.
 
- |project_name| always does full single-pass backups, so it will start again
 
- from the beginning - but it will be much faster, because some of the data was
 
- already stored into the repo (and is still referenced by the checkpoint
 
- archive), so it does not need to get transmitted and stored again.
 
- Once your backup has finished successfully, you can delete all
 
- ``<archive-name>.checkpoint`` archives.
 
- How can I backup huge file(s) over a unstable connection?
 
- ---------------------------------------------------------
 
- You can use this "split trick" as a workaround for the in-between-files-only
 
- checkpoints (see above), huge files and a instable connection to the repository:
 
- Split the huge file(s) into parts of manageable size (e.g. 100MB) and create
 
- a temporary archive of them. Borg will create checkpoints now more frequently
 
- than if you try to backup the files in their original form (e.g. 100GB).
 
- After that, you can remove the parts again and backup the huge file(s) in
 
- their original form. This will now work a lot faster as a lot of content chunks
 
- are already in the repository.
 
- After you have successfully backed up the huge original file(s), you can remove
 
- the temporary archive you made from the parts.
 
- We realize that this is just a better-than-nothing workaround, see :issue:`1198`
 
- for a potential solution.
 
- Please note that this workaround only helps you for backup, not for restore.
 
- If it crashes with a UnicodeError, what can I do?
 
- -------------------------------------------------
 
- Check if your encoding is set correctly. For most POSIX-like systems, try::
 
-   export LANG=en_US.UTF-8  # or similar, important is correct charset
 
- I can't extract non-ascii filenames by giving them on the commandline!?
 
- -----------------------------------------------------------------------
 
- This might be due to different ways to represent some characters in unicode
 
- or due to other non-ascii encoding issues.
 
- If you run into that, try this:
 
- - avoid the non-ascii characters on the commandline by e.g. extracting
 
-   the parent directory (or even everything)
 
- - mount the repo using FUSE and use some file manager
 
- Can |project_name| add redundancy to the backup data to deal with hardware malfunction?
 
- ---------------------------------------------------------------------------------------
 
- No, it can't. While that at first sounds like a good idea to defend against
 
- some defect HDD sectors or SSD flash blocks, dealing with this in a
 
- reliable way needs a lot of low-level storage layout information and
 
- control which we do not have (and also can't get, even if we wanted).
 
- So, if you need that, consider RAID or a filesystem that offers redundant
 
- storage or just make backups to different locations / different hardware.
 
- See also :issue:`225`.
 
- Can |project_name| verify data integrity of a backup archive?
 
- -------------------------------------------------------------
 
- Yes, if you want to detect accidental data damage (like bit rot), use the
 
- ``check`` operation. It will notice corruption using CRCs and hashes.
 
- If you want to be able to detect malicious tampering also, use an encrypted
 
- repo. It will then be able to check using CRCs and HMACs.
 
- .. _a_status_oddity:
 
- I am seeing 'A' (added) status for a unchanged file!?
 
- -----------------------------------------------------
 
- The files cache is used to determine whether |project_name| already
 
- "knows" / has backed up a file and if so, to skip the file from
 
- chunking. It does intentionally *not* contain files that have a modification
 
- time (mtime) same as the newest mtime in the created archive.
 
- So, if you see an 'A' status for unchanged file(s), they are likely the files
 
- with the most recent mtime in that archive.
 
- This is expected: it is to avoid data loss with files that are backed up from
 
- a snapshot and that are immediately changed after the snapshot (but within
 
- mtime granularity time, so the mtime would not change). Without the code that
 
- removes these files from the files cache, the change that happened right after
 
- the snapshot would not be contained in the next backup as |project_name| would
 
- think the file is unchanged.
 
- This does not affect deduplication, the file will be chunked, but as the chunks
 
- will often be the same and already stored in the repo (except in the above
 
- mentioned rare condition), it will just re-use them as usual and not store new
 
- data chunks.
 
- If you want to avoid unnecessary chunking, just create or touch a small or
 
- empty file in your backup source file set (so that one has the latest mtime,
 
- not your 50GB VM disk image) and, if you do snapshots, do the snapshot after
 
- that.
 
- Since only the files cache is used in the display of files status,
 
- those files are reported as being added when, really, chunks are
 
- already used.
 
- .. _always_chunking:
 
- It always chunks all my files, even unchanged ones!
 
- ---------------------------------------------------
 
- |project_name| maintains a files cache where it remembers the mtime, size and
 
- inode of files. When |project_name| does a new backup and starts processing a
 
- file, it first looks whether the file has changed (compared to the values
 
- stored in the files cache). If the values are the same, the file is assumed
 
- unchanged and thus its contents won't get chunked (again).
 
- |project_name| can't keep an infinite history of files of course, thus entries
 
- in the files cache have a "maximum time to live" which is set via the
 
- environment variable BORG_FILES_CACHE_TTL (and defaults to 20).
 
- Every time you do a backup (on the same machine, using the same user), the
 
- cache entries' ttl values of files that were not "seen" are incremented by 1
 
- and if they reach BORG_FILES_CACHE_TTL, the entry is removed from the cache.
 
- So, for example, if you do daily backups of 26 different data sets A, B,
 
- C, ..., Z on one machine (using the default TTL), the files from A will be
 
- already forgotten when you repeat the same backups on the next day and it
 
- will be slow because it would chunk all the files each time. If you set
 
- BORG_FILES_CACHE_TTL to at least 26 (or maybe even a small multiple of that),
 
- it would be much faster.
 
- Another possible reason is that files don't always have the same path, for
 
- example if you mount a filesystem without stable mount points for each backup.
 
- If the directory where you mount a filesystem is different every time,
 
- |project_name| assume they are different files.
 
- Is there a way to limit bandwidth with |project_name|?
 
- ------------------------------------------------------
 
- There is no command line option to limit bandwidth with |project_name|, but
 
- bandwidth limiting can be accomplished with pipeviewer_:
 
- Create a wrapper script:  /usr/local/bin/pv-wrapper  ::
 
-     #!/bin/bash
 
-         ## -q, --quiet              do not output any transfer information at all
 
-         ## -L, --rate-limit RATE    limit transfer to RATE bytes per second
 
-     export RATE=307200
 
-     pv -q -L $RATE  | "$@"
 
- Add BORG_RSH environment variable to use pipeviewer wrapper script with ssh. ::
 
-     export BORG_RSH='/usr/local/bin/pv-wrapper ssh'
 
- Now |project_name| will be bandwidth limited. Nice thing about pv is that you can change rate-limit on the fly: ::
 
-     pv -R $(pidof pv) -L 102400
 
- .. _pipeviewer: http://www.ivarch.com/programs/pv.shtml
 
- I am having troubles with some network/FUSE/special filesystem, why?
 
- --------------------------------------------------------------------
 
- |project_name| is doing nothing special in the filesystem, it only uses very
 
- common and compatible operations (even the locking is just "mkdir").
 
- So, if you are encountering issues like slowness, corruption or malfunction
 
- when using a specific filesystem, please try if you can reproduce the issues
 
- with a local (non-network) and proven filesystem (like ext4 on Linux).
 
- If you can't reproduce the issue then, you maybe have found an issue within
 
- the filesystem code you used (not with |project_name|). For this case, it is
 
- recommended that you talk to the developers / support of the network fs and
 
- maybe open an issue in their issue tracker. Do not file an issue in the
 
- |project_name| issue tracker.
 
- If you can reproduce the issue with the proven filesystem, please file an
 
- issue in the |project_name| issue tracker about that.
 
- Requirements for the borg single-file binary, esp. (g)libc?
 
- -----------------------------------------------------------
 
- We try to build the binary on old, but still supported systems - to keep the
 
- minimum requirement for the (g)libc low. The (g)libc can't be bundled into
 
- the binary as it needs to fit your kernel and OS, but Python and all other
 
- required libraries will be bundled into the binary.
 
- If your system fulfills the minimum (g)libc requirement (see the README that
 
- is released with the binary), there should be no problem. If you are slightly
 
- below the required version, maybe just try. Due to the dynamic loading (or not
 
- loading) of some shared libraries, it might still work depending on what
 
- libraries are actually loaded and used.
 
- In the borg git repository, there is scripts/glibc_check.py that can determine
 
- (based on the symbols' versions they want to link to) whether a set of given
 
- (Linux) binaries works with a given glibc version.
 
- Why was Borg forked from Attic?
 
- -------------------------------
 
- Borg was created in May 2015 in response to the difficulty of getting new
 
- code or larger changes incorporated into Attic and establishing a bigger
 
- developer community / more open development.
 
- More details can be found in `ticket 217
 
- <https://github.com/jborg/attic/issues/217>`_ that led to the fork.
 
- Borg intends to be:
 
- * simple:
 
-   * as simple as possible, but no simpler
 
-   * do the right thing by default, but offer options
 
- * open:
 
-   * welcome feature requests
 
-   * accept pull requests of good quality and coding style
 
-   * give feedback on PRs that can't be accepted "as is"
 
-   * discuss openly, don't work in the dark
 
- * changing:
 
-   * Borg is not compatible with Attic
 
-   * do not break compatibility accidentally, without a good reason
 
-     or without warning. allow compatibility breaking for other cases.
 
-   * if major version number changes, it may have incompatible changes
 
 
  |