|
@@ -89,55 +89,56 @@ The check command verifies the consistency of a repository and the corresponding
|
|
|
|
|
|
First, the underlying repository data files are checked:
|
|
|
|
|
|
-- For all segments the segment magic (header) is checked
|
|
|
-- For all objects stored in the segments, all metadata (e.g. crc and size) and
|
|
|
+- For all segments, the segment magic header is checked.
|
|
|
+- For all objects stored in the segments, all metadata (e.g. CRC and size) and
|
|
|
all data is read. The read data is checked by size and CRC. Bit rot and other
|
|
|
types of accidental damage can be detected this way.
|
|
|
-- If we are in repair mode and a integrity error is detected for a segment,
|
|
|
- we try to recover as many objects from the segment as possible.
|
|
|
-- In repair mode, it makes sure that the index is consistent with the data
|
|
|
- stored in the segments.
|
|
|
-- If you use a remote repo server via ssh:, the repo check is executed on the
|
|
|
- repo server without causing significant network traffic.
|
|
|
+- In repair mode, if an integrity error is detected in a segment, try to recover
|
|
|
+ as many objects from the segment as possible.
|
|
|
+- In repair mode, make sure that the index is consistent with the data stored in
|
|
|
+ the segments.
|
|
|
+- If checking a remote repo via ``ssh:``, the repo check is executed on the server
|
|
|
+ without causing significant network traffic.
|
|
|
- The repository check can be skipped using the ``--archives-only`` option.
|
|
|
-- A repository check can be time consuming. Partial checks are possible with the ``--max-duration`` option.
|
|
|
+- A repository check can be time consuming. Partial checks are possible with the
|
|
|
+ ``--max-duration`` option.
|
|
|
|
|
|
Second, the consistency and correctness of the archive metadata is verified:
|
|
|
|
|
|
- Is the repo manifest present? If not, it is rebuilt from archive metadata
|
|
|
chunks (this requires reading and decrypting of all metadata and data).
|
|
|
-- Check if archive metadata chunk is present. if not, remove archive from
|
|
|
- manifest.
|
|
|
+- Check if archive metadata chunk is present; if not, remove archive from manifest.
|
|
|
- For all files (items) in the archive, for all chunks referenced by these
|
|
|
- files, check if chunk is present.
|
|
|
- If a chunk is not present and we are in repair mode, replace it with a same-size
|
|
|
- replacement chunk of zeros.
|
|
|
- If a previously lost chunk reappears (e.g. via a later backup) and we are in
|
|
|
- repair mode, the all-zero replacement chunk will be replaced by the correct chunk.
|
|
|
- This requires reading of archive and file metadata, but not data.
|
|
|
-- If we are in repair mode and we checked all the archives: delete orphaned
|
|
|
- chunks from the repo.
|
|
|
-- if you use a remote repo server via ssh:, the archive check is executed on
|
|
|
- the client machine (because if encryption is enabled, the checks will require
|
|
|
- decryption and this is always done client-side, because key access will be
|
|
|
- required).
|
|
|
-- The archive checks can be time consuming, they can be skipped using the
|
|
|
+ files, check if chunk is present. In repair mode, if a chunk is not present,
|
|
|
+ replace it with a same-size replacement chunk of zeroes. If a previously lost
|
|
|
+ chunk reappears (e.g. via a later backup), in repair mode the all-zero replacement
|
|
|
+ chunk will be replaced by the correct chunk. This requires reading of archive and
|
|
|
+ file metadata, but not data.
|
|
|
+- In repair mode, when all the archives were checked, orphaned chunks are deleted
|
|
|
+ from the repo. One cause of orphaned chunks are input file related errors (like
|
|
|
+ read errors) in the archive creation process.
|
|
|
+- If checking a remote repo via ``ssh:``, the archive check is executed on the
|
|
|
+ client machine because it requires decryption, and this is always done client-side
|
|
|
+ as key access is needed.
|
|
|
+- The archive checks can be time consuming; they can be skipped using the
|
|
|
``--repository-only`` option.
|
|
|
|
|
|
-The ``--max-duration`` option can be used to split a long-running repository check into multiple partial checks.
|
|
|
-After the given number of seconds the check is interrupted. The next partial check will continue where the
|
|
|
-previous one stopped, until the complete repository has been checked. Example: Assuming a full check took 7
|
|
|
-hours, then running a daily check with --max-duration=3600 (1 hour) would result in one full check per week.
|
|
|
+The ``--max-duration`` option can be used to split a long-running repository check
|
|
|
+into multiple partial checks. After the given number of seconds the check is
|
|
|
+interrupted. The next partial check will continue where the previous one stopped,
|
|
|
+until the complete repository has been checked. Example: Assuming a full check took 7
|
|
|
+hours, then running a daily check with --max-duration=3600 (1 hour) resulted in one
|
|
|
+full check per week.
|
|
|
|
|
|
-Attention: Partial checks can only do way less checks than a full check (only the CRC32 checks on segment file
|
|
|
-entries are done) and cannot be combined with ``--repair``. Partial checks may therefore be useful only with very
|
|
|
-large repositories where a full check would take too long. Doing a full repository check aborts a partial check;
|
|
|
-the next partial check will start from the beginning.
|
|
|
+Attention: Partial checks can only do way less checking than a full check (only the
|
|
|
+CRC32 checks on segment file entries are done), and cannot be combined with the
|
|
|
+``--repair`` option. Partial checks may therefore be useful only with very large
|
|
|
+repositories where a full check took too long. Doing a full repository check aborts a
|
|
|
+partial check; the next partial check will restart from the beginning.
|
|
|
|
|
|
The ``--verify-data`` option will perform a full integrity verification (as opposed to
|
|
|
checking the CRC32 of the segment) of data, which means reading the data from the
|
|
|
repository, decrypting and decompressing it. This is a cryptographic verification,
|
|
|
which will detect (accidental) corruption. For encrypted repositories it is
|
|
|
-tamper-resistant as well, unless the attacker has access to the keys.
|
|
|
-
|
|
|
-It is also very slow.
|
|
|
+tamper-resistant as well, unless the attacker has access to the keys. It is also very
|
|
|
+slow.
|