浏览代码

Merge pull request #6741 from fantasya-pbem/docs/5310_overhaul-help-patterns

docs: overhaul borg help patterns, fixes #5310
TW 3 年之前
父节点
当前提交
4ff0a29209
共有 1 个文件被更改,包括 115 次插入102 次删除
  1. 115 102
      src/borg/archiver.py

+ 115 - 102
src/borg/archiver.py

@@ -2485,40 +2485,35 @@ class Archiver:
 
 
     helptext = collections.OrderedDict()
     helptext = collections.OrderedDict()
     helptext['patterns'] = textwrap.dedent('''
     helptext['patterns'] = textwrap.dedent('''
-        The path/filenames used as input for the pattern matching start from the
-        currently active recursion root. You usually give the recursion root(s)
-        when invoking borg and these can be either relative or absolute paths.
-
-        If you give `/absolute/` as root, the paths going into the matcher will
-        look relative like `absolute/.../file.ext`, because file paths in Borg
-        archives are always stored normalized and relative. This means that e.g.
-        ``borg create /path/to/repo ../some/path`` will store all files as
-        `some/path/.../file.ext` and ``borg create /path/to/repo /home/user``
-        will store all files as `home/user/.../file.ext`.
-
-        A directory exclusion pattern can end either with or without a slash ('/').
-        If it ends with a slash, such as `some/path/`, the directory will be
-        included but not its content. If it does not end with a slash, such as
-        `some/path`, both the directory and content will be excluded.
-
-        File patterns support these styles: fnmatch, shell, regular expressions,
-        path prefixes and path full-matches. By default, fnmatch is used for
-        ``--exclude`` patterns and shell-style is used for the ``--pattern``
-        option. For commands that support patterns in their ``PATH`` argument
-        like (``borg list``), the default pattern is path prefix.
-
-        Starting with Borg 1.2, discovered fs paths are normalised, have leading
-        slashes removed and then are matched against your patterns.
-        Note: You need to review your include / exclude patterns and make
-        sure they do not expect leading slashes. Borg can only deal with this
-        for some very simple patterns by removing leading slashes there also.
-
-        If followed by a colon (':') the first two characters of a pattern are
-        used as a style selector. Explicit style selection is necessary when a
-        non-default style is desired or when the desired pattern starts with
-        two alphanumeric characters followed by a colon (i.e. `aa:something/*`).
-
-        `Fnmatch <https://docs.python.org/3/library/fnmatch.html>`_, selector `fm:`
+        When specifying one or more file paths in a Borg command that supports
+        patterns for the respective option or argument, you can apply the
+        patterns described here to include only desired files and/or exclude
+        unwanted ones. Patterns can be used
+
+        - for ``--exclude`` option,
+        - in the file given with ``--exclude-from`` option,
+        - for ``--pattern`` option,
+        - in the file given with ``--patterns-from`` option and
+        - for ``PATH`` arguments that explicitly support them.
+
+        Borg always stores all file paths normalized and relative to the
+        current recursion root. The recursion root is also named ``PATH`` in
+        Borg commands like `borg create` that do a file discovery, so do not
+        confuse the root with the ``PATH`` argument of e.g. `borg extract`.
+
+        Starting with Borg 1.2, paths that are matched against patterns always
+        appear relative. If you give ``/absolute/`` as root, the paths going
+        into the matcher will look relative like ``absolute/.../file.ext``.
+        If you give ``../some/path`` as root, the paths will look like
+        ``some/path/.../file.ext``.
+
+        File patterns support five different styles. If followed by a colon ':',
+        the first two characters of a pattern are used as a style selector.
+        Explicit style selection is necessary if a non-default style is desired
+        or when the desired pattern starts with two alphanumeric characters
+        followed by a colon (i.e. ``aa:something/*``).
+
+        `Fnmatch <https://docs.python.org/3/library/fnmatch.html>`_, selector ``fm:``
             This is the default style for ``--exclude`` and ``--exclude-from``.
             This is the default style for ``--exclude`` and ``--exclude-from``.
             These patterns use a variant of shell pattern syntax, with '\\*' matching
             These patterns use a variant of shell pattern syntax, with '\\*' matching
             any number of characters, '?' matching any single character, '[...]'
             any number of characters, '?' matching any single character, '[...]'
@@ -2526,7 +2521,7 @@ class Archiver:
             matching any character not specified. For the purpose of these patterns,
             matching any character not specified. For the purpose of these patterns,
             the path separator (backslash for Windows and '/' on other systems) is not
             the path separator (backslash for Windows and '/' on other systems) is not
             treated specially. Wrap meta-characters in brackets for a literal
             treated specially. Wrap meta-characters in brackets for a literal
-            match (i.e. `[?]` to match the literal character `?`). For a path
+            match (i.e. ``[?]`` to match the literal character '?'). For a path
             to match a pattern, the full path must match, or it must match
             to match a pattern, the full path must match, or it must match
             from the start of the full path to just before a path separator. Except
             from the start of the full path to just before a path separator. Except
             for the root path, paths will never end in the path separator when
             for the root path, paths will never end in the path separator when
@@ -2534,33 +2529,31 @@ class Archiver:
             separator, a '\\*' is appended before matching is attempted. A leading
             separator, a '\\*' is appended before matching is attempted. A leading
             path separator is always removed.
             path separator is always removed.
 
 
-        Shell-style patterns, selector `sh:`
+        Shell-style patterns, selector ``sh:``
             This is the default style for ``--pattern`` and ``--patterns-from``.
             This is the default style for ``--pattern`` and ``--patterns-from``.
             Like fnmatch patterns these are similar to shell patterns. The difference
             Like fnmatch patterns these are similar to shell patterns. The difference
-            is that the pattern may include `**/` for matching zero or more directory
-            levels, `*` for matching zero or more arbitrary characters with the
+            is that the pattern may include ``**/`` for matching zero or more directory
+            levels, ``*`` for matching zero or more arbitrary characters with the
             exception of any path separator. A leading path separator is always removed.
             exception of any path separator. A leading path separator is always removed.
 
 
-        Regular expressions, selector `re:`
-            Regular expressions similar to those found in Perl are supported. Unlike
-            shell patterns regular expressions are not required to match the full
+        `Regular expressions <https://docs.python.org/3/library/re.html>`_, selector ``re:``
+            Unlike shell patterns, regular expressions are not required to match the full
             path and any substring match is sufficient. It is strongly recommended to
             path and any substring match is sufficient. It is strongly recommended to
             anchor patterns to the start ('^'), to the end ('$') or both. Path
             anchor patterns to the start ('^'), to the end ('$') or both. Path
             separators (backslash for Windows and '/' on other systems) in paths are
             separators (backslash for Windows and '/' on other systems) in paths are
-            always normalized to a forward slash ('/') before applying a pattern. The
-            regular expression syntax is described in the `Python documentation for
-            the re module <https://docs.python.org/3/library/re.html>`_.
+            always normalized to a forward slash '/' before applying a pattern.
 
 
-        Path prefix, selector `pp:`
+        Path prefix, selector ``pp:``
             This pattern style is useful to match whole sub-directories. The pattern
             This pattern style is useful to match whole sub-directories. The pattern
-            `pp:root/somedir` matches `root/somedir` and everything therein. A leading
-            path separator is always removed.
+            ``pp:root/somedir`` matches ``root/somedir`` and everything therein.
+            A leading path separator is always removed.
 
 
-        Path full-match, selector `pf:`
+        Path full-match, selector ``pf:``
             This pattern style is (only) useful to match full paths.
             This pattern style is (only) useful to match full paths.
             This is kind of a pseudo pattern as it can not have any variable or
             This is kind of a pseudo pattern as it can not have any variable or
-            unspecified parts - the full path must be given. `pf:root/file.ext` matches
-            `root/file.ext` only. A leading path separator is always removed.
+            unspecified parts - the full path must be given. ``pf:root/file.ext``
+            matches ``root/file.ext`` only. A leading path separator is always
+            removed.
 
 
             Implementation note: this is implemented via very time-efficient O(1)
             Implementation note: this is implemented via very time-efficient O(1)
             hashtable lookups (this means you can have huge amounts of such patterns
             hashtable lookups (this means you can have huge amounts of such patterns
@@ -2573,20 +2566,20 @@ class Archiver:
 
 
         .. note::
         .. note::
 
 
-            `re:`, `sh:` and `fm:` patterns are all implemented on top of the Python SRE
-            engine. It is very easy to formulate patterns for each of these types which
-            requires an inordinate amount of time to match paths. If untrusted users
-            are able to supply patterns, ensure they cannot supply `re:` patterns.
-            Further, ensure that `sh:` and `fm:` patterns only contain a handful of
-            wildcards at most.
+            ``re:``, ``sh:`` and ``fm:`` patterns are all implemented on top of
+            the Python SRE engine. It is very easy to formulate patterns for each
+            of these types which requires an inordinate amount of time to match
+            paths. If untrusted users are able to supply patterns, ensure they
+            cannot supply ``re:`` patterns. Further, ensure that ``sh:`` and
+            ``fm:`` patterns only contain a handful of wildcards at most.
 
 
         Exclusions can be passed via the command line option ``--exclude``. When used
         Exclusions can be passed via the command line option ``--exclude``. When used
         from within a shell, the patterns should be quoted to protect them from
         from within a shell, the patterns should be quoted to protect them from
         expansion.
         expansion.
 
 
         The ``--exclude-from`` option permits loading exclusion patterns from a text
         The ``--exclude-from`` option permits loading exclusion patterns from a text
-        file with one pattern per line. Lines empty or starting with the number sign
-        ('#') after removing whitespace on both ends are ignored. The optional style
+        file with one pattern per line. Lines empty or starting with the hash sign
+        '#' after removing whitespace on both ends are ignored. The optional style
         selector prefix is also supported for patterns loaded from a file. Due to
         selector prefix is also supported for patterns loaded from a file. Due to
         whitespace removal, paths with whitespace at the beginning or end can only be
         whitespace removal, paths with whitespace at the beginning or end can only be
         excluded using regular expressions.
         excluded using regular expressions.
@@ -2597,21 +2590,21 @@ class Archiver:
         Examples::
         Examples::
 
 
             # Exclude '/home/user/file.o' but not '/home/user/file.odt':
             # Exclude '/home/user/file.o' but not '/home/user/file.odt':
-            $ borg create -e '*.o' backup /
+            $ borg create -e '*.o' /path/to/repo::archive /
 
 
             # Exclude '/home/user/junk' and '/home/user/subdir/junk' but
             # Exclude '/home/user/junk' and '/home/user/subdir/junk' but
             # not '/home/user/importantjunk' or '/etc/junk':
             # not '/home/user/importantjunk' or '/etc/junk':
-            $ borg create -e 'home/*/junk' backup /
+            $ borg create -e 'home/*/junk' /path/to/repo::archive /
 
 
             # Exclude the contents of '/home/user/cache' but not the directory itself:
             # Exclude the contents of '/home/user/cache' but not the directory itself:
-            $ borg create -e home/user/cache/ backup /
+            $ borg create -e home/user/cache/ /path/to/repo::archive /
 
 
             # The file '/home/user/cache/important' is *not* backed up:
             # The file '/home/user/cache/important' is *not* backed up:
-            $ borg create -e home/user/cache/ backup / /home/user/cache/important
+            $ borg create -e home/user/cache/ /path/to/repo::archive / /home/user/cache/important
 
 
             # The contents of directories in '/home' are not backed up when their name
             # The contents of directories in '/home' are not backed up when their name
             # ends in '.tmp'
             # ends in '.tmp'
-            $ borg create --exclude 're:^home/[^/]+\\.tmp/' backup /
+            $ borg create --exclude 're:^home/[^/]+\\.tmp/' /path/to/repo::archive /
 
 
             # Load exclusions from file
             # Load exclusions from file
             $ cat >exclude.txt <<EOF
             $ cat >exclude.txt <<EOF
@@ -2624,36 +2617,56 @@ class Archiver:
             # Example with spaces, no need to escape as it is processed by borg
             # Example with spaces, no need to escape as it is processed by borg
             some file with spaces.txt
             some file with spaces.txt
             EOF
             EOF
-            $ borg create --exclude-from exclude.txt backup /
+            $ borg create --exclude-from exclude.txt /path/to/repo::archive /
 
 
-        A more general and easier to use way to define filename matching patterns exists
-        with the ``--pattern`` and ``--patterns-from`` options. Using these, you may
-        specify the backup roots (starting points) and patterns for inclusion/exclusion.
-        A root path starts with the prefix `R`, followed by a path (a plain path, not a
-        file pattern). An include rule starts with the prefix +, an exclude rule starts
-        with the prefix -, an exclude-norecurse rule starts with !, all followed by a pattern.
+        A more general and easier to use way to define filename matching patterns
+        exists with the ``--pattern`` and ``--patterns-from`` options. Using
+        these, you may specify the backup roots, default pattern styles and
+        patterns for inclusion and exclusion.
 
 
-        .. note::
+        Root path prefix ``R``
+            A recursion root path starts with the prefix ``R``, followed by a path
+            (a plain path, not a file pattern). Use this prefix to have the root
+            paths in the patterns file rather than as command line arguments.
 
 
-            Via ``--pattern`` or ``--patterns-from`` you can define BOTH inclusion and exclusion
-            of files using pattern prefixes ``+`` and ``-``. With ``--exclude`` and
-            ``--exclude-from`` ONLY excludes are defined.
+        Pattern style prefix ``P``
+            To change the default pattern style, use the ``P`` prefix, followed by
+            the pattern style abbreviation (``fm``, ``pf``, ``pp``, ``re``, ``sh``).
+            All patterns following this line will use this style until another style
+            is specified.
 
 
-        Inclusion patterns are useful to include paths that are contained in an excluded
-        path. The first matching pattern is used so if an include pattern matches before
-        an exclude pattern, the file is backed up. If an exclude-norecurse pattern matches
-        a directory, it won't recurse into it and won't discover any potential matches for
-        include rules below that directory.
+        Exclude pattern prefix ``-``
+            Use the prefix ``-``, followed by a pattern, to define an exclusion.
+            This has the same effect as the ``--exclude`` option.
 
 
-        .. note::
+        Exclude no-recurse pattern prefix ``!``
+            Use the prefix ``!``, followed by a pattern, to define an exclusion
+            that does not recurse into subdirectories. This saves time, but
+            prevents include patterns to match any files in subdirectories.
+
+        Include pattern prefix ``+``
+            Use the prefix ``+``, followed by a pattern, to define inclusions.
+            This is useful to include paths that are covered in an exclude
+            pattern and would otherwise not be backed up.
+
+        The first matching pattern is used, so if an include pattern matches
+        before an exclude pattern, the file is backed up. Note that a no-recurse
+        exclude stops examination of subdirectories so that potential includes
+        will not match - use normal exludes for such use cases.
+
+        **Tip: You can easily test your patterns with --dry-run and  --list**::
 
 
-            It's possible that a sub-directory/file is matched while parent directories are not.
-            In that case, parent directories are not backed up thus their user, group, permission,
-            etc. can not be restored.
+            $ borg create --dry-run --list --patterns-from patterns.txt /path/to/repo::archive
 
 
-        Note that the default pattern style for ``--pattern`` and ``--patterns-from`` is
-        shell style (`sh:`), so those patterns behave similar to rsync include/exclude
-        patterns. The pattern style can be set via the `P` prefix.
+        This will list the considered files one per line, prefixed with a
+        character that indicates the action (e.g. 'x' for excluding, see
+        **Item flags** in `borg create` usage docs).
+
+        .. note::
+
+            It's possible that a sub-directory/file is matched while parent
+            directories are not. In that case, parent directories are not backed
+            up and thus their user, group, permission, etc. cannot be restored.
 
 
         Patterns (``--pattern``) and excludes (``--exclude``) from the command line are
         Patterns (``--pattern``) and excludes (``--exclude``) from the command line are
         considered first (in the order of appearance). Then patterns from ``--patterns-from``
         considered first (in the order of appearance). Then patterns from ``--patterns-from``
@@ -2663,44 +2676,44 @@ class Archiver:
 
 
             # backup pics, but not the ones from 2018, except the good ones:
             # backup pics, but not the ones from 2018, except the good ones:
             # note: using = is essential to avoid cmdline argument parsing issues.
             # note: using = is essential to avoid cmdline argument parsing issues.
-            borg create --pattern=+pics/2018/good --pattern=-pics/2018 repo::arch pics
+            borg create --pattern=+pics/2018/good --pattern=-pics/2018 /path/to/repo::archive pics
 
 
-            # use a file with patterns:
-            borg create --patterns-from patterns.lst repo::arch
+            # backup only JPG/JPEG files (case insensitive) in all home directories:
+            borg create --pattern '+ re:\\.jpe?g(?i)$' /path/to/repo::archive /home
+
+            # backup homes, but exclude big downloads (like .ISO files) or hidden files:
+            borg create --exclude 're:\\.iso(?i)$' --exclude 'sh:home/**/.*' /path/to/repo::archive /home
+
+            # use a file with patterns (recursion root '/' via command line):
+            borg create --patterns-from patterns.lst /path/to/repo::archive /
 
 
         The patterns.lst file could look like that::
         The patterns.lst file could look like that::
 
 
-            # "sh:" pattern style is the default, so the following line is not needed:
-            P sh
-            R /
-            # can be rebuild
+            # "sh:" pattern style is the default
+            # exclude caches
             - home/*/.cache
             - home/*/.cache
-            # they're downloads for a reason
-            - home/*/Downloads
-            # susan is a nice person
             # include susans home
             # include susans home
             + home/susan
             + home/susan
             # also back up this exact file
             # also back up this exact file
             + pf:home/bobby/specialfile.txt
             + pf:home/bobby/specialfile.txt
             # don't backup the other home directories
             # don't backup the other home directories
             - home/*
             - home/*
-            # don't even look in /proc
-            ! proc
+            # don't even look in /dev, /proc, /run, /sys, /tmp (note: would exclude files like /device, too)
+            ! re:^(dev|proc|run|sys|tmp)
 
 
         You can specify recursion roots either on the command line or in a patternfile::
         You can specify recursion roots either on the command line or in a patternfile::
 
 
             # these two commands do the same thing
             # these two commands do the same thing
-            borg create --exclude home/bobby/junk repo::arch /home/bobby /home/susan
-            borg create --patterns-from patternfile.lst repo::arch
+            borg create --exclude home/bobby/junk /path/to/repo::archive /home/bobby /home/susan
+            borg create --patterns-from patternfile.lst /path/to/repo::archive
 
 
-        The patternfile::
+        patternfile.lst::
 
 
             # note that excludes use fm: by default and patternfiles use sh: by default.
             # note that excludes use fm: by default and patternfiles use sh: by default.
             # therefore, we need to specify fm: to have the same exact behavior.
             # therefore, we need to specify fm: to have the same exact behavior.
             P fm
             P fm
             R /home/bobby
             R /home/bobby
             R /home/susan
             R /home/susan
-
             - home/bobby/junk
             - home/bobby/junk
 
 
         This allows you to share the same patterns between multiple repositories
         This allows you to share the same patterns between multiple repositories