@builder.io/dev-tools-windows-x64 1.19.9 → 1.19.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,1022 @@
1
+ ## User Guide
2
+
3
+ This guide is intended to give an elementary description of ripgrep and an
4
+ overview of its capabilities. This guide assumes that ripgrep is
5
+ [installed](README.md#installation)
6
+ and that readers have passing familiarity with using command line tools. This
7
+ also assumes a Unix-like system, although most commands are probably easily
8
+ translatable to any command line shell environment.
9
+
10
+
11
+ ### Table of Contents
12
+
13
+ * [Basics](#basics)
14
+ * [Recursive search](#recursive-search)
15
+ * [Automatic filtering](#automatic-filtering)
16
+ * [Manual filtering: globs](#manual-filtering-globs)
17
+ * [Manual filtering: file types](#manual-filtering-file-types)
18
+ * [Replacements](#replacements)
19
+ * [Configuration file](#configuration-file)
20
+ * [File encoding](#file-encoding)
21
+ * [Binary data](#binary-data)
22
+ * [Preprocessor](#preprocessor)
23
+ * [Common options](#common-options)
24
+
25
+
26
+ ### Basics
27
+
28
+ ripgrep is a command line tool that searches your files for patterns that
29
+ you give it. ripgrep behaves as if reading each file line by line. If a line
30
+ matches the pattern provided to ripgrep, then that line will be printed. If a
31
+ line does not match the pattern, then the line is not printed.
32
+
33
+ The best way to see how this works is with an example. To show an example, we
34
+ need something to search. Let's try searching ripgrep's source code. First
35
+ grab a ripgrep source archive from
36
+ https://github.com/BurntSushi/ripgrep/archive/0.7.1.zip
37
+ and extract it:
38
+
39
+ ```
40
+ $ curl -LO https://github.com/BurntSushi/ripgrep/archive/0.7.1.zip
41
+ $ unzip 0.7.1.zip
42
+ $ cd ripgrep-0.7.1
43
+ $ ls
44
+ benchsuite grep tests Cargo.toml LICENSE-MIT
45
+ ci ignore wincolor CHANGELOG.md README.md
46
+ complete pkg appveyor.yml compile snapcraft.yaml
47
+ doc src build.rs COPYING UNLICENSE
48
+ globset termcolor Cargo.lock HomebrewFormula
49
+ ```
50
+
51
+ Let's try our first search by looking for all occurrences of the word `fast`
52
+ in `README.md`:
53
+
54
+ ```
55
+ $ rg fast README.md
56
+ 75: faster than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
57
+ 88: color and full Unicode support. Unlike GNU grep, `ripgrep` stays fast while
58
+ 119:### Is it really faster than everything else?
59
+ 124:Summarizing, `ripgrep` is fast because:
60
+ 129: optimizations to make searching very fast.
61
+ ```
62
+
63
+ (**Note:** If you see an error message from ripgrep saying that it didn't
64
+ search any files, then re-run ripgrep with the `--debug` flag. One likely cause
65
+ of this is that you have a `*` rule in a `$HOME/.gitignore` file.)
66
+
67
+ So what happened here? ripgrep read the contents of `README.md`, and for each
68
+ line that contained `fast`, ripgrep printed it to your terminal. ripgrep also
69
+ included the line number for each line by default. If your terminal supports
70
+ colors, then your output might actually look something like this screenshot:
71
+
72
+ [![A screenshot of a sample search ripgrep](https://burntsushi.net/stuff/ripgrep-guide-sample.png)](https://burntsushi.net/stuff/ripgrep-guide-sample.png)
73
+
74
+ In this example, we searched for something called a "literal" string. This
75
+ means that our pattern was just some normal text that we asked ripgrep to
76
+ find. But ripgrep supports the ability to specify patterns via [regular
77
+ expressions](https://en.wikipedia.org/wiki/Regular_expression). As an example,
78
+ what if we wanted to find all lines have a word that contains `fast` followed
79
+ by some number of other letters?
80
+
81
+ ```
82
+ $ rg 'fast\w+' README.md
83
+ 75: faster than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
84
+ 119:### Is it really faster than everything else?
85
+ ```
86
+
87
+ In this example, we used the pattern `fast\w+`. This pattern tells ripgrep to
88
+ look for any lines containing the letters `fast` followed by *one or more*
89
+ word-like characters. Namely, `\w` matches characters that compose words (like
90
+ `a` and `L` but unlike `.` and ` `). The `+` after the `\w` means, "match the
91
+ previous pattern one or more times." This means that the word `fast` won't
92
+ match because there are no word characters following the final `t`. But a word
93
+ like `faster` will. `faste` would also match!
94
+
95
+ Here's a different variation on this same theme:
96
+
97
+ ```
98
+ $ rg 'fast\w*' README.md
99
+ 75: faster than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
100
+ 88: color and full Unicode support. Unlike GNU grep, `ripgrep` stays fast while
101
+ 119:### Is it really faster than everything else?
102
+ 124:Summarizing, `ripgrep` is fast because:
103
+ 129: optimizations to make searching very fast.
104
+ ```
105
+
106
+ In this case, we used `fast\w*` for our pattern instead of `fast\w+`. The `*`
107
+ means that it should match *zero* or more times. In this case, ripgrep will
108
+ print the same lines as the pattern `fast`, but if your terminal supports
109
+ colors, you'll notice that `faster` will be highlighted instead of just the
110
+ `fast` prefix.
111
+
112
+ It is beyond the scope of this guide to provide a full tutorial on regular
113
+ expressions, but ripgrep's specific syntax is documented here:
114
+ https://docs.rs/regex/*/regex/#syntax
115
+
116
+
117
+ ### Recursive search
118
+
119
+ In the previous section, we showed how to use ripgrep to search a single file.
120
+ In this section, we'll show how to use ripgrep to search an entire directory
121
+ of files. In fact, *recursively* searching your current working directory is
122
+ the default mode of operation for ripgrep, which means doing this is very
123
+ simple.
124
+
125
+ Using our unzipped archive of ripgrep source code, here's how to find all
126
+ function definitions whose name is `write`:
127
+
128
+ ```
129
+ $ rg 'fn write\('
130
+ src/printer.rs
131
+ 469: fn write(&mut self, buf: &[u8]) {
132
+
133
+ termcolor/src/lib.rs
134
+ 227: fn write(&mut self, b: &[u8]) -> io::Result<usize> {
135
+ 250: fn write(&mut self, b: &[u8]) -> io::Result<usize> {
136
+ 428: fn write(&mut self, b: &[u8]) -> io::Result<usize> { self.wtr.write(b) }
137
+ 441: fn write(&mut self, b: &[u8]) -> io::Result<usize> { self.wtr.write(b) }
138
+ 454: fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
139
+ 511: fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
140
+ 848: fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
141
+ 915: fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
142
+ 949: fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
143
+ 1114: fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
144
+ 1348: fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
145
+ 1353: fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
146
+ ```
147
+
148
+ (**Note:** We escape the `(` here because `(` has special significance inside
149
+ regular expressions. You could also use `rg -F 'fn write('` to achieve the
150
+ same thing, where `-F` interprets your pattern as a literal string instead of
151
+ a regular expression.)
152
+
153
+ In this example, we didn't specify a file at all. Instead, ripgrep defaulted
154
+ to searching your current directory in the absence of a path. In general,
155
+ `rg foo` is equivalent to `rg foo ./`.
156
+
157
+ This particular search showed us results in both the `src` and `termcolor`
158
+ directories. The `src` directory is the core ripgrep code where as `termcolor`
159
+ is a dependency of ripgrep (and is used by other tools). What if we only wanted
160
+ to search core ripgrep code? Well, that's easy, just specify the directory you
161
+ want:
162
+
163
+ ```
164
+ $ rg 'fn write\(' src
165
+ src/printer.rs
166
+ 469: fn write(&mut self, buf: &[u8]) {
167
+ ```
168
+
169
+ Here, ripgrep limited its search to the `src` directory. Another way of doing
170
+ this search would be to `cd` into the `src` directory and simply use `rg 'fn
171
+ write\('` again.
172
+
173
+
174
+ ### Automatic filtering
175
+
176
+ After recursive search, ripgrep's most important feature is what it *doesn't*
177
+ search. By default, when you search a directory, ripgrep will ignore all of
178
+ the following:
179
+
180
+ 1. Files and directories that match glob patterns in these three categories:
181
+ 1. `.gitignore` globs (including global and repo-specific globs). This
182
+ includes `.gitignore` files in parent directories that are part of the
183
+ same `git` repository. (Unless the `--no-require-git` flag is given.)
184
+ 2. `.ignore` globs, which take precedence over all gitignore globs
185
+ when there's a conflict. This includes `.ignore` files in parent
186
+ directories.
187
+ 3. `.rgignore` globs, which take precedence over all `.ignore` globs
188
+ when there's a conflict. This includes `.rgignore` files in parent
189
+ directories.
190
+ 2. Hidden files and directories.
191
+ 3. Binary files. (ripgrep considers any file with a `NUL` byte to be binary.)
192
+ 4. Symbolic links aren't followed.
193
+
194
+ All of these things can be toggled using various flags provided by ripgrep:
195
+
196
+ 1. You can disable all ignore-related filtering with the `--no-ignore` flag.
197
+ 2. Hidden files and directories can be searched with the `--hidden` (`-.` for
198
+ short) flag.
199
+ 3. Binary files can be searched via the `--text` (`-a` for short) flag.
200
+ Be careful with this flag! Binary files may emit control characters to your
201
+ terminal, which might cause strange behavior.
202
+ 4. ripgrep can follow symlinks with the `--follow` (`-L` for short) flag.
203
+
204
+ As a special convenience, ripgrep also provides a flag called `--unrestricted`
205
+ (`-u` for short). Repeated uses of this flag will cause ripgrep to disable
206
+ more and more of its filtering. That is, `-u` will disable `.gitignore`
207
+ handling, `-uu` will search hidden files and directories and `-uuu` will search
208
+ binary files. This is useful when you're using ripgrep and you aren't sure
209
+ whether its filtering is hiding results from you. Tacking on a couple `-u`
210
+ flags is a quick way to find out. (Use the `--debug` flag if you're still
211
+ perplexed, and if that doesn't help,
212
+ [file an issue](https://github.com/BurntSushi/ripgrep/issues/new).)
213
+
214
+ ripgrep's `.gitignore` handling actually goes a bit beyond just `.gitignore`
215
+ files. ripgrep will also respect repository specific rules found in
216
+ `$GIT_DIR/info/exclude`, as well as any global ignore rules in your
217
+ `core.excludesFile` (which is usually `$XDG_CONFIG_HOME/git/ignore` on
218
+ Unix-like systems).
219
+
220
+ Sometimes you want to search files that are in your `.gitignore`, so it is
221
+ possible to specify additional ignore rules or overrides in a `.ignore`
222
+ (application agnostic) or `.rgignore` (ripgrep specific) file.
223
+
224
+ For example, let's say you have a `.gitignore` file that looks like this:
225
+
226
+ ```
227
+ log/
228
+ ```
229
+
230
+ This generally means that any `log` directory won't be tracked by `git`.
231
+ However, perhaps it contains useful output that you'd like to include in your
232
+ searches, but you still don't want to track it in `git`. You can achieve this
233
+ by creating a `.ignore` file in the same directory as the `.gitignore` file
234
+ with the following contents:
235
+
236
+ ```
237
+ !log/
238
+ ```
239
+
240
+ ripgrep treats `.ignore` files with higher precedence than `.gitignore` files
241
+ (and treats `.rgignore` files with higher precedence than `.ignore` files).
242
+ This means ripgrep will see the `!log/` whitelist rule first and search that
243
+ directory.
244
+
245
+ Like `.gitignore`, a `.ignore` file can be placed in any directory. Its rules
246
+ will be processed with respect to the directory it resides in, just like
247
+ `.gitignore`.
248
+
249
+ To process `.gitignore` and `.ignore` files case insensitively, use the flag
250
+ `--ignore-file-case-insensitive`. This is especially useful on case insensitive
251
+ file systems like those on Windows and macOS. Note though that this can come
252
+ with a significant performance penalty, and is therefore disabled by default.
253
+
254
+ For a more in depth description of how glob patterns in a `.gitignore` file
255
+ are interpreted, please see `man gitignore`.
256
+
257
+
258
+ ### Manual filtering: globs
259
+
260
+ In the previous section, we talked about ripgrep's filtering that it does by
261
+ default. It is "automatic" because it reacts to your environment. That is, it
262
+ uses already existing `.gitignore` files to produce more relevant search
263
+ results.
264
+
265
+ In addition to automatic filtering, ripgrep also provides more manual or ad hoc
266
+ filtering. This comes in two varieties: additional glob patterns specified in
267
+ your ripgrep commands and file type filtering. This section covers glob
268
+ patterns while the next section covers file type filtering.
269
+
270
+ In our ripgrep source code (see [Basics](#basics) for instructions on how to
271
+ get a source archive to search), let's say we wanted to see which things depend
272
+ on `clap`, our argument parser.
273
+
274
+ We could do this:
275
+
276
+ ```
277
+ $ rg clap
278
+ [lots of results]
279
+ ```
280
+
281
+ But this shows us many things, and we're only interested in where we wrote
282
+ `clap` as a dependency. Instead, we could limit ourselves to TOML files, which
283
+ is how dependencies are communicated to Rust's build tool, Cargo:
284
+
285
+ ```
286
+ $ rg clap -g '*.toml'
287
+ Cargo.toml
288
+ 35:clap = "2.26"
289
+ 51:clap = "2.26"
290
+ ```
291
+
292
+ The `-g '*.toml'` syntax says, "make sure every file searched matches this
293
+ glob pattern." Note that we put `'*.toml'` in single quotes to prevent our
294
+ shell from expanding the `*`.
295
+
296
+ If we wanted, we could tell ripgrep to search anything *but* `*.toml` files:
297
+
298
+ ```
299
+ $ rg clap -g '!*.toml'
300
+ [lots of results]
301
+ ```
302
+
303
+ This will give you a lot of results again as above, but they won't include
304
+ files ending with `.toml`. Note that the use of a `!` here to mean "negation"
305
+ is a bit non-standard, but it was chosen to be consistent with how globs in
306
+ `.gitignore` files are written. (Although, the meaning is reversed. In
307
+ `.gitignore` files, a `!` prefix means whitelist, and on the command line, a
308
+ `!` means blacklist.)
309
+
310
+ Globs are interpreted in exactly the same way as `.gitignore` patterns. That
311
+ is, later globs will override earlier globs. For example, the following command
312
+ will search only `*.toml` files:
313
+
314
+ ```
315
+ $ rg clap -g '!*.toml' -g '*.toml'
316
+ ```
317
+
318
+ Interestingly, reversing the order of the globs in this case will match
319
+ nothing, since the presence of at least one non-blacklist glob will institute a
320
+ requirement that every file searched must match at least one glob. In this
321
+ case, the blacklist glob takes precedence over the previous glob and prevents
322
+ any file from being searched at all!
323
+
324
+
325
+ ### Manual filtering: file types
326
+
327
+ Over time, you might notice that you use the same glob patterns over and over.
328
+ For example, you might find yourself doing a lot of searches where you only
329
+ want to see results for Rust files:
330
+
331
+ ```
332
+ $ rg 'fn run' -g '*.rs'
333
+ ```
334
+
335
+ Instead of writing out the glob every time, you can use ripgrep's support for
336
+ file types:
337
+
338
+ ```
339
+ $ rg 'fn run' --type rust
340
+ ```
341
+
342
+ or, more succinctly,
343
+
344
+ ```
345
+ $ rg 'fn run' -trust
346
+ ```
347
+
348
+ The way the `--type` flag functions is simple. It acts as a name that is
349
+ assigned to one or more globs that match the relevant files. This lets you
350
+ write a single type that might encompass a broad range of file extensions. For
351
+ example, if you wanted to search C files, you'd have to check both C source
352
+ files and C header files:
353
+
354
+ ```
355
+ $ rg 'int main' -g '*.{c,h}'
356
+ ```
357
+
358
+ or you could just use the C file type:
359
+
360
+ ```
361
+ $ rg 'int main' -tc
362
+ ```
363
+
364
+ Just as you can write blacklist globs, you can blacklist file types too:
365
+
366
+ ```
367
+ $ rg clap --type-not rust
368
+ ```
369
+
370
+ or, more succinctly,
371
+
372
+ ```
373
+ $ rg clap -Trust
374
+ ```
375
+
376
+ That is, `-t` means "include files of this type" where as `-T` means "exclude
377
+ files of this type."
378
+
379
+ To see the globs that make up a type, run `rg --type-list`:
380
+
381
+ ```
382
+ $ rg --type-list | rg '^make:'
383
+ make: *.mak, *.mk, GNUmakefile, Gnumakefile, Makefile, gnumakefile, makefile
384
+ ```
385
+
386
+ By default, ripgrep comes with a bunch of pre-defined types. Generally, these
387
+ types correspond to well known public formats. But you can define your own
388
+ types as well. For example, perhaps you frequently search "web" files, which
389
+ consist of JavaScript, HTML and CSS:
390
+
391
+ ```
392
+ $ rg --type-add 'web:*.html' --type-add 'web:*.css' --type-add 'web:*.js' -tweb title
393
+ ```
394
+
395
+ or, more succinctly,
396
+
397
+ ```
398
+ $ rg --type-add 'web:*.{html,css,js}' -tweb title
399
+ ```
400
+
401
+ The above command defines a new type, `web`, corresponding to the glob
402
+ `*.{html,css,js}`. It then applies the new filter with `-tweb` and searches for
403
+ the pattern `title`. If you ran
404
+
405
+ ```
406
+ $ rg --type-add 'web:*.{html,css,js}' --type-list
407
+ ```
408
+
409
+ Then you would see your `web` type show up in the list, even though it is not
410
+ part of ripgrep's built-in types.
411
+
412
+ It is important to stress here that the `--type-add` flag only applies to the
413
+ current command. It does not add a new file type and save it somewhere in a
414
+ persistent form. If you want a type to be available in every ripgrep command,
415
+ then you should either create a shell alias:
416
+
417
+ ```
418
+ alias rg="rg --type-add 'web:*.{html,css,js}'"
419
+ ```
420
+
421
+ or add `--type-add=web:*.{html,css,js}` to your ripgrep configuration file.
422
+ ([Configuration files](#configuration-file) are covered in more detail later.)
423
+
424
+ #### The special `all` file type
425
+
426
+ A special option supported by the `--type` flag is `all`. `--type all` looks
427
+ for a match in any of the supported file types listed by `--type-list`,
428
+ including those added on the command line using `--type-add`. It's equivalent
429
+ to the command `rg --type agda --type asciidoc --type asm ...`, where `...`
430
+ stands for a list of `--type` flags for the rest of the types in `--type-list`.
431
+
432
+ As an example, let's suppose you have a shell script in your current directory,
433
+ `my-shell-script`, which includes a shell library, `my-shell-library.bash`.
434
+ Both `rg --type sh` and `rg --type all` would only search for matches in
435
+ `my-shell-library.bash`, not `my-shell-script`, because the globs matched
436
+ by the `sh` file type don't include files without an extension. On the
437
+ other hand, `rg --type-not all` would search `my-shell-script` but not
438
+ `my-shell-library.bash`.
439
+
440
+ ### Replacements
441
+
442
+ ripgrep provides a limited ability to modify its output by replacing matched
443
+ text with some other text. This is easiest to explain with an example. Remember
444
+ when we searched for the word `fast` in ripgrep's README?
445
+
446
+ ```
447
+ $ rg fast README.md
448
+ 75: faster than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
449
+ 88: color and full Unicode support. Unlike GNU grep, `ripgrep` stays fast while
450
+ 119:### Is it really faster than everything else?
451
+ 124:Summarizing, `ripgrep` is fast because:
452
+ 129: optimizations to make searching very fast.
453
+ ```
454
+
455
+ What if we wanted to *replace* all occurrences of `fast` with `FAST`? That's
456
+ easy with ripgrep's `--replace` flag:
457
+
458
+ ```
459
+ $ rg fast README.md --replace FAST
460
+ 75: FASTer than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
461
+ 88: color and full Unicode support. Unlike GNU grep, `ripgrep` stays FAST while
462
+ 119:### Is it really FASTer than everything else?
463
+ 124:Summarizing, `ripgrep` is FAST because:
464
+ 129: optimizations to make searching very FAST.
465
+ ```
466
+
467
+ or, more succinctly,
468
+
469
+ ```
470
+ $ rg fast README.md -r FAST
471
+ [snip]
472
+ ```
473
+
474
+ In essence, the `--replace` flag applies *only* to the matching portion of text
475
+ in the output. If you instead wanted to replace an entire line of text, then
476
+ you need to include the entire line in your match. For example:
477
+
478
+ ```
479
+ $ rg '^.*fast.*$' README.md -r FAST
480
+ 75:FAST
481
+ 88:FAST
482
+ 119:FAST
483
+ 124:FAST
484
+ 129:FAST
485
+ ```
486
+
487
+ Alternatively, you can combine the `--only-matching` (or `-o` for short) with
488
+ the `--replace` flag to achieve the same result:
489
+
490
+ ```
491
+ $ rg fast README.md --only-matching --replace FAST
492
+ 75:FAST
493
+ 88:FAST
494
+ 119:FAST
495
+ 124:FAST
496
+ 129:FAST
497
+ ```
498
+
499
+ or, more succinctly,
500
+
501
+ ```
502
+ $ rg fast README.md -or FAST
503
+ [snip]
504
+ ```
505
+
506
+ Finally, replacements can include capturing groups. For example, let's say
507
+ we wanted to find all occurrences of `fast` followed by another word and
508
+ join them together with a dash. The pattern we might use for that is
509
+ `fast\s+(\w+)`, which matches `fast`, followed by any amount of whitespace,
510
+ followed by any number of "word" characters. We put the `\w+` in a "capturing
511
+ group" (indicated by parentheses) so that we can reference it later in our
512
+ replacement string. For example:
513
+
514
+ ```
515
+ $ rg 'fast\s+(\w+)' README.md -r 'fast-$1'
516
+ 88: color and full Unicode support. Unlike GNU grep, `ripgrep` stays fast-while
517
+ 124:Summarizing, `ripgrep` is fast-because:
518
+ ```
519
+
520
+ Our replacement string here, `fast-$1`, consists of `fast-` followed by the
521
+ contents of the capturing group at index `1`. (Capturing groups actually start
522
+ at index 0, but the `0`th capturing group always corresponds to the entire
523
+ match. The capturing group at index `1` always corresponds to the first
524
+ explicit capturing group found in the regex pattern.)
525
+
526
+ Capturing groups can also be named, which is sometimes more convenient than
527
+ using the indices. For example, the following command is equivalent to the
528
+ above command:
529
+
530
+ ```
531
+ $ rg 'fast\s+(?P<word>\w+)' README.md -r 'fast-$word'
532
+ 88: color and full Unicode support. Unlike GNU grep, `ripgrep` stays fast-while
533
+ 124:Summarizing, `ripgrep` is fast-because:
534
+ ```
535
+
536
+ It is important to note that ripgrep **will never modify your files**. The
537
+ `--replace` flag only controls ripgrep's output. (And there is no flag to let
538
+ you do a replacement in a file.)
539
+
540
+
541
+ ### Configuration file
542
+
543
+ It is possible that ripgrep's default options aren't suitable in every case.
544
+ For that reason, and because shell aliases aren't always convenient, ripgrep
545
+ supports configuration files.
546
+
547
+ Setting up a configuration file is simple. ripgrep will not look in any
548
+ predetermined directory for a config file automatically. Instead, you need to
549
+ set the `RIPGREP_CONFIG_PATH` environment variable to the file path of your
550
+ config file. Once the environment variable is set, open the file and just type
551
+ in the flags you want set automatically. There are only two rules for
552
+ describing the format of the config file:
553
+
554
+ 1. Every line is a shell argument, after trimming whitespace.
555
+ 2. Lines starting with `#` (optionally preceded by any amount of whitespace)
556
+ are ignored.
557
+
558
+ In particular, there is no escaping. Each line is given to ripgrep as a single
559
+ command line argument verbatim.
560
+
561
+ Here's an example of a configuration file, which demonstrates some of the
562
+ formatting peculiarities:
563
+
564
+ ```
565
+ $ cat $HOME/.ripgreprc
566
+ # Don't let ripgrep vomit really long lines to my terminal, and show a preview.
567
+ --max-columns=150
568
+ --max-columns-preview
569
+
570
+ # Add my 'web' type.
571
+ --type-add
572
+ web:*.{html,css,js}*
573
+
574
+ # Search hidden files / directories (e.g. dotfiles) by default
575
+ --hidden
576
+
577
+ # Using glob patterns to include/exclude files or folders
578
+ --glob=!.git/*
579
+
580
+ # or
581
+ --glob
582
+ !.git/*
583
+
584
+ # Set the colors.
585
+ --colors=line:none
586
+ --colors=line:style:bold
587
+
588
+ # Because who cares about case!?
589
+ --smart-case
590
+ ```
591
+
592
+ When we use a flag that has a value, we either put the flag and the value on
593
+ the same line but delimited by an `=` sign (e.g., `--max-columns=150`), or we
594
+ put the flag and the value on two different lines. This is because ripgrep's
595
+ argument parser knows to treat the single argument `--max-columns=150` as a
596
+ flag with a value, but if we had written `--max-columns 150` in our
597
+ configuration file, then ripgrep's argument parser wouldn't know what to do
598
+ with it.
599
+
600
+ Putting the flag and value on different lines is exactly equivalent and is a
601
+ matter of style.
602
+
603
+ Comments are encouraged so that you remember what the config is doing. Empty
604
+ lines are OK too.
605
+
606
+ So let's say you're using the above configuration file, but while you're at a
607
+ terminal, you really want to be able to see lines longer than 150 columns. What
608
+ do you do? Thankfully, all you need to do is pass `--max-columns 0` (or `-M0`
609
+ for short) on the command line, which will override your configuration file's
610
+ setting. This works because ripgrep's configuration file is *prepended* to the
611
+ explicit arguments you give it on the command line. Since flags given later
612
+ override flags given earlier, everything works as expected. This works for most
613
+ other flags as well, and each flag's documentation states which other flags
614
+ override it.
615
+
616
+ If you're confused about what configuration file ripgrep is reading arguments
617
+ from, then running ripgrep with the `--debug` flag should help clarify things.
618
+ The debug output should note what config file is being loaded and the arguments
619
+ that have been read from the configuration.
620
+
621
+ Finally, if you want to make absolutely sure that ripgrep *isn't* reading a
622
+ configuration file, then you can pass the `--no-config` flag, which will always
623
+ prevent ripgrep from reading extraneous configuration from the environment,
624
+ regardless of what other methods of configuration are added to ripgrep in the
625
+ future.
626
+
627
+
628
+ ### File encoding
629
+
630
+ [Text encoding](https://en.wikipedia.org/wiki/Character_encoding) is a complex
631
+ topic, but we can try to summarize its relevancy to ripgrep:
632
+
633
+ * Files are generally just a bundle of bytes. There is no reliable way to know
634
+ their encoding.
635
+ * Either the encoding of the pattern must match the encoding of the files being
636
+ searched, or a form of transcoding must be performed that converts either the
637
+ pattern or the file to the same encoding as the other.
638
+ * ripgrep tends to work best on plain text files, and among plain text files,
639
+ the most popular encodings likely consist of ASCII, latin1 or UTF-8. As
640
+ a special exception, UTF-16 is prevalent in Windows environments
641
+
642
+ In light of the above, here is how ripgrep behaves when `--encoding auto` is
643
+ given, which is the default:
644
+
645
+ * All input is assumed to be ASCII compatible (which means every byte that
646
+ corresponds to an ASCII codepoint actually is an ASCII codepoint). This
647
+ includes ASCII itself, latin1 and UTF-8.
648
+ * ripgrep works best with UTF-8. For example, ripgrep's regular expression
649
+ engine supports Unicode features. Namely, character classes like `\w` will
650
+ match all word characters by Unicode's definition and `.` will match any
651
+ Unicode codepoint instead of any byte. These constructions assume UTF-8,
652
+ so they simply won't match when they come across bytes in a file that aren't
653
+ UTF-8.
654
+ * To handle the UTF-16 case, ripgrep will do something called "BOM sniffing"
655
+ by default. That is, the first three bytes of a file will be read, and if
656
+ they correspond to a UTF-16 BOM, then ripgrep will transcode the contents of
657
+ the file from UTF-16 to UTF-8, and then execute the search on the transcoded
658
+ version of the file. (This incurs a performance penalty since transcoding
659
+ is needed in addition to regex searching.) If the file contains invalid
660
+ UTF-16, then the Unicode replacement codepoint is substituted in place of
661
+ invalid code units.
662
+ * To handle other cases, ripgrep provides a `-E/--encoding` flag, which permits
663
+ you to specify an encoding from the
664
+ [Encoding Standard](https://encoding.spec.whatwg.org/#concept-encoding-get).
665
+ ripgrep will assume *all* files searched are the encoding specified (unless
666
+ the file has a BOM) and will perform a transcoding step just like in the
667
+ UTF-16 case described above.
668
+
669
+ By default, ripgrep will not require its input be valid UTF-8. That is, ripgrep
670
+ can and will search arbitrary bytes. The key here is that if you're searching
671
+ content that isn't UTF-8, then the usefulness of your pattern will degrade. If
672
+ you're searching bytes that aren't ASCII compatible, then it's likely the
673
+ pattern won't find anything. With all that said, this mode of operation is
674
+ important, because it lets you find ASCII or UTF-8 *within* files that are
675
+ otherwise arbitrary bytes.
676
+
677
+ As a special case, the `-E/--encoding` flag supports the value `none`, which
678
+ will completely disable all encoding related logic, including BOM sniffing.
679
+ When `-E/--encoding` is set to `none`, ripgrep will search the raw bytes of
680
+ the underlying file with no transcoding step. For example, here's how you might
681
+ search the raw UTF-16 encoding of the string `Шерлок`:
682
+
683
+ ```
684
+ $ rg '(?-u)\(\x045\x04@\x04;\x04>\x04:\x04' -E none -a some-utf16-file
685
+ ```
686
+
687
+ Of course, that's just an example meant to show how one can drop down into
688
+ raw bytes. Namely, the simpler command works as you might expect automatically:
689
+
690
+ ```
691
+ $ rg 'Шерлок' some-utf16-file
692
+ ```
693
+
694
+ Finally, it is possible to disable ripgrep's Unicode support from within the
695
+ regular expression. For example, let's say you wanted `.` to match any byte
696
+ rather than any Unicode codepoint. (You might want this while searching a
697
+ binary file, since `.` by default will not match invalid UTF-8.) You could do
698
+ this by disabling Unicode via a regular expression flag:
699
+
700
+ ```
701
+ $ rg '(?-u:.)'
702
+ ```
703
+
704
+ This works for any part of the pattern. For example, the following will find
705
+ any Unicode word character followed by any ASCII word character followed by
706
+ another Unicode word character:
707
+
708
+ ```
709
+ $ rg '\w(?-u:\w)\w'
710
+ ```
711
+
712
+
713
+ ### Binary data
714
+
715
+ In addition to skipping hidden files and files in your `.gitignore` by default,
716
+ ripgrep also attempts to skip binary files. ripgrep does this by default
717
+ because binary files (like PDFs or images) are typically not things you want to
718
+ search when searching for regex matches. Moreover, if content in a binary file
719
+ did match, then it's possible for undesirable binary data to be printed to your
720
+ terminal and wreak havoc.
721
+
722
+ Unfortunately, unlike skipping hidden files and respecting your `.gitignore`
723
+ rules, a file cannot as easily be classified as binary. In order to figure out
724
+ whether a file is binary, the most effective heuristic that balances
725
+ correctness with performance is to simply look for `NUL` bytes. At that point,
726
+ the determination is simple: a file is considered "binary" if and only if it
727
+ contains a `NUL` byte somewhere in its contents.
728
+
729
+ The issue is that while most binary files will have a `NUL` byte toward the
730
+ beginning of its contents, this is not necessarily true. The `NUL` byte might
731
+ be the very last byte in a large file, but that file is still considered
732
+ binary. While this leads to a fair amount of complexity inside ripgrep's
733
+ implementation, it also results in some unintuitive user experiences.
734
+
735
+ At a high level, ripgrep operates in three different modes with respect to
736
+ binary files:
737
+
738
+ 1. The default mode is to attempt to remove binary files from a search
739
+ completely. This is meant to mirror how ripgrep removes hidden files and
740
+ files in your `.gitignore` automatically. That is, as soon as a file is
741
+ detected as binary, searching stops. If a match was already printed (because
742
+ it was detected long before a `NUL` byte), then ripgrep will print a warning
743
+ message indicating that the search stopped prematurely. This default mode
744
+ **only applies to files searched by ripgrep as a result of recursive
745
+ directory traversal**, which is consistent with ripgrep's other automatic
746
+ filtering. For example, `rg foo .file` will search `.file` even though it
747
+ is hidden. Similarly, `rg foo binary-file` will search `binary-file` in
748
+ "binary" mode automatically.
749
+ 2. Binary mode is similar to the default mode, except it will not always
750
+ stop searching after it sees a `NUL` byte. Namely, in this mode, ripgrep
751
+ will continue searching a file that is known to be binary until the first
752
+ of two conditions is met: 1) the end of the file has been reached or 2) a
753
+ match is or has been seen. This means that in binary mode, if ripgrep
754
+ reports no matches, then there are no matches in the file. When a match does
755
+ occur, ripgrep prints a message similar to one it prints when in its default
756
+ mode indicating that the search has stopped prematurely. This mode can be
757
+ forcefully enabled for all files with the `--binary` flag. The purpose of
758
+ binary mode is to provide a way to discover matches in all files, but to
759
+ avoid having binary data dumped into your terminal.
760
+ 3. Text mode completely disables all binary detection and searches all files
761
+ as if they were text. This is useful when searching a file that is
762
+ predominantly text but contains a `NUL` byte, or if you are specifically
763
+ trying to search binary data. This mode can be enabled with the `-a/--text`
764
+ flag. Note that when using this mode on very large binary files, it is
765
+ possible for ripgrep to use a lot of memory.
766
+
767
+ Unfortunately, there is one additional complexity in ripgrep that can make it
768
+ difficult to reason about binary files. That is, the way binary detection works
769
+ depends on the way that ripgrep searches your files. Specifically:
770
+
771
+ * When ripgrep uses memory maps, then binary detection is only performed on the
772
+ first few kilobytes of the file in addition to every matching line.
773
+ * When ripgrep doesn't use memory maps, then binary detection is performed on
774
+ all bytes searched.
775
+
776
+ This means that whether a file is detected as binary or not can change based
777
+ on the internal search strategy used by ripgrep. If you prefer to keep
778
+ ripgrep's binary file detection consistent, then you can disable memory maps
779
+ via the `--no-mmap` flag. (The cost will be a small performance regression when
780
+ searching very large files on some platforms.)
781
+
782
+
783
+ ### Preprocessor
784
+
785
+ In ripgrep, a preprocessor is any type of command that can be run to transform
786
+ the input of every file before ripgrep searches it. This makes it possible to
787
+ search virtually any kind of content that can be automatically converted to
788
+ text without having to teach ripgrep how to read said content.
789
+
790
+ One common example is searching PDFs. PDFs are first and foremost meant to be
791
+ displayed to users. But PDFs often have text streams in them that can be useful
792
+ to search. In our case, we want to search Bruce Watson's excellent
793
+ dissertation,
794
+ [Taxonomies and Toolkits of Regular Language Algorithms](https://burntsushi.net/stuff/1995-watson.pdf).
795
+ After downloading it, let's try searching it:
796
+
797
+ ```
798
+ $ rg 'The Commentz-Walter algorithm' 1995-watson.pdf
799
+ $
800
+ ```
801
+
802
+ Surely, a dissertation on regular language algorithms would mention
803
+ Commentz-Walter. Indeed it does, but our search isn't picking it up because
804
+ PDFs are a binary format, and the text shown in the PDF may not be encoded as
805
+ simple contiguous UTF-8. Namely, even passing the `-a/--text` flag to ripgrep
806
+ will not make our search work.
807
+
808
+ One way to fix this is to convert the PDF to plain text first. This won't work
809
+ well for all PDFs, but does great in a lot of cases. (Note that the tool we
810
+ use, `pdftotext`, is part of the [poppler](https://poppler.freedesktop.org)
811
+ PDF rendering library.)
812
+
813
+ ```
814
+ $ pdftotext 1995-watson.pdf > 1995-watson.txt
815
+ $ rg 'The Commentz-Walter algorithm' 1995-watson.txt
816
+ 316:The Commentz-Walter algorithms : : : : : : : : : : : : : : :
817
+ 7165:4.4 The Commentz-Walter algorithms
818
+ 10062:in input string S , we obtain the Boyer-Moore algorithm. The Commentz-Walter algorithm
819
+ 17218:The Commentz-Walter algorithm (and its variants) displayed more interesting behaviour,
820
+ 17249:Aho-Corasick algorithms are used extensively. The Commentz-Walter algorithms are used
821
+ 17297: The Commentz-Walter algorithms (CW). In all versions of the CW algorithms, a common program skeleton is used with di erent shift functions. The CW algorithms are
822
+ ```
823
+
824
+ But having to explicitly convert every file can be a pain, especially when you
825
+ have a directory full of PDF files. Instead, we can use ripgrep's preprocessor
826
+ feature to search the PDF. ripgrep's `--pre` flag works by taking a single
827
+ command name and then executing that command for every file that it searches.
828
+ ripgrep passes the file path as the first and only argument to the command and
829
+ also sends the contents of the file to stdin. So let's write a simple shell
830
+ script that wraps `pdftotext` in a way that conforms to this interface:
831
+
832
+ ```
833
+ $ cat preprocess
834
+ #!/bin/sh
835
+
836
+ exec pdftotext - -
837
+ ```
838
+
839
+ With `preprocess` in the same directory as `1995-watson.pdf`, we can now use it
840
+ to search the PDF:
841
+
842
+ ```
843
+ $ rg --pre ./preprocess 'The Commentz-Walter algorithm' 1995-watson.pdf
844
+ 316:The Commentz-Walter algorithms : : : : : : : : : : : : : : :
845
+ 7165:4.4 The Commentz-Walter algorithms
846
+ 10062:in input string S , we obtain the Boyer-Moore algorithm. The Commentz-Walter algorithm
847
+ 17218:The Commentz-Walter algorithm (and its variants) displayed more interesting behaviour,
848
+ 17249:Aho-Corasick algorithms are used extensively. The Commentz-Walter algorithms are used
849
+ 17297: The Commentz-Walter algorithms (CW). In all versions of the CW algorithms, a common program skeleton is used with di erent shift functions. The CW algorithms are
850
+ ```
851
+
852
+ Note that `preprocess` must be resolvable to a command that ripgrep can read.
853
+ The simplest way to do this is to put your preprocessor command in a directory
854
+ that is in your `PATH` (or equivalent), or otherwise use an absolute path.
855
+
856
+ As a bonus, this turns out to be quite a bit faster than other specialized PDF
857
+ grepping tools:
858
+
859
+ ```
860
+ $ time rg --pre ./preprocess 'The Commentz-Walter algorithm' 1995-watson.pdf -c
861
+ 6
862
+
863
+ real 0.697
864
+ user 0.684
865
+ sys 0.007
866
+ maxmem 16 MB
867
+ faults 0
868
+
869
+ $ time pdfgrep 'The Commentz-Walter algorithm' 1995-watson.pdf -c
870
+ 6
871
+
872
+ real 1.336
873
+ user 1.310
874
+ sys 0.023
875
+ maxmem 16 MB
876
+ faults 0
877
+ ```
878
+
879
+ If you wind up needing to search a lot of PDFs, then ripgrep's parallelism can
880
+ make the speed difference even greater.
881
+
882
+ #### A more robust preprocessor
883
+
884
+ One of the problems with the aforementioned preprocessor is that it will fail
885
+ if you try to search a file that isn't a PDF:
886
+
887
+ ```
888
+ $ echo foo > not-a-pdf
889
+ $ rg --pre ./preprocess 'The Commentz-Walter algorithm' not-a-pdf
890
+ not-a-pdf: preprocessor command failed: '"./preprocess" "not-a-pdf"':
891
+ -------------------------------------------------------------------------------
892
+ Syntax Warning: May not be a PDF file (continuing anyway)
893
+ Syntax Error: Couldn't find trailer dictionary
894
+ Syntax Error: Couldn't find trailer dictionary
895
+ Syntax Error: Couldn't read xref table
896
+ ```
897
+
898
+ To fix this, we can make our preprocessor script a bit more robust by only
899
+ running `pdftotext` when we think the input is a non-empty PDF:
900
+
901
+ ```
902
+ $ cat preprocessor
903
+ #!/bin/sh
904
+
905
+ case "$1" in
906
+ *.pdf)
907
+ # The -s flag ensures that the file is non-empty.
908
+ if [ -s "$1" ]; then
909
+ exec pdftotext - -
910
+ else
911
+ exec cat
912
+ fi
913
+ ;;
914
+ *)
915
+ exec cat
916
+ ;;
917
+ esac
918
+ ```
919
+
920
+ We can even extend our preprocessor to search other kinds of files. Sometimes
921
+ we don't always know the file type from the file name, so we can use the `file`
922
+ utility to "sniff" the type of the file based on its contents:
923
+
924
+ ```
925
+ $ cat processor
926
+ #!/bin/sh
927
+
928
+ case "$1" in
929
+ *.pdf)
930
+ # The -s flag ensures that the file is non-empty.
931
+ if [ -s "$1" ]; then
932
+ exec pdftotext - -
933
+ else
934
+ exec cat
935
+ fi
936
+ ;;
937
+ *)
938
+ case $(file "$1") in
939
+ *Zstandard*)
940
+ exec pzstd -cdq
941
+ ;;
942
+ *)
943
+ exec cat
944
+ ;;
945
+ esac
946
+ ;;
947
+ esac
948
+ ```
949
+
950
+ #### Reducing preprocessor overhead
951
+
952
+ There is one more problem with the above approach: it requires running a
953
+ preprocessor for every single file that ripgrep searches. If every file needs
954
+ a preprocessor, then this is OK. But if most don't, then this can substantially
955
+ slow down searches because of the overhead of launching new processors. You
956
+ can avoid this by telling ripgrep to only invoke the preprocessor when the file
957
+ path matches a glob. For example, consider the performance difference even when
958
+ searching a repository as small as ripgrep's:
959
+
960
+ ```
961
+ $ time rg --pre pre-rg 'fn is_empty' -c
962
+ crates/globset/src/lib.rs:1
963
+ crates/matcher/src/lib.rs:2
964
+ crates/ignore/src/overrides.rs:1
965
+ crates/ignore/src/gitignore.rs:1
966
+ crates/ignore/src/types.rs:1
967
+
968
+ real 0.138
969
+ user 0.485
970
+ sys 0.209
971
+ maxmem 7 MB
972
+ faults 0
973
+
974
+ $ time rg --pre pre-rg --pre-glob '*.pdf' 'fn is_empty' -c
975
+ crates/globset/src/lib.rs:1
976
+ crates/ignore/src/types.rs:1
977
+ crates/ignore/src/gitignore.rs:1
978
+ crates/ignore/src/overrides.rs:1
979
+ crates/matcher/src/lib.rs:2
980
+
981
+ real 0.008
982
+ user 0.010
983
+ sys 0.002
984
+ maxmem 7 MB
985
+ faults 0
986
+ ```
987
+
988
+
989
+ ### Common options
990
+
991
+ ripgrep has a lot of flags. Too many to keep in your head at once. This section
992
+ is intended to give you a sampling of some of the most important and frequently
993
+ used options that will likely impact how you use ripgrep on a regular basis.
994
+
995
+ * `-h`: Show ripgrep's condensed help output.
996
+ * `--help`: Show ripgrep's longer form help output. (Nearly what you'd find in
997
+ ripgrep's man page, so pipe it into a pager!)
998
+ * `-i/--ignore-case`: When searching for a pattern, ignore case differences.
999
+ That is `rg -i fast` matches `fast`, `fASt`, `FAST`, etc.
1000
+ * `-S/--smart-case`: This is similar to `--ignore-case`, but disables itself
1001
+ if the pattern contains any uppercase letters. Usually this flag is put into
1002
+ alias or a config file.
1003
+ * `-F/--fixed-strings`: Disable regular expression matching and treat the pattern
1004
+ as a literal string.
1005
+ * `-w/--word-regexp`: Require that all matches of the pattern be surrounded
1006
+ by word boundaries. That is, given `pattern`, the `--word-regexp` flag will
1007
+ cause ripgrep to behave as if `pattern` were actually `\b(?:pattern)\b`.
1008
+ * `-c/--count`: Report a count of total matched lines.
1009
+ * `--files`: Print the files that ripgrep *would* search, but don't actually
1010
+ search them.
1011
+ * `-a/--text`: Search binary files as if they were plain text.
1012
+ * `-U/--multiline`: Permit matches to span multiple lines.
1013
+ * `-z/--search-zip`: Search compressed files (gzip, bzip2, lzma, xz, lz4,
1014
+ brotli, zstd). This is disabled by default.
1015
+ * `-C/--context`: Show the lines surrounding a match.
1016
+ * `--sort path`: Force ripgrep to sort its output by file name. (This disables
1017
+ parallelism, so it might be slower.)
1018
+ * `-L/--follow`: Follow symbolic links while recursively searching.
1019
+ * `-M/--max-columns`: Limit the length of lines printed by ripgrep.
1020
+ * `--debug`: Shows ripgrep's debug output. This is useful for understanding
1021
+ why a particular file might be ignored from search, or what kinds of
1022
+ configuration ripgrep is loading from the environment.