Xah Lee, 2007-08.
Emacs's regex is not based on Perl or Python's, but it is basically the same, except in emacs regex, the parenthesis characters “()” are literal. If you want to capture a pattern, you need to escape the paren like this: “\(myPattern\)”.
Here are some common patterns:
Pattern Matches ----------------------------------- . any single character \. one period [0-9]+ digit sequence [A-Za-z]+ sequence of letters [_A-Za-z0-9]+ sequence of alphanumeric char and underscore [-A-Za-z0-9]+ sequence of alphanumeric char and hyphen [[:blank:]]+ sequence of tabs and spaces "\([^"]+\)" capture text between double quotes + means match previous pattern 1 or more times * means match previous pattern 0 or more times ? means match previous pattern 0 or 1 time
The above patterns will cover the vast majority of regex uses.
If you are familiar with Perl's regex, here are some practical major difference.
Emacs has a interactive regex mode, so that you can type a pattern and see immediately what it matches, as you type out the regex. To go into the mode, type “Alt+x regexp-builder”.
Alternatively, you can type “Alt+x query-replace-regexp” to test your pattern. Its keyboard shortcut is “Ctrl+Alt+%”.
To test regex in lisp code, you can open a empty file and place this code “(search-forward-regexp "yourRegex")” then place the text you want to match below it. Then, move your cursor right next of the closing parenthesis, then type “Ctrl+x Ctrl+e” (or Alt+eval-last-sexp). If your regex matches, it'll move to the last char of the matched text. If you get a lisp error saying search-failed, then you know your regex didn't match. If you get a lisp syntax error, then you know you probably screwed up on the backslashs.
In a lisp regex function that takes a regex string (Example: “search-forward-regexp”), you will need to use double backslash. This is because, in elisp string, a backslash needs to be prefixed with a backslash, then, this interpreted string is passed to emacs's regex engine.
For example, suppose you have this text “Sin[x] + Sin[y]”, and you need to capture the x or y. You can use “(search-backward-regexp "\\(\\[[a-z]\\]\\)")”. The regex engine really just got “\(\[[a-z]\]\)”.
If you want to match “\n”, you don't need double backslash because “\n” already stand for line return inside a string.
Reference: (info "(emacs)Regexps").
Reference: Elisp Manual: Regular-Expressions.
Related essays:
Page created: 2007-08. © 2007 by Xah Lee.