Xah Lee, 2008-06
This page collects some basic emacs lisp programing patterns related to text processing.
This page is grouped into 2 sections: Interactive and Batch. The Interactive section focuses on idioms of writing the type of commands you call when actively editing. For example, google search the word under cursor, replace certain words in current region, insert XML template, rename a given function in the current programing project, etc. The Batch section focuses on batch style text processing, typically the type of tasks one would do in unix shell tools or Perl. For example, find and replace on a list of given files or dir, run XML validation on a bunch of files.
The idioms on this page contains only very basic, covering most frequently needed cases, suitable for beginning elisp programer.
You should first be familiar with basic emacs functions that get cursor position, move cursor, search text, inserting and deleting text. See Emacs Lisp Basic Text-editing Functions.
This is the typical template for all user-defined emacs functions.
(defun my-function () "... do this returns that ..." (interactive) (let (localVar1 localVar2 ...) (setq localVar1 ...) (setq localVar2 ...) ... ;; do something ... ) )
; get the string from buffer (setq myStr (buffer-substring myStartPos myEndPos)) (setq myStr (buffer-substring-no-properties myStartPos myEndPos))
Emacs's string can have properties for the purposes of syntax coloring, active button, hypertext, etc. The “buffer-substring-no-properties” function just return a plain string without these properties. However, most functions that takes string as argument can also accept a string that has properties.
Reference: Elisp Manual: Buffer-Contents.
Grabbing the current word, line, sentence, url, file name etc.
; grab a thing at point. The “thing” is a semantic unit. It can be a ; word, symbol, line, sentence, filename, url and others. ; grab the current word (setq myStr (thing-at-point 'word)) ; grab the current word with hyphens or underscore (setq myStr (thing-at-point 'symbol)) ; grab the current line (setq myStr (thing-at-point 'line)) ; grab the start and end positions of a word (or any other thing) (setq myBoundaries (bounds-of-thing-at-point 'word))
Note that, when the thing is a “symbol”, it usually means any alphanumeric sequence with dash “-” or underscore “_” characters. For example, if you are writing php reference lookup command, and the cursor is on p in “ print_r($y);”, you want to grab the whole “print_r” not just “print”. The exact meaning of symbol depends on the mode's Syntax Table.
Reference: Elisp Manual: Syntax-Tables.
Here's a example of php reference lookup command that grabs by “symbol” if there's no active region.
(defun php-lookup () "Look up current word in PHP ref site in a browser.\n If a region is active (a phrase), lookup that phrase." (interactive) (let (myword myurl) (setq myword (if (and transient-mark-mode mark-active) (buffer-substring-no-properties (region-beginning) (region-end)) (thing-at-point 'symbol))) (setq myurl (concat "http://us.php.net/" myword)) (browse-url myurl)))
Reference: Elisp Manual: Buffer-Contents.
Grab the current text between delimiters such as between angle brackets “<>”, parens “()”, double quotes “""”, etc.
The trick is to use skip-chars-backward and forward. In the following example, the p1 is set to the position of the double quote to the left of cursor, and p2 is set to the position on the double quote to the right of the cursor.
(defun select-inside-quotes () "Select text between double straight quotes on each side of cursor." (interactive) (let (p1 p2) (skip-chars-backward "^\"") (setq p1 (point)) (skip-chars-forward "^\"") (setq p2 (point)) (goto-char p1) (push-mark p2) (setq mark-active t) ) )
If you want to grab text inside parens, you can change the “"^\""” to “"^("” and “"^)"”. Similar for matching brakets “<>”. However, note that this code does not consider nested matching pairs.
Idiom for a command works on the current region's text.
Let your function have 2 parameters, conventionally named “start” and “end”, then use “(interactive "r")”, then the parameters will be filled with beginning and ending positions of the region. Example:
(defun remove-hard-wrap-region (start end) "Replace newline chars in region by single spaces." (interactive "r") (let ((fill-column 90002000)) (fill-region start end)))
Idiom for acting on the region, if there's one, else, on the current word or thing.
(defun down-case-word-or-region () "Make current word or region into lower case." (interactive) (let (pos1 pos2) (if (and transient-mark-mode mark-active) (setq pos1 (region-beginning) pos2 (region-end)) (setq pos1 (car (bounds-of-thing-at-point 'word)) pos2 (cdr (bounds-of-thing-at-point 'word)))) (downcase-region pos1 pos2) ) )
Idiom for promping user for input as the argument to your command.
Use this code “(interactive "‹code›‹promp string›")”. Example:
(defun query-friends-phone (name) "..." (interactive "sEnter friend's name: ") (message "Name: %s" name) )
What the “(interactive "sEnter friend's name:")” does is that it will ask user to input something, taken as string, and becomes the value of your command's parameter.
The “interactive” can be used to get other types of input. Here are some basic examples of using “interactive”.
The prompt text can follow the single-letter code string.
If your function takes multiple inputs, you can promp user multiple times, using a single call “interactive”, by joining the promp code string with “\n” in between, like this:
(defun query-friends-phone (name age) "..." (interactive "sEnter friend's name: \nnEnter friend's age: ") (message "Name: %s, Age: %d" name age) )
Reference: Elisp Manual: Defining-Commands.
The “interactive” function is useful for getting arguments for your command. But sometimes you need to promp user in the middle of a program. For example, “Make change to this file?”. You can use “y-or-n-p” function. Like this:
(if (y-or-n-p "Do it?") (progn ;; code to do something here ) (progn ;; code if user answered no. ) )
The y-or-n-p will ask the user to type a “y” or “n” character. You can also use “yes-and-no-p”, which forces user to type full “yes” and “no” to answer. This can be used for example when you want to confirm deleting files.
Reference: Elisp Manual: Yes-or-No-Queries.
If you need more general mechanism for getting user input, you'll need to use “read-from-minibuffer”. This can be useful for example, when you want use features like completitions or input history.
Reference: Elisp Manual: Text-from-Minibuffer.
In a scripting language, such as perl, there are tens of functions that act on string. In elisp, there are only a handful, because elisp provides a buffer datatype that's far more powerful, and you have literally thousands of functions that acts on text in a buffer. When you have a string or text, and you need to do more than just getting substring or number of chars, put it in a temp buffer. Here's a example:
;; suppose mystr is a var whose value is a string (setq mystr "some string here you need to process") (setq mystr (with-temp-buffer (insert mystr) ;; manipulate the string here (buffer-string) ; get result ))
Find and Replace string is the hallmark of text processing. Here's how you do it.
; idiom for string replacement in current buffer; ; use search-forward-regexp if you need regexp (goto-char (point-min)) (while (search-forward "myStr1" nil t) (replace-match "myReplaceStr1")) (goto-char (point-min)) (while (search-forward "myStr2" nil t) (replace-match "myReplaceStr2")) ;; repeat for other string pairs
To apply a function to marked files in dired, use “dired-get-marked-files”, like this:
;; idiom for processing a list of files in dired's marked files ;; suppose myProcessFile is your function that takes a file path ;; and do some processing on the file (defun dired-myProcessFile () "apply myProcessFile function to marked files in dired." (interactive) (require 'dired) (mapc 'myProcessFile (dired-get-marked-files)) )
Open a file, process it, save, close it.
; open a file, process it, save, close it (defun my-process-file (fpath) "process the file at fullpath fpath ..." (let (mybuffer) (setq mybuffer (find-file fpath)) (buffer-disable-undo) ;; no need undo (goto-char (point-min)) ;; in case buffer already open ;; do something (save-buffer) (kill-buffer mybuffer)))
For processing hundreds of files, you don't need emacs to keep undo info or fontification. This is more efficient:
(defun my-process-file (fpath) "process the file at fullpath fpath ..." (let () ;; create temp buffer without undo record. first space is necessary (set-buffer (get-buffer-create " myTemp")) (insert-file-contents fpath nil nil nil t) ;; process it ... (kill-buffer " myTemp")))
; idiom for calling a shell command (shell-command "cp /somepath/myfile.txt /somepath") ; idiom for calling a shell command and get its output (shell-command-to-string "ls")
Both shell-command and shell-command-to-string will wait for the shell process to finish before continuing. To not wait, use start-process or start-process-shell-command.
Reference: Elisp Manual: Asynchronous-Processes.
In the following, my-process-file is a function that takes a file full path as input. The find-lisp-find-files will generate a list of full paths, using a regex on file name. The “mapc” will apply the function to elements in a list.
; idiom for traversing a directory (require 'find-lisp) (mapc 'my-process-file (find-lisp-find-files "~/web/emacs/" "\\.html$"))
You can run a elisp program in the the Operating System's command line interface (shell), using the “--script” option. For example:
emacs --script process_log.el
Emacs has few other options and variations to control how you run a elisp script. Here's a table of main options:
| full option name | meaning |
|---|---|
| --no-site-file | Do not load the site wide “site-start.el” |
| --no-init-file | Do not load your init files “~/.emacs” or “default.el”. |
| --batch | Run emacs in batch mode, use it together with “--load” to specify a lisp file. This implies “--no-init-file” but not “--no-site-file”. |
| --script ‹file path› | Run emacs like “--batch” with “--load” set to “‹file path›”. |
| --load="‹elisp file path›" | Execute the elisp file at “‹elisp file path›”. |
| --user=‹user name› | Load user ‹user name›'s emacs init file (the “.emacs”). |
When you write a elisp script to run in batch, make sure your elisp file is self-contained, doesn't call functions in your emacs init file, call to load all libraries it needs (using “require” or “load”), has necessary load path set in the script (e.g. “(add-to-list 'load-path ‹lib path›)”), just like you would with a Perl or Python script.
If you've done a clean job in your elisp script, then, all you need to use is “emacs --script ‹elisp file path›”.
If your elisp program requires functions that you've defined in your emacs init file (the “.emacs”), then you should explicitly load it in your script by “(load ‹emacs init file path›)”, or, you can add the option to load it, like this: “--user=xah”. (best to actually pull out the function you need)
If you are on a Mac with Carbon Emacs or Aquamacs, your emacs program will be be like this “/Applications/Emacs.app/Contents/MacOS/Emacs” instead of just “emacs”. The following is a example of a variation
/Applications/Emacs.app/Contents/MacOS/Emacs --no-site-file --batch --load=process_log.el
Reference: (info "(emacs)Option Index").
Related essays:
