Next: Standard Syntax Tables, Previous: Motion and Syntax, Up: Syntax Tables
Here are several functions for parsing and scanning balanced expressions, also known as sexps. Basically, a sexp is either a balanced parenthetical grouping, or a symbol name (a sequence of characters whose syntax is either word constituent or symbol constituent). However, characters whose syntax is expression prefix are treated as part of the sexp if they appear next to it.
The syntax table controls the interpretation of characters, so these functions can be used for Lisp expressions when in Lisp mode and for C expressions when in C mode. See List Motion, for convenient higher-level functions for moving over balanced expressions.
A syntax table only describes how each character changes the state of the parser, rather than describing the state itself. For example, a string delimiter character toggles the parser state between “in-string” and “in-code” but the characters inside the string do not have any particular syntax to identify them as such. For example (note that 15 is the syntax code for generic string delimiters),
(put-text-property 1 9 'syntax-table '(15 . nil))
does not tell Emacs that the first eight chars of the current buffer are a string, but rather that they are all string delimiters. As a result, Emacs treats them as four consecutive empty string constants.
Every time you use the parser, you specify it a starting state as
well as a starting position. If you omit the starting state, the
default is “top level in parenthesis structure,” as it would be at
the beginning of a function definition. (This is the case for
forward-sexp, which blindly assumes that the starting point is
in such a state.)
This function parses a sexp in the current buffer starting at start, not scanning past limit. It stops at position limit or when certain criteria described below are met, and sets point to the location where parsing stops. It returns a value describing the status of the parse at the point where it stops.
If state is
nil, start is assumed to be at the top level of parenthesis structure, such as the beginning of a function definition. Alternatively, you might wish to resume parsing in the middle of the structure. To do this, you must provide a state argument that describes the initial status of parsing.If the third argument target-depth is non-
nil, parsing stops if the depth in parentheses becomes equal to target-depth. The depth starts at 0, or at whatever is given in state.If the fourth argument stop-before is non-
nil, parsing stops when it comes to any character that starts a sexp. If stop-comment is non-nil, parsing stops when it comes to the start of a comment. If stop-comment is the symbolsyntax-table, parsing stops after the start of a comment or a string, or the end of a comment or a string, whichever comes first.The fifth argument state is a ten-element list of the same form as the value of this function, described below. The return value of one call may be used to initialize the state of the parse on another call to
parse-partial-sexp.The result is a list of ten elements describing the final state of the parse:
- The depth in parentheses, counting from 0. Warning: this can be negative if there are more close parens than open parens between the start of the defun and point.
- The character position of the start of the innermost parenthetical grouping containing the stopping point;
nilif none.- The character position of the start of the last complete subexpression terminated;
nilif none.- Non-
nilif inside a string. More precisely, this is the character that will terminate the string, ortif a generic string delimiter character should terminate it.tif inside a comment (of either style), or the comment nesting level if inside a kind of comment that can be nested.tif point is just after a quote character.- The minimum parenthesis depth encountered during this scan.
- What kind of comment is active:
nilfor a comment of style “a” or when not inside a comment,tfor a comment of style “b,” andsyntax-tablefor a comment that should be ended by a generic comment delimiter character.- The string or comment start position. While inside a comment, this is the position where the comment began; while inside a string, this is the position where the string began. When outside of strings and comments, this element is
nil.- Internal data for continuing the parsing. The meaning of this data is subject to change; it is used if you pass this list as the state argument to another call.
Elements 1, 2, and 6 are ignored in the argument state. Element 8 is used only to set the corresponding element of the return value, in certain simple cases. Element 9 is used only to set element 1 of the return value, in trivial cases where parsing starts and stops within the same pair of parentheses.
This function is most often used to compute indentation for languages that have nested parentheses.
This function returns the state that the parser would have at position pos, if it were started with a default start state at the beginning of the buffer. Thus, it is equivalent to
(parse-partial-sexp (point-min)pos), except thatsyntax-ppssuses a cache to speed up the computation. Also, the 2nd value (previous complete subexpression) and 6th value (minimum parenthesis depth) of the returned state are not meaningful.
This function flushes the cache used by
syntax-ppss, starting at position beg.When
syntax-ppssis called, it automatically hooks itself tobefore-change-functionsto keep its cache consistent. But this can fail ifsyntax-ppssis called whilebefore-change-functionsis temporarily let-bound, or if the buffer is modified without obeying the hook, such as when usinginhibit-modification-hooks. For this reason, it is sometimes necessary to flush the cache manually.
If this is non-
nil, it should be a function that moves to an earlier buffer position where the parser state is equivalent tonil—in other words, a position outside of any comment, string, or parenthesis.syntax-ppssuses it to supplement its cache.
This function scans forward count balanced parenthetical groupings from position from. It returns the position where the scan stops. If count is negative, the scan moves backwards.
If depth is nonzero, parenthesis depth counting begins from that value. The only candidates for stopping are places where the depth in parentheses becomes zero;
scan-listscounts count such places and then stops. Thus, a positive value for depth means go out depth levels of parenthesis.Scanning ignores comments if
parse-sexp-ignore-commentsis non-nil.If the scan reaches the beginning or end of the buffer (or its accessible portion), and the depth is not zero, an error is signaled. If the depth is zero but the count is not used up,
nilis returned.
This function scans forward count sexps from position from. It returns the position where the scan stops. If count is negative, the scan moves backwards.
Scanning ignores comments if
parse-sexp-ignore-commentsis non-nil.If the scan reaches the beginning or end of (the accessible part of) the buffer while in the middle of a parenthetical grouping, an error is signaled. If it reaches the beginning or end between groupings but before count is used up,
nilis returned.
If this variable is non-
nil,scan-sexpstreats all non-ASCII characters as symbol constituents regardless of what the syntax table says about them. (However, text properties can still override the syntax.)
If the value is non-
nil, then comments are treated as whitespace by the functions in this section and byforward-sexp.
The behavior of parse-partial-sexp is also affected by
parse-sexp-lookup-properties (see Syntax Properties).
You can use forward-comment to move forward or backward over
one comment or several comments.
This function moves point forward across count complete comments (that is, including the starting delimiter and the terminating delimiter if any), plus any whitespace encountered on the way. It moves backward if count is negative. If it encounters anything other than a comment or whitespace, it stops, leaving point at the place where it stopped. This includes (for instance) finding the end of a comment when moving forward and expecting the beginning of one. The function also stops immediately after moving over the specified number of complete comments. If count comments are found as expected, with nothing except whitespace between them, it returns
t; otherwise it returnsnil.This function cannot tell whether the “comments” it traverses are embedded within a string. If they look like comments, it treats them as comments.
To move forward over all comments and whitespace following point, use
(forward-comment (buffer-size)). (buffer-size) is a good
argument to use, because the number of comments in the buffer cannot
exceed that many.
