Chapter 19: Putting It All Together: Notes on the Structure of Lojban Texts

13. Erasure: SI, SA, SU

The following cmavo are discussed in this section:

si  SI  erase word
sa  SA  erase phrase
su  SU  erase discourse

The cmavo “si” (of selma'o SI) is a metalinguistic operator that erases the preceding word, as if it had never been spoken:

✥13.1    ti gerku si mlatu
This is-a-dog, er, is-a-cat.

means the same thing as “ti mlatu”. Multiple “si” cmavo in succession erase the appropriate number of words:

✥13.2    ta blanu zdani si si xekri zdani
That is-a-blue house, er, er, is-a-black house.

In order to erase the word “zo”, it is necessary to use three “si” cmavo in a row:

✥13.3    zo .bab. se cmene zo si si si la bab.
The-word “Bob” is-the-name-of the word “si”, er, er, Bob.

The first use of “si” does not erase anything, but completes the “zo” quotation. Two more “si” cmavo are then necessary to erase the first “si” and the “zo”.

Incorrect names can likewise cause trouble with “si”:

✥13.4    mi tavla fo la .esperanto
    si si .esperanton.
I talk in-language that-named “and” “speranto”,
    er, er, Esperanto.

The Lojbanized spelling “.esperanto” breaks up, as a consequence of the Lojban morphology rules (see Chapter 4) into two Lojban words, the cmavo “.e” and the undefined fu'ivla “speranto”. Therefore, two “si” cmavo are needed to erase them. Of course, “.e speranto” is not grammatical after “la”, but recognition of “si” is done before grammatical analysis.

Even more messy is the result of an incorrect “zoi”:

✥13.5    mi cusku zoi fy. gy. .fy.
    si si si si zo .djan
I express [foreign] [quote] “sy” [unquote],
    er, er, er, er, “John”.

In ✥13.5, the first “fy” is taken to be the delimiting word. The next word must be different from the delimiting word, and “gy.”, the Lojban name for the letter “g”, was chosen arbitrarily. Then the delimiting word must be repeated. For purposes of “si” erasure, the entire quoted text is taken to be a word, so four words have been uttered, and four more “si” cmavo are needed to erase them altogether. Similarly, a stray “lo'u” quotation mark must be erased with “fy. le'u si si si”, by completing the quotation and then erasing it all with three “si” cmavo.

What if less than the entire “zo” or “zoi” construct is erased? The result is something which has a loose “zo” or “zoi” in it, without its expected sequels, and which is incurably ungrammatical. Thus, to erase just the word quoted by “zo”, it turns out to be necessary to erase the “zo” as well:

✥13.6    mi se cmene zo .djan.
    si si zo .djordj.
I am-named-by the-word “John”,
    er, er, the-word “George”.

The parser will reject “zo .djan. si .djordj.”, because in that context “djordj.” is a name (of selma'o CMENE) rather than a quoted word.

Note: The current machine parser does not implement “si” erasure.

As the above examples plainly show, precise erasures with “si” can be extremely hard to get right. Therefore, the cmavo “sa” (of selma'o SA) is provided for erasing more than one word. The cmavo following “sa” should be the starting marker of some grammatical construct. The effect of the “sa” is to erase back to and including the last starting marker of the same kind. For example:

✥13.7    mi viska le sa .i mi cusku zo .djan.
I see the  …  I say the-word “John”.

Since the word following “sa” is “.i”, the sentence separator, its effect is to erase the preceding sentence. So ✥13.7 is equivalent to:

✥13.8    mi cusku zo .djan.

Another example, erasing a partial description rather than a partial sentence:

✥13.9    mi viska le blanu zdan. sa le xekri zdani
I see the blue hou …  the black house.

In ✥13.9, “le blanu zdan.” is ungrammatical, but clearly reflects the speaker's original intention to say “le blanu zdani”. However, the “zdani” was cut off before the end and changed into a name. The entire ungrammatical “le” construct is erased and replaced by “le xekri zdani”.

Note: The current machine parser does not implement “sa” erasure. Getting “sa” right is even more difficult (for a computer) than getting “si” right, as the behavior of “si” is defined in terms of words rather than in terms of grammatical constructs (possibly incorrect ones) and words are conceptually simpler entities. On the other hand, “sa” is generally easier for human beings, because the rules for using it correctly are less finicky.

The cmavo “su” (of selma'o SU) is yet another metalinguistic operator that erases the entire text. However, if the text involves multiple speakers, then “su” will only erase the remarks made by the one who said it, unless that speaker has said nothing. Therefore “susu” is needed to eradicate a whole discussion in conversation.

Note: The current machine parser does not implement either “su” or “susu” erasure.