5. Alien alphabets

As stated in Section 1, Lojban’s goal of cultural neutrality demands a standard set of lerfu words for the lerfu of as many other writing systems as possible. When we meet these lerfu in written text (particularly, though not exclusively, mathematical text), we need a standard Lojbanic way to pronounce them.

There are certainly hundreds of alphabets and other writing systems in use around the world, and it is probably an unachievable goal to create a single system which can express all of them, but if perfection is not demanded, a usable system can be created from the raw material which Lojban provides.

One possibility would be to use the lerfu word associated with the language itself, Lojbanized and with “bu” added. Indeed, an isolated Greek “alpha” in running Lojban text is probably most easily handled by calling it “.alfas. bu”. Here the Greek lerfu word has been made into a Lojbanized name by adding “s” and then into a Lojban lerfu word by adding “bu”. Note that the pause after “.alfas.” is still needed.

Likewise, the easiest way to handle the Latin letters “h”, “q”, and “w” that are not used in Lojban is by a consonant lerfu word with “bu” attached. The following assignments have been made:

        .y'y.bu     h
        ky.bu       q
        vy.bu       w
As an example, the English word “quack” would be spelled in Lojban thus:
5.1)   ky.bu .ubu .abu cy. ky.
       “q” “u” “a” “c” “k”
Note that the fact that the letter “c” in this word has nothing to do with the sound of the Lojban letter “c” is irrelevant; we are spelling an English word and English rules control the choice of letters, but we are speaking Lojban and Lojban rules control the pronunciations of those letters.

A few more possibilities for Latin-alphabet letters used in languages other than English:

        ty.bu       þ (thorn)
        dy.bu       ð (edh)

However, this system is not ideal for all purposes. For one thing, it is verbose. The native lerfu words are often quite long, and with “bu” added they become even longer: the worst-case Greek lerfu word would be “.Omikron. bu”, with four syllables and two mandatory pauses. In addition, alphabets that are used by many languages have separate sets of lerfu words for each language, and which set is Lojban to choose?

The alternative plan, therefore, is to use a shift word similar to those introduced in Section 3. After the appearance of such a shift word, the regular lerfu words are re-interpreted to represent the lerfu of the alphabet now in use. After a shift to the Greek alphabet, for example, the lerfu word “ty” would represent not Latin “t” but Greek “tau”. Why “tau”? Because it is, in some sense, the closest counterpart of “t” within the Greek lerfu system. In principle it would be all right to map “ty.” to “phi” or even “omega”, but such an arbitrary relationship would be extremely hard to remember.

Where no obvious closest counterpart exists, some more or less arbitrary choice must be made. Some alien lerfu may simply not have any shifted equivalent, forcing the speaker to fall back on a “bu” form. Since a “bu” form may mean different things in different alphabets, it is safest to employ a shift word even when “bu” forms are in use.

Shifts for several alphabets have been assigned cmavo of selma'o BY:

    lo'a    Latin/Roman/Lojban alphabet
    ge'o    Greek alphabet
    je'o    Hebrew alphabet
    jo'o    Arabic alphabet
    ru'o    Cyrillic alphabet

The cmavo “zai” (of selma'o LAU) is used to create shift words to still other alphabets. The BY word which must follow any LAU cmavo would typically be a name representing the alphabet with “bu” suffixed:

5.2)   zai .devanagar. bu
       Devanagari (Hindi) alphabet

5.3)   zai .katakan. bu
       Japanese katakana syllabary

5.4)   zai .xiragan. bu
       Japanese hiragana syllabary
Unlike the cmavo above, these shift words have not been standardized and probably will not be until someone actually has a need for them. (Note the “.” characters marking leading and following pauses.)

In addition, there may be multiple visible representations within a single alphabet for a given letter: roman vs. italics, handwriting vs. print, Bodoni vs. Helvetica. These traditional “font and face” distinctions are also represented by shift words, indicated with the cmavo “ce'a” (of selma'o LAU) and a following BY word:

5.5)   ce'a .xelveticas. bu
       Helvetica font

5.6)   ce'a .xancisk. bu
       handwriting

5.7)   ce'a .pavrel. bu
       12-point font size

The cmavo “na'a” (of selma'o BY) is a universal shift-word cancel: it returns the interpretation of lerfu words to the default of lower-case Lojban with no specific font. It is more general than “lo'a”, which changes the alphabet only, potentially leaving font and case shifts in place.

Several sections at the end of this chapter contain tables of proposed lerfu word assignments for various languages.