Using Chinese characters to write Taiwanese

The majority of Taiwanese words have a clear Chinese origin; that is, there is little disagreement that the word or its morphemes descend from Old Chinese or a later Chinese language, and there are clear cognates in Literary Chinese. These words are usually consistently written with the same characters that would be used for the Literary Chinese cognate.

Many words and morphemes, however, don’t have such clear origins, and are often found only in Hokkien or in neighboring languages. Usually, for these words, a character with a similar meaning or sound is borrowed from Literary Chinese to write each Taiwanese syllable. In rare instances, new characters are created to write the Taiwanese word. Some of these words have been written with the same characters since the earliest traces of written Taiwanese; others have been written in many different ways.

From a practical point of view, I see very little reason to write these words with Chinese characters at all. Many people share this perspective, using Chinese characters for Chinese words and romanizations for everything else, or just abandoning characters altogether. Chinese characters are highly impractical even for writing Chinese, and borrowing the characters for other words further convolutes the system: phonetic loans lose the connection between form and meaning, and semantic loans lose the connection between form and sound—connections that were already tenuous in the first place. But there are still cultural and aesthetic reasons to use characters. When we do so, we have to choose which characters to use, even though none of the choices may be optimal.

Tâi-jī-chhân recommendations

That best approach that I’ve seen for choosing Chinese characters is provided by Tâi-jī-chhân 台字田 (TJC), a dictionary edited by an anonymous group that calls themselves the Chāi-lâi-jī Siā 在來字社. (It is hosted by FHL (信望愛), a volunteer-run Christian resources website, so they may be associated.) A Facebook page, Súi Tâi-bûn 水台文, seems to give the same recommendations.

These recommendations are very thoroughly researched and explained. For the most part, they recommend characters with a long history of use in Taiwanese opera and other Hokkien texts from Taiwan, China, and Southeast Asia. They explain these historical choices to the best of their ability, and give good reasons when they recommend a character that strays from historical use.

I have read many of their entries, and agree with their general approach. This is not because I think historically used characters are unequivocally better. Due to colonial suppression, the pre-ROC Taiwanese literary tradition has all but disappeared, and very few people read historical Taiwanese texts. There is thus little reason to stick blindly to historical precedence. But the TJC reveals that there are consistent underlying principles behind historically used characters, however imperfect those choices may still be.

Ministry of Education recommendations

Meanwhile, the Taiwanese Ministry of Education (MOE) has also recommended characters for a large number of Taiwanese words. These characters are highly controversial, primarily because they recommend many rare characters that have no history of use in Taiwan. They have published explanations (臺灣閩南語按呢寫) for many of their choices, and some of their reasoning is not only disagreeable but factually incorrect. Nevertheless, most people now who write in Taiwanese use their recommendations, because it is the standard taught in schools and there is no other widely known standard. The general attitude seems to be to follow the government standard until there is widespread consensus to change parts of it, but given how stubborn government entities are I see very little likelihood of major change. I find this unfortunate, as I disagree not only with the Ministry’s specific missteps but with their general approach.

Comparing the two approaches

Historically, literate Taiwanese people were educated primarily in Literary Chinese. When writing Taiwanese, they would thus mostly borrow characters that are commonly used in Literary Chinese, using semantic or phonetic loans for most words with no clear Chinese origin. There were some exceptions, where certain words came to be widely written with newly invented characters (such as 𫔘 for tú, 𨑨迌 for chhit-thô, for chūn, and for se̍h), but such cases were rare. Some of these inventions are essentially phonetic loans with an added radical, such as and .

As mentioned, TJC generally follows these traditional choices, and when they differ from them or when historical precedence is scant they give thorough, well-researched explanations. They also recognize the flaws of using Chinese characters, and explicitly recommend using romanizations for a number of non-Chinese words.

The MOE, meanwhile, strays from historical usage much more than TJC does. When they do, it is generally for one of the following reasons:

  • For a large number of characters, the MOE seems to have ignored or neglected the traditionally used characters. Such characters are often listed by TJC as the most commonly used ones for a given word, but aren’t even mentioned in the MOE explanations. It looks like the MOE simply failed to do the research. This accounts for many discrepancies between the two recommendations.
  • The MOE is very eager to claim a character as a word’s ancestral character—that is, as the character traditionally used to write a word from which a Taiwanese word is descended. This leads to recommendations such as 𠢕 (⿱敖力), which it claims is the ancestral character of gâu, even though the character was never commonly used and the corresponding word possibly never existed in spoken language. Sometimes, the MOE doesn’t explicitly claim ancestry, but chooses an uncommon character because both meaning and pronunciation somewhat match the Taiwanese word. Two examples of this are for se̍h and for khǹg, which were both invented to write other varieties of Chinese and have never been used for Taiwanese.
  • The MOE is much more likely to deem a traditional usage as ambiguous and recommend different characters as a result. An example is the recommendation for m̄, which has always been written with the semantic loan . is pronounced put in a small number of Taiwanese words and when reading Classical Chinese, so the MOE recommended for m̄ to disambiguate. But any fluent Taiwanese speaker would be able to differentiate the two pronunciations. Another example is the adjective súi (“beautiful”), for which the MOE recommends the uncommon character , but which has always been written with the phonetic loan (súi), which means “water”. These are radically different meanings, and are unlikely to be confused. Most such “disambiguating” choices are like this: they ignore that writing is used by literate people who can understand context.

I don’t think any of these are convincing reasons to depart from the semantic or phonetic loans that were historically used. A large number of the newly recommended characters are highly uncommon, which increases the number of characters one has to know without any real benefits. It also makes for text that looks and feels uglier, because Literary Chinese characters with fewer strokes and richer literary associations have been replaced by new or uncommon characters.

September 2020