The Part You're Excited and Worried About: Chinese Characters

Alright, let’s get right to the point! At first exposure, Chinese characters can be intriguing, mysterious, and daunting all at once. There are over 50,000 Chinese characters! Fortunately, the average educated person only knows about 8,000 characters and you could read a typical newspaper with knowledge of somewhere between 2,000 and 3,000 characters. Does this still sound like a daunting amount to learn? Just wait for next post where we’ll distill this number down quite a bit further, so don’t stress. But first, what exactly are Chinese characters anyway?

Many people have the misconception that all Chinese characters were formed as pictures of things. However, this isn’t true for the most part. Only about 600 Chinese characters were actually created in this way! We will get into how the rest are made, but we have to discuss a few more things first.

Chinese characters are individually formed by a series of strokes. A stroke is each individual mark you make on the page when handwriting a character. Strokes are the building blocks of all characters. They are even given a specific stroke order. This order defines that characters are written with their strokes in a correct order which makes the method of writing each character easier to remember. The details of the stroke order rules are beyond the scope of this guide, so you can refer to follow link for more information: Chinese Stroke Order Rules.

Sidebar: For the most part, I believe that learning to write in Chinese is particularly inefficient, given the ease of using input methods on our phones or computers (to be discussed later). Unless you see a specific need for handwriting skills, the time investment required to learn handwriting will significantly slow your progress. This is due to the fact that the skill of recognizing characters and remembering their intricate details well enough to write them correctly differ by an incredible amount in difficulty level.

These strokes typically combine to form what are called radicals. The radicals are a set of defined collections of strokes that are used across many different characters to signify general meanings. Complete characters are then formed by combining these radicals together in different combinations. Theoretically, if you know the meaning behind all of the radicals, you can have a guess of the general meaning of a character that uses a combination of them. In practice though, I find that it’s pretty difficult to determine the exact meaning in this way although it definitely offers a hint in the right direction. For the most part, this method might help you understand in retrospect how the radicals convey the character’s meaning after you’ve already learned it.

There are 214 different Chinese radicals and they’re shown below roughly in order of the number of strokes. Note that the list below includes all 214 radicals in addition to their variants used in some characters. When you learn characters, you will quickly start to recognize these radicals as they will appear again and again. I find that the idea of learning individual characters seemed much simpler once I recognized that these common radicals form the basis for pretty much all characters.

I’m sure you know that Chinese culture has been around for a very long time, as have Chinese characters. They began as what are called oracle bone script, which was the earliest form of Chinese characters carved onto animal bones. These eventually evolved into what are now called Traditional Chinese Characters. This traditional character set was used throughout China until the 1950s, when the government promoted the simplification of the character system into what are called Simplified Chinese Characters. Simplifying the characters was intended to boost literacy throughout China by removing complexity from the written language. Through this process many strokes were removed from common characters with a high number of strokes. Below are a few examples of traditional characters and their simplified equivalents.

  • country: (guó) 國→国
  • to speak/talk/say: (shuō) 說→说
  • to learn: (xué) 學→学
  • factory: (chǎng) 廠→厂

Today, mainland China uses simplified characters almost exclusively, while Hong Kong, Macau, and Taiwan all use traditional characters. If you are learning Mandarin specifically because you plan to live in one of these places, it would be wise to start by learning traditional rather than simplified characters. Not to mention, if you are planning to live in Hong Kong or Macau, you might want to consider learning Cantonese instead of Mandarin anyway.

While learning the radicals can be helpful to determine character meanings, you should also be aware that some characters’ fundamental radicals have been lost in the transition from traditional to simplified characters. Take the example of the word tīng, meaning to listen. The traditional form and simplified forms are shown below.

聽 → 听

The simplified version of this character has lost the ear, eye, and heart radicals which have been replaced with a mouth radical. Now, this doesn’t fully capture the changes to this character. But, I think we can see the idea of listening as illustrated by an abstract combination of listening (ear – 耳), seeing (eye – 目), and feeling (heart – 心) has been lost somewhat with the simplified version. This obviously makes things more difficult if you expect you could guess the meaning based on the radicals alone.

Okay, now we know how characters are created with common radicals. Let’s discuss more about the types of characters themselves. There are actually 6 different categories of Chinese characters, called 六书 (liùshū) or The Six Writings. I should preface by saying that the following classifications are good context to know about when learning characters, particularly because this makes characters a bit easier to learn. But try not to get too bogged down in this details of the categories, because I think that being aware they exist is more important here than knowing which characters fall in each category. These six categories are:

1. Pictographs (象形) – these are characters that look like things, as we discussed earlier. Some commonly shown examples are:

  • 人(rén – person)
  • 手  (shǒu – hand)
  • (shān – mountain)
  • (mù – tree or wood)
  • 馬 → 马  (mǎ – horse) (You can see how the similarity to a picture of a horse has been lost a bit from the traditional to simplified)

2. Ideographs (指事) – these characters represent more abstract ideas but in a simple form that is just a more abstract version of a pictograph. Several common examples of this concept are:

上 (shàng – on), 下 (xià- under)
一  (yī – one), 二 (èr – two), 三 (sān – three)

3. Compound Ideographs (会意) – these combine two or more of the above two categories to create a meaning through the combination of these individual meanings.

  • 好 (hǎo – good) = 女 (nǚ – woman) + 子 (zi – child)
  • 男 (nán – male) = 田 (tiān – field) + 力 (lì – power)
  • 林 (lín – woods / forest) = 2 x (mù – tree or wood)
  • 森 (sēn – forest) = 3 x (mù – tree or wood)

Sidebar: These first three are probably fairly consistent with what most people initially think about how Chinese characters work. But the interesting thing is that they actually compose a pretty small part of the number of Chinese characters in existence. The vast majority (over 90%) of characters are actually phono-semantic compounds (type #4).

4. Phono-semantic compounds (形声) – these characters are composed of a radical (usually on the left) and a phonetic component (usually on the right). The radical gives some general meaning to the character based on the meaning attributed to that radical. The phonetic component contributes to pronunciation of the character, but typically not the meaning. For example, all of the following characters are pronounced díe.

碟 蝶 谍 鲽 堞 蹀 喋 牒

The great thing about this is we can use this to our advantage in learning Chinese. That is, once you’ve learned the majority of characters by frequency (hang on for the next post for a game-changer on this!) you will start to notice a lot of new characters for which you can accurately guess the pronunciation just based on the phonetic components they contain from characters you already know.

But we should note that there is one reason you might sometimes be guessing wrong: phonetic drift. Often times characters composed in this way will show slight differences in pronunciation, particularly between similar unaspirated and aspirated sounds (don’t worry, we’ll discuss this in just a bit in the pinyin pronunciation post). My guess is that this is probably due to the drift toward other similar pronunciations over time since characters themselves do not indicate pronunciation exactly. Basically, it’s like the result of a centuries-long game of telephone. For example, take a look at the two sets of characters below.

波(bō), 玻 (bō), 跛 (bǒ)
坡 (pō), 破 (pò), 婆 (pó)

From my own experiences, #5 and #6 are fairly rare so I will save the examples. If you’re interested, research these categories a bit for yourself later.

5. Phonetic loan (假借) – these characters were existing characters that were basically borrowed (and sometimes altered) to be used for a completely different meaning.

6. Derivative cognates (转注) – these are characters that may have originated from the same etymology and were split into separate characters over time.

That was probably a lot of detail to go into, but the main take-home here is pretty simple: some characters look like physical things or convey abstract things, but actually the vast majority of characters are phono-semantic compounds.

Now that we have learned about the characters themselves, it’s time to crush most of your concerns about learning Chinese characters. Let’s discuss the secret to rapid success for learning characters…Pareto’s Principle.

