Let’s make a language – Part 1a: Phonology (Intro)

The sound of a language is, in a sense, it’s first impression. And first impressions matter. How a language sounds, the spoken noises that it uses, can certainly influence the opinion of a listener (or reader). In the real world, for example, Westerners often perceive Arabic as a “harsh” language because of its series of “guttural” sounds. We might also talk about Chinese as a “musical” language, since it makes use of tone, a quality we’ll come to later. For conlangs, things are no different. The Elvish languages of Lord of the Rings are praised as melodious, while the Klingons of Star Trek speak a tongue that, like them, comes across as abrasive, violent. (Of course, in the case of conlangs, we have to look at things from the other direction sometimes. Elves have “enchanting” words because they’re supposed to. Klingons are a warrior race, and their language reflects this.)

All this is to say that the sound of your language is important. Even if you’re making a purely written language (like for a book), you might need to pronounce it at some point, and many readers will certainly try. After all, Dothraki began as a few words and phrases scattered almost haphazardly throughout the books of A Song of Ice and Fire. Once those books were turned into the Game of Thrones TV series, Dothraki (and Valyrian, which is barely found in the books at all, apart from a couple of fixed phrases like valar morghulis) had to become something more “real”.

To make a language, we need to understand a little about how languages work, and this is one of those posts. Specifically, we’re looking at what’s called phonology, i.e., the sounds that make up a language. Obviously, if your language isn’t spoken, like a sign language, then this post won’t be of much use. Honestly, though, I have no idea of how to even begin to make a sign language, so that’s the last I’ll say about them. (I can’t think of too many signed conlangs, unless you consider ASL a conlang. The closest thing I can come up with is the elaborate gesturing or “posing” of Daniel Abraham’s Long Price Quartet series, which is more of an addition to speech than a language of its own.) Also, if you’re making a language for aliens that don’t speak the way we do, then you’ve probably got bigger problems than I can solve.

(Digression: Okay, I had this whole thing planned out where I’d go over all the phonology stuff. But I scrapped it. Why? A few reasons. First, it was about 2,000 words just for the section on consonants. That was way too long for a post. Second, plenty of other people have already done the same thing. So, instead, I’ll leave you with a link to Wikipedia’s page on the International Phonetic Alphabet, which has clickable links for just about every possible sound found in human languages, and I’ll turn this post into something more general and useful for a beginning conlanger.)

The Sounds We Make

Every language in the world has a number of phonemes, which are basic units of sound. Think of them as letters, except we’re not necessarily talking about the ones in the alphabet. English, for instance, has 26 letters, but 40 or so phonemes, depending on dialect. Many of these phonemes, however, can surface as slightly different sounds, or allophones. The P sounds in pot and top are good examples of this. They don’t sound exactly the same, but they’re close enough that English speakers call them the same thing. A language like Hindi, on the other hand, does say they are different sounds: /p/ and /pʰ/.

Which (and how many) sounds you use in your language is largely a matter of style, and that directly relates to what kind of conlang you’re making. For languages intended to be for communication (auxlangs), you definitely want to use the most common sounds, most of which have IPA values of basic English letters: /p/, /t/, /k/, and so on. Adding in fancy things like retroflex consonants (despite being common in the very populous Indian subcontinent) or palatalization (found in Slavic languages and Irish, but not many other places) will only make things harder for the speakers that have to learn not only a new language, but new sounds to go with it.

For every other type of conlang, you might think you can just go wild with phonemes. Obviously, you can. I’m not stopping you. But something intended to sound natural should fit the patterns of natural languages. Otherwise, you end up with what I’ve heard called “shotgun phonology”. You may as well throw darts at an IPA chart. So, instead, let’s take a look at what linguistic evolution has come up with, and see if we can make something to match it.


We’ll start with consonants, both because there’s more of them and because that’s where some of the most interesting possibilities lie. English has about two dozen, which is pretty much average in the world, according to Chapter 1 of the World Atlas of Language Structures. (By the way, bookmark that site; we’ll be going back to it a lot. I’ll usually refer to it as WALS from here on out.) The minimum is about 6 or so, found in a few Pacific and Amazon languages like Rotokas and Pirahã. The high end goes up to around 80 in the Caucasian language Ubykh, and the click languages of Africa can have even more if you count the combination of click and stop as a single phoneme.

So, anywhere from 6 to 80. That’s quite a range, but we can narrow it down once we start looking for patterns. That’s the key to making a conlang seem natural in its phonemic inventory. Take English as an example, since we’re already using it. English has a set of labial consonants (/p b m f v/), a set of dentals (/t d n s z θ ð l r/), some post-alveolar or palatals (/ʃ ʒ tʃ dʒ j/), and a few velars (/k g ŋ w/). /h/ is the odd one out, but it’s like that in a lot of languages, so that’s okay. Looking at it from the other dimension, English has stops (/p b t d k g/), nasals (/m n ŋ/), fricatives (/f v θ ð s z ʃ ʒ h/), affricates (/tʃ dʒ/), and approximants (/r l j w/). Any way you look at it, essentially every consonant is related to another. There’s not, say, a uvular stop out by itself.

Any language you can think of works the same way. Spanish has a palatal series (/tʃ ɲ ʎ j/), Hindi has a set of retroflex consonants. The languages with smaller consonant inventories have broader distinctions. Rotokas, with its half a dozen consonants, divides them up in two dimensions: voiced or voiceless, and labial, alveolar, or velar. The enormous systems of the Caucasus come about similarly, but making finer distinctions. The 58 consonants of Abkhaz illustrate this. Labialized and non-labialized consonants are different in that language, and there is a set of ejective stops. Both of these combine to increase the inventory while avoiding outliers.

That’s not to say you can’t have outliers. You just need a good reason for them. If you’ve got /p/, /b/, and /t/ already, you’ll probably have /d/, too, but that doesn’t always have to be the case. Especially as you go “down” the phonetic chart, from stops to fricatives to approximants, there are a lot more opportunities to add wrinkles to the system. You can have /s/ and /k/ without having /x/, like English. Or /r/ without /l/, like in Japanese.

The same is true for “rare” sounds. Conlangers tend to over-represent two of these in particular: the English “th” sounds /θ ð/. (I’m guilty of it myself, with my language Suvile.) These sounds are comparatively rare (about 1 in 10 languages have them), but they’re far more common in conlangs. The same is true for some of the more outlandish distinctions, and the reason why is simple. A conlanger sees a sound he likes, and he builds the language specifically to have it, whether it fits or not. Again, if that’s what you like, go for it, but the result might feel “fake”.


Vowels have a bit less in the way of possibilities, and vowel systems tend to fall into a few basic categories. Here, English is on the large end of the scale, with up to 20 or more vowel sounds, depending on dialect. A few languages have only two vowel phonemes (Ubykh, mentioned above, is one of these), though these may take on different qualities at different points in a word. Five is the most common, though, according to WALS Chapter 2, and those five are usually the cardinal vowels /a e i o u/. Six is also common, with the addition usually being a schwa (/ə/) or a high central vowel like /ɨ/, though something like /æ/ isn’t out of the question. Systems with four vowels drop one of the cardinal quintet, usually /o/ or /u/. Three-vowel systems are almost always /a i u/, as these are maximally distinct.

Like with consonants, the key here is regularity, at least at the start. The common five vowels can be split into high (/i u/), middle (/e o/), and low. Or you could divide them into front (/i e/), central (/a/), and back (/o u/). Larger vowel systems become that way because they add dimensions. If you have the front vowels /i/ and /e/ and the rounded vowels /o/ and /u/, it’s not that much of a stretch to add in the front and rounded /y/ and /ø/. Similarly, a quality like length or nasalization tends to “spread” through the vowel system, multiplying the number of phonemes.

Vowel harmony is another of those ideas that conlangers get carried away with. The canonical example is Turkish, with its eight vowels /i y ɯ u e ø o a/. This makes a kind of 3D grid, where each vowel is either front or back, either high or low, and either rounded or unrounded. Turkish grammatical suffixes come in different forms, depending on which type of vowel they need, and a word must have its vowels all front or all back. This has an appealing symmetry of the kind that conlangers tend to love. Like the consonantal rarities above, though, there needs to be a reason, even if that reason boils down to “because it sounds cool”.

In my opinion, if you have no other pressing needs (like fitting in with names you’ve already made, for instance) then you should probably start with the basic five vowels. If you’re making an auxiliary language, then I’d strongly suggest stopping there. (Volapük used front vowels, because its creator was German. Esperanto went with the basic set instead. Which one’s more popular?)

Everybody else probably needs more, though. Still, start with the basics. If you add vowels, make sure they fit. More than consonants, vowels have a tendency to shift around in speech, almost like they’re floating. They like to be as distinct as possible. Sure, it might sound fun to have a language whose vowels are /i y e ø ɨ ʉ ɛ ɔ ɜ ɑ/, but it wouldn’t stay that way for long. A couple of generations of real language evolution would turn it into something like /i ɪ e ə æ u ʊ o/.


Besides consonants and vowels, we have one more thing to add to our study of phonology. Tone is probably the most popular in conlangs, simply because it isn’t found in many languages Westerners would be familiar with, making it seem exotic. (And the one major tonal language group is Chinese, further reinforcing that stereotype.) But tone is actually quite common in the world’s languages, especially in places of high linguistic concentration like Africa and the Amazon.

Tone itself can be divided into two varieties. Mandarin Chinese is an example of the first, which uses relative changes in pitch: level (called “high” in studies of the language), rising, dipping (falling from a low pitch to an even lower one, then sometimes rising again), falling, and a fifth, neutral tone found in weak syllables. Other languages have more or less complicated systems, but the idea remains the same: it’s the change in pitch that is important.

The alternative is a system where the tones themselves are steady, but at different levels. This is found, e.g., in Bantu languages of Africa. These are usually languages with two tones, a high and a low, or three, adding a middle tone. Four or more tones of this kind are rare, and it’s easy to see why. I mean, you could make a language with seven tones, each corresponding to a note on the major scale, and such a thing has indeed been done, but it would be awfully hard to speak. For speakers of such a language, singing lessons might be an integral part of grammar classes!

Obviously, an international auxlang likely won’t have tone, although one intended solely for communication in places where most languages are already tonal wouldn’t be out of the ordinary. For the more artistic conlangs, do whatever you want! In terms of numbers of languages, about half are tonal, though this is skewed by the large concentrations of tonality I mentioned above. (On a personal note, I’ve made one serious attempt at a tonal language, Lyssai. It’s for a race of elf-like forest dwellers in a story I’ll eventually write.)


Note: If you’re making an auxiliary language, you can probably skip this section.

A lot of the flavor of a language comes from its sound, and that sound comes largely from the phonemes used in the language. (Some of it comes from the syllable structure and stress patterns, which we’ll get into next time.) Guttural sounds from the back of the throat grate on American ears, while the liquid sounds of approximants and trills feel soft. Palatalized sounds have a “slurring” quality, while dentals make us think of a lisp.

For fictitious cultures, this stereotyping becomes useful. Tolkien puts into the mouths of his elves words full of fricatives and approximants and voiceless stops, all phonemes perceived as soft. In sharp contrast, orc speech is full of aspirated or voiced stops, both “uglier” types of sounds, a subtle way of confirming their status as the enemy.

Of course, if you’re making a language meant to be spoken by actors, you need to take that into account, too. That’s why Dothraki, for example, has such a relatively simple phonology. (The exception is the lone uvular stop [q], which goes against what I said earlier about phoneme sets, but he’s getting paid, and I’m not. Oh well.)

So, the lessons we can learn here are many:

  1. If you’re making an auxiliary language, choose sounds and sound distinctions that are fairly common. Esperanto arguably screwed up by including a palatal series. Volapük did the same with front rounded vowels. Of course, French was once the lingua franca (it’s right there in the name), and it has a pretty complex phonology, so there are always exceptions.

  2. Artistic languages can have whatever sounds you can pronounce. But remember your audience. Americans probably aren’t going to be able to pronounce pharyngeals. Japanese speakers might not be able to manage [θ] and [ð].

  3. Phonemes, especially stops, tend to be connected. A distinction made on only one phoneme feels unnatural. It’s not impossible, mind you, just less likely.

  4. Vowels are like a gas. They expand to fill their space, and they spread out. The fewer you have, the more guises they can take. A language with only /a i u/, for example, can still have [e o] as allophones.

  5. Tone is nice, and it can be interesting, but you need to study up on how it’s used. (Actually, this can go for anything else in this gigantic post.)

  6. There are more things on heaven and earth than are dreamt of in your language. The conlang community has a saying known as ANADEW: a natlang (natural language) already did, except worse. Almost every concept that a conlanger thought he came up with, some real language spoken somewhere has it.

That’s it for now. (Finally!) Next time, we’ll get into the sound systems of our two languages, Isian and Ardari.

Leave a Reply

Your email address will not be published. Required fields are marked *