On writing systems

Most conlangers work with text: text files, wordlists, and the like. It’s very much a visual process, quite the opposite of “real” languages. Yes, we think about the sound of a language while we’re making it, but the bulk of the creation is concerned with the written word. It’s just easier to work with, especially on a computer.

Writing, of course, has a long history in the real world, and many cultures have invented their own ways of recording the spoken word. For a conlang, however, the usual form of writing is a transcription into our own alphabet. Few go to the trouble of creating their own system of writing, their own script. Tolkien did, to great effect, but he was certainly an outlier. That makes sense. After all, creating a language is hard enough. Giving it its own script is much more effort for comparatively little payoff.

But some are willing to try. For those who are, let’s see what it takes to create writing. Specifically, we’ll look at the different kinds of scripts out there in this post.

Alphabet

The alphabet is probably the simplest form of script, from the point of view of making one. You don’t really need an example of an alphabet—unless this post was translated into Chinese while I wasn’t looking, you’re reading one! Still, our familiar letters aren’t the only possibility. There’s the Greek alphabet, for example, as well as Cyrillic and a few others.

Alphabets generally have a small inventory of symbols, each used (more or less) for a single phoneme. Obviously, English is far from perfect on that front, but that’s okay. It doesn’t have to be perfect. The principle stands, even if it’s stretched a bit. None of our 26 letters stands for a full syllable, right?

That’s why alphabets are so easy to make, and why they’re (probably) the most common form of writing for conlangs. You only need a few symbols—and there’s nothing saying you can’t borrow a few—and you’re all but done. Writing in the script you make can be as simple as exchanging letters for glyphs.

Abjad and abugida

These two foreign terms name two related variations on the alphabet. The abjad is a script where only consonants are directly written; vowels are represented by diacritics, if at all. That’s the basic system used by Arabic and many of its cousins, as in “ةباتك” (kitāba). Note that Arabic isn’t a “pure” abjad, though. The third letter (reading right-to-left) stands for the long a, while the final a has its own letter. As with English, that’s fine. Nobody’s perfect.

The abugida is similar to the abjad, but it does mark vowels. Unlike an alphabet, this is usually with some form of diacritic or as an “inherent” vowel, but it’s always there. Many of the various languages of India use this type of script, such as the Devanagari used by Hindi: लेखन (lekhan). This particular word has three “letters”, roughly standing for l, kh, and n. The vowel a (actually a schwa) is implicit, and it’s omitted at the end of words in Hindi, so only the first letter needs a diacritic to change its vowel. Once more, the scheme isn’t perfect, but it works for a few hundred million people, so there you go.

Syllabary

Alphabets, abjads, and abugidas all have one thing in common: they work on the level of phonemes. That makes intuitive sense, particularly in languages with complex phonotactics. When there are hundreds of thousands of possible syllables, but only a few dozen individual phonemes, the choice is clear. (That hasn’t stopped some crazy people from trying to make a syllabary for English, but I digress.)

The syllabary, by contrast, gives each syllable its own symbol. Realistically, to use a “pure” syllabary, a language almost has to have a very simple syllabic structure. It works best with the CV or CVC languages common to Asia and Oceania, and that’s probably why the most well-known syllabary comes from that region, the Japanese kana: てがき (tegaki).

A syllabary will always have more symbols than an alphabet (about 50 for Hiragana, plus diacritics for voicing), but not an overwhelming number of them. Syllabaries made for more complicated structures usually have to make a few sacrifices; look at the contortions required in Japanese to convert foreign words into Katakana. But with the right language, they can be a compact way of representing speech.

Featural

A featural alphabet is another possibility, sitting somewhere between an alphabet and a syllabary. In this type of script, the letter forms are phonemic, but they are constructed to illustrate their phonetic qualities. Korean is the typical example of a featural script: 필적 (piljeog). As you can see (hopefully; I don’t seem to have the right font installed on this computer), each character does encode a syllable, but it’s obviously made up of parts that represent the portions of that syllable.

Featural alphabets might be overrepresented in conlanging, because they appeal to our natural rationality. Like agglutinative languages, they’re almost mechanical in their elegance. They only require the creation of an alphabet’s worth of symbols, but they give the “look” of a more complex script. If you like them, go for it, but they’re probably rare in the world for a reason.

Logographic

Finally, we come to the logographic script. In this system, each glyph stands for a morpheme or word, with the usual caveat that no real-world system is perfectly pure. Chinese is far and away the most popular logographic script these days: 写作 (xiězuò). Chinese characters have also been borrowed into Korean, Japanese, and other neighboring languages, but they aren’t the only logograms around. Cuneiform, hieroglyphs (Egyptian, Mayan, or whatever), and a few other ancient scripts are logographic in nature.

It should be blatantly obvious what the pros and cons are. The biggest downside to logograms is the sheer number of them you need. About half of Unicode’s Basic Multilingual Plane is composed of Chinese characters, and that’s still not enough. Everything about them is harder, whether writing, inputting, or even learning them. In exchange, you get the most compressed, most unambiguous script possible. But the task might be too daunting for a conlanger.

The mix

In truth, no language falls neatly into one of the above categories. English is written in an alphabet, yes, but we also have quite a few logograms, such as those symbols on the top row of your keyboard. And with the advent of emoji, the logographic repertoire has grown exponentially. Similarly, Arabic has alphabetic properties, Japanese uses Chinese logograms and Latin letters in addition to its syllabic kana, and the phonetic diacritics used by languages such as German are essentially featural.

For your conlang, the style you choose is just that: a style. It’s an artistic choice. Alphabets (including abjads and abugidas) are far easier. Syllabaries can work if you have the right language, or are willing to play around. Logograms require an enormous effort, but they’re so rare that they might be interesting in their own right. And featural systems have the same “logical” appeal as conlangs like Lojban. Which you choose is up to you, but a natural script won’t be limited to one of them. It will borrow parts from the others.

Creating a script for a conlang can be a rewarding task. It’s not the type of thing to undertake lightly, however. It’s a lot of work, and it takes a bit of artistic vision. But you wouldn’t be making a language if you weren’t something of an artist, right?

Let’s make a language – Part 15b: Color terms (Conlangs)

So we’ve seen how real-world languages (or cultures, to be more precise) treat color. Now let’s take a look at what Isian and Ardari have to say about it.

Isian

Isian has a fairly short list of basic color terms. It’s got the primary six common to most “developed” languages, as follows:

Color Word
white bid
black ocom
red ray
green tich
yellow majil
blue sush

We’ve actually seen these before, in the big vocabulary list a few parts back, but now you know why those colors were picked.

There are also three other “secondary” terms. Mesan is the Isian word for “gray”, and it runs the gamut from black to white. Sun covers browns and oranges, with an ochre or tawny being the close to the “default”. In the same way, loca is the general term for purple, pink, magenta, fuchsia, and similar colors. Finally, mays and gar are “relative” terms for light and dark, respectively; gar sush is “dark blue”, which could be, say, a navy or royal blue.

All these words are adjectives, so we can say e sush lash “the blue dress” or ta ocom bis “a black eye”. Making them into nouns takes the same effort as any other adjective, using the suffix -os. Thus, rayos refers to the color of red; we could instead say rayechil “red-color”.

Derivation is also at the heart of most other Isian color names. Compounds of two adjectives aren’t too common in the language, but they are used for colors. In all cases, the “primary” color is taken as the head of the compound. Some examples include:

  • raysun, a reddish-brown or red-orange; some hair colors, like auburn, might also fit under this term.
  • majiltich, a yellow-green close to chartreuse.
  • tichmajil, similar to majiltich, but more yellow, like lime.
  • locasush, a mix of blue and purple, a bit like indigo.

Most other colors are named after those things that have them. “Blood red”, for instance, is mirokel (using the adjectival form of miroc “blood”). Halakel is “sky blue”, and so on. As with English, many of the names come from flowers, fruits, woods, and other botanical origins. We’ll look at those in a later post, though.

Ardari

To look at Ardari’s color terminology, we’ll need to work in stages, as this uncovers a bit of the language’s history. First, it seems that Ardari went a long time with four basic colors:

Color Word
white ayzh
black zar
red jor
green rhiz

Yellow (mingall) and blue (uswall) got added later, likely beginning as derivations from some now-lost roots. (The sun and the sky are good bets, based on what we know about real-world cultures.)

Next came a few more unanalyzable roots:

Color Word
brown dir
orange nòrs
purple plom
pink pyèt
gray rhuk

That gives the full array of eleven that many languages get before moving on to finer distinctions. Add in wich “light” and nyn “dark”, and you’re on your way to about 30 total colors.

Ardari doesn’t use compounds very often, so most of the other color terms are derived in some fashion. Two good examples are the similar-sounding wènyät “gold” and welyät “sky blue”. These started out as nothing more than adjectival forms of owènyi “gold” and weli “sky”, turned into adjectives by the -rät suffix we met not too long ago, and worn down a bit over time.

Another color word, josall, is an example of a more abstract or general term. It covers very light colors like beige and the pastels. It’s lighter even than wich nòrs or wich jor would be, but with more color than pure white. The word itself probably derives from josta “shell”, so you could describe it as a seashell color.

Grammatically, Ardari color terms are adjectives, so they inflect for gender just like any other. They can be used directly as nouns. And you can add the suffix -it to make something like English “-ish”: jorit “reddish”. That’s really all there is to it.

Moving on

Both our conlangs could easily have a hundred more words for various colors, but these are enough for now. You get the idea, after all. So it’s time to head to the next topic. I still haven’t thought of what that will be. At some point (probably by the time I write Part 16), I’ll have to make some tough decisions about the world around Isian and Ardari, because we’re fast approaching the point where that will matter. So the series might go on a hiatus of a few weeks while I brainstorm. We’ll see.

Let’s make a language – Part 15a: Color terms (Intro)

Once you have the grammar parts figured out, most of the rest of the conlanging process is making words. We began to see that in Part 14, when we discussed deriving new words from existing roots. This time around, we’re going back to the roots (pun intended) and looking at a very specific set of words: the color terms.

Color terms are, well, terms for colors. They’re the names you see on crayons or paint swatches. As anyone who has been to a hardware store knows, there are thousands of these, but we’ll focus on the absolute basics. Most colors are named after things that are that color, like “violet” or “salmon”. A few, however, are truly basic: “red”, “yellow”, “black”, and so on. These are the ones that most interest us here.

More importantly, which color terms are considered “basic” turns out to be a way in which languages differ. That makes this subject an excellent illustration of how a language can divide up the “semantic space”. Not every language is the same in this respect, and realizing that is a good step towards creating a more naturalistic conlang, rather than a simple cipher of your native tongue.

The color hierarchy

Every language has at least two basic colors. That seems to be a linguistic and cultural universal. But according to a study by Berlin and Kay (1969), what comes next follows a fairly regular trajectory. To be sure, there are outliers, but the past few decades have only reinforced the notion of a developmental hierarchy of color terms, making it a useful model for conlangs.

The first distinction in color is near-universal: light and dark. This can also be black and white or warm and cool; the specifics won’t matter too much. Mostly, yellow and red fall in with white in this scheme, while blue and green are dark. Other colors, like purple, brown, or orange, fall in somewhere along this spectrum. Exactly where is different for each language. It’s easy to see pink as “light” and purple as “dark”, but what about a soft lavender or a deep ruby?

At some point, probably fairly early in a culture’s history, a new color term comes about, splitting “light” into white and red. This seems obvious, as blood is red, and it’s a very important part of humanity. Yellow also tends to get lumped in with red in this scheme, meaning that most oranges do, too.

The next two colors to “break off” are green and yellow, in either order. Green can come first or yellow can, but they both need to be present before the next stage can begin. Once a language has these five color terms—black, white, red, green, yellow—then it’s on to the sixth and final major color: blue.

These six are the main group, then, and there’s a very good reason why. Human vision, as anybody who took biology knows, has two key parts: rods and cones. The rods are monochromatic, distinguishing only light and dark; in other words, just like a two-color-term language. The cones, however, are how we see color. They come in three flavors, roughly corresponding to red, green, and blue.

So that’s probably a good explanation for the first six basic color terms. Red has the longest wavelength, so it’s the easiest to see, hence why stop signs and a car’s brake lights are red. It stands to reason that it would be singled out first. The eye’s green cones tend to be the most sensitive, but green and yellow are pretty close together, spectrally speaking, so they’re the next two, but their similarity leads to the flip-flop in which comes first. And then that leaves blue.

What about the others, though? Well, there it gets murky. Brown is usually the seventh basic color, distinguished from red and yellow. After that, there’s no real set order among the next four: orange, pink, purple, and gray. But those eleven, possibly accompanied by one or more lighter or darker shades (cyan, magenta, azure, etc.), make up the core color terminology of the majority of languages.

The rest of the box

All the other colors’ names will be derived in some way, and that can include some from the above list, if a language doesn’t have a full complement of basic terms. One way of doing this is with adjectives that specify a particular shade of a color. English has lots of these: dark, light, pale, deep, etc. The new color names produced with them aren’t single words, but phrases like “dark blue” or “pale pink”; other languages might have ways of compounding them, though.

Compounds give us another way of making new color words. By combining two basic colors, we can get new ones. That’s how we have “red-orange” or “blue-green”, to name but two. They’re in-between colors, and they tend to be composed of two colors adjacent on the spectrum. It’s hard to imagine a “yellow-blue” that isn’t green, for instance.

Another possibility is the abstract color word. These aren’t basic terms; instead, they tend to come about as finer distinctions of shade. They may have started off with some other meaning, but they now refer almost exclusively to a specific range of colors. Maroon and cyan are a couple of English examples.

By far, though, the best way of making names for colors is through description. Something that has a certain color becomes a descriptor for that particular color—“navy blue”, for instance—then, eventually, the color’s name. That’s how it worked for salmon, coral, violet, lavender, and hundreds of others. It may have even been the case for orange, as the fruit’s name seems be older than the color term. And if the original reason for one of these names is lost, then it may come to be considered an abstract term; indigo is one color that has gone through this process.

Using all these, a language can easily fill up even the biggest box of crayons. But the more color terms you have, the less of the color space each one covers. There will be overlap, of course, and the general terms will always cover more area than the more specific ones. And every language makes its own distinctions. The border between, say, red and yellow isn’t set in stone.

Even weirder

A few conlangers like making languages for speakers that aren’t ordinary humans. Since we’re moving into more culture-specific parts of language, this is a good opportunity to look at what needs to be done for that sort of conlang.

If the prevailing theory is accurate, basic color terms come about in the order they do because of human vision, as we saw above. A race that doesn’t follow normal human rules, however, will have a different color hierarchy. Some people, for example, have a fourth set of cone cells, purportedly letting them see otherwise “impossible” colors. Tetrachromats, as they’re called, effectively have a fourth primary color at their disposal.

An entire race (in the literary sense) of tetrachromats would have a language that reflects this. Where their fourth color fits into the hierarchy would depend on the specifics of how that fourth cone cell works, but it would certainly be in that first group alongside red, green, yellow, and blue.

Similarly, red-green colorblindness could be the norm for a race. In that case, red and green wouldn’t differentiate, obviously, but the rest of the diminished color space would also be changed. In fact, it’s easy to imagine such a race never getting past the light/dark stage.

And no discussion of color vision would be complete without including the neighboring portions of the spectrum. The human lens blocks ultraviolet, but some people report being able to see it. Vision reaching into the infrared is a little more plausible for our species. Aliens, though, could have their equivalent to cones reach their peak sensitivity at different points of the spectrum, allowing them to see into the deeper or higher ranges. Their color terms would likely reflect this, and an alien race could have a whole collection of words for color combinations that we simply cannot see.

Next up

Next time, we’ll look at our two conlangs and their color words. Then, it’ll be off to another part of the semantic realm, but I don’t yet know exactly which one. Stay tuned.

The future of auxlangs

Auxlangs are auxiliary languages: conlangs specifically created to be a medium of communication, rather than for artistic purposes. In other words, auxlangs are made to be used. And two auxlangs have become relatively popular in the world. Esperanto is actually spoken by a couple million people, and it has, at times, been considered a possibility for adoption by large groups of people. Lojban, though constructed on different principles, is likewise an example of an auxlang being used to communicate.

The promise of auxlangs, of course, is the end of mistranslation. Different languages have different meanings, different grammars, different ways of looking at the world. That results in some pretty awful failures to communicate; a quick Internet search should net you hundreds of “translation fails”. But if we had a language designed to be a go-between for speakers of, say, English and Spanish, then things wouldn’t be so bad, right?

That’s the idea, anyway. Esperanto, despite its numerous flaws, does accomplish this to a degree. Lojban is…less useful for speaking, but it has a few benefits that we’ll call “philosophical”. And plenty of conlangers think they will make the one true international auxiliary language.

So let’s fast-forward a few centuries. Esperanto was invented on the very edge of living memory, as we know, and Lojban is even younger than that, but Rome wasn’t built in a day. Once auxlangs have a bit of history behind them, will any of them achieve that Holy Grail?

The obvious contender

They’d have to get past English, first. Right now, the one thing holding back auxlang adoption is English. Sure, less than a quarter of the world’s population speaks it, but it’s the language for global communication right now. Nothing in the near future looks likely to take its place, but let’s look at the next best options.

Chinese, particularly Mandarin, may have a slight edge in sheer numbers, but it’s, well, Chinese. It’s spoken by Chinese, written by Chinese, and it’s almost completely confined to China. Sure, Japan, Korea, and much of Southeast Asia took parts of its writing system and borrowed tons of words, but that was a thousand years ago. Today, Chinese is for China. No matter how many manufacturing jobs move there (and they’re starting to leave), it won’t be the world language. That’s not to say we won’t pick a few items from it, though.

On the surface, Arabic looks like another candidate. It’s got a few hundred million speakers right now, and they’re growing. It has a serious written history, the support of multiple nations…it’s almost the perfect setup. But that’s Classical Arabic, the kind used in the Koran. Real-life “street” Arabic is a horrible mess of dialects, some mutually unintelligible. But let’s take the classical tongue. Can it gain some purchase as an auxlang?

Probably not. Again, Arabic is associated with a particular cultural “style”. It’s not only used by Muslims or even Arabs, mind you, but that’s the common belief. There’s a growing backlash against Muslims in certain parts of the world, and some groups are taking advantage of this to further fan the flames. (I write this a few hours after the Brussels bombings on March 22.) But Arabic’s problems aren’t entirely political. It’s an awful language to try to speak, at least by European standards. Chinese has tones, yes, but you can almost learn those; pharyngeal and emphatic consonants are even worse for us. Now imagine the trouble someone from Japan would have.

Okay, so the next two biggest language blocks are out. What’s left? Spanish is a common language for most of two continents, although it has its own dialect troubles. Hindi is phonologically complex, and it’s not even a majority language in its own country. Latin is dead, as much as academics hate to acknowledge that fact. Almost nothing else has the clout of English, Chinese, and Arabic. It would take a serious upheaval to make any of them the world’s lingua franca.

Outliving usefulness

It’s entirely possible that we’ll never need an international auxiliary language at all, because automatic translation becomes good enough for daily use in real-time. Some groups are making great headway on this right now, and it’s only expected to get better.

If that’s the case, auxlangs are then obsolete. There’s no other way of putting it. If computers can translate between any two languages at will, then why do you need yet another one to communicate with people? It seems likely that computing will only become more ubiquitous. Wearables look silly to me, but I’ll admit that I’m not the best judge of such things. Perhaps they’ll go mainstream within the next decade.

Whatever computers you have on your person, whether in the form of a smartphone or headgear, likely won’t be powerful enough to do the instantaneous translation needed for conversation, but it’ll be connected to the Internet (sorry, the cloud), with all the access that entails. Speech and text could both be handled by such a system, probably using cameras for the latter.

For auxlang designers, that’s very much a dystopian future. Auxiliary languages effectively become a subset of artlangs. But never fear. Not everyone will have a connection. Not everyone will have the equipment. It’ll take time for the algorithms to learn how to translate the thousands of spoken languages in the world, even if half of them are supposed to go extinct in the coming decades.

The middle road

Auxlangs, then, have a tough road ahead. They have to displace English as the world language, then hold off the other natural contenders. They need real-time translation to be a much more intractable problem than Google and Microsoft are making it out to be. But there’s a sliver of a chance.

Not all auxlangs are appropriate as an international standard of communication. Lojban is nice in a logical, even mathematical way, but it’s too complicated for most people. A truly worldwide auxlang won’t look like that. So what would it look like?

It’ll be simple, that’s for sure. Think something closer to pidgins and creoles than lambda calculus. Something like Toki Pona might be too far down the scale of simplicity, but it’s a good contrast. The optimum is probably nearer to it than to Lojban. Esperanto and other simplified Latins can work, but you need to strip out a lot of filler. Remember, everyone has to speak this, from Europeans to Inuits to Zulus to Aborigines, and everywhere in between. You can’t please everybody, but you can limit the damage.

Phonology would also tend to err on the side of simplicity. No tones, no guttural sounds half the world would need to learn, no complex consonant clusters (but English gets a pass with that one, strangely enough). The auxiliary language of the future won’t be Hawaiian, but it won’t be Georgian, either. Again, on the lower side of medium seems to be the sweet spot.

The vocabulary for a hypothetical world language will be the biggest problem. There’s no way around it that I can see, short of doing some serious linguistic analysis or using the shortcut of “take the same term in a few big languages and find the average”. Because of this, I can seriously see a world auxlang as being a pidgin English. Think a much simplified grammar, with most of the extraneous bits thrown out. Smooth out the phonology (get rid of “wh”, drop the dental fricatives, regularize the vowels, etc.) and make the whole thing either more isolating or more agglutinative—I’m not sure which works best for this. The end result is a leaner language that is easier to pick up.

Or just wait for the computers to take care of things for us.

Let’s make a language – Part 14c: Derivation (Ardari)

Ardari takes a different approach for its word derivation. Instead of compounding, like Isian does, Ardari likes stacking derivational affixes. That doesn’t mean it totally lacks compounds, just that they take a bit of a back seat to affixes. Therefore, we should start with the latter.

Ardari’s three main parts of speech—noun, verb, and adjective—are mostly separate. Sure, you can use adjectives directly as nouns, and we’ve got ky to create infinitives, but there are usually insurmountable boundaries surrounding these three. The most regular and productive derivation affixes, then, are the ones that let us pass through those boundaries.

Making nouns

To make new nouns from other types of words, we’ve got a few choices:

  • -önda creates abstract nouns from verbs (luchönda “feeling”)
  • -kön makes agent nouns, like English “-er” (kwarkön “hunter”)
  • -nyn creates patient nouns from verbs, a bit like a better “-ee” (chudnyn “one who is guarded”)
  • -ymat takes an adjective and makes an abstract noun from it (agrisymat “richness”)

All of these are perfectly regular and widely used in the language. The nouns they create are, by default, neuter. -kön and -nyn, however, can be gendered: kwarköna denotes a male hunter, kwarköni a huntress.

Two other important nominal suffixes are -sö and -ölad. The first switches an abstract or mass noun to a concrete or count noun, while the second does the opposite. Thus, we have ichurisö “a time of peace”, oblasö “a drop of water”, sèdölad “childhood”, or kujdölad “kingship”. (Note that a final vowel disappears when -ölad is added.)

Ardari also has both a diminutive -imi and an augmentative -oza. These work on nouns about like you’d expect: rhasimi “puppy”, oskoza “ocean”. However, there is a bit of a sticking point. Diminutive nouns are always feminine, and augmentatives always masculine, no matter the original noun’s gender. This can cause oddities, especially with kinship terms: emönimi “little brother” is grammatically feminine!

The other main nominal derivation is po- (p- before vowels). This forms antonyms or opposites, like English “un-” or “non-“. Examples include poban “non-human” and polagri “gibberish”.

Most other derived nouns are, in fact, adjectives used as nouns, as we’ll see below.

Making adjectives

First of all, adjectives can be made by one of three class-changing suffixes:

  • -ösat makes an adjective from an abstract noun (idyazösat “warlike”)
  • -rät makes an adjective from a concrete noun (emirät “motherly”)
  • -ròs creates a “possibility” adjective from a verb (dervaròs “livable”)

Diminutives and augmentatives work as for nouns, but they take the forms -it and -ab, and they don’t alter gender, as Ardari adjectives must agree with head nouns in gender. Some examples: pòdit “oldish”, nejab “very wrong”.

We’ve already seen the general adjective negator ur- in the Babel Text. It works very similarly to English un-, except that it can be used anywhere. (The blended form u- from the Babel Text’s ulokyn is a special, nonproductive stem change.)

Most of the other adjective derivations are essentially postpositional phrases with the order reversed. Here are some of the most common:

  • nèch-, after (nèchidyaz “postwar”)
  • jögh-, before (jötulyan “pre-day”)
  • olon-, middle, centrally (olongoz “midnight”)
  • är-, above or over (ärdaböl “overland”, from dabla)
  • khow-, below or under (khowdyev “underground”)

Many of these are quickly turned into abstract nouns. For instance, olongoz is perfectly usable as a noun meaning “midnight”. Like any other adjective-turned-noun, it would be neuter: olongoze äl “at midnight”.

Making verbs

There are only two main class-changing suffixes to make verbs. We can add -ara to create a verb roughly meaning “to make X”, as khèvara “to dry”. The suffix -èlo works on nouns, and its meaning is often more nuanced. For example, pämèlo “to plant”, from pämi “plant”.

Repetition, like English “re-“, is a suffix in Ardari. For verb stems ending in a consonant, it’s -eg: prèlleg- “to relearn”. Vowel-stems instead use -vo, as in bejëvo- “to rethink”.

Ardari also has a number of prefixes that can be added for subtle connotations. The following table shows some of these, along with their English equivalents.

Prefix Meaning English Example
ej- for, in favor of pro- ejsim “to speak for”
èk- against anti- èksim “to speak against”
jès- with co- jèzgrät “to co-create”
nich- wrongly, badly mis- nichablon “to mishear”
ob- after post-/re- opsim “to reply”
sèt- before pre- sètokön “to precut”
wa- into in- wamykhes “to inquire”
zha- out of ex- zhalo “to expire”

Making compounds

Compounds aren’t as common in Ardari as they are in Isian, but they’re still around. Any noun can be combined with any other noun or adjective, with the head component coming last, as in the rest of the language.

Adjective-noun combinations are the most regular, like chelban “youth, young person”. Noun-agent is another productive combination: byzrivirdökön “bookseller”. Noun-noun compounds tend to be idiosyncratic: lagribyzri “dictionary”, from lagri “word” and byzri “book”.

Reduplicated adjectives are sometimes used for colloquial superlatives: khajkhaj “topmost”, slisli “most beautiful”.

A few words derived from nouns or verbs sit somewhere between compounds and derivational morphemes. An example is -allonda, from allèlönda “naming”. This one works a bit like English “-onomy”: palallonda “astronomy”. Another is -prèllönda, more like “-ology”: ondaprèllönda “audiology”. Finally, -benda and -bekön, from bejë-, work like “-ism” and “-ist”: potsorbekön “atheist” (po- + tsor + -bekön).

Make some words

As before, these aren’t all of the available derivations for Ardari. They’re enough to get started though, and they’re enough to accomplish our stated goal: creating lots of words!

Let’s make a language – Part 14b: Derivation (Isian)

Both of our conlangs have a wide variety of ways to construct new words without having to resort to full-on coinages. We’ll start with Isian, as always, since it tends to be the simpler of the two.

Isian compounds

Isian is a bit more like German or Swedish than English, in that it prefers compounds of whole words rather than tacking on bound affixes. That’s not to say the language doesn’t have a sizable collection of those, but they’re more situational. Compounding is the preferred way of making new terms.

Isian compounds are mostly head-final, and the most common by far are combinations of two or more nouns:

  • hu “dog” + talar “house” → hutalar “doghouse”
  • acros “war” + sam “man” → acrosam “soldier” (“war-man”)
  • tor “land” + domo “lord” → tordomo “landlord”

Note that acrosam shows a loss of one s. This is a common occurrence in Isian compounds. Anytime two of the same letter would meet, they merge into one. (In writing, they might remain separate.) Two sounds that “can’t” go together are instead linked by -r- or -e-, whichever fits better.

Adjectives can combine with nouns, too. The noun always goes last. Only the stress patterns and the occasional added or deleted sound tell you that you’re dealing with a compound rather than a noun phrase:

  • sush “blue” + firin “bird” → sufirin “bluebird”
  • bid “white” + ficha “river” → bificha “rapids” (“white river”)

In the latter example, which shows elision, the noun phrase “a white river” would be ta bid ficha, with bid receiving its own stress. The compound “some rapids” is instead ta bificha, with only one stress.

Most verbs can’t combine directly with anything else; they have to be changed to adjectives first. A few “dynamic” verbs, however, can be derived from wasa “to go” plus another verb. An example might be wasotasi “to grab”, from otasi “to hold”.

Changing class

Isian does have ways of deriving, say, a noun from an adjective. The language has a total of eight of these class-changing morphemes that are fairly regular and productive. All of them are suffixes, and the table below shows their meaning, an example, and their closest English equivalent.

Suffix Function English Example
-do State adjective from verb -ly ligado “lovely”
-(t)e Verb from noun -fy safe “to snow”
-el Adjective from noun -y, -al lakhel “royal”
-m Agent noun from verb -er ostanim “hunter”
-mer Adjective from verb -able cheremer “visible”
-nas Abstract noun from verb -ance gonas “speech”
-(r)os Noun from adjective -ness yaliros “happiness”
-(a)ti Verb from adjective en- haykati “to anger”

For the most part, these can’t be combined. Instead, compounds are formed. As an example, “visibility” can be translated as cheremered “visible one”, compounding cheremer with the generic pronoun ed.

-do is very commonly used to make compounds of verbs (in the form of gerund-like adjectives) and nouns. An example might be sipedototac “woodcutting”, from which we could also derive sipedototakem “woodcutter”.

More derivation

The other productive derivational affixes don’t change a word’s part of speech, but slightly alter some other aspect. While the class-changers are all suffixes, this small set contains suffixes, prefixes, and even a couple of circumfixes. (We already met one of those in the Babel Text, as you’ll recall.)

  • -chi and -go are diminutive and augmentative suffixes for nouns. Most nouns can take these, although the meanings are often idiosyncratic. For example, jedechi, from jed “boy”, means “little boy”, and secago “greatsword” derives from seca “sword”.

  • -cat, as we saw in the Babel Text, turns a noun into a “mass” noun, one that represents a material or some other uncountable. One instance there was gadocat “brick”, meaning the material of brick, not an individual block.

  • a-an was also in the Babel Text. It’s a circumfix: the a- part is a prefix, the -an a suffix. Thus, we can make ayalian “unhappy” from yali “happy”.

  • Two other productive circumfixes are i-se and o-ca, the diminutive and augmentative for adjectives, respectively. With these, we can make triplets like hul “cold”, ihulse “cool”, and ohulca “frigid”.

  • The prefix et- works almost exactly like English re-, except that you can put it on just about any verb: roco “to write”, eteroco or etroco “to rewrite”.

  • ha-, another verbal prefix, makes “inverse” forms of verbs. For example, hachere might mean “to not see” or “to miss”. It’s different from the modal adverb an.

  • mo- is similar in meaning, but it’s a “reverse”: mochere “to unsee”.

That’s not all

Isian has a few other derivation affixes, but they’re mostly “legacy”. They aren’t productive, and some of them are quite irregular. We’ll meet them as we go on, though. For now, it’s time to switch to Ardari.

Let’s make a language – Part 14a: Derivation (Intro)

By this point in the series, we’ve made quite a few words, but a “real” language has far more. English, for instance, is variously quoted as having anywhere from 100,000 to over a million different words. How do they do it? Up to now, we’ve been creating words in our conlangs in a pretty direct manner. Here’s a concept, so there’s a word, and then it’s on to the next. But that only takes you a very short way into a vocabulary. What we need is a faster method.

Our words so far (with a few exceptions) have been roots. These are the basic stock of a language’s lexicon, but not its entirety. Most languages can take those roots and construct from them a multitude of new, related words. This process is called derivation, and it might be seen as one of the most powerful weapons in the conlanger’s arsenal.

How to build a word

Derivation is different from inflection. Where inflection is the way we make roots into grammatically correct words, derivation is more concerned with making roots into bigger roots. These can then be inflected like any other, but that’s for after they’re derived.

The processes of derivation and inflection, however, work in similar ways. We’ve got quite a few choices for ways to build words. Here are some of the most common, with English examples where possible.

  • Prefixes: morphemes added to the beginning of a root; “un-” or “anti-“.
  • Suffixes: morphemes added to the end of a root; “-ize” and “-ly”.
  • Compounding: putting two or more roots together to make a new one; “football” or “cellphone”.
  • Reduplication: repeating part or all of a root; “no-no”, “chit-chat”.
  • Stress: changing the stress of a root; noun “permit” and verb “permit“.

Stem changes (where some part of the root itself changes) are another possibility, but these are more common as inflections in English, as in singular “mouse” versus plural “mice”. Tone can be used in derivation in languages that have it, though this seems to be a little rarer.

Also, although I only listed prefixes and suffixes above, there are a few other types of affixes that sometimes pop up in derivation. Infixes are inserted inside the root; English doesn’t do this, except in the case of expletives. Circumfixes combine prefixes and suffixes, like German’s inflectional ge-t. The only English circumfix I can think of is en-en, used to make a few verbs like “enlighten” and the humorous “embiggen”. Finally, many languages’ compounds contain a linking element. German has the ubiquitous -s-, and English has words like “speedometer”.

Derivations of any kind can be classified based on how productive they are. A productive derivation is one which can be used on many words with predictable results. Unproductive derivations might be limited to a few idiosyncratic uses. These categories aren’t fixed, though. Over time, some productive affixes can fall out of fashion, while unproductive ones become more useful due to analogy. (“Trans-” is undergoing the latter transformation—ha!—as we speak, and some are pushing for wider use of the near-forgotten “cis-“.)

Isolating languages are a special case that deserves a footnote. Since the whole point of such a language is that words are usually simple, you might wonder how they can have derivation. Sometimes, they will allow a more “traditional” derivation process, typically compounding or some sort of affix. An alternative is to create phrases with the desired meaning. These periphrastic compounds might be fixed and regular enough in form to be considered derivations, in which case they’ll follow the same rules.

What it means

So we have a lot of ways to build new words (or phrases, for the isolating fans out there) out of smaller parts. That’s great, but now we need those parts. For compounds, it’s pretty easy, so we’ll start with those.

Compounding is the art of taking two smaller words and creating a larger one from them. (And it is indeed an art; look at German if you don’t believe me.) This new word is somehow related to its parts, but how depends a lot on the language. It can be nothing more than the sum of its parts, as in “input-output”. Or the compound may express a subset of one part, like “cellphone”.

Which words can be compounded also changes from language to language. Putting two nouns together (“railroad”) is very common; which one goes first depends, and it’s not always as simple as head-first or head-final. Combinations of two verbs are rarer in Western languages, though colloquial English has phrasal compounds like “go get” and “come see”. Adjective-noun compounds are everywhere in English: “redbird”, “loudspeaker”, and so on.

Verbs and nouns can fit together, too, as they often do in English and related languages. “Breakfast” and “touchscreen” are good examples. Usually, these words combine a verb and an object into a new noun, but not always. Instrumental compounds can also be formed, where the noun is the cause or means of the action. In English, these are distinguished by being noun-verb compounds: “finger-pointing”, “screen-looking”. They start out as gerunds (hence the -ing), but its trivially easy to turn them into verbs.

Really, any words can be compounded. “Livestreaming” is an adjective-verb compound. “Aboveboard” combines a preposition and a noun. The possibilities are endless, and linguistic prescription can’t stop the creative spirit. You don’t even have to use the whole word these days. “Simulcast”, “blog”, and the hideous “staycation” are all examples of “blended” compounds.

All the rest

Compounds are all made from words or, more technically, free morphemes. Most of the other derivational processes work by attaching bound morphemes to a root. Some of these are highly productive, able to make a new word out of just about anything. Others are more restricted, like the rare examples of English reduplication.

Changing class

Most derivations of this type change some part of a word’s nature, shifting it from one category to another. English, as we know, is full of these, and its collection makes a good, exhaustive list for a conlanger. We’ve got -ness (adjective to noun), -al (noun to adjective), -fy (noun to verb), -ize (adjective to verb), -able (verb to adjective), and -ly (adjective to adverb), just to name a few. Two special ones of note are -er, which changes a verb to an agent noun, and its patient counterpart -ee.

In general, a language with a heavy focus on derivation (especially agglutinative languages) will have lots of these. One for each possible pair isn’t out of the question. Sometimes, you’ll be able to stack them, as in words like “villification” (noun to verb and back to noun) or “internationalization” (noun to adjective to verb to noun!).

Changing meaning

Those derivations that don’t alter a lexical category will instead change the meaning of the root. We’ve got a lot of options here, and English seems happy to use every single one of them. But we’ll look at just a few of them here. Most, it must be said, were borrowed from Latin or Greek, starting a couple hundred years ago; these classical languages placed a much heavier emphasis on agglutination than English at the time.

Negation is common, particularly for verbs and adjectives. In English, for example, we’ve got un-, non-, in-, dis-, de-, and a-, among others. For nouns, it’s usually more of an antonym than a negation: anti-.

Diminutives show up in a lot of languages, where they indicate “smallness” or “closeness” of some sort. Spanish, for instance, has the diminutive suffix -ito (feminine form -ita). English, on the other hand, doesn’t have a good “general” diminutive. We’ve got -ish for adjectives (“largish”) and -y for some nouns (“daddy”), but nothing totally regular. By a kind of linguistic analogy, diminutives often have high, front vowels in them.

Augmentatives are the opposite: they connote greatness in size or stature. Prefixes like over-, mega-, and super- might be considered augmentatives, and they’re starting to become more productive in modern English. By the same logic as above, augmentatives tend to use back, low vowels.

Most of the others are concerned with verbal aspect, noun location, and the like. In a sense, they replace adverbs or prepositions. Re-, for example, stands in for “again”, as pre- does for “before”. And then there are the outliers, mostly borrowed from classical languages. -ology and -onomy are good examples of this.

Non-English

We’ve heavily focused on English so far, and that’s for good reason: I know English, you know English, and it has a rich tradition of derivation. Other languages work their own ways. The Germanic family likes its compounding. Greek and Latin had tons of affixes you could attach to a word. Many languages of Asia, Africa, and the Pacific have very productive reduplication. Although I used English examples above, that’s no reason to slavishly follow that particular language when constructing your own.

In the next two posts, we’ll see how Isian and Ardari make new words. Each will have its own “style” of derivation, but the results will be the same: near-infinite possibilities.

Sound changes: everything else

Not every sound change works on just consonants or just vowels. Some can transmute one into the other. Others affect entire syllables or words. A few work on a different level entirely. So, we’ll finish this series by looking at these “miscellaneous” types of evolution.

Tones

Tones have to come from somewhere. One of the ways they can appear (tonogenesis) is through the loss of consonants preceding or following a vowel. A voiced consonant, for instance, can cause the vowel after it to be spoken at a lower pitch. If those consonant go away, the change in pitch can remain: a low tone. As another example, a number of consonant elisions led to the tonal system of Chinese, along with its restrictive syllable structure.

Once a language has tone, it becomes a target for evolution. Tones can change, merge, split, and disappear, exactly as phonemes. Unstressed syllables may develop a neutral tone, which might get reanalyzed as one of the existing tones. Sequences of tones can affect each other, as well, a complex process called tone sandhi.

Like any other part of language, tone is subject to the same forces that drive all sound change, which can be summed up as human laziness. More on that later.

Sandhi

The term sandhi comes from Sanskrit; roughly speaking, it means “joining”. In modern linguistics, it’s a catch-all term used for any kind of sound change that crosses the boundary between morphemes. The “linking” R in some English dialects is a kind of sandhi, and so is the use of the article “an” before vowels. Romance languages show a couple more instances of the process: Spanish de eldel; Italian delladell’; the heavy use of liaison in French.

When sandhi becomes systematic, it can create new words, like Spanish del and al. These, of course, can then be changed by any other sound change. And it’s not limited to vowels. Consonants can also be affected by sandhi. The most common expression of this is anticipatory voicing across word boundaries, but other types of assimilation are equally valid.

Epenthesis

Epenthesis is the adding of a sound, the opposite of elision. It’s another way of breaking up a cluster that violates a language’s phonological rules or aesthetic sensibilities. Some epenthesis is a kind of sandhi, like English “an”, and the diaresis discussed last week is another form. Those aren’t the only possibilities, though.

An epenthetic vowel can be inserted between two consonants, and this will usually be a neutral vowel, whatever the language considers that to be. Schwa (/ə/) is a common choice, but /e/, /a/, and /o/ also pop up. /i/ and /u/, however, are usually too strong.

Similarly, strings of vowels may be broken up by epenthetic consonants. Again, something weak and unassuming is needed, something like /r/, /n/, /l/, /h/, or /ʔ/. /w/ and /j/ can be used as glides, as we have seen, but they’ll tend to be used only when they can relate to one of the vowels.

Another option for consonant clusters is an epenthetic consonant, one that bridges the gap between the two. Greek, for example, shows a sound change /mr/ → /mbr/, as seen in words like “ambrosia”. Many speakers of English insert epenthetic consonants like this all over, without even knowing it, like the [p] in “something”. (If this became phonemic, it would be essentially the same thing that happened to Greek.)

Haplology

Two syllables that are fairly close in sound may not stay together for very long. Haplology is a sound change that involves the deletion of one syllable of such a pair. It can be either one, and there’s no standard for how “close” two syllables need to be to trigger the change. English examples include the common pronunciations of “probably” and “February”, and others aren’t hard to find. (In another one of those linguistic oddities, “haplology” itself can fall victim to this, becoming “haplogy”.)

Applying the rules

Although there are plenty of other sound changes out there—again, I refer you to Index Diachronica for more—we have gathered enough over the last three posts to start looking at how to apply them to a conlang. There are plenty of programs out there that can do this for you, but it helps to know the rules. These aren’t set in stone, mind you, but you should have a good reason for breaking them. (That reason would probably lead to more conlanging, so I’m not complaining.)

First, evolutionary sound changes are regular. They’ll almost always happen when the right conditions are met. If you’ve got devoicing of final stops, as in German, then essentially every final stop is going to get devoiced. Sure, there may be exceptions, but those exceptions can be explained. Maybe those words appeared in their current forms after the sound change.

Second, remember that sound changes don’t care. This is a subset of regularity, but it bears repeating. A sound change will affect a word no matter what that word’s history. A particular evolutionary condition may be met because of an earlier sound change, but later changes won’t know that. They’ll only “see” a word ripe for alteration.

Third, sound changes operate on a lower level. They’re “below” grammar and, as such, aren’t affected by it. But this means that grammatical ambiguity can arise, as when sounds of case endings are merged or dropped. (This one happened in both English and the Romance languages.) Speakers will then need to find ways of clearing things up, leading to innovations on the grammar side of things.

Fourth, sound change stems from laziness, a desire to minimize the effort required in speaking and conveying our thoughts. Weak sounds disappear, dissimilar sounds merge, and it’s all because we, as a whole, know we can get away with it. As long as there’s enough left to get the message across, all else is simply extraneous baggage. And that’s what’s most likely to change.

Finally, evolution is unceasing. When it comes to language, the only constant is change. Even our best efforts at writing and education and language academies can’t stop sound change. There will always be differences in speech. Those will form dialects, and then those may split into new, mutually unintelligible languages.

Sound changes: vowels

In this part of our little series, we’ll look at some of the sound changes that can affect vowels. Since there tend to be far fewer vowels than consonants in a language’s phonemic inventory, there aren’t as many places for these sound changes to go. For the same reason, however, the vowel system of a language is more prone to change, with new phonemes coming into use and old ones disappearing.

Vowels

Before we begin, think of (or look at) the IPA vowel chart. It’s usually depicted as something like a trapezoid, but it’s just as easy to imagine it as a triangle with vertices at /i/, /a/, and /u/. All the other vowel sounds—/e/, /y/, /ø/, and so on—are along the sides or in the middle. This conception will make a few of the sound changes described below seem more obvious.

Umlaut

The process of umlaut, as found in German, is an example of a larger phenomenon used referred to as fronting. Either term is fine for amateur conlangers, because everyone will know what you mean. Whatever you call it, it’s a change that causes vowels to move towards the front of the mouth.

Most commonly, fronting occurs under the influence of an /i/ sound. (In that, it’s almost like a kind of vowel harmony, or the vowel version of assimilation.) Sometimes, the /i/ later disappears, leaving behind the affected vowel as its only trace.

The Germanic languages embraced fronting to varying degrees, and they’re the best example around. German itself, of course, has the front rounded vowels ü and ö; the diacritic is often called an umlaut for just this reason. Old English, meanwhile, back /ɑ/ was fronted to /æ/. Swedish brought its /uː/ frontward to become /ʉː/. And the list goes on.

Fronting doesn’t always happen, so the back vowels aren’t totally lost. Instead, it can become a way to add in more front vowels; overall, languages tend to have more in the front than the back. Or it can cause mergers, as [y] becomes reinterpreted as /i/. This very thing happened in Greek, for instance.

Raising and lowering

Instead of bringing a vowel to the front, raising brings it up. Usually, this moves a sound one “step” up on the vowel chart: /a/ → /e/ → /i/. Intermediate steps like /ɛ/ can come into play, as well. An example of this process happening right now is in my own dialect of US Southern English, where some vowels are raised before nasal sounds. Thus, “pin” and “pen” sound alike.

The environment usually causes raising, but it’s not any specific sound that triggers it. Nasals can, as they do for me, but raised vowels later in the word can do it, too. So can other consonants. In general, it works out to yet another form of assimilation—vowels will tend to be raised by proximity to other “high” sounds. The reason it works so well for nasals is because they’re the highest in the mouth that you can get: in the nose.

Unlike fronting, raising seems to be more “effective”. But this makes it possible for other sound changes to come into play, sweeping into the vocalic void left behind. If raising gets rid of most instances of /a/, for example, some other sound will likely change to fill that gap.

The opposite of raising, lowering, is one such way of accomplishing this. It’s the same thing as raising, but in reverse: /u/ → /o/ → /a/ is a common trend. Front vowels appear to be harder to lower, likely from the massive influence of /i/, but it’s possible to do, say, /e/ → /ɛ/.

Nasalization

Vowels near nasal sounds might assimilate to them, in a change called nasalization. If the change is thorough enough, it can even result in the loss of the nasal consonant, leaving only a nasal vowel. That was the case in French and Portuguese, both of which have a set of nasalized vowels.

Any of the nasal sounds work for this, from /m/ to /ɴ/, but the “big three” of /n/, /m/, and /ŋ/ are the most common in languages, in that order. They’ll be the likely suspects. If nasalization occurs, then it will probably be on those vowels that precede these sounds; vowels following nasals are less susceptible to the change. Nasals at the end of a word or right before another consonant are the best candidates for the total nasalization that results in their disappearance.

A similar change can occur with /r/-like (rhotic) sounds, but this is much less common. It is a way to get a series of rhotic vowels like those in American English, and it’s conceivable that the difference between “regular” and “rhoticized” could become phonemic.

Lengthening and shortening

Solitary vowel phonemes can, in some cases, become long vowels or diphthongs. On the other hand, it’s easy for those to revert to short vowels. (And those can be shortened further, dropping out altogether, but we’ll get to that in a moment.)

These changes are very connected to the stress pattern of a word. Stressed vowels are more likely to be lengthened or broken into diphthongs. Unstressed vowels, by contrast, get the opposite treatment: reduction and shortening. That’s not the only reason these processes can happen, but it is the primary one.

The total elision of unstressed vowels is also quite possible. This can happen between consonants (syncope), at the beginning of a word (apheresis) or at its end (apocope). All of these are historically attested, both in natural language evolution and in borrowed words. Syncope, for example, occurs in British English pronunciations of words like secretary, while apocope turns American “going” to “goin'”.

Combining and breaking

Two vowels that end up beside each other (probably because of consonant changes) can create an unstable situation. Like the case of consonant clusters, vowel clusters “want” to simplify. They can go about this in a couple of different ways.

The easiest way is for the two to combine into a diphthong or long vowel. Where this isn’t possible, one of the vowels may assimilate to the other, much like consonants. Alternatively, the two might “average out”, fusing into a sort of compromise sound, like /au/ → /o/ (or /oː/, if that’s possible in the language).

Another potential outcome is a separation into two syllables by adding a glide. For example, one form of this diaresis is /ie/ → /i.je/. Once the vowel cluster is broken apart, other sound changes can then alter the new structure, potentially even re-merging the cluster.

Onward

Plenty of other vowel changes exist, but these are the most common and most defining. Next time, we’ll wrap up the series with a look at some of the sound changes that sit outside of the usual consonant/vowel dichotomy, as well as those that can affect a whole word. Also, we’ll conclude with a few rules of thumb to help you get the most out of your conlang’s evolution.

Sound changes: consonants

Languages change all the time. Words, of course, are the most obvious illustration of this, especially when we look at slang and such. Grammar, by contrast, tends to be a bit more static, but not wholly so; English used to have noun case, but it no longer does.

The sounds of a language fall into a middle ground. New words are invented all the time, while old ones fall out of fashion, but the phonemes that make up those words take a longer time to change. This does, however, occur more often than wholesale grammatical alterations. (In fact, sound change can lead to changes in grammar, but it’s hard to see how the opposite can happen.)

This brief miniseries will detail some of the main ways sounds can change in a language. The idea is to give you, the conlanger, a new tool for making naturalistic languages. I won’t be covering everything here—I don’t have time for that, nor do you. Examples will be necessarily brief. The Index Diachronica is a massive catalog of sound changes that have occurred in real-world languages, and it’s a good resource for conlangers looking for this sort of thing.

Consonants

We’ll start by looking at some of the main sound changes that can happen to consonants. Yes, some effects are equally valid for consonants and vowels, but I had to divide this up somehow.

Lenition

Lenition is one of the most common sound changes. Basically, it’s a kind of “weakening” of a consonant into another. Stops can weaken into affricates or fricatives, for instance; German did this after English and its relatives broke away, hence “white” versus weiß. Another word is “father”, which shows two examples of this—compare it to Latin pater, which isn’t too far off from the ancestral form. (Interestingly, you can even say that “lenition” itself is a victim.)

Fricatives can weaken further into approximants (or even flaps or taps): one such change, of /s/ to /h/, happened early on in Greek, hence “heptagon”, using the Greek-derived root “hepta-“. Latin didn’t take this particular route, giving us “September” from Latin septem “seven”.

Approximants don’t really have anywhere to go. They’re already weak enough as it is. The only place for them to go is away, and that sometimes happens, a process called elision. Other sounds can be elided, but the approximants are the most prone to it. In English, for instance, we’ve lost /h/ (and older /x/) in a lot of places. (“im” for “him” is just the same process continuing in the present day.)

Lenition and elision tend to happen in two main places: between vowels and at the end of a word. Those aren’t the only places, however.

Assimilation

Assimilation is when a sound becomes more like another. This can happen with any pair of phonemes, but consonants are more susceptible, if only because they’re more likely to be adjacent.

Most assimilation involves voicing or the point of articulation. In other words, an unvoiced sound next to a voiced one is an unstable situation, as is a cluster like /kf/. Humans are lazy, it seems, and they want to talk with the least effort possible. Thus, disparate sequences of sounds like /bs/ or /mg/ tend to become more homogenized. (Good examples in English are all those Latin borrowings where ad- shows up as “al-” or “as-“, like “assimilation”.)

Obviously, there are a few ways this can play out. Either sound can be the one to change—/bs/ can end up as /ps/ or /bz/—but it tends to be the leading phoneme that gets altered. How it changes is another factor, and this depends on the language. If the two sounds are different in voicing, then that’ll likely shift first. If they’re at different parts of the vocal tract, then the one that changes will slide towards the other. Thus, /bs/ will probably come out as /ps/, while /mg/ ends up as /ŋg/.

Assimilation is also one way to get rid of consonant clusters. Some of the consonants will assimilate, then they’ll disappear. Or maybe they won’t, and they’ll create geminates, as in Italian

Metathesis

Anyone who’s ever heard the word “ask” pronounced as “ax” can identify metathesis, the rearranging of sounds. This can happen just about anywhere, but it often seems to occur with sound sequences that are relatively uncommon in a language, like the /sk/ cluster in English.

This one isn’t quite as systematic in English, but other languages do have regular metathesis sound changes. Spanish often swapped /l/ and /r/, for example, sometimes in different syllables. One common thread that crosses linguistic barriers involves the sonority hierarchy. A cluster like /dn/ is more likely to turn into /nd/ than the other way around.

Palatalization, etc.

Any of the “secondary” characteristics of a consonant can be changed. Consonants can be palatalized, labialized, velarized, glottalized, and so on. This usually happens because they’re next to a sound that displays one of those properties. It’s like assimilation, in a way.

Palatalization appears to be the most common of these, often affecting consonants adjacent to a front vowel. (/i/ is the likely culprit, but /e/ and /y/ work, too.) Labialization sometimes happens around back rounded vowels like /u/. Glottal stops, naturally, tend to cause glottalization, etc. Often, the affecting sound will disappear after it does its work.

Dissimliation

Dissimliation is the opposite of assimilation: it makes sounds more different. This can occur in response to a kind of phonological confusion, but it doesn’t seem to be very common as a regular process. Words like “colonel” (pronounced as “kernel”) show dissimilation in English, and examples can be found in many other languages.

Even more…

There are a lot of possible sound changes we haven’t covered, and that’s just in the consonants! Most of the other ways consonants can evolve are much rarer, however. Fortition, for example, is the opposite of lenition, but instances of it are vastly outnumbered by those of the latter.

Vowels present yet more opportunities to change up the sound of a language, and we’ll see them next week. Then, we’ll wrap up the series by looking at all the other ways the sound of a word can change over time.