Let’s make a language – Part 5a: Verbs (Intro)

Last time around, we talked about nouns, the words of people, places, and things. This post will be the counterpoint to that one, because we’re going to look at verbs.

Verbs are words of action. They tell us what is happening. We might walk to the bathroom or drive to the grocery store, and verbs are the words that get us there. But they can also help describe what we are (“to be”), what we possess (“to have”), and what we do (“to do”), along with many other possibilities. Where a noun is an object or an idea, a verb is an action or a state of being.

Just like nouns, every conlang is going to have verbs (except those specifically designed to avoid them, and they do exist). And just like nouns, they have a lot of grammatical baggage. In inflectional languages, verbs will likely have a variety of forms (think of Latin’s verb conjugations). Isolating languages, by contrast, might have verbs that are constant, but they may be able to string them together in such a way that they can create the same shades of meaning. As before, the type of conlang you want to make will influence your verbal structure, but the basic idea of “verb” will remain the same.

Parts of a verb

Where the different categories for nouns are largely concerned with identifying a specific instance of something, verbal categories are more focused on the circumstances of the action in question. The most widely recognized of these include transitivity, tense, aspect, mood, and voice. Below, we’ll look at each of these in turn.

First, though, we need to decide what kind of word the verb will be. This will depend on your conlang, and it will follow the same general pattern as the noun. Isolating languages won’t have a lot of verbal morphology, relying instead on a lot of adverbs, adjectives, and preposition-like phrases, or just more than one verb in a phrase (“serial” verbs). More polysynthetic languages, on the other hand, will tend to concentrate a lot of information in the verbal word itself; agglutinative conlangs will likely have a series of affixes, leading to long words, while inflectional types will instead have fewer affixes each with more permutations.

Second, we need to know a little bit about verbs in relation to nouns. A typical sentence in most languages will have a single verb that acts as the “head”. For our running example, we’ll use the ridiculously simple English sentence the man drives a car. Here, drives is the verb, and you can see why it’s considered the head. Change the verb, and the whole meaning of the action changes as a result. If we say pushes instead, then the man probably ran out of gas. Say steals, and now he’s a thief.

Verbs, like people, have arguments. Here, the term “argument” just means a phrase that’s directly connected to the verb in some way. Our example has two arguments: a subject (the man) and a direct object (a car). If you remember when we were talking about noun case, well, that’s what some of the cases are for. The nominative and accusative (or ergative and absolutive, if you swing that way) basically represent the two main arguments of a verb, subject and object, while the dative indicates the indirect object. (Other cases, like the ablative, aren’t for verbal arguments, so we’ll mostly ignore them here.)

Transitivity

The idea of transitivity isn’t one that most people think about after high school English classes, but it’s central to the construction of a verb. A transitive verb has two arguments (subject and direct object), while an intransitive verb has only one. That would be simple enough, except for the exceptions.

Few languages directly mark transitivity. Some, like English, almost ignore it. Mostly, though, there might be a special verb form to temporarily change transitive to intransitive, or vice versa. Something like this can be seen in Spanish, where a number of intransitive-looking verbs actually have a direct object, typically a reflexive pronoun like se.

If that wasn’t bad enough, there are a few verbs that don’t really fit in the transitive dichotomy. The most important of these is give, which (in many languages) takes not two but three arguments. (This is where the dative comes into play, if the language has one.)

And then there are the “impersonal” verbs, which effectively have zero arguments. Weather verbs are the most common of these. Where English uses a dummy subject (it’s raining), Romance languages can just say the verb itself (Spanish llueve).

Tense

Tense describes when an action takes place in relation to outside events. Obviously, there are three main possibilities: past, present, and future. Not all languages use these, though. English, technically speaking, only has a grammatical distinction between past and present; the future tense is just a present-tense verb preceded by the auxiliary will. And this is a fairly common arrangement. Others prefer having three explicit tenses, while a few (such as Chinese) don’t really mark tense at all on the verb.

So, when we have tense at all, past and present are usually in, and future slides in there occasionally. What else is possible? Well, a few have the opposite distinction as English, marking the past and present the same, but future differently. Another option is to add tenses, splitting either the past or future into more than one. Plenty of real-life languages do this, although probably not any you’ve ever heard:

  • Cubeo (an Amazonian language) is one that has a “historical” past tense used for events long ago.
  • The Bantu language Mwera has a tense specifically for “today”.
  • The language of the Western Torres Strait Islanders, known as Kala Lagaw Ya, is said to have six tenses, with a present, “near” and “far” versions of past and future, and a “today” past tense.
  • A few languages, mostly in Africa, have special verbal forms for “yesterday” and “tomorrow”.

In our example, we’re talking in the present tense, but we can change it to the past by saying the man drove a car. That doesn’t tell us when he drove it, only that he did at some point before now.

Aspect

Where tense is concerned with an absolute fixing in time of an event, aspect tells us more about the “internal” structure. Is the action complete? Is it still ongoing? Did it just start? These are the questions aspect answers, and it turns out that there can be a lot more of them than you might think.

The first distinction, the most basic and most common, is between events that are complete or ongoing. In linguistic terms, these are the perfective and imperfective, respectively. Taking our example sentence (we’ll need to switch it to the past tense for this, but bear with me), we have the perfective the man drove a car versus the imperfective the man was driving a car. As you can see, the later fixes the “reference point” of the sentence inside the action, while the perfective version looks at the act of driving from the outside.

There are dozens of aspects, but most languages don’t directly mark more than a handful. Perfective and imperfective are common, but they’re sometimes mixed with tense, too. That’s the source of the English perfect and pluperfect, which are kind of like crossing the past tense and perfective aspect, but the result can be treated as any tense: the man has driven/had driven/will have driven a car.

Wikipedia has a long list of aspects seen in various languages, but remember that many of these are restricted to just a very few languages.

Mood

Mood (or “modality”, a more technically nuanced term) talks about how a speaker feels towards the event he’s talking about. Is it a statement of fact? A command? A wish?

Moods probably aren’t marked quite as much as tense and aspect, but a few of them cross paths with those two in some languages. The subjunctive mood (which can be used for hypotheticals, opinions, desires, etc.) shows up in English, although it’s starting to disappear in the spoken language. In Romance languages, though, it’s still going strong. Imperatives, marking commands, are found in most languages, and they often have their own morphology.

The other moods don’t show up on verbs quite as often. Some languages have an optative mood specifically for hopes and dreams, wishes and desires. Arabic has the jussive, which is a kind of catch-all mood like the subjunctive. A few languages have a special mood marker for questions, for conditions, and for events that the speaker thinks are likely to occur.

As English doesn’t really have morphology for moods, our only change to the example sentence is the subjunctive that the man drive a car, which sounds overly formal, maybe even archaic.

Voice

Voice is a way to describe the relation between the verb and its arguments. The active voice is the main one, and it means that the subject is the main “doer” or agent, while the direct object (if there is one) is the “target” or patient.

The passive voice is a common alteration. Here, the subject and object switch places. The object becomes the subject, but it’s still the patient. The former subject is demoted to a prepositional phrase (or the language’s equivalent), or it’s dropped altogether. In our English example, we would have a car was driven. (Passives in English, incidentally, have an air of formality to them. It’s popular in business specifically because it de-emphasizes the subject, which minimizes liability.)

Some languages have a middle voice, where the subject is a little bit of both agent and patient. English doesn’t have this, but it can almost emulate it: the car drove. Obviously, in that sentence, the car isn’t driving something. In a sense, we’re saying that it’s driving itself, but that’s not exactly the middle voice, either. That would be the reflexive, which appears in a few languages.

Other moods include the antipassive (where it’s the object that gets dropped, instead of the subject), the applicative, and the causative. None of these are really present in the languages we’re most familiar with, but they pop up all over the world.

Odds and ends

All this, and we still haven’t touched on things like the infinitive, the gerund, and other miscellany. Well, this post is already getting pretty long, so we’ll look at those as they come up. They’re mostly concerned with larger phrases, anyway, and we haven’t even started on those.

Next time, we’ll look at how Isian and Ardari make their verbs. Along the way, we’ll cover some of the bits left out of this post, like grammatical concord. After that, our next topic will be word order, which means we can finally make a sentence in each of our conlangs.

Irregularity in language

No natural language in the world is completely and totally regular. We think of English as an extreme of irregularity, and it really is, but all languages have at least some part of their grammar where things don’t always go as planned. And there’s nothing wrong with that. That’s a natural part of a language’s evolution.

Conlangs, on the other hand, are often far too regular. For an auxlang, intended for clear communication, that’s actually a good thing. There, you want regularity, predictability. You want the “clockwork morphology” of Esperanto or Lojban. The problem comes with the artistic conlangs. These, especially those made by novices, can be too predictable. It’s not exactly a big deal—every plural ending in -i isn’t going to break the immersion of a story for the vast majority of people—but it’s a little wart that you might want to do away with.

Count the ways

Irregularity comes in a few different varieties. Mostly, though, they’re all the same: a place where the normal rules of grammar don’t quite work. English is full of these, as everyone knows. Plurals are marked by -s, except when they’re not: geese, oxen, deer, people. Past tense is -ed, except that it sometimes isn’t: go and went. (“Strong” verbs like “get” that change vowels don’t really count, because they are regular, but in their own way.) And let’s not even get started on English orthography.

Some other languages aren’t much better. French has a spelling system that matches its pronunciation in theory only, and Irish looks like a keyboard malfunction. Inflectional grammars are full of oddities, ask any Latin student. Arabic’s broken plurals are just that: broken. Chinese tone patterns change in complex and unpredictable ways, despite tone supposedly being an integral part of a morpheme.

On the other hand, there are a few languages out there that seem to strive for regularity. Turkish is always cited as an example here, the joke being that there’s one irregular verb, and it’s only there so that students will know what to expect when they study other languages.

Conlangs are a sharp contrast. Esperanto’s plurals are always -j. There’s no small class of words marked by -m or anything like that. Again, for the purposes of clarity, that’s a good thing. But it’s not natural.

Phonological irregularity

Irregularity in a language’s phonology happens for a few different reasons. However, because phonology is so central to the character of a language, it can be hard to spot. Here are a few places where it can show up:

  • Borrowing: Especially as English (American English in particular) suffuses every corner of the planet, languages can pick up new words and bring new sounds with them. This did happen in English’s history, as it brought the /ʒ/ sound (“pleasure”, etc.) from French, but a more extreme example is the number of Bantu languages that borrowed click sounds from their Khoisan neighbors.

  • Onomatopoeia: The sounds of nature can be emulated by speech, but there’s not always a perfect correspondence between the two. The “meow” of a cat, for instance, contains a sequence of sounds rare in the rest of English.

  • Register: Slang and colloquialism can create phonological irregularities, although this isn’t all that common. English has “yeah” and “nah”, both with a final /æ/, which appears in no other word.

Grammatical irregularity

This is what most people think of when they consider irregularity in a language. Examples include:

  • Irregular marking: We’ve already seen examples of English plurals and past tense. Pretty much every other natural language has something else to throw in here.

  • Gender differences: I’m not just talking about the weirdness of having the word for “girl” in the neuter gender. The Romance languages also have a curious oddity where some masculine-looking words take a feminine article, as in Spanish la mano.

  • Number differences: This includes all those English words where the plural is the same as the singular, like deer and fish, as well as plural-only nouns like scissors.

  • Borrowing: Loanwords can bring their own grammar with them. What’s the plural of manga or even rendezvous?

Lexical irregularity

Sometimes words just don’t fit. Look at the English verb to be. Present, it’s is or are, past is was or were, and so on. Totally unpredictable. This can happen in any language, and one way is a drift in a word’s meaning.

  • Substitution: One word form can be swapped out for another. This is the case with to be and its varied forms.

  • Meaning changes: Most common in slang, like using “bad” to mean “good”.

  • Useless affixes: “Inflammable means flammable?” The same thing is presently ongoing as “irregardless” becomes more widespread.

  • Archaisms: Old forms can be kept around in fixed phrases. In English, this is most commonly the case with the Bible and Shakespeare, but “to and fro” is still around, too.

Orthographic irregularity

There are spelling bees for English. How many other languages can say that? How many would want to? As a language evolves, its orthography doesn’t necessarily follow, especially in languages where the standard spelling was fixed long ago. Here are a few ways that spelling can drift from pronunciation:

  • Silent letters: English is full of these, French more so. And then there are all those extra silent letters added to make words look more like Latin. Case in point, debt didn’t always have the b; it was added to remind people of debitus. (Silent letters can even be dialectal in nature. I pronounce wh and w differently, but few other Americans do.)

  • Missing letters: Nowhere in English can you have dg followed by a consonant except in the American spelling of words like judgment, where the e that would soften the g is implied. (I lost a spelling bee on this very word, in fact, but that was a long time ago.)

  • Sound changes: These can come from evolution or what seems like sheer perversity. (English gh is a case of the latter, I think.)

  • Borrowing: As phonological understanding has grown, we’ve adopted a kind of “standard” orthography for loanwords, roughly equivalent to Latin, Spanish, or Italian. Problem is, this is nothing at all like the standard orthography already present in English. And don’t even get me started on the attempts at rendering Arabic words into English letters.

In closing

All this is not to say that you should run off and add hundreds of irregular forms to your conlang. Again, if it’s an auxlang, you don’t want that. Even conlangs made for a story should use irregular words only sparingly. But artistic conlangs can gain a lot of flavor and “realism” from having a weird word here and there. It makes things harder to learn, obviously, but it’s the natural thing to do.

Let’s make a language – Part 4c: Nouns (Ardari)

For nouns in Ardari, we can afford to be a little more daring. As we’ve decided, Ardari is an agglutinative language with fusional (or inflectional) aspects, and now we’ll get to see a bit of what that entails.

Three types of nouns

Ardari has three genders of nouns: masculine, feminine, and neuter. Like languages such as Spanish or German, these don’t necessarily correspond to the notions of “male”, “female”, and “everything else”. Instead, they’re a little bit arbitrary, but we won’t make the same mistakes as natural languages when it comes to assigning nouns to genders. (Actually, we will make the same mistakes, but on purpose, not through the vagaries of linguistic evolution.)

Each noun is inflected not only for gender, but also for number and case. Number can be either singular or plural, just like with Isian. As for case, well, we have five of them:

  • Nominative, used mostly for subjects of sentences,
  • Accusative, used mainly for the direct objects,
  • Dative, occasionally seen for indirect objects, but mostly used for the Ardari equivalent of prepositional phrases,
  • Genitive, indicating possession, composition, and most places where English uses “of”,
  • Vocative, only used when addressing someone; as a result, it only makes sense with names and certain nouns.

So we have three genders, two numbers, and five cases. Multiply those together, and you get 30 possibilities for declension. (If you took Latin in school, that word might have made you shudder. Sorry.) It’s not quite that bad, since some of these will overlap, but it’s still a lot to take in. That’s the difficulty—and the beauty, for some—of fusional languages.

Masculine

Masculine nouns in Ardari all have stems that end in -a. One example is kona “man”, and this table shows its declensions:

kona Singular Plural
Nominative kona kono
Accusative konan konon
Genitive kone konoj
Dative konak konon
Vocative konaj konaj

Roughly speaking, you can translate kono as “men”, kone as “of a man”, etc. We run into a bit of a problem with konon, since it could be either accusative or dative. That’s okay; things like this happen often in fusional languages. We’ll say it was caused by sound changes. We just have to remember that translating will need a bit more context.

Also, many of these declensions will change the stress of a word to the final syllable, following our phonological rules from Part 1.

Feminine

Feminine noun stems end in -i, and they have these declensions (using chi “sun” as our example):

chi Singular Plural
Nominative chi chir
Accusative chis chell
Genitive chini chisèn
Dative chise chiti
Vocative chi chi

The same translation guides apply here, except we don’t have the problem of “syncretism”, where two cases share the same form.

Neuter

Neuter nouns have stems that can end in any consonant. Using the example of tyèk “house”, we have:

tyèk Singular Plural
Nominative tyèk tyèkar
Accusative tyèke tyèkòn
Genitive tyèkin tyèkoj
Dative tyèkèt tyèkoda
Vocative tyèkaj tyèkaj

A couple of these (genitive plural, vocative) are recycled from the masculine table. Again, that’s fairly common in languages of this type, so I added it for naturalism.

Definiteness

Unlike Isian, Ardari doesn’t use separate words for its articles. Instead, it has a “definiteness” marker that can be added to the end of a noun. It changes form based on the gender and number of the noun you’re attaching it to, coming in one of a few forms:

  • -tö is the general singular marker, used on all three genders in all cases except the neuter dative.
  • -dys is used on masculine and most neuter plurals (except, again, the dative).
  • -tös is for feminine plurals.
  • Neuter nouns in the dative use for the singular and -s for the plural.

The neuter dative is weird, partly because of a phonological process called “haplology”, where consecutive sounds or syllables that are very close in sound merge into one. Take our example above of tyèk. You’d expect the datives to be tyèkètto and tyèkodadys. For the singular, the case marker already ends in -t, so it’s just a matter of dropping that sound from the “article” suffix. The plural would have two syllables da and dys next to each other. Speakers of languages are lazy, so they’d likely combine those into something a bit less time-consuming, thus we have tyèkodas “to the houses”.

New words

Even though I didn’t actually introduce any new vocabulary in this post, here’s the same word list from last week’s Isian post, now with Ardari equivalents. Two words are a little different. “Child” appears in three gendered forms (masculine, feminine, and a neuter version for “unknown” or “unimportant”). “Friend”, on the other hand, is a simple substitution of stem vowels for masculine or feminine, but you have to pick one, although a word like ast (a “neutered” formation) might be common in some dialects of spoken Ardari.

  • sword: èngla
  • cup: kykad
  • mother: emi
  • father: aba
  • woman: näli
  • child: pwa (boy) / gli (girl) / sèd (any or unknown)
  • friend: asta (male) / asti (female)
  • head: chäf
  • eye: agya
  • mouth: mim
  • hand: kyur
  • foot: allga
  • cat: avbi
  • flower: afli
  • shirt: tèwar

Let’s make a language – Part 4b: Nouns (Isian)

Keeping in our pattern of making Isian a fairly simple language, there’s not going to be a lot here about the conlang’s simple nouns. Of course, when we start constructing longer phrases (with adjectives and the like), things will get a little hairier.

Noun roots

Isian nouns can look like just about anything. They don’t have a set form, much like their English counterparts. But we can divide them into two broad classes based on the last letter of their root morphemes: vowel-stems and consonant-stems. There’s no difference in meaning between the two, and they really only differ in how plural forms are constructed, as we shall see.

Cases

For all intents and purposes, Isian nouns don’t mark case. We’ll get to pronouns in a later post, and they will have different case forms (again, similar to English), but the basic nouns themselves don’t change when they take different roles in a sentence.

The plural (with added gender)

The plural is where most of Isian’s noun morphology comes in. For consonant-stems, it’s pretty simple: the plural is always -i. From last week, we have the nouns sam “man” and talar “house”. The plurals, then, are sami “men” and talari “houses”. Not much else to it.

For vowel-stems, I’ve added a little complexity and “naturalism”. We have three different choices for a plural suffix. (This shouldn’t be too strange for English speakers, as we’ve got “-s”, “-es”, and oddities like “-en” in “oxen”.) So the possibilities are:

  • -t: This will be the most common marker. If all else fails, we’ll use it. An example might be seca “sword”; plural secat.

  • -s: For vowel-stems whose last consonant is a t or d, the plural becomes -s. (We’ll say it’s from some historical sound change.) Example: deta “cup”; plural detas.

  • -r: This one is almost totally irregular. Mostly, it’ll be on “feminine” nouns; we’ll justify this by saying it’s the remnant of a proper gender distinction in Ancient Isian. An example: mati “mother”; matir “mothers”.

As we go along, I’ll point out any nouns that deviate from the usual -i or -t.

Articles

Like English, Isian has an indefinite article, similar to “a/an”, that appears before a noun. Unlike the one in English, Isian’s is always the same: ta. It’s never stressed, so the vowel isn’t really distinct; it would sound more like “tuh”.

We can use the indefinite when we’re talking about one or more of a noun, but not any specific instances: ta sam “a man”; ta hut “some dogs”. (Note that we can also use it with plurals, which is something “a/an” can’t do.)

The counterpart is the definite article, like English the. Isian has not one but two of these, a singular and a plural. The singular form is e, and the plural is es; both are always stressed.

These are used when we’re talking about specific, identifiable nouns: e sam “the man”; es sami “the men”.

More words

That’s all there really is to it, at least as far as the basic noun structure. Sure, it’ll get a lot more complicated once we through in adjectives and relative clauses and such, but we’ve got a good start here. So, here’s a few more nouns, all of which follow the rules set out in this post:

  • madi “mother” (pl. madir)
  • pado “father” (pl. pados)
  • shes “woman”
  • tay “child” (pl. tays)
  • chaley “friend”
  • gol “head”
  • bis “eye”
  • ula “mouth”
  • fesh “hand”
  • pusca “foot”
  • her “cat”
  • atul “flower”
  • seca “sword”
  • deta “cup” (pl. detas)
  • jeda “shirt” (pl. jedas)

Let’s make a language – Part 4a: Nouns (Intro)

A noun, as we learned in school, is a person, place, or thing. Of course, there’s more to it than that. Later in our education, ideas and abstract concepts get added in, but the general notion of “noun” remains the same. All natural languages have nouns, and they almost always use them for the same thing. How they use them is where things get interesting.

The Noun Itself

Nouns are going to be words. In fact, they’re probably going to be the biggest set of words in a language, owing to the vast array of people and objects and ideas in the world. The most basic nouns (i.e., the ones we’re discussing today) are represented by a single morpheme, like “dog” or “car”. Later on, we’ll get into more complicated nouns that are built up (derived) from other words, but we’ll keep it simple this time.

So we have a morpheme, which we’ll call the root. This root is the core bit of meaning; if we change it completely, we change the whole noun. We can modify the root a little, however, and some languages require us to do this. In English, for example, a noun like dog refers to a single dog. If we want to talk about four of them, we have to write dogs. Similarly, the Latin word aqua, meaning “water”, becomes aquam if it’s used as the object of a sentence.

Most languages that mark these shades of meaning (subject vs. object, one vs. many) do so via suffixes, like the English plural -s. A few work more with prefixes; these are mostly lesser-known languages in Africa, Asia, and the Americas. English is a little weird in having yet another way of marking the distinction of number: sound change, as in words like goose and geese. (It inherits this from its Germanic roots.) Semitic languages, particularly Arabic, take this a step further, but Semitic morphology is a vastly overused element of conlangs, so I won’t discuss it much here.

Isolating languages, on the other hand, don’t really go in for this kind of thing. Their nouns mostly stay in the same form, but they can still represent the same ideas in different ways. If you’re working with a language like this, then the grammatical categories we’ll see in the rest of this post will likely be formed by additional words rather than suffixes or prefixes.

Number

Probably the most basic (and most common) distinction made for nouns is that of number. Not every language has it—aficionados of Japanese know that the correct plural of “manga” is still “manga”—and that’s certainly a valid possibility for a conlang.

Besides an absence of number, what possibilities are there? First, there’s a division between one and many, singular and plural, with the singular taken as the default. That’s very common, and it’s familiar from English and most other European languages. But it’s not the only way. Some other number markings include:

  • A dual number, representing two of something. Arabic and Sanskrit have this, and there are remnants of it in English, with words like “both” and “either”.

  • Marking both singular and plural, each differently, as in Swahili mtoto “child” vs. watoto “children”. In this case, the singular prefix isn’t part of the root.

  • A distinction between “mass” and “count” (or “uncountable” and “countable”) nouns. Mass nouns like English “water”, logically enough, don’t appear in the plural.

  • A category of number specifically referring to “a few” or “some”. This is called the paucal, and it pops up here and there. Usually, it means anywhere from two to ten or so, probably because people have ten fingers.

Some languages mark for two of a noun, and some mark for a few. Three is an obvious next choice, and there are indeed a handful of languages with a “trial” number, but they only use it in pronouns (which are the subject of a later post), not the nouns themselves. Four is right out.

Gender

Gender in language has almost nothing to do with gender in anything else. For many languages, it’s almost completely arbitrary. Sure, the word for “man” might be in the masculine gender, and “woman” in the feminine, but just about anything else is possible. German Mädchen “girl” is neuter, as is Old English wīf “woman, wife”. Irish has cailín “girl” as a masculine noun, while Spanish gente “people” is feminine, no matter what kind of people it’s talking about. Of course, things don’t have to be this confused. A lot of the gender oddities are caused by historical sound changes. Conlangs don’t generally have this problem, although some authors like to add the semblance of such things.

For those languages that have gender, having two of them is common. Usually, that’s masculine and feminine. Some languages instead distinguish between animate and inanimate nouns, though there aren’t too many of these left around. Swedish managed to merge masculine and feminine at some point, resulting in the dichotomy of “common” and “neuter”.

Neuter is a popular third gender; it might be analyzed as an absence of gender, except that some nouns that do have a sex are classified under it, like those examples above. With a neuter gender, sexless items such as inanimate objects often end up there, but they can also fit into one of the others.

Languages can also make more than two or three distinctions of gender. You could have, for example, a language that has four, where every noun is either masculine or feminine, and either animate or inanimate. Some languages (notably the Bantu languages, including Swahili) have a wide variety of categories that might be called gender, though they’re more of a noun “class”.

Case

Anybody who ever took Latin in school knows about case. And they probably hate it. Case is a way of marking the role a noun has in a sentence, such as subject or object. It can also be used to show finer points of meaning, such as those marked in English by prepositions like “in” or “with”.

A lot of languages don’t have case, or only use it in certain places. English doesn’t for its nouns, but does for pronouns (“he”, “him”, “his”), and that’s actually not that rare. Other languages seem to love cases; Finnish has a dozen or so, depending on who’s counting. Generally speaking, it seems that inflectional languages are especially fond of large case systems. Isolating languages make do with something like prepositions. Conlangs can be absolutely anywhere on the spectrum, from caseless languages to the monstrosity of Ithkuil, which has 96. (Granted, Ithkuil is intended to be unrealistic.)

Closing Thoughts

There’s more to nouns than meets the eye, and I’ve only covered about half of it. Wikipedia’s page on grammatical category has a wealth of knowledge about everything above, plus all the stuff I didn’t cover.

What it can’t tell you, though, is which of these categories nouns in your conlang should have. The answer to that depends on a number of factors. For an auxiliary language, you’ll want to be pretty simple. Alien conlangs can (should, even) break the Western mold.

Number is a fairly easy choice, but there’s a hidden complexity in there. (Just look at all the plural exceptions in English!) Gender has its problems, some of them even political, but it also has the potential to make things truly interesting. A matriarchal culture, for instance, might take offense at the idea that “masculine” is the default gender in a language. Cases make a language harder to learn, I would say, but they do feel like they add a “precision” to meaning. It’s possible to go overboard, though. (Actually, studying Finnish grammar isn’t the worst idea for a budding conlanger. It worked for Tolkien.)

The next two posts are going to cover basic nouns in Isian and Ardari, along with a bunch of added vocabulary. Those, combined with the pointers in this post, should be enough to stimulate your own imagination. After that, we’ll move on to verbs, so that we can make our nouns do things.

Let’s make a language – Part 3b: Language Types (Conlangs)

On the 2D “grid” of languages we saw last week, where do our two conlangs fall? We’ll take each of them in turn.

Isian

Since Isian is intended to be simple and familiar, I’ve decided to make it similar to English in this respect. Isian will have a lot of isolating features, but compound words can be made through agglutination. However, there will be a few fusional bits here and there. We might consider these “legacy” aspects of the language, something like how English still distinguishes subject and object, but only in pronouns.

Most of the morphemes in Isian will be free. Bound morphemes will be a fairly restricted set of affixes, mostly grammatical in nature, but with a few “learned” compounding affixes, analogous to English’s Latin borrowings: pre-, inter-, etc. Owing to Isian’s smaller phonology, a lot of morphemes will be two or even three syllables, but the most common are the most likely to be short.

Ardari

With Ardari, we can be more ambitious. We’ll make it a more polysynthetic language, leaning agglutinative, but with some fusional aspects, too. In other words, Ardari will have a lot of word-making suffixes and prefixes, and plenty of grammatical attachments. Some of those will have a single meaning, while others will come in a fusional set.

Like Isian, though, those bits will tend to be older, even antiquated. It’s a common theme in natural languages: fusional aspects tend to disappear over time. Look at Latin and its daughter languages. Sure, Spanish (and Italian, and French, and…) kept the verbal conjugations. But noun case is all but gone, and French shows us that spoken verbs aren’t exactly untouchable. The same thing happened with English, but long ago. (If you don’t believe me, look up some Old English. We lost our cases, too, but our cousin, German, still has them.)

Since we have more sounds to work with, Ardari will have quite a few more morphemes of a single syllable, but two will still be common, and three won’t be entirely unheard of. On the whole, though, an Ardari text will tend to be shorter than its Isian equivalent, if harder to pronounce and translate.

The Words

Now for the moment you’ve all been waiting for. Here’s the first basic vocabulary list for both of our conlangs, including an even dozen words. Obviously, these are going to be loose translations, but we’ll say that they cover the same ground as their English glosses. Also, these are simple nouns and verbs. No pronouns or adjectives yet, because we don’t really know what form they’ll take. (If you’re wondering, the Ardari verbs end in dashes because those are only the roots. We haven’t yet seen how to make the inflected forms.)

English Isian Ardari
man sam kona
house talar tyèk
dog hu rhasa
sun sida chi
water shos obla
fire cay aghli
food tema fès
walk coto brin-
see chere ivit-
eat hama tum-
live liga derva-
build oste moll-

Next Time

In the next post, we’ll take a break from our methodical, studious approach and digress into the wonderful world of nouns. We’ve already got seven of them up there, but we’ll come out with plenty more. After that, we’ll do the same for verbs, and then we’ll start to look at how we can take both of them and combine them into sentences.

Let’s make a language – Part 3a: Language Types (Intro)

The sounds a language contains can go a long way toward giving that language a specific “feel”. But the very structure of the words themselves creates another kind of feel. Think of German, with its immensely long words full of consonants. Compare that to Chinese words, short and to the point, but combined in numerous ways to make new phrases. Latin has tables of declensions, as any student knows, while English gets by with only a few variations in its word forms.

All of this comes under the field of morphology, which is, in essence, a parallel to phonology. Where phonology is concerned with a language’s sound inventory, morphology goes up to the next step: the building blocks of words. Not necessarily the words themselves, as we shall see. But first, we need to meet the morpheme.

The Morpheme

A phoneme, as we know, is the most basic unit of sound distinguished in a language. By analogy, then, a morpheme is the basic unit of grammar. This may surprise some people. After all, aren’t words the smallest part of grammar?

Well, sometimes. Words can be made of a single morpheme, and English has plenty of examples: dog, walk, I. These are called free morphemes, because they can stand alone as words in their own right. In contrast, the English plural ending -s and the past tense suffix -ed can’t be alone. They have to be attached to other morphemes to create a legitimate word, so we call them bound morphemes. Thus, the English sentence I walked the dogs has four words, but a total of six morphemes.

Languages can divide up their morphemes, free and bound, in numerous ways, but they can all be defined in two dimensions. First, how many morphemes are there in a word? Or, to put it another way, what’s the ratio of free to bound?

Isolating vs. Polysynthetic

This distinction is an easy one to think about. Look at English words like predestination or internationalization. They’re big words, and they have a lot of morphemes. “Internationalization”, as an example, has the free (“root”) morpheme nation surrounded by the bound morphemes inter-, -al, -ize, and -ation, for a total of five.

Not every language is like English, though. Many, instead, only really allow one or two morphemes per word, preferring to build their larger “words” as phrases constructed from multiple free roots. The Chinese languages are well-known examples of this style. They, and those like them, are called isolating languages, since their words are “isolated”, or able to stand alone.

The other extreme is exemplified by languages such as those of the Eskimo and Inuit peoples. Here, words can be constructed to mean entire sentences, and they are full of bound morphemes. Not only is the marker for tense stuck to the verb, but verbs and nouns themselves are welded together, and the whole thing becomes a single word. To demonstrate, I’ll copy Wikipedia’s example, the Yupik word tuntussuqatarniksaitengqiggtuq, meaning “He had not yet said again that he was going to hunt reindeer.” Wow. (By the way, this is one reason for the linguistic urban legend that the Eskimos have a hundred words for snow. Sure they do, if you count something that means “it’s going to snow tomorrow morning” as a word. But they certainly don’t have that many free morphemes that convey the meaning of “snow”.) Languages like these, where there are often many morphemes in a word, most of them bound, not allowed to stand by themselves, are called polysynthetic languages.

Of course, a language can be in the middle of this spectrum. Isolating versus polysynthetic isn’t a binary choice. English, after all, has plenty of cases of both isolation and (mild) polysynthesis. Indeed, most of the more common languages of the world fall near the muddy center of the continuum. Chinese, of course, is very isolating. English is kind of right in the middle. Turkish and Finnish are quite polysynthetic, though more of a type that we’ll see below. French manages to put one foot in either world, with a highly isolating written language that’s often spoken like it’s polysynthetic.

Conlangs tend to follow their authors’ leanings. Some like the exotic allure of polysynthetic languages, while others choose the stark simplicity of the isolating. Most, though, are somewhere in between, like the native tongues of their creators. Certainly, an auxiliary language shouldn’t be nearly as polysynthetic as Inuktitut. But that same style can definitely give an alien vibe to an otherwise simple language. An isolating style, on the other hand, could conjure up images of the East, or of Pacific pidgins and creoles.

Agglutinating vs. Fusional

For those languages that have them (purely isolating languages need not apply), bound morphemes are often used to indicate grammatical relationships. Again, we can look at English: plural -s, past tense -ed, etc. Most of these have a specific meaning, but not all. On verbs, -s marks the third person, but only the singular version: compare “he walks” and “they walk”. This is the second “dimension” of a language, and it asks, “How much meaning does a bound morpheme have?”

Like above, there are two paths we can choose. With a few exceptions (like verbal -s), English takes the “one morpheme, one meaning” approach. Thus, it’s fair to say that English is an agglutinating language. Turkish is a popular example of taking this to the extreme, as Turkish verbs can have a string of suffixes: one for person, one for tense, and so on. German’s interminable compounds are much the same, but with more “meaning” for each morpheme beyond mere grammatical marking.

At the other end of the spectrum, you have the fusional languages including, for instance, the Romance family. Take the Spanish word amó, which we can translate as “she loved”. We’ve got a root am- (amar in its dictionary form) and a suffix , and that’s it. But we know that it’s in the third person, past tense, and singular. (Spanish doesn’t distinguish gender in verb conjugation, though, so it could equally mean “he loved”.) Three separate meanings “fused” into a single suffix. And we know this by looking at a Spanish conjugation table. Change the person to first, and the word must become amé. Plural instead of singular? You have to say amaron. Want it to be in the future, rather than the past? It’s now amará. Alter any one part, and you need a whole new morpheme.

Like in the first case above, few languages fall on the absolute extremes of the agglutinative/fusional spectrum. English is mostly agglutinative, Spanish mostly fusional, but both have exceptions. The fusional type, though, seems a bit more popular in Europe (as you can see from the number of languages with declensions and inflections and make it stop), meaning that it’s better represented at the top of the chart. But even Europe has its agglutinative sect: English and Finnish, among others. Elsewhere, it really depends.

For conlangs, it still depends. Westerners are familiar with fusional languages, but agglutinating has a mechanical appeal, and it’s definitely a lot easier to work with. Auxiliary languages might be best served by a hybrid approach, where there are mostly agglutinative elements, but a few fusional aspects added where they can simplify things (like English’s verbal -s). (And if you’re making a purely isolating language, you can completely ignore the whole thing!)

Next Time

In the next post, we’ll look at Isian and Ardari and how they fit into the two-dimensional world of isolating and fusional and agglutinating and polysynthetic. The results may shock you! Oh, and we’ll also start making actual words in our two conlangs. Yes, finally.

Let’s make a language – Part 2b: Syllables and Stress (Conlangs)

Okay, last time we ran a bit long. This one should be fairly short. Today, we’ll look at the syllable structure and stress patterns of our two conlangs, Isian and Ardari. There’s no sense wasting time; let’s get right to it!

Isian

Isian, remember, is going to be the simpler of the two, so we’ll start with it. Isian syllables, for the sake of simplicity, will be of the form CVC. In other words, we can have a consonant on either side of a vowel. We don’t have to, of course. Syllables like an or de are just fine. CVC is the “maximum” complexity we can have.

Obviously, the V can stand for any vowel. (It’d be kind of silly to have a vowel you couldn’t use, wouldn’t it?) Similarly, the first C stands for any consonant. For the second C, the coda, that’s where things get a little more complicated. Two rules come into effect here. First, h isn’t allowed as a final consonant. That makes a lot of sense for English speakers, who find it hard to pronounce a final /h/, although it might upset speakers of other languages. Again, simple is the name of the game.

The second rule concerns diphthongs. If you’ll recall from Part 1, Isian has six of them. Here, we’ll say that /w/ and /j/ (written w and y) can only be the final consonant if they follow a, e, or o. This matches our phonology, where /ij/, /iw/, /uj/, and /uw/ aren’t allowed. Thus, diphthongs can be neatly analyzed as nothing more than a combination of vowel and consonant.

Moving on, we’ll give Isian a fixed stress: always on the penultimate syllable. So a word like baro will always be pronounced /ˈbaro/, never /baˈro/. Words with three syllables follow the same pattern: lamani is /laˈmani/. Since a diphthong is just a vowel plus a consonant, they don’t affect stress at all: paylow will be /ˈpajlow/.

In longer words, we’ll extend this stress in the same way. Or, to put it another way, every other syllable will get some sort of stress. A hypothetical word like solantafayan would have a secondary stress: /soˌlantaˈfajan/.

There won’t be any vowel reduction in Isian. Like Spanish, every vowel will be sounded in full, and each syllable will take up about the same time. This, combined with the regular stress, will probably give the conlang a distinct rhythm. (The dominant form of poetic meter, for example, will definitely be trochaic, and Isian musicians would probably find Western 4/4 rhythms very appealing.)

Ardari

As usual, Ardari is a bit more complicated. For this language, syllables will have the structure CCVCC, and each of the four C’s will have a different set of possibilities:

  1. The first C can be any stop consonant, /m/, or /n/.
  2. The second C can be any fricative or liquid except /ɫ/, except a fricative can’t follow a nasal.
  3. V, of course, stands for any of the ten Ardari vowels.
  4. The third C is restricted to four liquid sounds: /w j ɫ ɾ/
  5. Finally, the fourth C can be any consonant except those four liquids.

Now, in addition to these definitions, Ardari syllables have a few rules about which clusters of consonants are available. In the onset, there are three broad categories: stop + fricative, stop + liquid, and nasal + liquid. The last is the smallest, so we’ll deal with it first. For that combination, there are eight possibilities: /mw mj ml mɾ nw nj nl nɾ/. Of these, we’ll say that Ardari doesn’t allow a nasal followed by /l/. Also, /nj/ isn’t that much different from /ɲ/, so we’ll say that those two sounds merge, allowing a syllable that starts with /ɲ/, but nothing else. The remaining five clusters can go in as they are.

For the combination of stop and fricative, things get trickier, because of Ardari’s rules about voicing and palatalization. Rather than a system, it might be best to show precisely which clusters are allowed:

  • Bilabial + fricative: /pɸ bβ pʁ bʁ/
  • Alveolar + fricative: /ts dz tɬ tʲs dʲz tʁ dʁ/
  • Velar + fricative: /kʁ gʁ kʲɕ gʲʑ/

For stops and liquids, we’ll do the same thing:

  • Bilabial + liquid: /pl pɾ pʲʎ pw bl bɾ bʲʎ bw/
  • Alveolar + liquid: /tw tɾ tʲɾ dw dr dʲr/
  • Velar + liquid: /kw kl kɾ kʲɾ kʲʎ gw gl gʲɾ gʲʎ/

At the end of a syllable, the clusters /ɫʁ ɾʁ ɫl ɫʎ wʎ ɾʎ ɫɲ ɾɲ/ aren’t allowed, but any others that fit the syllable structure are. (This is mainly because I find them too hard to pronounce.)

Ardari stress is free, but predictable. Syllables that have coda consonants other than just /w/ or /j/ are considered heavy, while all others are light. For most words, the stress will be on the last heavy syllable. (Secondary stress will fall on any heavy syllable not adjacent to another one.) Words with only light syllables are stressed on the penultimate, as are all words with exactly two syllables. For all of these rules, there is an overriding exception: /ɨ/ and /ə/ can never be stressed. If they would be, then the stress is moved to the next syllable. So, examples of all of these, using hypothetical words:

  • Basic stress pattern: sembina /ˈsembina/, karosti /kaˈɾosti/, dyëfar /dʲəˈfaɾ/.
  • Secondary stress in long words: andanyeskaro /ˌandaˈɲeskaɾo/.
  • Two syllables: meto /ˈmeto/, kyasayn /ˈkʲasajn/.
  • All light syllables: taralèko /taɾaˈlɛko/.
  • Stress moved because of vowel: lysmo /lɨsˈmo/, mönchado /mənˈɕado/.

Because of the vowel reduction, Ardari will likely be a more free-form language than Isian, poetically speaking. Indeed, it will probably sound a lot more like English.

Next Time

With this post, we now have enough information to start making words in both our conlangs. That may even be enough for some people. If all you need is a “naming” language, you don’t have to worry too much about grammar. That said, stick around, because there’s plenty more to see. Next up is a theory post where we begin to give our words meaning, and we find out just how many words the Eskimos have for snow. See you then!

Let’s make a language – Part 2a: Syllables and Stress (Intro)

The syllable is the next logical unit of speech after the phoneme. It’s one or more sounds that follow a pattern, usually (but not always) centered around a vowel. These syllables can then be strung together into words, which we’ll cover in the next part. For now, we’ll see what we can do with these intermediate building blocks.

The Syllable

Most linguistic discussions divide a syllable into two parts: the onset and the rhyme. That’s as good a place as any to start, so that’s what we’ll do. The rhyme part is further subdivided into a nucleus and a coda, again a useful distinction for us to work with. As the rhyme is often the more important, we’ll look at it first.

The nucleus is the center of the syllable, and it’s usually a vowel sound. Some languages, however, permit consonants here, too, and these are known as syllabic consonants. In English, these are the sounds at the ends of words like better, bottle, bottom, and button. A few languages (e.g., Bella Coola, some Berber languages) go even farther, to the point where the dividing line between syllables becomes so blurred as to be useless. By and large, though, vowels and the occasional syllabic consonant are the rule for the nucleus.

The coda is everything that follows the nucleus, and it’s a part that is, strictly speaking, optional. Languages like Hawaiian don’t have syllable codas at all, while Japanese only allows its “n” sound, as in onsen. A slightly more complex scheme allows most (if not all) of the consonants in the language to appear in the coda. Beyond that are languages that allow clusters of two, three, or even four consonants, with English a primary example of the last category, as in the words texts and strengths. (We’ll come back to that one later.) An important distinction we can draw is between open syllables without a coda and closed syllables with one. That will come into play later on, when we discuss stress.

Moving to the onset, we see another opportunity for consonants. This can range from nothing at all (though languages such as Arabic do require an onset) to a single consonant to a cluster of two or three. Again, English is ridiculously complex in this regard, at the far end of the scale in allowing three: split and (once again) strengths. Of course, this complexity is tempered by the fact that the first of those three must be /s/, which brings us to the topic of phonotactics.

Loosely speaking, phonotactics is a set of constraints on which sounds can appear in a syllable. It’s a different system for each language. They all have a few things in common, though. First, there’s a distinction between consonant and vowel. The simplest systems allow only syllables of CV, where C stands for any consonant, V for any vowel. An alternative is (C)V, where the parentheses around C mean that it’s optional.

The next step up in complexity comes with a coda or an onset cluster: CVC or CCV. (We’ll assume the parentheses indicated an optional consonant are implied.) These two are the most common, according to WALS Chapter 12, but they’re also where phonotactics becomes important. Which consonants can end a syllable? Which clusters are allowed? Although the first question has no universal answer, the second does have a trend that we can (or should) use.

Most languages that allow consonant clusters follow what’s called the sonority hierarchy. For consonants, it’s kind of a ranking of how “vowel-like” a sound is. Semivowels such as /w/ or /j/ are high on the list, usually followed by approximants like /l/ or /r/, then nasals, then fricatives, then stops. The rule, then, is that the allowed syllables have sonority that falls outward from the nucleus. In other words, it’s incredibly common for a language to allow a syllable onset like /kɾ/, but rare for /ɾk/ to be permitted. In the coda, that’s reversed, as the sounds with higher sonority come first. English bears this out: trust is the sequence stop – approximant – vowel – fricative – stop. /s/ (and /z/, for that matter) is special, though. Many languages allow either sound to appear in a place where the hierarchy says it shouldn’t go, like in stop or tops. That’s also how English gets three consonants in an onset: /s/ is always the first.

And, of course, there are the combinations that aren’t allowed by a language despite the sonority hierarchy saying they’re fine. In English, these are mostly combinations of stops and nasals. We don’t pronounce the k in knight or the p in pneumonia, but other languages do. Conversely, those other languages have their own rules about what’s forbidden.

For a conlang, there’s really no best option for syllable structure. CV is simple, true, but it’s also limiting, and it creates its own problems. Generally, less complex syllables mean longer words, since there aren’t that many permutations that fit the rules. On the other hand, something too complicated can devolve into a mess of rules about which phoneme is allowed where.

Auxiliary languages, then, should probably stick with something in the middle of the spectrum, like CVC or a very restricted form of CCVC. Conlang artisans can go with something a bit more bizarre, especially if they’re never intending their languages to be spoken by mere mortals. And, of course, an alien race might have a different sonority hierarchy altogether, and the idea of “syllable” might make as little sense as it does for the Nuxalk of British Columbia.

Stress and Accent

However we chose to make syllables, whether CV or CVC or CCCVCCCC, we can now put them together to form words. Some words need just one. (Like every word in that sentence!) Many will need more, though, and some people find joy in hunting down the longest possible words in different languages.

Once we have more than one syllable in a word, there can be a battle for supremacy. Stress is a way of marking a syllable so that it stands out from those around it. Stressed syllables are typically spoken louder or with more emphasis. (An alternative is pitch accent, where the emphasized syllable is spoken with a different tone. This can happen even in languages that don’t actually have phonemic tone, including Japanese and Swedish.)

There doesn’t have to be any special meaning attached to stress. Many languages fix the position of the stressed syllable, so it’s always the last, the next to last (penultimate), or the third to last (antepenultimate). Others go in the opposite direction, stressing the first (initial), second, or third syllable from the beginning. In any of these languages, the stress falls in a specified place that doesn’t change, no matter what the word is. Examples (according to [WALS Chapter 14])(http://wals.info/chapter/14) include:

  • Final stress: Persian, Modern Hebrew
  • Penultimate: Swahili, Tagalog
  • Antepenultimate: Modern Greek, Georgian
  • Initial: Finnish, Czech
  • Second syllable: mostly smaller languages such as Dakota and Paiute
  • Third syllable: almost no languages (the only example in WALS is Winnebago)

Conversely, a language can also have stress that doesn’t seem to follow any rules at all. This free stress occurs in languages like English, where (as usual) it is weirder than it looks. In fact, English stress is phonemic, as it can be used to tell words apart. The canonical example is permit, which is a noun if you stress the first syllable, but a verb when you stress the second. In languages with free stress, it must often be learned, and it can be indicated in the orthography by diacritics, as in Spanish or Italian. Free stress can even vary by dialect, as in English laboratory.

It’s rare that a language has completely unpredictable stress. Usually, it’s determined by the kind of syllables in a word. This is where the distinction between open and closed syllables comes into play. Closed syllables tend to be more likely to take stress (i.e., they’re “heavy”), while open (“light”) syllables are stressed only when they are the only option. (Some languages consider long vowels and diphthongs to be heavy, too, but this isn’t universal.) It’s entirely possible, for example, for a language to normally have penultimate stress, but force the stress to move “back” to the antepenultimate if the final two syllables are light.

Stress in conlangs might be entirely unpredictable. All types are represented, in similar proportions to the real world, although pitch accent is one of those things that conlangers find fascinating. Auxiliary languages tend to have stress that’s either fixed or easily predictable; Esperanto’s fixed penultimate is a good example. Artistic languages are more likely to have free stress, though some of this might be due to laziness on the part of their creators. Fixed is easier, of course, since it’s mechanical, but free stress has its advantages. (An interesting experiment would be to create a language with free, unmarked stress, then come back to it a few years later and try to read it.)

Rhythm and Timing

Rhythm is kind of a forgotten part of conlanging. (I’m guilty of it, too.) It’s most closely tied to poetry, obviously, but the same concept creeps into spoken language, as well. For this post, the main point of rhythm is secondary stress. This kind of stress is lighter than the main, primary stress we discussed above, and it mostly occurs in long words of at least four syllables. Now, some languages don’t need (or have) a rhythmic pattern, but it can make a conlang feel more natural.

Generally, a heavy syllable is going to be more likely to get secondary stress, especially if there is a single, light syllable between it and the main stress. (In which direction? Whichever one you use to find the primary stress.) Languages without heavy syllables (such as pure CV languages) will probably have a pattern of stressing alternate syllables; in a penultimate-stress language, this would be the second to last, fourth to last, and so on.

Somewhat related to rhythm is timing, another under-appreciated aspect of a language. In languages such as Spanish or Italian, unstressed syllables are treated essentially the same as those that are stressed, and each syllable sounds like it takes the same amount of time. In others, including English, an unstressed syllable is spoken more quickly, and its vowel is reduced; here, it seems to be the amount of time between stressed syllables that stays constant.

For the most part, conlangers don’t need to worry much about rhythm and timing. However, if you’re writing poetry (or song) in your language, it will certainly come into play. Any post I do about that is a long way off.

The Mora

Some languages don’t use the syllable as the basis for stress and rhythm. Instead, these languages (including Japanese and Ancient Greek, to name but two) use the mora (plural morae). This is, in essence, another way of looking at light and heavy syllables. Basically, a short vowel in a syllable nucleus counts for one mora, while long vowels or diphthongs are two. A coda consonant then adds another mora, giving a range of one to three. Thus, a syllable that has one mora is light, and two morae make a heavy syllable. Three morae can make a “superheavy” syllable, though some languages don’t have these, and four seems to be impossible.

In a moraic system, stress (or pitch, if using pitch accent) can then be assigned to heavier syllables. Rhythm, too, would be based on the mora, not the syllable. The distinction can even be shown in writing, as in the Japanese kana. The end result, though, can be explained in the same terms either way. It’s just another option you can look into.

Conclusion

That was a lot to cover, and I only scratched the surface of syllables. But we can now make words, and that was worth a long post. Next up is a combination post for both Isian and Ardari. Since the theory’s out of the way, the implementation won’t take much explanation, so I’ve decided to cover both languages at the same time. After that, we’ll actually start diving into grammar. See you next week!

Let’s make a language – Part 1c: Ardari Phonology

Okay, the last time wasn’t so bad. But Isian is supposed to be simple. Ardari, on the other hand, will be a little bit different. Again, I’m going to try to explain some of the reasoning behind my choices as we go.

Ardari Consonants

Bilabial Alveolar Palatal Velar Uvular
Nasal m n ɲ ŋ
Stop p pʲ b bʲ t tʲ d dʲ k kʲ g gʲ q
Fricative ɸ β s z ɬ ɕ ʑ x ɣ ʁ
Approximant w l j ʎ ɫ
Tap ɾ

Instead of the relatively few 19 consonants of Isian, Ardari has a total of 33, slightly above the world average. And some of them are…well, you can see the table. The main features of Ardari’s consonant system are as follows:

  • A set of palatalized stops (all the ones with a ʲ). Note that there aren’t any actual palatal stops or affricates. Maybe they merged with the alveolar or velar stops at some point in the language’s history.

  • The uvular stop /q/ and fricative /ʁ/. These don’t quite fit in, but we can say they developed from earlier glottal stops or something. /q/ doesn’t have a voiced counterpart (nor does /ʁ/ have a voiceless one), but allophonic alteration will likely fill in the gaps. (By the way, WALS Chapter 6 has info on uvular consonants.)

  • A full set of fricatives, including bilabials (instead of the labiodentals of English), alveolars (the familiar /s/ and /z/), palatals (technically alveolo-palatals as found in e.g., Polish), and velars (voiceless and voiced).

  • More lateral consonants. We have the basic /l/, the “dark” velar /ɫ/, the palatal /ʎ/ (like ll in some Spanish dialects), and the voiceless fricative /ɬ/. The last is rare in Europe, with the exception of Welsh, where it is written ll. (WALS Chapter 8 is all about laterals.)

  • Two different kinds of “r” sound: the /ɾ/ from Spanish pero and /ʁ/, which is more like the French sound.

To add to this, some of the consonants will change at times. The most important point here is that palatalization and voicing change consonants in clusters. In pairs of consonants, the first takes on the voice quality of the second, while the second takes on the palatalization of the first. As an example, the cluster /sgʲ/ (assuming it’s possible) would be pronounced as if it were [zg], while /dʲs/ would come out as [tʲsʲ]. This only happens for stops and fricatives, though, since they’re the only ones where voicing and palatalization really matter.

As you can see, Ardari’s consonants are quite different from Isian’s. Still, even though some of them might be hard for you to pronounce, they still aren’t quite as outrageous as some of the real world’s languages. Be glad I didn’t add in implosives or clicks or something else completely weird.

Ardari Vowels

Front Central Back
High i ɨ u
Mid-High e o
Mid ə
Mid-Low ɛ ɔ
Low æ ɑ

The vowel system is more complex, but it’s still a system. Ardari has 10 vowel phonemes, and we can divide them into three groups: front (/i e ɛ æ/), middle (/ɨ ə/), and back (/u o ɔ ɑ/). The two middle vowels are most likely reduction vowels that gained full phonemic status at some point. /ɛ/ and /ɔ/, on the other hand, probably represent a lost length distinction.

The Ardari vowels, since there are so many of them, don’t show too much variation. In unstressed syllables, some vowels might be pronounced as [ɨ] or [ə]. There is one rule that will stick out, though: /i/ and /e/ are never found after a non-palatal stop. /ɨ/, conversely, can’t follow any palatal or palatalized consonant. (A similar constraint can be found in Russian, for example.)

There will still be diphthongs in Ardari, though we’ll postulate that most of them have been converted into pure vowels over time. The four that remain visible are /aj æw ej ou/ (phonetic [aɪ æʊ ɛi ɔu]), corresponding to English lie, how, say, and low. Most other combinations of vowels followed by glide consonants (/j/ and /w/) will end up being pronounced as one of these. For instance, the sequence /eu/ would become [æʊ], and /oj/ would turn into [aɪ].

Although the table looks ripe for it, Ardari doesn’t have vowel harmony. Sure it’d be easy to add it in, and I’ve done just that with a conlang that has these exact phonemes. But not this time. We’ll keep it simple for now, saving the complications for the grammar, which will come soon.

Orthography

With a total of 43 phonemes (not counting diphthongs), it’s clear that fitting Ardari into the English alphabet is going to be a challenge. We have two options. We can opt for digraphs, which are strings of multiple letters standing for one phoneme (like English and Isian sh), or we can use diacritics, those funny little squiggles above letters in foreign languages. For Ardari, a combination of both might be our best bet.

Some of the phonemes can take their letter values, just like we did with Isian. Here, we’ll let the consonant phonemes /m p b w n t d s z l k g q/ and the cardinal vowels /e i o u/ all be written as they are in the IPA (/ɑ/ is close enough to a that we can say they’re the same). But that doesn’t even get us halfway!

If you look at the chart above, you can see that the palatalized stops are a big component. Let’s write them as the regular stops followed by y. That’ll take care of six more. Then, we can do the same for the palatal nasal and lateral: ny and ly. Now we’re getting somewhere. We’ll write /j/ itself as j, though, and you’ll see why in a moment. For the palatal fricatives, we’ll use the digraphs ch and zh. (We could also use Slavic diacritics and type them as š and ž. We can call that an alternate standard.)

The bilabial fricatives are pretty close in sound to their labiodental counterparts, so we’ll use f and v for them. The velar nasal is almost everywhere written as ng, so we’ll do that, except when it comes before another velar sound, when it will be n. Since nasals will assimilate, that’s okay.

We have two “rhotic” sounds /ɾ/ and /ʁ/. Either one could lay claim to r, but I’m going with /ɾ/ for that. For /ʁ/, we’ll use rh. That helps signify its “rougher” quality, don’t you think?

That leaves two laterals, two velar fricatives, and five vowels. For the velars, we can use the digraphs kh for /x/ and gh for /ɣ/. The laterals are a little tougher to figure out, but I’ll choose lh for /ɬ/ and ll for /ɫ/. It’s an arbitrary choice, to be sure, but I’m open to suggestions.

For the vowels, the best bet is usually diacritics, because the English alphabet simply doesn’t have enough vowel letters. Sure, you can use clever digraphs and trigraphs, but that way lies madness and Irish orthography, which are pretty much the same thing. Squiggles it is, then. We’ll use familiar European standards where we can, like a German-style ä for /æ/. French gives us è for /ɛ/, and we can extend this by analogy to ò for /ɔ/. That takes care of all but the two central vowels, which turn out to be surprisingly difficult. For /ɨ/, we can use y, since we already said it can’t appear after palatal consonants. (In other words, there’s no way to get yy.) For the schwa, we’ll go with ë or ö. Which to use depends on the previous consonant: ë after palatals, ö otherwise.

Whew. There we go. Let’s look at all this in a format that’s easier to read.

Written Phoneme Description
a /ɑ/ a as in father
ä /æ/ a as in cat
b /b/ b as in bad
by /bʲ/ palatalized b
ch /ɕ/ something like sh in show; more like Polish ś
d /d/ d as in dig
dy /dʲ/ palatalized d
e /e/ e as in Spanish queso
è /ɛ/ e as in bet
ë /ə/ a as in about; only after palatals
f /ɸ/ f as in Japanese fugu
g /g/ g as in got
gh /ɣ/ g as in Spanish amigo or Swedish jag
gy /gʲ/ palatalized g
i /i/ i as in German Sie
j /j/ y as in yet
k /k/ k as in key
kh /x/ ch like in German acht
ky /kʲ/ palatalized k
l /l/ l as in let
lh /ɬ/ ll as in Welsh llan
ll /ɫ/ l as in feel
ly /ʎ/ ll as in million (American English)
m /m/ m as in may
n /n/ n as in no
ng /ŋ/ ng as in sing
ny /ɲ/ ñ as in Spanish año
o /o/ au as in French haut
ò /ɔ/ o as in hot
ö /ə/ a as in about; only after non-palatals
p /p/ p as in pack
py /pʲ/ palatalized p
q /q/ q as in Arabic Qatar
r /ɾ/ r as in Spanish toro
rh /ʁ/ r as in French rue
s /s/ s as in sit
t /t/ t as in tent
ty /tʲ/ palatalized t
u /u/ ou as in French sous
v /β/ b as in Spanish bebe
w /w/ w as in wet
y /ɨ/ like i in bit; closer to Polish or Russian y
z /z/ z as in zebra
zh /ʑ/ like z in azure; closer to Polish ź

Wow, that’s a lot of letters! Next time, it’s back to the theory, where we’ll discuss all the things that we can use to make these sounds into words.