Random generation for conlangs

Making a language is hard work. As anyone who knows programmers knows, hard work is not something we like. Not hard work that we have to do ourselves, that is. If we can find a way to get a computer to do it for us, well, that’s entirely different.

On the surface, it seems pretty simple. Language creation is a lot of time-consuming, repetitive work. We have to make hundreds or thousands of words. We have to work out grammatical rules. The list goes on, and it’s almost a sure thing that, at some point, the computer-savvy conlang creator is going to ask, “Can I automate this?” And thus is random generation born.

The pros

It’s not that bad an idea, if you do it right. Computers are a lot faster than the human mind when it comes to such things. They can generate a million random words in the time it takes us to think of one. So shifting some of the burden to the uncaring machines seems almost natural. After all, they come up with random numbers all the time, don’t they? So what’s the difference.

For words, it’s not even that big a deal. If you’re working with an artistic conlang, or something otherwise not bound by the sequences of sounds used by natural languages, there’s almost no reason not to randomly generate at least some roots. Not whole words, mind you, because they won’t necessarily fit the grammatical and morphological patterns of your language. No, it’s better to generate the basic roots, then inflect them however your conlang does that—if it does at all.

Word generators are easy enough to find, and they aren’t all that hard to make. (I’ve done it no fewer than four times.) Most of the good ones—not mine—give you a lot of nice options. They let you do frequency distributions, so some sounds are more common than others. A few even provide variable substitution, syllabic constraints, allophony, and you get the idea. I won’t say that random word generation is a solved problem, but it’s definitely not new. If the circumstances of your conlang allow it, and you think you can get good results, go for it.

At other levels, random generation is a bit more iffy. Gleb is a part of an (abandoned?) attempt at randomly creating every part of a language; the link goes to a phonology generator, which…works. It’s a start, though, not an end result. I used a few of its outputs as the initial seeds for the conlangs of my “Otherworld” setting, but all of those required a lot more polish before they were anything approaching usable.

It’s also possible to generate random grammar rules and the like, but the field is so vast, and the different parts of a language so interwoven, that I’m not sure you could ever write a program that could give you something plausible. If you could, you’d probably be halfway to strong AI already. I’d like to see it, though.

The cons

On the flip side, random conlang generation has an awful lot of downsides that make the process unsuitable in a lot of instances.

First and foremost, any conlang intended to be an auxiliary language almost certainly can’t use randomness at all. It simply doesn’t fit the criteria. These languages are supposed to be either familiar to a broad population, simple enough for anyone to learn, or engineered based on linguistic principles. None of that really meshes well with random words, much less any other part of a conlang.

Second, unless you’re willing to go through a lot of trouble, the output of a generator isn’t always that great. Yes, the numbers it uses internally will likely be entirely random, but mapping those numbers to letters, phonemes, words, etc., in a way that doesn’t look, well, computer-generated, is an exceedingly difficult task. Take any of those word generators I mentioned above. Sure, you can get a good list out of them. More likely, however, you’re going to be looking at thousands of nonsense letter sequences that do nothing but waste your time.

Next, as I said above, most of the grammatical portion of a conlang isn’t really amenable to randomizing. Language structure is so full of causation and correlation, of universals and implications, that it’s just not that random. It’s more like a chain of logic, but with a few forks in the road that give the creator a bit of leeway. You don’t have to, say, have a past tense just because you have a future one, but it’s more common if you do, and random generators would have to account for that. In the end, there are so many variables, so many special cases, that the programmer effectively has to make a language just for the generator.

I could go on, but I’ll leave with one final argument against randomness, one I don’t entirely support. Nonetheless, it is popular, so I’d be remiss if I didn’t at least mention it. That reason is art. If you’re of the belief that conlangs are art, you might balk at the very notion of using computers to create them from random numbers. That would be no different from trying to use an RNG to make artistic textures, or images, or worlds, and we see how often those go awry. (Minecraft and the like use noise functions, which are entirely different. In many cases, they’re actually predictable; the same world seed will give you the same world.)

If you consider a created language to be an artistic work, then part of its allure is in the way it is crafted. We, the makers, choose words based not on algorithms, but aesthetics. It’s a more…philosophical argument, in my opinion, but I can see the reasoning.


In summary, my thoughts on the subject are as follows. Use random word generators if your conlang supports them. Don’t use them for anything where roots have to be derived from some other principle than your own mind. A phonology generator can be a good starting point, but not a finished product. Generators claiming to create grammar and the like for you probably aren’t going to give you something sensible, much less usable.

That’s not to say you can’t have fun with random generation. It is fun, and I can’t count the number of names I’ve used that ultimately derive from a list created by a computer program. (Some have even been published! Well, they will be in the coming weeks.) It’s not yet a substitution for the hard work of our minds, however, and it may never be. By the time a computer can create a language from nothing more than a random sequence of numbers, we may have bigger problems than conlangs becoming a commodity.

Leave a Reply

Your email address will not be published. Required fields are marked *