On procedural generation

One of my favorite uses of code is to create things. It always has been. When I was young, I was fascinated by fractals and terrain generators and the like. The whole idea of making the code to make something else always appealed to me. Now, as it turns out, the rest of the world has come to the same conclusion.

Procedural generation is all the rage in games these days. Minecraft, of course, has made a killing off of creating worlds from nothing. No Man’s Sky may have flopped, but you can’t fault its ambition: not only was it supposed to have procedurally-generated worlds, but a whole galaxy full of aliens, quests, and, well, content. That last part didn’t happen, but not because of impossibility. The list goes on—and back, as Elite, with its eight galaxies full of procedural star systems, is about as old as I am.

Terrain

Procedural terrain is probably the most widely known form of generation. Even if you’ve never played with TerraGen or something like that, you’ve probably played or seen a game that used procedural heightmaps. (Or voxels, like Minecraft.) Making terrain from code is embarrassingly easy, and I intend to do a post in the near future about it.

From the initial generation, you can add in lots of little extras. Multiple passes, possibly using different algorithms or parameters, give a more lifelike world. Tweaking, say, the sea level changes your jagged mountain range into an archipelago. You can go even further, adding in simulated plate tectonics or volcanic deposition or coastline erosion. There really are no boundaries, but realism takes some work.

Textures and models

Most 3D modeling software will give you an option to make “procedural” textures. These can be cool and trippy, especially those based on noise functions, but it’s very difficult to use them to make something realistic. That doesn’t stop them from being useful for other things; a noise bump map might be more interesting than a noise texture, but the principle is the same.

Going one step up—to actual procedural models—is…not trivial. The “creature generators” as in No Man’s Sky or Spore are severely limited in what they can do. That’s because making models is hard work already. Leaving the job in the hands of an algorithm is asking for disaster. You’re usually better off doing as they do, taking a “base” and altering it algorithmically, but in known ways.

Sound

Procedural sound effects and music interest me a lot. I like music, I like code. It seems only natural to want to combine the two. And there are procedural audio packages out there. Making them sound melodic is like getting a procedural model to work, but for your ears instead of your eyes. It’s far from easy. And most procedural music tends to sound either very loopy and repetitive, or utterly listless. The generating algorithms we use aren’t really suited for musical structure.

Story

Now here’s an intriguing notion: what if algorithms could write a story for us? Creating something coherent is at the high end of the difficulty curve, but that hasn’t stopped some from trying. There’s even a NaNoWriMo-like contest for it.

On a smaller scale, games have been making side quests and algorithmic NPCs for years. That part isn’t solved, but it isn’t hard. (For some reason, Fallout 4 got a lot of press for its “radiant quests” in 2015, like it was something new. Though those are, I think, more random than procedural…) Like modeling, the easiest method is to fill in parts of a pre-arranged skeleton, a bit like Mad Libs.

Anything else

Really, there’s no limit to what can be made through procedural generation. That’s probably why I like it so much. From a small starting seed and a collection of algorithms, amazing things can be brought forth. In an age of multi-gigabyte games full of motion-captured animation, professional voice talent, and real-world scenery, it’s a refreshing change to make something beautiful out of a few letters and numbers.

The alternative

No form of government is perfect. If one were, every nation-state would eventually gravitate towards it. Nor will I say that I have developed the perfect form of rule. In fact, I’m not sure such a thing is possible. However, I can present an alternative to the deeply flawed systems of the present day.

Guided by the principles of good government we have previously seen, and aided by logic, reason, and the wisdom of ages, we can derive a new method, a better method. It is not a fully-formed system. Rather, it is a framework with which we can tinker and adjust. It is a doctrine, what I call the Doctrine of Social Liberty.

I cannot accept the strictures of current political movements. In my eyes, they all fail at some point. That is the reason for stating my principles. Those are the core beliefs I hold, generalized into something that can apply to any nation, any state. A government that does not follow those principles is one that fails to represent me. I am a realist; as I said above, nothing is perfect. Yet we should strive for perfection, despite it being ever unattainable. The Doctrine of Social Liberty is my step in that direction.

More than ever, we need a sensible, rational government based on sound fundamentals. The answer does not lie in slavishly following the dogmatic manifestos of radical movements. It does not lie in constant partisan bickering. It can only be found by taking a step back, by asking ourselves what it is that we want from that which governs us.

Over the coming weeks, I hope to detail what I want from a government. I don’t normally post on Tuesdays, but the coming election presents a suitable reason to do so. In four posts, I will describe my doctrine in its broadest strokes, and I will show how our current ruling class violates the principles I have laid out. Afterward, following the start of next year, I want to go into greater detail, because I think these things will become even more relevant.

Let’s make a language, part 19b: Plants (Isian)

We’ve already established that Isian is a language of our world. We’ve also set it somewhere in the Old World, in a place relatively untouched by the passage of time. By definition, that means it won’t have much contact with the Americas, so the most common plant terms will be those from Eurasia, with a few popular items coming from Africa. On the other hand, Isian has native words for all the different parts of the plant, as well as what to do with them. Again, this comes from our worldbuilding: Isian is spoken in an agrarian society, so it’s only natural that its speakers would name such an integral part of their world.

Word list

General terms

These are parts of plants, mainly the important (i.e., edible) parts, as well as a few terms for the broad types of plants. Note that all of these are native Isian words, and almost all are also “fundamental” words, not derived from anything.

  • berry: eli
  • flower: atul
  • fruit: chil
  • grain: kashel
  • grass: tisen
  • leaf: eta
  • nut: con
  • plant: dires
  • root: balit
  • seed: som
  • stem (stalk): acut
  • to harvest: sepa
  • to plant: destera
  • tree: taw
Plant types

This set of words names specific types of plants. These fall into three main categories. First, there are the native terms, like pur “apple”, which are wholly Isian in nature. Next are the full-on loanwords, taken from the “common” names used in many parts of Europe; these are usually the New World plants where Isian has no history of association. Finally, there are a few compounds, like cosom, “pepper”, formed from ocom “black” and som “seed”.

  • apple: pur
  • banana: banan (loan)
  • bean: fowra
  • carrot: cate(s)
  • cherry: shuda(s)
  • corn (maize): meyse (loan)
  • cotton: churon
  • fig: dem
  • flax (linen): wod
  • grape: ged
  • mint: ninu
  • oak: sukh
  • olive: fili(r)
  • onion: dun
  • orange: sitru(s) (loan, “citrus”)
  • pea: bi (note: not a loan)
  • pepper: cosom (compound: “black seed”)
  • pine: ticho (from a compound “green tree”)
  • potato: pota (loan)
  • rice: manom
  • rose: rale(r)
  • wheat: loch

Coming up

These are far from the only words in the Isian language regarding plants, but they’re a good start, covering a lot of bases while also illustrating how we can combine worldbuilding and conlanging to make something better. Next week, we’ll see things from the Ardari side of the fence. Spoiler alert: it’s not exactly the same.

On visual programming

With the recent release of Godot 2.1, the engine’s developers broke the news of a project coming to a future version: a visual scripting system. That’s nothing new, even in the world of game development. Unreal came out with Blueprints a while back, and even Godot already used something similar for its shaders. Elsewhere, “visual programming” has its proponents and detractors, and I’m going to throw in my two cents.

Seeing is believing

In theory, visual programming is the greatest revolution in coding since functions. It’s supposed to deliver all the benefits of writing programs to those who don’t want to, well, write. The practice has its origins in flowcharts, which are graphical representations of algorithms. Computer scientists got to thinking, “Okay, so we can make algorithms like this, so why not whole applications?” It’s only been fairly recently that computers have been able to fulfill this desire, but visual programming languages are now springing up everywhere.

The idea is deceptive in its simplicity. Instead of laboriously typing out function declarations and loop bodies, you drag and drop boxes (or other shapes), connecting them to show the flow of logic through your program. The output of one function can be “wired” to another’s input, for example. The benefits are obvious. Forget about syntax errors. Never worry about type mismatches again. Code can truly become art. With the name of this site, you’d think I could get behind that.

It does work, I’ll give you that. MIT’s Scratch is used by tens of thousands of programmers, young and old alike. Through its ingenious “building block” system, where the “boxes” are shaped like pieces in a jigsaw puzzle, you never have to worry about putting the wrong peg into the wrong hole. For children who barely understand how a computer works—or, in extreme cases, may not know how to read yet—it’s a great help. There’s a reason why it’s been copied and cloned to no end. It even makes coding fun.

What can’t be unseen

The downsides to visual programming, however, are not fun at all. Ask anyone who’s ever suffered through LabVIEW (I’ve read enough horror stories to keep me away from it forever). Yes, the boxes and blocks are cute, but complex logic is bad enough when it’s in linear, written form. Converted to a more visual format, you see a literal interpretation of the term “spaghetti code”. Imagine how hard it was to write. Now imagine how hard it would be to maintain.

Second, visual programming interfaces have often suffered from a lack of operations. If you wanted to do something that there isn’t a block for, you were most likely out of luck. Today, it’s not so bad. Unreal’s Blueprints, for example, gives you about 95% of the functionality of C++, and Godot’s variation purports to do the same.

But some things just don’t fit. Composition, an important part of programming, is really, really hard to get right visually. Functional styles look like tangled messes. Numerics and statistics are better served in a linear form, where they’re close to the mathematics they’re implementing.

The verdict

I’m not saying visual programming is useless. It’s not. There are cases where it’s a wonderful thing. It’s great for education, and it’s suited to illustrating the way an algorithm works, something that often gets lost in the noise of code. But it needs to be used sparingly. I wouldn’t write an operating system or device driver in Scratch, even if I could. (You can’t, by the way.)

In truth, the visual “style” doesn’t appeal to me. That’s a personal opinion. Your mileage may vary. When I’m learning something new, it’s certainly a big help, but we have to take the training wheels off eventually. Right now, that’s how I see a visual scripting system. For a new programmer, sure, give it a try. You might love it.

But you probably won’t. After a while, you’ll start bumping into the limitations. You’ll have webs of logic that make spiders jealous, or customized blocks that look like something by Escher. Stay simple, and you might keep your sanity—as much as any programmer has, anyway. But we’re not artists. Not in that way, at least. Code can be beautiful without being graphical.

Magic and tech: safety and security

Despite what you may hear from TV and other sources of news, the world we live in today is the safest there’s ever been. Those of us living in the modern, industrialized West enjoy a level of personal, private, and public safety that would make earlier ages green with envy. Some of that comes from philosophy, from political science and enlightened ideas about the responsibilities of good government. With the representative democracies that make up most of Europe and North America, we’re all invested in the safety of everyone. An attack on one of us is an attack on all of us.

But technology also plays an important role in keeping us protected, on allowing us to live our lives free of the fears of random violence or other threats. Say what you will about them, but guns are a sufficient deterrent in many instances. But this isn’t the only form of technological security. Look at crash helmets, airbags, or even knee pads—all inventions created to keep us safe from incidental harm.

Science of safety

Today, we’re seeing a lot of talk about safety and security. Before we can look at them, though, we need to distinguish these two terms. Security, as I see it, is active protection from external threats, looking out for the things that might hurt you and dealing with them. Safety is more like not having those threats in the first place, or mitigating their causes in such a way that they never have the chance to harm you in the first place. Both of these aspects are intertwined, however.

Most technology deals with both ends of this spectrum at the same time. Take, for instance, collision avoidance. It’s a safety feature, in that its whole point is to steer you away from the possibility of a crash. But it can also be an active security system: if another car cuts you off, it can avoid that potential crash, too. Some of the more advanced systems can also stop you from causing an accident, by creating a negative feedback in steering or simply ignoring your movements of the wheel completely.

Safety and security aren’t limited to electronic assistance. They go back to the beginning of time. Any non-hunting weapon (or hunting weapon used for self-defense) is an implement of security. So are bodyguards and even standing armies. Public policies dating back to the age of Rome and before instituted measures of safety, from sanitation standards to traffic ordinances to weapons bans. (Whether these worked, of course, is a matter of debate.)

Socially speaking, there are also two ways we can look at safety. First, we can take it into our own hands. Anyone who owns a gun, has an alarm system, or even wears a seatbelt is doing exactly this. By following what we perceive to be “best practices”, we can make ourselves as safe as we wish. If X will harm you, then you try to put yourself in a position where X can’t get to you.

The alternative (not that they are mutually exclusive) is to put your trust in another. We also do that all the time. The whole point of a society based on the rule of law is that someone, somewhere, is responsible for the safety of the public. Whether that’s a king, president, or whatever you like, it doesn’t matter. Someone is looking out for you. We can’t protect against every threat, so we delegate to them.

Safety in magic

Most of our best safety and security comes from technology, whether that’s guns, cameras, anti-virus programs, or just a combination lock. Since we’ve established that magic can replace an awful lot of tech, we have to wonder: can magic make people safer?

Well, we’ve already seen a couple of realms where it does: medicine and self-defense. That’s proof enough of the merit of magical security. But how much further can we take this?

If your magic system allows shields of force (for this series, ours doesn’t, but bear with me), then that right there is a great example. Something like that would become extremely popular, especially if it’s not that hard to make. A single charm or enchantment that makes you all but immune to weapons, blunt trauma, falling, and the elements? You’d be crazy not to get one. But let’s say you’re working with something a little more low-key, like we are. We don’t have the luxury of an easy illustration of the power of magical security, so we’ll have to look at a few other possibilities.

We have an amplifying spell. A crafty mage can take this and turn it around. Instead of a speaker making his voice louder, a wary person can make ambient sounds louder. Sounds like, say, someone creeping through the bushes. It’s a primitive, but useful, security microphone. From the same earlier entry in this series, we also see a ventriloquist effect that can serve as a helpful bit of misdirection. If they think you’re over there, but you’re really here, those dangerous enemies will be out of position, giving you time to strike or run away.

Magical power, whether electrical or motive, gives us the opportunity to create such things as self-locking doors and electrified fences. Metallurgy, improved by the arcane arts, makes it easier to forge heavy, secure locks, but also the delicate keys needed to open them. A mage’s invisible markings can be used as fingerprinting or watermarking: a secure method of verifying the identity of a message’s sender. On the safety side, we have, of course, medicine and sanitation as the big winners, but they’re not the only ones.

Magic, and the scientific, empirical mindset it’s bringing to our fictional realm, will make many areas safer. From the grand (weather forecasting) to the mundane (washing hands), as our magical society becomes more advanced, it will seek out ways to keep its populace safe and secure. Sometimes, this may go too far—the seemingly inexorable slide of our own world into a surveillance state is an example—but one can hope the mages are smarter.

Safe and sound

If you’ll recall, our magical kingdom is, technologically speaking, still in the late medieval era. The added magic, however, is bringing it up to near-modern levels. Part of that advancement is in making people safer. If you do that, they live longer, healthier, better lives. They become more productive, and you eventually get that positive reinforcement that can explode into modernity. All you have to do is take some of the danger out of the world. Once the existential threats are no longer, people can begin to make themselves better.

Let’s make a language, part 19a: Plants (Intro)

Plants are everywhere around us. Grass, trees, flowers…you can’t get away from them. We eat them, wear them, and write on them. Growing them for our own use is one of the markers of civilization. Which plants a culture uses is as defining as its architectural style…or its language.

Every language used by humans will have an extensive list of plant terms. It’ll have names for individual plants, names for collections of them, names for parts of them. How many? Well, that depends. To answer that question, we’ll need to do a little worldbuilding.

The easy method

If you’re creating a modern (or future) language intended to be spoken by everyday humans, your task is fairly easy. All you have to do is borrow plant terms from one of the major languages of the day: English, Spanish, etc. Or you can use the combination of Latin and Greek that has served the West so well for centuries. Either way, an auxlang almost doesn’t need to make its own plant words.

Even naturalistic languages set in modern times can get away with this a bit. Maybe some plant terms have “native” words, but most of the rest are imported, just like the plants themselves. You could have native/loanword pairs, where the common folk use one word, but educated or formal contexts require a different one.

Harder but better

The further you get from modern Earth, the harder, but ultimately more rewarding, your task will be. Here’s where you need to consider the context of your language. Where is it spoken? By whom? And when? How much of the world do its speakers know?

Let’s take a few examples. The grapefruit is a popular fruit, but its history only extends back to the 1700s. A “lost” language in medieval Europe wouldn’t know of it, so they wouldn’t have a word for it. (Which is probably close to why it received the rather generic name of “grapefruit” in the first place.) Coffee, though grown in Colombia today, is native to the Old World, so ancient Amazonians would have never seen it. It wouldn’t be part of their world, so it wouldn’t get a name. Conversely, potatoes and tomatoes are American-born; you’d have to have a really good reason why your hidden-in-the-Caucasus ancient language has words for them.

For alien planets, it’s even worse. Here, you don’t even have the luxury of borrowing Earth names. But that also gives you the ultimate freedom in creating words. And that leads us to the next decision: which plants get which names.

Making your own

Remember this one general principle: common things will have common names. The more “outlandish” something is, the more likely it will be represented by a loanword. Also, the sheer number of different plants means that only a specific subset will have individual words. Most will instead be derived. In English, for example, we have the generic berry, describing (not always correctly) a particular type of fruit. We also have a number of derived terms: strawberry, blueberry, raspberry, huckleberry, and so on. Certain varieties of plants can even get compound names that are descriptive, such as black cherries; locative, like Vidalia onions; or (semi-)historical, such as Indian corn.

Plants often grow over a wide area, so it stands to reason that there will be dialectal differences. This provides an element of depth, in that you can create multiple words for the same plant, justifying them by saying that they’re used by different sets of speakers. Something of an English example is corn itself. In England, “corn” is a general term referring to a grain. For Americans, it’s specifically the staple crop of the New World, scientific name Zea mays. Back across the pond, that crop is instead called maize, but the American dialect’s “maize” tends to connote less-cultivated forms, such as the multicolored “Indian corn” associated with Thanksgiving. Confusing, I know, but it shows one way the same plant can get two names in the same language.

The early European explorers of America had the same problem a budding conlanger will have, so we can draw some conclusions from the way they did it. Some plants kept their native names, albeit in horribly mangled forms; examples include cocoa and potato. Some, such as tomatillo (Spanish for “little tomato”), are derived from indigenous terms. A few, like cotton, were named because they were identical or very close to Old World plants; the Europeans just used the old name for the new thing. Still others got the descriptive treatment, where they were close enough to a familiar plant to earn its name, but with a modifier to let people know it wasn’t the same as what they were used to.

The other side

In the next two entries, we’ll see what words Isian and Ardari use for their flora, and then it’s on to the other side of the coin, the other half of the couple. Animals. Fauna. Whatever you call them, they’re coming up soon.

Let’s end threads

Today’s computers almost all have multiple cores. I don’t even think you can buy a single-core processor anymore, at least not one meant for a desktop. More cores means more things can run at once, as we know, and there are two ways we can take advantage of that. One is easy: run more programs. In a multi-core system, you can have one core running the program you’re using right now, another running a background task, and two more ready in case you need them. And that’s great, although you do have the problem of only one set of memory, one hard drive, etc. You can’t really parallelize those.

The second option uses the additional cores to run different parts of a single application. Threads are the usual way of doing this, though they’ve been around far longer than multi-core processors. They’re more of a general concurrency framework. Put a long-running section of code in its own thread, and the OS will make sure it runs. It won’t block the rest of the program. And that’s great, too. Anything that lets us fully utilize this amazing hardware we have is a good thing, right?

Well, no. Threads are undoubtedly useful. We really couldn’t make modern software without them. But I would argue that we, as higher-level programmers, don’t need them. For an application developer, the very existence of threads should be an implementation detail in the same vein as small string optimizations or reference counting. Here’s why I think this.

Threads are low-level

The thread is a low-level construct. That cannot be denied. It’s closer to the operating system layer than the application layer. If you’re working at those lower levels, then that’s what you want, but most developers aren’t doing that. In a general desktop program or mobile app, threads aren’t abstract enough.

To put it another way: C++ bears the brunt of coders’ ire at its “manual” memory management. Unlike C# or Java, a C++ programmer does need to understand the lifecycle of data, when it is constructed and destructed, what happens when it changes hands. But few complain about keeping track of a thread’s lifecycle, which is essentially the same problem.

Manual threading is error-prone

This comes from threads being low-level because, as any programmer will tell you, the lower you are in the “stack”, the more likely you’ll unwittingly create bugs. And there might not be any bigger source of bugs in multithreaded applications than in the threading itself.

Especially in C++, but by no means unheard of in higher-level languages, threading leads to all sorts of undefined or ill-defined behavior, race conditions, and seemingly random bugginess. Because threads are scheduled by the OS, they’re out of your control. You don’t know what’s executing when. End a thread too early, for example, and your main program could try reading data that’s no longer there. And that can be awfully hard to detect with a debugger, since the very act of running something in debug mode will change the timing, the scheduling.

Sharing state sucks

In an ideal situation, one that the FP types tell us we should all strive for, one section of code won’t depend on any state from any other. If that’s the case, then you’ll never have a problem with memory accesses between threads, because there won’t be any.

We code in the real world, however, and the pure-functional approach simply does not work everywhere. But the alternative—accessing data living in one thread from another—is a minefield full of semaphores and mutexes and the like. It’s so bad that processors have implemented “atomic” memory access instructions just for this purpose, but they’re no magic bullet.

Once again, this is a function of threads being “primitive”. They’re manually operated, with all the baggage that entails. In fact, just about every problem with threads boils down to that same thing. So then the question becomes: can we fix it?

A better way

Absolutely. Some programming languages are already doing this, offering a full set of async utilities. Generally, these are higher-level functions, objects, and libraries that hide the workhorse threads behind abstractions. That is, of course, a very good thing for those of us using that higher level, those who don’t want to be bogged down in the minutiae of threading.

The details differ between languages, but the usual idea is that a program will want to run some sort of a “task” in the background, possibly providing data to it at initialization, or perhaps later, and receiving other data as a result. In other words, an async task is little more than a function that just happens to run on a different thread, but we don’t have to care about that last part. And that is the key. We don’t want to worry about how a thread is run, when it returns, and so on. All we care about is that it does what we ask it to, or else it lets us know that it can’t.

This async style can cover most of the other thread-based problems, too. Async tasks only run what they’re asked, they end when they’re done, and they give us ways (e.g., futures) to wait for their results before we try to use them. They take care of the entire lifecycle of threading, which we can then treat as a black box. Sharing memory is a bit harder, as we still need to guard against race conditions, but it can mostly be automated with atomic access controls.

A message-passing system like Scala and Erlang use can even go beyond this, effectively isolating the different tasks to an extent resembling that of the pure-FP style. But even in, say, C++, we can get rid of most direct uses of threads, just like C++11 removed most of the need for raw pointers.

On the application level, in 2016, there’s no reason a programmer should have to worry about manual memory allocation, so why should we worry about manual thread allocation? There are better ways for both. Let’s use them.

Building aliens: environment

Everything that lives lives somewhere. All organisms exist in an environment of some sort. It may not necessarily be what we think of when we hear the word “environment”, but that’s merely our human bias creeping in. Animals live in a specific environment. So do plants. So do extremophile bacteria, though theirs and ours have essentially nothing in common. Aliens, too, will be found in a certain environment, but which one is very dependent on their evolution.

The nature of Nature

For a long time, scientists and philosophers wrestled with the question of how much an organism’s environment affects its life, the so-called “Nature vs. Nurture” debate. We know now that there is no debate, that both have an impact, but let’s focus on the Nature half for now.

We, as humans, live mostly in temperate and tropical climates with moderate to heavy rainfall. We’re adapted to a fairly narrow band of temperatures, but our technology—clothes, air conditioning, etc.—augments our ability to survive and thrive in more hostile environments. Indeed, technology has let us travel to nearly lifeless regions, such as deep, dry deserts like the Atacama, the frozen wastes of interior Antarctica, and that most deadly environment of all: space.

But puny little us can’t live in such places. Not by ourselves. Other organisms are the same way, and they don’t have the benefit of advanced life-support machinery. So most of them are stuck where they are. Look through history, and you’ll see numerous accounts of wild animals (and indigenous people!) being captured and returned to an explorer’s homeland, where they promptly die.

Now, evolution’s very premise, natural selection, says that the most successful organisms are those best adapted to their environment. Thus, for an alien species, you want to know where it lives, because that will play a role in determining how viable your alien is. An aquatic animal isn’t going to survive very long in rain-shadow desert. Jungle trees won’t grow at 60° latitude. And the list goes on.

Components of an environment

A few factors go into describing the kind of life that can exist in a specific environment, or biome. Most of these boil down to getting the things life needs to perform its ultimate goals: survival and reproduction. For instance, all kinds of life require some form of energy. Plants get it from sunlight and photosynthesis, while animals instead eat things. The environment serves as a kind of backdrop, but it’s also an integral part of an organism’s survival, which is why life’s goals are better suited by becoming more adapted.

On a more useful level, however, we can look at a biome as an area having the following characteristics in about the same quantities:

  1. Temperature: Most species can only live effectively at a certain temperature. Too low, and things start to freeze; too high, and they boil. On Earth, of course, water is the primary limiting factor for temperature, though truly alien (i.e., not water-based) life will be constricted to somewhere near the range of its preferred chemical. (Not to say that freezing temperatures are an absolute barrier to life; penguins live just fine in subzero temps, for example.)

  2. Sunlight: This is the “energy” component I mentioned earlier. Assuming we’re dealing with a surface-dweller, sunlight is likely going to be the main type of incoming energy. That’s especially true for plants or other autotrophs, organisms which produce their own food. As any horticulturist knows, most plants are also highly adapted to a certain amount of sunlight. They’ll bloom only when the day is long enough, for example, or they’ll die if the nights grow too long, even if the temperature stays just fine.

  3. Proximity to water: I was going to label this as “precipitation”, but that turns out to be too specific. Water (or whatever your aliens use) is a vital substance. Every species requires it, and many absolutely must have a certain amount of it. If they, like plants, can’t move, then they must rely on water coming to them. That can fall from the sky as precipitation, or it can come across land in the form of tidal pools, or just about any other way you can think of.

  4. Predators and prey: If you remember old science classes, you know about the food chain. Well, that’s something all life has to worry about, if you’ll pardon the anthropomorphizing. Predators adapt to the presence of certain kinds of prey, and vice versa. Take one away, and things go out of whack. Species can overrun the land or go extinct.

Humans get away with a lot in this. Once again, that’s because of our intelligence and technology, and it’s reasonable to assume that a sapient alien race would overcome their own obstacles in much the same way. But everything else has to limp along without the benefits of higher thinking, so other species must adapt to their environment, rather than, essentially, bringing their own with them.

Great upheaval

All environments are constantly in flux. Climate changes, from season to season or millennium to millennium. Rainfall patterns shift, oceanic currents move, and that’s before you get into anything that may be caused by humanity. Then there are “transient” changes in environment, from wildfires to hurricanes to asteroid impacts. These can outright destroy entire habitats, entire biomes, but so can the slower, more gradual shifts. Those just give more warning.

When the environment changes beyond the bounds of a species, one of two things can happen. That species can adapt, or it will die. History and prehistory are littered with examples of the latter, from dodos to dire wolves. Adaptation, on the other hand, can often give rise to entirely new species, distinct from the old. (For an example, take any extant organism, because that’s how evolution works.)

An alien race will have its own history of environmental upheaval, entirely different from anything on Earth. A different series of major impacts, larger tidal effects from a bigger moon, massive solar flares…and that’s just the astronomical effects. Aliens will be the result of their own Mother Nature.

That’s where they become different. Even if they’re your standard, boring carbon-based lifeforms, even if their “animal” kingdom looks suspiciously like an alternate-color version of ours, they can still be inhuman. On Earth, one branch of the mammalian tree gave rise to primates, some of which got bigger brains. On another world, it could have been the equivalent of reptiles instead. Or birds. Or plants, but I’m not exactly sure how that’d work. One thing’s for sure, though: they’ll live somewhere.

On sign languages

Conlangs are, well, constructed languages. For the vast majority of people, a language is written and spoken, so creators of conlangs naturally gravitate towards those forms. But there is another kind of language that does not involve the written word, does not require speech. I’m talking, of course, about sign language.

Signing is not limited to the deaf. We all use body language and gesture every day, from waving goodbye to blowing kisses to holding out a thumbs-up or peace sign. Some of these signs are codified as part of a culture, and a few can be quite specific to a subculture, such as the “hook ’em horns” gesture that is a common symbol of the University of Texas.

Another example of non-deaf signing is the hand signals used by, for example, military and police units. These can be so complex that they become their own little jargon. They’re not full sign languages, but they work a bit like a pidgin, taking in only the minimum amount of vocabulary required for communication.

It’s only within the community of the hearing-impaired that sign language comes into its own, because we’re talking about a large subset of the population with few other options for that communication necessary for human civilization. But what a job they have done. American Sign Language is a complex, fully-formed language, one that is taught in schools, one learned by children the same as any spoken tongue.

Conlangs come in

So speaking with ones body is not only entirely possible, but it’s also an integral part of speaking for many people. (The whole part, for some.) Where does the art of constructing languages come in? Can we make a sign conlang?

Of course we can. ASL was constructed, as are many of the other world sign languages. All of them have a relatively short history, in fact, especially when compared to the antiquity of some natural languages. But there are a few major caveats.

First, sign languages are much more difficult to comprehend, at least for those of us who have never used one. Imagine trying to develop a conlang when you can’t speak any natlang. You won’t get very far. It’s the same way for a non-signer who would want to create a sign language. Only by knowing at least one language (preferably more) can you begin to understand what’s possible, what’s likely, and what’s common.

Second, sign languages are extremely hard to describe in print. ASL has transcription schemes, but they’re…not exactly optimal. Your best bet for detailing a sign conlang might actually be videos.

Finally, a non-spoken, non-written language will necessarily have a much smaller audience. Few Americans know ASL on even the most rudimentary level. I certainly don’t, despite decades of alphabet handouts from local charities and a vain attempt by my Spanish teacher in high school to use signing as a mnemonic device. Fewer still would want to learn a sign language with even less use. (Conlangers in general, on the other hand, would probably be as intrigued as for any new project.)

Limits

If you do want to try your hand at a sign conlang, I’m not sure how helpful I can be. I’ll try to give a few pointers, but remember what I said above: I’m not the best person to ask.

One thing to keep in mind is that the human body has limits. Also, the eye might be the most important organ for signing. A sign that can’t be seen is no better than a word you don’t speak. Similarly, it’s visual perception that will determine how subtle a signing movement can be. This is broadly analogous to the difference in phonemes, but it’s not exactly the same.

Something else to think about: signing involves more than the hands. Yes, the position and orientation of the hands and fingers are a fair bit of information, but sign languages use much more than that. They also involve, among other things, facial expressions and posture. A wink or a nod could as easily be a “phoneme” of sign as an outstretched thumb.

Grammar is another area where signing can feel strange to those used to spoken language. ASL, for example, doesn’t follow normal English grammar to the letter. And there’s no real reason why it should. It’s a different language, after all. And the “3D” nature of sign opens up many possibilities that wouldn’t work in linear speech. Again, it’s really hard to give more specific advice here, but if you do learn a sign language, you’ll see what I’m saying. (Ugh. Sorry about the pun.)

Celebrating differences

Conlanging is art. Just as some artists work with paint and canvas, others sculpture or verse, not all conlangers have to be tied to the spoken and written varieties of language. It’s okay to be different, and sign languages are certainly different.

Software internals: associative arrays and hashing

We’ve already seen a couple of options for storing sequences of data in this series. Early on, we looked at your basic array, for example, and then we later studied linked lists. Both of those are good, solid data structures. Linear arrays are fast and cheap; linked lists are great for storing an arbitrary amount of data for easy retrieval. There’s a reason these two form the backbone of programming.

They do have their downsides, however. Linear arrays and lists both suffer from a problem with indexing. See, a piece of data in an array is accessed by its index, a number you have to keep track of if you ever want to see that data again. With linked lists, thanks to their dynamic structure, you don’t even get that. Finding a value in a list, if you don’t already know what you’re looking for, is practically impossible.

By association

What we need is a structure that gives us the flexibility of a list but the indexing capabilities of an array. Oh, and can we have an array whose indexes are keys we decide, instead of just numbers? Sure, and I bet you want a pony, too. But seriously, we can do this. In fact, your programming language of choice most likely already does it, and you just never knew. JavaScript calls them “objects”, Python “dictionaries”, C++ “maps”, but computer science generally refers to this structure as the associative array. “Associative”, because it associates one value with another. “Array” tells you that it functions more like a linear array, in terms of accessing data, than a list. But instead of computer-assigned numbers, the indexes for an associative array can be just about anything: integers (of your choosing), strings, even whole objects.

The main component of an associative array is the “mapping” between keys (the indices used to access the data) and the values (the actual data you’re storing in the array). One simple way to create this mapping is by making a list of key-value pairs, like so:

class AssociativeArray {
    constructor() {
        // Pairs are objects with 2 attributes
        // k: key, v: value
        this.pairs = []
    }

    get(key) {
        let value = null;
        for (let p of this.pairs) {
            if (p.k === key) {
                value = p.v;
                break;
            }
        }

        return value;
    }

    insert(key, value) {
        if (!this.get(key)) {
            // Add value if key not already present
            this.pairs.push({k: key, v: value});
        } else {
            // *Update* value if key present
            this.pairs = this.pairs.map(
                p => (p.k === key) ?
                    {k: key, v: value} :
                    p
            );
        }
    }

    remove(key) {
        this.pairs = this.pairs.filter(p => (p.k !== key));
    }
}

(Note that I’m being all fancy and using ES6, or ES2015 if you prefer. I’ve also cheated a bit and used builtin Array functions for some operations. That also has the dubious benefit of making the whole thing look more FP-friendly.)

This does work. We could create a new instance of our AssociativeArray class, call it a, and add values to it like this: a.insert("foo", 42). Retrieving them is as easy as a.get("foo"), with a null result meaning no key was found. (If you want to store null values, you’d have to change this to throw an exception or something, which is probably better in the long run, anyway.)

The problem is, it’s not very fast. Our simple associative array is nothing more than a glorified linked list that just so happens to store two values at each node instead of one. For small amounts of data, that’s perfectly fine. If we need to store a lot of information and look it up later, then this naive approach won’t work. We can do better, but it’ll take some thought.

Making a hash of things

What if, instead of directly using the key value, we computed another value from it and used that instead? That might sound silly at first, but think about it. The problem with arbitrary keys is that they’re, well, arbitrary. We have no control over what crazy ideas users will come up with. But if we could create an easy-to-use number from a key, that effectively transforms our associative array into something closer to the familiar linear array.

That’s the thinking behind the hash table, one of the more popular “back ends” for associative arrays. Using a special function (the hash function), the table computes a simple number (the hash value or just hash) from the key. The hash function is chosen so that its output isn’t entirely predictable—though, of course, the same input will always return the same hash—which distributes the possible hash values in a random-looking fashion.

Hashes are usually the size of an integer, such as 32 or 64 bits. Obviously, using these directly is not an option for any realistic array, so some sacrifices have to be made. Most hash tables take a hybrid array/list approach. The hash “space” is divided into a number of buckets, chosen based on needs of space and processor time. Each bucket is a list (possibly even something like what we wrote above), and which bucket a value is placed in is determined by its hash. One easy option here is a simple modulus operation: a hash table with 256 buckets, for instance, would put a key-value pair into the bucket corresponding to the last byte (hash % 256) of its hash.

In code, we might end up with something like this:

class HashTable {
    constructor() {
        this._numBuckets = 256;
        this._buckets = new Array(this._numBuckets);
        this._buckets.fill([]);
    }

    _findInBucket(key, bucket) {
        // Returns element index of pair with specified key
        return this._buckets.findIndex(p => (p.k === key));
    }

    _update(key, value, bucket) {
        for (let i = 0; i < this._buckets[bucket].length; i++) {
            if (this._buckets[bucket][i] === key) {
                this._buckets[bucket][i].v = value;
                break;
            }
        }
    }

    set(key, value) {
        let h = hashCode(key);
        let bucket = h % this._numBuckets;
        let posInBucket = this._findInBucket(key, bucket);

        if (posInBucket === -1) {
            // Not in table, insert
            this._buckets[bucket].push({k: key, v: value});
        } else {
            // Already in table, update
            this._update(key, value, bucket);
        }
    }

    get(key) {
        let h = hashCode(key);
        let bucket = this._buckets[h % this._numBuckets];
        let index = this._findInBucket(key, bucket);

        if (index > -1) {
            return bucket[index].v;
        } else {
            // Key not found
            return null;
        }
    }
}

The function

This code won’t run. That’s not because it’s badly-written. (No, it’s totally because of that, but that’s not the point.) We’ve got an undeclared function stopping us: hashCode(). I’ve saved it for last, as it’s both the hardest and the most important part of a hash table.

A good hash function needs to give us a wide range of values, distributed with little correlation to its input values so as to reduce collisions, or inputs leading to the same hash. For a specific input, it also has to return the same value every time. That means no randomness, but the output needs to “look” random, in that it’s uniformly distributed. With a hash function that does this, the buckets will remain fairly empty; optimally, they’d each only have one entry. The worst-case scenario, on the other hand, puts everything into a single bucket, creating an overly complex linked list with lots of useless overhead.

There are a lot of pre-written hash functions out there, each with its own advantages and disadvantages. Some are general-purpose, while others are specialized for a particular type of data. Rather than walk you through making your own (which is probably a bad idea, for the same reason that making your own RNG is), I’ll let you find one that works for you. Your programming language may already have one: C++, for one, has std::hash.

Incidentally, you may have already seen hash functions “in the wild”. They’re fairly common even outside the world of associative arrays. MD5 and SHA-256, among others, are used as quick checksums for file downloads, as the uniform distribution principle of hashing causes small changes in a key to radically alter the final hash value. It’s also big (and bad) news when collisions are found with a new hash algorithm; very recently—I can’t find the article, or I’d link it here—developers started warning about trusting the “short” hashes used by Git to mark commits, tags, and authors. These are only 32 bits long, instead of the usual 256, so collisions are a lot more likely, and it’s not too hard to pad out a malicious file with enough garbage to give it the same short hash as, say, the latest Linux kernel.

Summing up

Hash tables aren’t the only way to implement associative arrays. For some cases, they’re nowhere near the best. But they do fill the general role of being good enough most of the time. Computing hash codes is the most time-consuming part, with handling collisions and overfull buckets following after that. Unfortunately for higher-level programmers, you don’t have a lot of control over any of these aspects, so optimizing for them is difficult. However, you rarely have to, as the library writers have done most of that work for you. But that’s what this series is about. You may never need to peer into the inner workings of an associative array, but now you’ve got an idea of what you’d find down there.