2023 Projects

I’m constantly dreaming up new ideas for side gigs and hobby projects. Anyone who read my posts before April 2021 knows that all too well. Lately, as my current job has begun to wind down and my relationship seems to be nearing a plateau, my brain has decided to kick back into high gear on this front. So here are some of the things I’m thinking about with my spare mental cycles. Some of them I’ll get to eventually. Some I’m already planning out. A few will likely never see the light of day.

Borealic

I haven’t done much with conlangs in the past couple of years. A few months back, I had another aborted start on an "engineered" language, this one based on a ternary number system. (The idea was to make something philosophical but also easily representable without words. I’m weird.)

Now, I’m doing serious work on what is my first real attempt at an auxiliary language. There are plenty of auxlangs already out there, of course: Esperanto, Lojban, and so on. Mine is slightly different, however. Instead of drawing on Latin as the primary source of vocabulary—or being some sort of amalgam of the world’s major languages—I’m developing a conlang intended as a pan-Germanic interlingua.

The core vocabulary is derived from actual Proto-Germanic roots, most of which are shared by at least two of the six major Germanic languages spoken today. Those are English, German, Dutch, Danish, Norwegian, and Swedish, for those of you keeping score at home. Icelandic, Frisian, and the other "minor" Germanic tongues also get their due, mostly as additional confirmation of a meaning that has drifted over the past 2500 years or so. (Gothic has been extinct basically forever, so I exclude it from consideration.)

In terms of grammar, "Borealic" (the external name; it calls itself "Altidisk") mostly follows the general pattern of West Germanic and North Germanic languages. Where these differ, I look for common ground, and I try going back to a common ancestor for inspiration. The basic word order, for example, is V2: verbs always try to fill the second slot in a sentence if possible. That’s a common theme throughout the Germanic world. So is a two-way tense distinction between past and non-past, with the future tense instead being indicated by an auxiliary verb.

My goal isn’t necessarily to create a conlang for everybody to use. No, this one is explicitly intended for purposes best described as nationalistic. Borealic is for the Germanic peoples of the world. It’s a way to connect with our shared culture, a culture that is increasingly under attack these days.

Borealic is what I’m working on as I write this post, so it’s the one I’ll probably be sharing soonest.

Word games

I still want to be a game developer, and I’m still working towards that goal. I have two concepts I’ve been fleshing out in my head, and I’m getting ready to start making something more concrete out of them.

First is "Fourwords". At its core, this is going to be a simple little fill-in word puzzle. Instead of a crossword, however, you get a chain of four different words. The last letter of one word is the first letter of the next, and all the words in a chain are connected by a theme which the player will see while working the puzzle. You get points based on the length of each word (they aren’t fixed, but are variable between 4-12 letters) and the perceived difficulty of the chain: more generic categories are considered harder, as are those for very specific niches.

I envision Fourwords as a mobile-first game. In other words (no pun intended), there will be sets of puzzles that unlock as the player progresses. I’ll have plenty of gamification elements thrown in there, and—as much as I hate it—probably some kind of builtin ad or IAP support. I’ll build it using the new 4.x version of the Godot Engine, which will be my first real foray into its new features. I imagine also needing a server to store player data and all that. Lucky for me, my "real" job requires me to learn AWS.

The second word game is much simpler, yet also much more complex. This one doesn’t have a name yet, and it’s little more than a Wordle clone at heart. It’s a Mastermind-like game using words of five or six letters; I haven’t decided which would work best. You have a secret word, and you have to try to guess what it is. If you’re right, you win! If you’re wrong, you get to see which letters are correct, and which ones are in the wrong places. Scoring is based on how many guesses you make and how long it takes you to get to the right word.

Since there are only so many words in the English language, this one necessarily has a well-defined endpoint. But I figure I can add in a timed mode with randomization to keep things a little fresh. Beyond that, the format doesn’t have much else going for it.

But here’s the kicker. This one isn’t going to come out on mobile. It’s not going to be on desktop, either. No, I want to make this game for a console. And not just any console, but a retro one. I must be getting crazy in my old age, because I am seriously considering making a game for the NES. That means 6502 assembly, low-res tile graphics, music that is more code than notes, and all those arcane incantations that game devs used to do. It’ll be a monumental undertaking, but what if I can pull it off?

Adventure

I’ve started writing again in recent weeks. Time is short, but I’ve been able to find an hour here and there to get back to On the Stellar Sea. Those poor kids have had to stay on that planet too long!

Writing on Orphans of the Stars has made me want to go back to the project I had originally imagined would accompany it. This one is almost another game dev project, but of a different sort. The Anitra Incident is technically a prequel to the novel series, but it’s one I plan to write as interactive fiction. In other words, you are the protagonist. The setting is about 200 years in the future, when humanity’s lunar and Mars colonies are up and running, and we now turn our eyes outward. A strange Main Belt asteroid catches our eye, and a manned mission is sent to explore it. What they—you—find will shock everyone.

That’s the gist of it. It’s kind of a CYOA game, kind of an exercise in descriptive writing, and hopefully a lot of fun. And the books have already referenced this particular era of the setting’s history, so part of me feels I have to write it. I’ll need to relearn Sugarcube, I suppose. Graphics should be a lot easier now, thanks to Stable Diffusion. I may even be able to do character portraits, something I never imagined I would be capable of. (That’s no joke. I’ve had great success generating portraits of some of the Innocence kids, and they make good writing references.)

Never enough

There are plenty of other things my brain has decided to focus on. Pixeme, my community-based language learning web platform idea, is starting to take shape. Concerto is another one I want to play around with some more; it’s a microkernel OS written in Nim, a language I’ve found that I really enjoy. Another one I just named yesterday is Stave: the goal with this one is to create a long-term stable virtual machine. As in really long term. I want to make a VM that will stand the test of time.

But I’ll get to that later. Right now, there’s so much to do, and nowhere near enough time to do it all.

Twospeech: An experiment in English diglossia

The English language is, like so much else in today’s world, in a state of conflict. Especially in America, our language serves two purposes which are distinct and even, in some cases, diametrically opposed. Not only must it serve as a native tongue for the vast majority of inhabitants of numerous countries (the US, the UK, Canada, Australia, South Africa, and so on), but it has also been adopted, in the so-called World English form, as a modern-day lingua franca for most international communications.

Those two purposes, however, work against one another. By serving as an international language, the value of English as a literary language is devalued, for we native speakers lose the connection that every other language allows. Conversely, the “Anglo” cultural connotations present in the language can be seen as relic of colonialism. Why must speakers of, say, Japanese or Arabic care about how a particular offshoot of the Saxons lived a thousand years ago? On the other hand, why shouldn’t Americans or Canadians have the opportunity to forge a closer cultural bond with each other than they would have with nonnative speakers?

In other words, we have a clash between a culture wanting their own language and a world needing a language without strings attached. But there is an answer.

The lingua franca

In days gone by, people—Europeans, rather—would turn to Latin. The Romans ruled a large swath of Europe, along with parts of North Africa and Asia Minor, and they spread their tongue throughout their realm. Thus, even centuries after their decline and fall, their speech was still seen as a model. It helped, of course, that many of the languages spoken in those regions were descended from Latin: the Romance tongues of French, Spanish, Italian, and so on.

Latin, of course, suffers from numerous problems of its own. It’s a complex, baroque language, and the “New Latin” movement that started shortly after the Renaissance only made the situation worse. On top of that, it is still a human language, associated with a culture.

As that culture is now extinct, we can counter most of the anti-colonialist arguments. Using Latin as a lingua franca doesn’t spread Roman culture any more than using modified Arabic numerals in mathematics spreads Islam. Time and evolution have detached the Latin language from its roots.

To a lesser extent, we can say the same thing for Classical Greek. Here, the situation is murkier. Greek is a living language, spoken (obviously) in Greece. But there are significant differences in phonology, grammar, and lexicon between the writings of Homer or Plato and what’s spoken on the streets of Athens today. In that sense, we can make a lesser argument that Classical Greek is sufficiently acultural to serve as the basis for a global language.

Contenders

One might also consider other possibilities. Chinese script, for instance, spread throughout East Asia, penetrating Korea, Japan, and Vietnam, among other places. Sanskrit is the ancestor of languages spoken by over a billion people, and has a rich literary tradition of its own.

These do have their own problems. Chinese might have a unified script, but this hides a wide range of variation in the spoken form, so much that what Westerners call dialects should, in fact, be treated as languages in their own right. Thus, for a spoken global language, we would have to choose one, and that disadvantages speakers of the others. Mandarin might be the most prominent, but why pick it over Cantonese?

Sanskrit’s daughter languages are even more distinct, much the same as the Romance languages of Europe, so cultural favoritism isn’t as much trouble. Rather, the problem here is one of connotation. In the West, Sanskrit is often considered to be the tongue of mystics and monks at best, New Age pseudoscience at worst. In a quirk of history, its vocabulary didn’t penetrate far enough outside its initial borders to gain global recognition. Thus, we should call it a more distant third choice after Latin and Greek.

Two other contenders, Classical Arabic and Old Church Slavonic, we must also reject due to connotations. In this case, the factors are religious, as they are inextricably linked to Islam and Orthodox Christianity, respectively. As we want to create a world language that respects diverse cultures while promoting none of its own, those best known as liturgical or scriptural won’t work.

English as spoken

Fortunately for our purposes, English already has numerous loanwords and coinages in Latin and Greek. (Most of those coming from Sanskrit and its children are cultural loans such as yoga.) By some estimates, as much as 50% of English text derives from these two languages, and that percentage is even higher in technical and scientific contexts. Modern terms often combine the two, creating forms such as television or hexadecimal, further diluting any connections to the native tongues.

This extensive vocabulary can be the beginning of our world language. Indeed, it already is. Scientific terms built from Latin and Greek roots have been borrowed into languages all over the planet, no matter whether those places and peoples were ever even conceived by the Romans.

Thus, we see one fairly simple path to removing the appropriation and colonialism of English: using and creating new “classical” terms wherever possible. English is a more isolating language, though, meaning that it uses a lot of purely grammatical words. Articles such as the, linking verbs like be and do, and many more have no lexical content at all, so there’s no harm in keeping them. It’s only the “content” words we need to worry about.

Conversely, the “native” form of English should favor native-built content words rather than classical borrowings and neologisms. English-speaking nations and peoples share a culture with a long and storied history, the same as any other on earth. We should maintain it, add to it, without forcing it upon the rest of the world or leaning on others as a crutch.

In time, we would have two different varieties of English. One is the “internal” native tongue, respecting its history and culture without attempting to spread them. The “external” language, by contrast, serves as a truly cosmopolitan manner of speaking, accepting all but favoring none. Rather than a distinction of station, what linguists call register, we would see a dichotomy of inner and outer, effectively two languages, although they would remain very, very close in many ways.

This state is called diglossia, following the “classical” tradition of Latin and Greek neologism. Using a more native approach, we might call it twospeech.

What it’s not

Let’s get this out of the way first. Twospeech is most emphatically not another attempt at linguistic purity, whatever that may be. We’re not trying to remove all traces of foreign influence from English. Instead, the goal is to create a more solid cultural boundary between speakers of Native English and those of World English. On one side, we have the tongue of the common citizen of the United States, England, and other countries where English is the primary language. On the other, we have the citizens of Earth itself, humans of all stripes, who should transcend barriers of race and ethnicity.

What it is, or could be

The “world” form would, most likely, become the educated variant, in much the same way that European university students throughout the Renaissance and Enlightenment periods learned Latin, then used it in scientific publications. We’ll call this style epiglossia (the “over” language) or worldspeech, depending on which form we’re speaking.

“Native” English, however, would be the word on the street. Perhaps it would receive “rural” connotations, but the interconnectivity of today’s world will act as a brake on such tendencies. The key is that this form is only intended for English speakers. Yes, it sometimes brings back old words, including some designed by linguistic purists in the past. It also adds naturalized loans, particularly those from Anglo-Norman or the Viking invasions, and it necessarily must turn to its higher-class sibling for scientific talk. But it retains its own character despite that. We’ll call it demoglossia, the “people’s” language, or kinspeech, to emphasize the shared bond between English speakers.

Epiglossia

Epiglossia, then, has the following characteristics:

  • The lexicon is built mostly from Latin and (Classical) Greek roots, with borrowings from other languages allowed when appropriate, but only if they retain their cultural context. In all cases, phonological considerations should be taken into account, as well as the limitations of our script.

  • Grammar is formal English, as would be used in a research paper, professional speech, or government memorandum. In particular, colloquial phrases should be avoided.

  • Slang, being specific to a subculture, is best omitted, but common abbreviations are allowed. For example, “chem” for chemistry is acceptable.

  • Speakers should take care to consider the audience, using forms such as singular “they” if appropriate.

Demoglossia

Demoglossia charts its own course, with these guiding principles:

  • English grammar remains unchanged, but colloquialisms are allowed for all but the most formal situations, encouraged whenever speakers feel comfortable.

  • Although lexical items can be borrowed from Latin and Greek, as in epiglossia, prefer native constructions and coinages, using roots from Anglo-Saxon, Anglo-Norman (and its related dialects of Old French), and words imported into English-speaking areas historically.

  • Deriving shortened forms from classical roots (including epiglossia itself) is acceptable. Illustrative examples include phone and TV.

  • Creativity is key, as the goal of demoglossia is to embrace the Englishness of our language.

Conclusion

This is just a sketch, but it’s an idea with merit. Many have tried throughout the years to create a “world” language, whether a fixed form of English or an auxiliary language such as Esperanto. Twospeech leans more toward the former, but attacks the problem in a different way, by accepting the inherent conflict between the two ideas of what English should be.

Today, the language is under assault from multiple directions, and they will eventually cause a split or perhaps even the fall of English as the international standard for communication. Embracing the notion that the language we speak to outsiders doesn’t necessarily have to match the one we use among friends is only doing what every other culture has to do on a daily basis. In that regard, Twospeech fosters linguistic equality.

It is true that this proposal doesn’t remove all the parochialism from World English, but it’s a significant step forward. In the coming weeks, I’ll expand upon this initial sketch, because I believe it to be educational, even enlightening.

🖼🗣: the emoji conlang, part 7

Welcome to another chapter in the story of the emoji conlang 🖼🗣. This time around, we’ll get most of the more complex clauses you’d find in a language, including some that are traditionally considered the hardest to pin down. So let’s get right into it, shall we?

Comparisons

Comparing two things is both easy and common. In English, of course, you use “comparative” forms of adjectives: bigger, stronger, more interesting. 🖼🗣 does things a little differently, however.

First off, there are no special adjective suffixes for comparisons. That fits with the general idea of the conlang as being very isolating. Instead, we use the verb ⬜▫. Normally, it has the meaning “to exceed”, but we can prefix it with an adjective (effectively functioning as an adverb) to create a comparison: 👇 👨 ↕〰 ⬜▫ 👆 👩 “this man is taller than that woman”.

The form, then, is fairly simple. First comes the thing that is being compared. Next is an adjective for the quality being compared. Third on the list is the verb ⬜▫ (which can take suffixes if needed). Last comes the “standard”, the yardstick being measured against.

Note that this construction is for actual comparisons only. If you just want an adjective meaning “more of X”, you can just use the superlative suffix ⬛. It works for English “more” and “most”.

Causation and purpose

In 🖼🗣, these two concepts are closely related. Something can cause something else to perform an action, or something can perform that action for a reason. Either way, the form is similar, so we’ll treat these two types of clauses together.

First, the simpler purpose clause is just a string of verbs or verb phrases, with objects and the like inserted where they would normally go. So “I went to the store to buy food” becomes 🤳 🛫◀ 🏬 🛍📨 🥘.

Note here that the subject of the second clause is implied. That’s normal. Just having multiple verbs strung together is enough to indicate what we’re talking about. But we can add a subject, too: 🤳 🛫◀ 🏬 🤲 💁 👉▶ 🥘. (That strange 💁 in there will make sense in a minute.) Roughly, this sentence translates as “I went to the store so we’ll have food.”

Now, building off this, we can use the verb ↘ “cause” to create, well, causatives. For instance, ♀ ↘◀ 🤳 🛫 🏬 would mean something like “she made me go to the store”; here, we explicitly indicate the subject in the second clause, showing that it is not the same as in the first.

Finally, two special words work with the purpose clause to add to it. Between the verb phrases, we can add ⤵⌛: or ⤵↘ to express times or reasons, respectively. Here’s an example of each:

  • 🤳 ❔➡ 🚫 🛫◀ 🏫 ⤵⌛ ➡ 🤢. “I couldn’t go to school while I was sick.”
  • 🤳 ❔➡ 🚫 🛫◀ 🏫 ⤵↘ ➡ 🤢. “I couldn’t go to school because I was sick.”

The topic particle

I promised I’d explain 💁, so here goes. In linguistic terms, it’s a topic particle, sometimes called a topicalizer. If you know Japanese, it should feel familiar, as it functions much like the particle wa (は). If not, read on.

The topic of a sentence is often the same as the subject. In cases where it isn’t, however, or when we want to emphasize it for some reason, we use the topic particle to draw attention to it. Notably, 🖼🗣 uses this in possessive predicates. The formula here is (owner) 💁 ➡ (possession), and we could translate it loosely “with (owner), there is a (possession)”. Complicated, I know, but you’ll get the hang of it.

Indeed, possessives like this are one of the few cases where the language gives two similar concepts wildly different forms. Compare 🤳 💁 ➡ 🐈 “I have a cat” versus 🤳’🐈 “my cat”. Not nearly the same.

Back to the topic particle, though, because it’s got another use: subjects. Not the grammatical sort, but the discussion sort. If I wanted to say in 🖼🗣 that my favorite food is chicken, for instance, I might type 🥘 💁 🤳 🔘👍 🐔. You can follow the same pattern to express preferences, opinions, ideas, and much more.

Relative clauses

Last, we’ll look at what is traditionally considered one of the most difficult phrases to describe, the relative clause. Fortunately for us, 🖼🗣 makes those fairly easy to start.

Relative clauses always begin with 👈, so if you see that, you know what you’re dealing with. In some cases, you don’t even have to worry about anything else. When the head noun is the same as the subject of the relative clause, you’re done: 👩 👈 🏡➡ 📍 👵 ⬜▫ 🤳, “the woman who lives here is older than me”.

When it’s not the subject, the only thing that changes is an extra pronoun that we add into the relative clause, kind of a placeholder for what we took out. 👨 👈 🤳 👀◀ ♂ means “the man that I saw”, but a more literal translation would be the grammatically incorrect (in English) “the man that I saw him”. If you’ve ever lost yourself in relative clauses, you’ll recognize this one!

That extra pronoun functions exactly as the noun it’s replacing, even in possessive constructions. And pedants will either love or hate the way 🖼🗣 deals with relative nouns in prepositional phrases. Because of this “placeholder”, we have no reason to end a sentence with a preposition: 👇 👈 🤳 ➡ ⬅⬅ ◻ “this is where I’m from” (or, if you must be formal, “this is the place from where I come”).

Conclusion

That’s all for now, but we really have all that we need. Well, except for words. Those are, after all, the meat of a language, so the next part of the series is going to go back to making them. Keep watching, because it’s about to get even more fun!

🖼🗣: the emoji conlang, part 6

It’s time for some more 🖼🗣. Last time around, you may remember that we looked at the vast collection of emoji in the Unicode standard (as of version 12). Most of them, not counting the numerous variations allowed by gender, skin tone, and hair modifiers, have some sort of meaning in our script.

Now it’s time to put them to use in making not just words, but phrases, sentences. We’ve been doing that already, of course; parts 2 and 3 were dedicated to that. Here, though, we’ll delve deeper into the nuances. And we’ll take it one step at a time.

Noun phrases

Conveniently enough, most words in 🖼🗣 are nouns, so we’ve got a lot to work with here. (Since emoji are icons, and it’s a little difficult to have an icon that represents something abstract, it’s only natural.)

To start off, remember that our script doesn’t have articles. There’s no “a” or “the” in 🖼🗣. They’re not needed. (Plenty of languages around the world get by without them, after all.) The meaning is implied; if you really need to specify something definite, then the demonstrative pronoun 👇 can provide a similar function. It’s not exactly the same, as it actually means something closer to “this” than “the”, but you get the idea.

Numerals are another important part of noun phrases. For us, they’re pretty simple: just use them. For “one”, you write 1. Ordinals, as we saw last time, instead use the “keycap” emoji such as 1️⃣. For ordinals greater than 10, you can compound them: “fourteenth” (the day I’m writing this) is 14.

Everything else is fairly straightforward. Adjectives occur before their head nouns: ⬜ 👨 “a big man”; 🔵 🚗 “the blue car”. Possessives use the apostrophe notation we saw in Part 2, always attaching to the head noun: 3 🤳’🧒 “my three children”; 👇 👴 🤲’🏠 “this old house of ours”. The last traditional component of a noun phrase is a relative clause, which we’ll deal with later.

Before we move on, though, a couple of little extra rules. First, adjectives can’t appear as subjects without a head noun. (They’re fine as predicates, by the way.) Thus, you can use the determiner word ⚪ as a kind of “empty” noun in these cases: 2️⃣ ⚪ “the second one”. This is not the same as converting an adjective to a noun; that’s why ⚪ is a separate word here.

Second, you’re allowed to use a verb as a head noun in a very specific circumstance. Linguists call it an action nominal, but you can think of it as something like the English gerund phrase. It must be as part of a possession construction: 🤳’📖 “my reading”; ♂’🛑 🚗 “his stopping (of) the car”. Somewhat obscure, I’ll admit, but it might come in handy.

Verb phrases

Verbs have quite a bit of variance, as we saw in Part 3. But that’s all inflection. At the phrase level there’s not a lot to them. 🖼🗣 doesn’t do much in terms of verbal grammar, because we’re trying to keep things simple.

That said, we do have a handful of auxiliary verbal words. 🙆 and 🙅 indicate permission and prohibition, respectively; they’re equivalent to English may and may not: 💮 🙅 🛫 “you may not go”. Much to the dismay of students, there’s a different can counterpart, ❔➡. That one is only for ability.

Simple negation uses 🚫, so we might say 👁️‍🗨 🚫 🍴⏯ “I haven’t eaten”.

The imperative is what linguists call direct commands, and we mark it with the suffix ❕, as in 🛑❕ “stop!” Using the appropriate pronouns, we can do a few more tricks with this: 🤲 🛫❕ “let’s go”, 👥 👀❕ “let them see”.

The special compound pronoun 👐↔ means roughly “each other” when used as the object to a verb. We might use it like this: 👨 ➕ 👩 💕➡ 👐↔ “the man and woman love each other”.

Finally, you may be wondering where all the adverbs are. Well, 🖼🗣 doesn’t have a separate class of them. Instead, it just uses adjectives that modify verbs. That’s pretty much what a lot of English speakers do in colloquial language, so it shouldn’t be any problem. ✔ ✍❕ ◻ “write it correctly”, ♀ 👍 🗣 “she speaks well”.

Prepositions

Now we’re only missing one major part of language, and that’s the preposition. Grammatically speaking, those in 🖼🗣 function as adjectives, with the special rule that they always appear at the beginning of a noun phrase; this phrase can then appear after another one or at the end of the sentence, depending on the situation. (It’s not quite free variation, in case you’re wondering. Sentence-final phrases tend to be those that modify a verb.)

Here are some of the most common single-symbol prepositions in our script:

  • ⏩ – “after”
  • 🆚 – “against”
  • ⏪ – “before”
  • ➗ – “between”
  • ⤵ – “in” or “into”
  • ⤴ – “out of”

A lot more are compounds, often using the adjectival suffix 〰:

  • ⬆〰 – “above”
  • ⬇〰 – “below”
  • ⬅⬅ – “from”
  • ➡➡ – “to” or “for”
  • ➕↗ – “with”

Last, and simplest, is the way 🖼🗣 says at: @. You can use this as a normal preposition: 🤳 👉 @ 🎦 “I’m at the theater”. But it also has a secondary use as a kind of attention-getter for speech, in which case it works as a prefix on a head noun: 🤳 💬◀ @♂… “I said to him…” (The intent here is to emulate @-mentions, as on Twitter and Mastodon.)

To be continued

This post covered the most basic sorts of phrases you’ll find in 🖼🗣, but not the only ones. In the next installment, we’ll look into the more complex clauses: relative, purposive, subordinate, and so on. Sounds hard, I know, but never fear. There’s really not that much to them.

🖼🗣: the emoji conlang, part 5

As promised, this edition of our series on the emoji conlang 🖼🗣 (aka Pictalk), is going to be focused primarily on building our vocabulary. You saw last time the ways we can combine symbols to create new words, but we’re first going to look at roots, individual symbols that can be used as words in their own right.

The inventory

As of the recently-released Version 12 of the Unicode standard, we have a total of 3,019 emoji at our disposal. That sounds like a lot, for sure, but…it’s not that simple, at least as far as our script is concerned. Gender and skin tone modifiers don’t come into play for us, because their meanings aren’t exactly lexical. (Okay, gender is linguistic, but I’ve decided that it plays no role in 🖼🗣 grammar.) Take those out, take out the various “family” permutations, and do some shuffling, and my best calculation is a total of 1,581.

That’s still a large number, but we’re using quite a lot of them, such as ◻ or ➡, as grammatical particles, suffixes, or other “content-less” morphemes. Also, we’ve got plenty of duplicates, and some, such as the annoying “cat face” emoji, that we just don’t use. What’s left comes out to 1,200 or so symbols, plenty for a vast and diverse vocabulary even before you start compounding.

The roots

We can divide the roots into a number of categories. We’ll look at each of those groups in turn, because they tend to show some similarities. While I won’t describe every emoji in much detail, I hope this overview, along with the examples I give, suffice until I can create a real list.

Faces

Most of the faces (the emoticons, as we old-timers call them) stand for the emotion or state they express:

  • 😄 – happy
  • 😕 – confused
  • 😠 – angry
  • 😫 – tired
  • 😷 – sick

Not all are like this, though. The “basic” face 😀 instead translates as the noun face itself. 😆, 🙃, and 😤 represent verbs laugh, invert, and defeat, respectively. But symbols like these are the exception, and the class-changing suffixes we saw last time work to convert them into something more like their fellows.

Emotions

Unicode is for lovers, apparently, because there’s an awful lot of different hearts. But we’ve got other emotions, too. And most of the hearts turn out to be just color variations; in 🖼🗣, colored version of emoji always represent those colors.

The rest tend to be either adjectives describing the emotion or verbs that define an action, although some get more idiosyncratic meanings instead:

  • 💋 – to kiss
  • 💌 – romance
  • 💖 – emotional
  • ❣ – to compliment
  • 💨 – fast
  • 💤 – sleep (note that this is a noun first)

The standard includes a few others in the “emotion” section, namely speech bubbles. These are important as communication words in our script:

  • 💬 – to say
  • 👁️‍🗨 – the 1st-person pronoun “I” (where needed)
  • 🗨 – to reply
  • 🗯 – to shout
  • 💭 – to think
Body parts

Mostly, body part emoji stand for the that part of the body, or else the sense it provides:

  • 🧠 – intelligence
  • 👂 – ear
  • 🦴 – bone (this is new, so not all fonts support it)
  • 👁 – eye
  • 👀 – to see
  • 👄 – mouth

The various finger-pointing symbols, by contrast, have meanings less often associated with symbolism:

  • 👋 – hello
  • 🖐 – fingers
  • 🎌 – to hope
  • 👉 – to be
  • 👈 – a marker for relative clauses (which we’ll see in a future post)
  • 👆 – that
  • 👇 – this
  • 👍 – good
  • 👎 – bad
  • 🙏 – to pray
  • 🤲 – the 1st-person pronoun “we”

And I think you can guess what 🖕 means.

People

As stated above, 🖼🗣 doesn’t bother with the gender or skin tone modifiers of Unicode. Instead, people are just…people. With very few exceptions, the “person” emoji stand for the specific person represented:

  • 👨 – man
  • 👩 – woman
  • 👶 – baby
  • 🧒 – child
  • 👨‍🎓 or 👩‍🎓 – student
  • 👨‍🎤 or 👩‍🎤 – singer

Some of the exceptions include 🙍, for the verb frown, and 🙅, to indicate prohibition (“may not”, in English).

Also, any of the numerous family permutations is allowed as a substitute for 👪 family. The generic is considered the default, but more specific variants can show a degree of politeness or respect.

Activities

Technically, Unicode classes these as a subset of the “person” group, but they’re very different in our script. For most of these, the meaning is verbal, rather than nominal. Again, gender doesn’t matter, although it can be considered polite to use it where it matters. (Where available, the generic “person” forms are to be preferred as default.)

  • 🚶 – to walk
  • 🏌 – to play golf
  • 🏊 – to swim
  • 🛀 – to wash/bathe
  • 🛌 – to rest
Animals

Unicode has a bunch of animal emoji symbols, and we use almost all of them to represent those animals by themselves. Reduplicated forms (doubling the symbol) form a “pack”, “flock”, or any other collective noun, while the adjective and verb class-changing suffixes form words concerning the nature and actions of each individual animal.

  • 🐕 – dog
  • 🐈 – cat
  • 🐴 – horse
  • 🐁 – mouse
  • 🐔 – chicken
  • 🐳 – whale
  • 🐜 – ant

One of the few exceptions in this class is 🐽, which instead stands for the verb smell.

Plants

Plants aren’t as numerous as animals in the Unicode emoji set, and 🖼🗣 tends to use many of them for more abstract meanings. Still, the specific types of plant, such as 🌷 and 🌵, stand for their individual kinds.

Examples of the abstract set include:

  • 🌱 – plant
  • 🍀 – luck
  • 🍂 – autumn
Food and drink

People love to eat, and Unicode definitely has them covered there. As with plants and animals, most of these are specific foods or beverages, so their basic meanings encode those:

  • 🍔 – hamburger
  • 🍕 – pizza
  • 🍓 – strawberry
  • 🍪 – cookie
  • 🍺 – beer

A couple of abstract symbols include:

  • 🍳 – to cook (specifically fry, but any kind of cooking is a valid translation)
  • 🥘 – food

Also, the 🍴 and 🍽 symbols translate as eat and meal, respectively.

Places

Once more, we have a large set of emoji symbols whose meanings are fairly transparent. The numerous places, whether geographic or constructed, tend to represent in language what they look like:

  • ⛰ – mountain
  • 🏠 – house
  • 🏥 – hospital
  • 🏫 – school
Transportation

Unicode gives us a lot of vehicles, and we use them about how you’d expect. I know this is sounding like a tired refrain by now, but it’s just how it is.

  • 🚕 – taxi
  • 🚓 – police
  • 🚃 – train

A little wrinkle here is that 🛣 is the abstract road rather than something more specific; if you want something more concrete (sorry about the pun), you can use compounding.

Clocks

Clocks representing half-hour intervals should be self-explanatory. The ⌛ emoji represents time in the abstract, while the verb measure (specifically for time) can be translated as ⏱.

Sky and weather

Most of these are fairly obvious. Cloudy and sunny skies represent just that. The various kinds of weather emoji mostly encode that sort of state. 💧 is abstract water, however, and 🌊 is ocean rather than something specifically to do with waves.

Recreation

Games, sports, and activities mostly function the same as any other “this is what it looks like” emoji:

  • ⚾ – baseball
  • ⛷ – to ski

Some are different, though: 🕹 is control, 🃏 simply joke.

Clothing

Once more, it’s the same general idea: 👕 is shirt, etc. Some of the oddities here include:

  • 🎓 – to graduate
  • 🛍 – to shop
  • 🎒 – student
Technology

Many of the technology-oriented emoji are used for grammatical purposes. Most of the rest tend to be of the “object” sort we’ve seen so many times already:

  • 💿 – CD
  • 🎥 – film
  • 📸 – to take a picture
Tools

Most of the tools are of the “object” sort, representing the objects they appear to be. An important exception is 🔫, which always translates as a real gun, not a toy, when used alone. (Unicode quite clearly defines the symbol as “pistol”, but PC-crazed tech companies try to pass it off as a harmless water gun instead.)

A few other interesting symbols in this group include:

  • ⚖ – law
  • ⚙ – machine
  • 🗜 – to compress
  • ⛓ – to hold back
Household

These are more “object” type emoji, and they tend to fall under the same rules as above.

Keycaps

I’m skipping most of the symbols in this post for a very good reason: they’re symbolic. They don’t have well-defined meanings to begin with, so I felt no shame in recycling them for grammatical use. That includes things like audio controls, punctuation, and the multitude of arrows.

But one set of exceptions should be pointed out here, I think. The Unicode standard has a kind of generic method of constructing keycaps (boxed numerals that look like they’re on buttons), and it defines about a dozen of them. The numerical ones, such as 1, are ordinals: first, second, etc. The others are:

  • #️⃣ – number
  • *️⃣ – any
Flags

Lastly, about 300 of the available emoji are national or regional flags. These are a little special in 🖼🗣, for they can function as both nouns and adjectives without needing class-changing suffixes. The role they fill is implied based on position, defaulting to nominal:

  • 🇺🇸 – USA, American
  • 🇪🇺 – Europe, European
  • 🏴󠁧󠁢󠁥󠁮󠁧󠁿 – England, English (note: not the same as 🇬🇧)

Conclusion

Whew. That’s a lot to take in, and I didn’t even cover everything. Fortunately, it’s a lot smoother sailing from here on out. I’ll illustrate new words when they come up, and I’ll point out non-obvious compounds or derivations. Other than that, the next post will get back to grammar. Fun, isn’t it?

🖼🗣 : the emoji conlang, part 4

🖼🗣 is becoming quite the little language. In the first three parts, you saw the basic outline of how we can take the wide array of emoji characters available in Unicode and contort them into a hieroglyphic script for modern times. Now, we’ll take another step by looking into the many ways in which we can construct new words from the building blocks we’ve been given.

Derivation

First of all, we need to make a distinction between the two different types of combining we can do. Derivation is mostly a grammatical process; it turns nouns into verbs, for example. Almost all languages have at least some derivational processes, and they tend to fall into a few major categories. 🖼🗣 is no exception, so we’ll look at these now. Later, we’ll turn to compounding, where we take individual words and combine them to create something new.

All of the script’s derivations are suffixes. We’ve already met a few, but here’s a complete list. (Note that tense markers, the plural and singular markers, and others like those are considered inflectional, so they’re not listed here.)

  • 〰 – This sign converts a word into an adjective. Usually, it’s a “quality” adjective: a 🧒 (child) is young, so 🧒〰 means “young”.

  • ▪ – This sign forms diminutives. These are “small” forms of words (typically nouns or adjectives) that indicate a lesser degree or amount: 🏙 “city” becomes 🏙▪ “town”, and ❄ “cold” turns into ❄▪ “chilly”.

  • ◼ – This sign changes an adjective or verb into a noun representing something to do with them. So we might turn 🍴 “to eat” into 🍴◼ “meal”, because a meal is something you eat.

  • ⬛ – The opposite of ▪, this sign creates superlative or augmentative forms. Linguistically, those are two different things, but they both pertain to an increase of a quality. With adjectives, ⬛ forms a superlative: 💪 “strong” becomes 💪⬛ “strongest”; this is really an inflection rather than a derivation. When used on a noun, however, the connotation is slightly different: 🌧 “rain” can become 🌧⬛ “torrent, flood”.

  • 🔻 – This marks a negative or inverse connotation. Usually, there’s already another word available, but using this suffix means you’re focusing on what something is not. An example might be 👍 “good” becoming 👍🔻 “not good”. It’s not quite the same as 👎 “bad”, but it’s close.

  • 🔺 – This is the counterpart to 🔻. It marks a positive connotation, which you may think has little use, but it can also function as an intensifier, a bit like “definitely” or (in colloquial speech) “literally” in English.

  • ➡ – As we have seen in previous parts, this forms verbs from other words. No examples needed here, because you should already get the gist.

These are the main derivations in 🖼🗣. Others do exist, but they have more specialized meanings, and they’re probably better analyzed as compounds, which we’ll get to right now.

Compounding

Most vocabulary in the script is formed by compounding. This process, much more general (yet also a bit more idiosyncratic) than derivation, allows us to express essentially any concept through a combination of 🖼🗣 symbols. The rules are a little involved, so pay close attention.

General compounding rules
  1. Any lexical symbol can be used in a compound. Those with a purely grammatical function (such as the derivational affixes above) aren’t allowed, except in very specific circumstances. (These form what’s called a closed class of words, and they don’t really concern us here.)

  2. The minimum number of symbols is 2, but the only upper limit is imagination. Realistically, however, most compounds will have at most 4 symbols.

  3. One element of the compound is the head, while the rest are considered modifiers. (Linguists note that the head element isn’t necessarily the semantic head, but it usually is.)

  4. The head determines the part of speech of the compound. Thus, compounds with heads that are nouns will be nouns themselves.

  5. Verb compounds are head-initial, while all others are head-final.

Noun-noun compounds

Compounds of multiple nouns are probably the easiest to understand. Almost all of them tend to denote specificity. In other words, the modifiers define a specific type of the noun represented by the head. We’ve already seen 🐕🏠 “doghouse”, for instance, but here are a few more:

  • 🐦🛁, “birdbath”
  • 🚲🛣, “bike path”
  • ✋🔫, “handgun”
  • 📰📄, “newspaper”

Simple enough, right? These are mostly English-oriented, but the same principles are common across many languages.

Adjective-noun compounds

These are almost the same as the noun-noun compounds above, but the modifier is an adjective instead:

  • 💨🛣, “fast lane”
  • ♨🛁, “hot tub”
  • 🤓☎, “smartphone”

Again, there’s not much to it.

Adjective-headed compounds

When an adjective is the head, the modifiers shift the base meaning toward their own. It’s a little hard to explain in prose, so we’ll try a few examples instead:

  • 🌹🔴, “rose red”
  • 🏛👴, “ancient”
  • 👿🖤, “devilish”

Unlike nominal compounds, these are often less transparent, but that’s okay.

Verb-headed compounds

Verbal compounds are the hardest. For one thing, they’re “inverted”, with the head coming first. For another, pinning down their meaning isn’t easy. In general, more active verbs tend to form compounds whose meanings are related to the head, while “static” verbs function a lot more like adjectives.

  • 🏃💨, “sprint”
  • 🤝💬, “introduce”
  • 👐🆓, “donate”

Moving on

Part 5 of this series will be a chance to pause and take stock. Instead of grammar and word-building, I’ll provide a lot more vocabulary, roots and compounds alike. I hope to see you then!

🖼🗣 : the emoji conlang, part 3

As we have seen, 🖼🗣 is perfectly capable of writing simple sentences using nothing but emoji along with standard English punctuation. In this post, we’ll delve a little deeper into the script, focusing first on verbs.

Preliminaries

Before we get started, let’s add in a few more simple words. All of these are verbs that represent actions, and I’ve tried to choose those best suited to “dynamic” phrasing.

  • 🛬 – to come
  • 💃 – to dance
  • ✈ – to fly
  • 🛫 – to go
  • 👊 – to hit
  • 🤗 – to hug
  • 😆 – to laugh
  • ⛹ – to play
  • 🏃 – to run
  • 💺 – to sit
  • 🏊 – to swim
  • 💼 – to work
  • ✍ – to write
  • 🔥➡ – to burn
  • 🚗➡ – to drive
  • 💕➡ – to love

Of course, most nouns can be “verbalized” by adding the ➡ suffix. In most cases, the resulting word can be interpreted in one of two ways. It’s either an action that uses the root noun (e.g., you drive a car), or one that is caused by the root (fire burns).

Adjectives can also take ➡, but their meaning is a lot simpler. Most of the time, the verb created is one that represents being in a specific state. Thus, 😃➡ translates as “to be happy”. Those are far more regular than verbs derived from nouns, but not nearly as flashy, so we don’t really need a list yet.

The usual suspects

🖼🗣 verbs express actions. As in English, they can also express when an action occurs. In other words, verbs have tense.

You’re probably expecting a table showing the different tenses in the script, because that’s what most language-learning texts offer right about now. But hold on just a minute. This is a little different. We don’t just have a simple three-way distinction, because 🖼🗣 combines tense and the linguistic notion of aspect into a single marker. So let’s slow down and take these things one at a time, since they can trip you up if you’re not careful.

First off, the present tense is simply the lack of any other marker. It’s the default. (Linguists can argue the point that non-finite verbs are also unmarked, but we’ll ignore them.) Also, the present tense here is best interpreted as the “imperfective” or “progressive” kind. That’s more in line with typical English speech, which is what 🖼🗣 tries to follow. Thus, a phrase like ♂ ✍ should read as “he is writing” rather than “he writes”. Obviously, that’s not set in stone, but it’s a good rule of thumb.

Next up, we have your basic past and future tenses. These are ◀ and ▶, respectively, and it’s not hard to get the symbolism. As opposed to the present, you can assume both of these are “perfective” by default: ♂ ✍◀ “he wrote”, ♂ ✍▶ “he will write”.

Here’s where it gets tricky. If you’re familiar with both English and Romance languages such as French or Spanish, you’ll also know the perfect tenses. They’re the “have” forms of an English verb, or the separate conjugations in Latin, or however you like to look at them. They’re hard to explain without resorting to linguistic nastiness, so I’ll keep it simple. A verb in a perfect tense refers to an action that took place before the time it’s talking about. It talks about something that, from the point of view of the verb, has already finished.

For 🖼🗣, we’ve got a trio of perfect markers. Each simple tense has a corresponding perfect form, and they look similar enough to the basics that you can almost imagine they fit. ⏯ marks the present perfect (“he has written”), ⏮ the past perfect (“he had written”), and ⏭ the future perfect (“he will have written”).

Last are two linguistic aspects that English doesn’t mark in a simple way. Other languages do, however, and two emoji perfectly fit the bill for them. Rather than use the technical terms, I’ll describe them more informally. ⏺ marks an action that is beginning (“he is starting to write”), while ⏹ says that an action is ending (“he isn’t writing anymore”). As you can see, these aren’t easy to translate, but the concepts aren’t too hard to grasp.

Combining the tense and aspect markers works just fine. You can use one of each, and the order doesn’t matter. Add in the adverb 🚫 “not”, and you can express some fairly complex ideas in just a few symbols. 🤳 🚫 ✍⏭⏹ “I will not have stopped writing”. Wow.

Back to pronouns

About that last example, though. You’ll notice I used 🤳 as the first-person pronoun. Last time, as you may recall, I mentioned that this is an acceptable substitute, and now it’s time to explain why.

Any 🖼🗣 pronoun can take 🤳 as a suffix meaning “-self”. (Get it? Because it’s a selfie.) Most of the time, you’d use this as the object of a phrase: ♀ 👀 ♀🤳 “she sees herself”. But that’s a little repetitive, so there are other ways. Placing this “reflexive” pronoun as the subject lets you say the same thing in a more concise way: ♀🤳 👀.

So far, so good. But the other meaning for 🤳 is as an “intensifier”. Those of you who know Spanish may recall that subject pronouns are optional in that language. Indeed, using them regularly is one of the hallmarks of beginner speakers. But when they do appear, it’s usually to indicate emphasis. In effect, they say that I’m doing this, not somebody else.

While our emoji script doesn’t allow omitting most pronouns, the emphatic use of 🤳 works just fine. And that’s our loophole to let us use 🤳 as an acceptable alternative to 👁️‍🗨. Language geeks rejoice.

We’ve also got a few other pronouns to cover before we go. These are the indefinite sort, and they’re all formed as compounds using 🔳 “some”. Thus, you have the following:

  • 🔳◻ – something
  • 🔳👤 – someone/somebody
  • 🔳📍 – somewhere
  • 🔳⌛ – sometime
  • 🔳〰◼ – somehow

Moving forward

So that about wraps it up for this one. Next time around, I promise we’ll get more into making actual sentences. We’ll also go a little deeper into what makes up words, including some of the more regular compounding constructions. 👀▶ 💮 🔜!

🖼🗣 : the emoji conlang, part 2

In the previous article, I showed that it is possible to create a kind of modern-day hieroglyphic script using the ~1200 emoji characters available in Unicode. Now, let’s expand on that.

Rather than go through a formal grammar, we’ll work our way up from a few simple phrases and sentences, much in the same way as a student learning a new language. 🖼🗣 is, after all, a bit like it’s own language.

Preliminaries

First off, let’s define a few very simple words. These are all “content” words, as you’ll see; grammatical particles (what few we truly need in 🖼🗣) can come later.

  • 👨 – man
  • 👩 – woman
  • 👤 – person
  • 🧒 – child
  • 🐕 – dog
  • 🐈 – cat
  • 👁 – eye
  • 👄 – mouth
  • ✋ – hand
  • 👣 – foot
  • 🍴 – to eat
  • 🥤➡ – to drink
  • 👀 – to see
  • 👂➡ – to hear
  • 🧠➡ – to know
  • 🚶 – to walk
  • 💧 – water
  • 🌬 – air
  • 🔥 – fire
  • 🌐 – earth
  • 🌞 – sun
  • 🌝 – moon
  • ⛅ – sky
  • 🔴 – red
  • 💚 – green
  • 🔷 – blue
  • ◻🌈 – white
  • ◼🌈 – black
  • ♨ – hot
  • ❄ – cold
  • 😃 – happy
  • 😢 – sad
  • 💪 – strong
  • ⬜ – big
  • 🧠〰 – smart

For most of these, the meanings should be fairly obvious. Some, however, are compounds. As an example, the color terms for white and black, ◻🌈 and ◼🌈, combine their first glyphs (ordinarily simple nominal particles) with 🌈, a regular derivation that makes color terms. Similarly, the numerous verbs with ➡ are derived from nouns; the second symbol here acts as a verbalizing suffix. And for adjectives, you can often use 〰, as we did with “smart”: 🧠〰.

Simplicity

The simplest sentences are those with nothing more than a subject, verb, and object. And, to make things even simpler, we’ll start with the most basic verb of all: “to be”. In 🖼🗣, that’s 👉. No need to worry about agreement suffixes or anything like that, though. For our purposes here, 👉 is all we need. (We’ll get to tenses in a later part.)

Here are a few examples to show what I mean:

  • 🧒 👉 😃. – The child is happy.
  • 👩 👉 🧠〰. – The woman is smart.
  • 👨 👉 💪. – The man is strong.
  • 🔥 👉 ♨ – Fire is hot.

Note that we don’t need any special word for “the”, either. It’s understood. (If you really think you need it, you can use 👇, though its meaning is closer to “this”.)

Our other verbs aren’t quite as easy to work with, but we can manage. The principle’s the same, after all:

  • 🧒 👀 🐕. – The child sees a dog.
  • 👨➿ 🚶. – The men are walking.
  • 🐈 🥤➡ 💧. – The cat drinks water.

Once again, don’t worry about the difference between English simple and progressive forms. 🖼🗣 doesn’t bother distinguishing the two.

You, me, and all the rest

Today, everybody’s worried about pronouns. Well, I’ve got you covered there, because 🖼🗣 has plenty of them.

Most languages make a distinction between persons: first, second, and third. To some extent, that’s what we’ll do here, but modern communication, especially on the Internet, is more geared towards a distinction between speaker, audience, and others. (Technically, that’s all the three degrees of person represent, but bear with me.)

A speaker’s solo pronoun is 👁️‍🗨. If they’re including others (whether inside or outside their audience), then this becomes 🤲. These are like “I” and “we”, respectively:

  • 👁️‍🗨 👉 🧠〰. – I am smart.
  • 🤲 👉 😃. – We’re happy.
  • 👁️‍🗨️ 👀 🔷 ⛅ – I see the blue sky.

(Important note: Some systems are not able to display or input the “compound” emoji 👁️‍🗨️. If yours is one of them, don’t despair. You can use 🤳 instead. It doesn’t mean exactly the same thing, as we’ll see in the next part, but it’s close enough.)

But here’s where it gets interesting. If you’re only speaking of yourself, there’s really no reason to need that cumbersome pronoun. It’s implied, because you’re the one talking. So that first sentence can become “👉 🧠〰.” instead, and it’ll mean the same thing.

Only the singular speaker pronoun can be dropped like this, which is far different from most spoken languages which allow such things.

The listener pronouns are much simpler. In fact, they’re not even pronouns at all, because there’s only one of them: 💮. Example:

  • 👁️‍🗨️️ 👀 💮. – I see you.

As with English “you”, this works for both singular and plural.

Last are what most languages call the third-person pronouns. Here, 🖼🗣 has a wide variety to choose from, so let’s take a look.

  • For talking about people in general: singular 👤, plural 👥
  • For talking about anything not human: singular ◻, plural ◻◻
  • For talking about only men: singular ♂, plural 👥♂
  • For talking about only women: singular ♀, plural 👥♀

Mostly, the first two pairs should be preferred, and the “general” form is required when you’re referring to mixed groups. And, of course, using the “non-human” pronouns when you want to talk about people is just wrong.

Some examples using these pronouns:

  • 👥 👉 😃. – They are happy.
  • ♂ 👉 ⬜ 👨. – He is a big man.
  • 👀 ◻. – I see it.
  • 👥♀ 👂➡ 💮. – They (i.e., those women) can hear you.

Possessed

Last in this little lesson, we’ll discuss the possessive form. As with many parts of 🖼🗣, that’s a little different from what you might expect. In fact, it’s one of the few cases where the script recycles English punctuation.

Our key here is the apostrophe, or single quote mark: ‘. When put between two nouns (pronouns included), it indicates that the first possesses the second. So we might say 🧒’🐈 for “the child’s cat” or ♂’✋ for “his hand”.

These aren’t exactly compound nouns, but they can function much like them, fitting into sentences with ease.

  • 👁️‍🗨️’👁 👉 🔷. – My eyes are blue.
  • ♀’🧒➿ 👉 😃. – Her children are happy.
  • 👀 ⬜ 👨’🐕➿. – I see the man’s big dogs.

In the last example above, you can see a difference between the script and English, as far as word order is concerned. The possessive “attaches” to the head noun, even if there are modifying adjectives before it.

Also, you can “chain” possessives, as in ♂’🧒’👁➿ “his child’s eyes”.

Moving forward

Now that you’ve seen a little bit more of this experiment, does it still seem so outlandish? Stay tuned, as this series will delve even deeper into the weird world of emoji, and the strange things we can accomplish when our language is allowed to use nothing else. 👀▶ 💮 🔜!

🖼🗣 : the emoji conlang, part 1

I talked about this a while back, but now it’s for real. Today, I introduce to you a new conlang: 🖼🗣. Or, to put the name in something pronounceable, Pictalk. Yes, the glyphs making up the name are emoji. Yes, so are all the characters used in the entire language.

Strictly speaking, Pictalk isn’t a full-fledged conlang. It’s written-only, first of all. There is no true spoken form. Instead, it should be considered something closer to a conscript, an artificial writing system, modeled after hieroglyphic and ideographic scripts. But that’s enough to encode ideas, thoughts, sayings, and anything that might need to be written in this modern, digital age.

Glyph inventory

The hardest part about making Pictalk is the very restricted set of available glyphs. True, there are over 1200 emoji characters available, and they cover a wide variety of concepts, from animals to emotions to transportation and more. But I don’t have control over which symbols the Unicode Consortium adds to the list. While that list will grow (they add more each year, it seems), there’s little rhyme or reason to which new characters come in.

But that’s okay. We can do this. English only needs 26 letters, right?

Even with the wide array we have, it’s safe to discard quite a few right off the bat. First, I’ll drop the “cat face” group, such as 😸, because they really only repeat the normal human smileys. Next, toss out the handful of CJK ideographs in circles or squares, like 🈹—I’m an English speaker, and even Unicode gives up on giving them reasonable names. The skin tone modifiers (🏻 and friends) don’t make sense in the context of language; Pictalk thus won’t give them meanings, but will allow them to modify other symbols as a kind of synonym.

Likewise (and here’s where we start getting into the grammar bits), gendered forms like 👩‍🏫 or 👨‍⚕️ are synonymous with their “base” forms. With many languages, particularly in the West, where there is no neuter form, masculine is considered the default. Pictalk, however, is gender-neutral. That’s not out of some misguided idea of social justice or diversity, but simple expedience. Unicode has neuter forms for most of what we might call agentive glyphs. Where it doesn’t, we can use either, and that’s fine.

Last, flags. These take up a good chunk of the emoji list (about 15%, all told), and they’re mostly country flags. Well, for Pictalk, those flags represent their countries, and that’s that. Unlike most other characters, they don’t really participate in the construction processes we’ll see later on.

Non-emoji characters

Before we get to that, let’s go over the rest of Unicode. Obviously, since the whole point of Pictalk is to create a hieroglyphic script using the emoji characters, they’re the focus. But we’ve got a few other options available. One I won’t use is Latin letters. Or, for that matter, any other alphabetic script. In earlier versions of the language, I did utilize them for derivation and some small grammatical particles, but I’ve since removed the need for them. Only proper names use alphabetic characters; these are written as they would be in either the speaker’s or the audience’s preferred language.

Numbers, on the other hand, are perfectly usable. They’re already a little bit ideographic, after all, so it wouldn’t destroy the purity of Pictalk to include them. So 0-9 work exactly as they would in English: as the numerals zero through nine. And you can build on that as you do in English. (Pictalk is base-10, by the way.)

Punctuation works the same, as it’s very difficult to design a conlang that doesn’t need it. So sentences can end with a period, question mark, or exclamation point. Quote marks work for, well, quotes. Commas aren’t as necessary, but you can still use them to mark off clauses. Colons, besides having their normal English function, are used as attention-getters, in a sense, following the intended recipient of a statement or question. And we’ll see the other “special” characters as they come up.

Building words

Quite a few emoji work as words by themselves. Think of 🐕, 😄, or ✈, for instance. In Pictalk, that’s the most basic sort of word, and most symbols can function alone. Some are considered nouns, others adjectives or verbs, but there’s always a way to convert them.

Other symbols are “bound”, in that they can only occur fixed to others. An example here would be the (optional) plural marker ➿. By itself, it has no meaning. Suffixed to a root, whether a single symbol or a string of them, it gains meaning: 🐕➿ “dogs”.

More complex are the compound symbols that make up the bulk of the lexicon. In general, nominal compounds are head-final, as in 🐕🏠 “doghouse”, while verbal compounds are often head-initial, as with 📖🏫 “study”, from 📖 “read”. I’ve tried to refrain from being cute with meanings, striving instead for transparency, but some compounds remain idiosyncratic in meaning.

Last, a form of word-building that English doesn’t often employ comes into its own in Pictalk. Reduplication is productive for many basic words. For nouns, it can create a kind of collective sense: 🏠🏠 “neighborhood”. Verbs instead use reduplication as an intensifier: 💭💭 “to contemplate” (or possibly “to overthink”).

Moving on

All in all, I think this just might work. We can make words using only emoji characters. Next up, we’ll see how far we can go in making a language.

A mad experiment

Today, most of the world uses alphabetic scripts, or something fairly close to them. With the major exception of Chinese (and the writing systems derived from it, such as those in Japan and Korea), alphabets, consonantal scripts, and the like reign supreme. They’re easier to learn, obviously, and far more suited to computers, so it’s only natural. Simple scripts, in the vast majority of cases, work just fine, so that’s what we use.

But it wasn’t always this way.

If you look back at the history of writing, you see that alphabets were not the original form of script. Indeed, assuming current theories are correct, writing developed first as pictorial representations of people, animals, etc. Abstractions came in later, as did the practice of using glyphs to represent spoken language, rather than as something closer to an aide mémoire.

The oldest evidence of writing we have all points in the same direction. Egyptian hieroglyphs, Sumerian cuneiform, and ancient Chinese symbols share the common feature of being, at least in some part, logographic scripts. The same may be true of other, mostly undeciphered writing, such as the Proto-Elamite script of that of the Indus Valley—given their age, it doesn’t seem out of the realm of possibility. While China kept its style of writing through the millennia, occasionally simplifying but never throwing away, the rest have mostly died out, replaced by Latin, Greek, Cyrillic, Arabic, the various scripts of India and Southeast Asia, and so on.

Enter madness

But wait. Anyone with a cellphone (which is to say, well, anybody) has at their disposal a vast and growing collection of bona fide ideograms: emoji. Can we use those as the basis for a modern-day hieroglyphic script?

I know what you’re thinking. “Michael, you’ve gone completely crazy!” you probably shouted at your computer screen.

You’d be right, but hear me out. I am being totally serious. Think about it. As of 2018, there are over 1000 emoji symbols in the Unicode standard, and they’re adding more with every update. Granted, most of the new ones are gender-specific versions of older ones, but you still see a genuine emoji every now and then. (“Lobster” was in the newest batch, I think.)

Most emoji fall into one of two categories. One is clearly nominal in nature: animals, vehicles, people, and so on. The other is the emotional set: grinning faces, smilies, and the like. Those can be considered adjectives, if you look at it the right way. Verbs, now, those are harder, but not impossible.

So here’s what I propose. Take the emoji, minus a few that aren’t really all that useful to English speakers (think the “cat faces”, or the numerous symbols containing Japanese writing), and construct a script. Or, if you will, a written-only conlang. Technically speaking, it would be something more akin to a pidgin. It would have no vocabulary of its own, and the grammar would necessarily be very stripped-down.

The limitations are severe, but operating under limiting conditions is the time-honored path of the hacker (in the original sense of the word). Here, we have no control over the inventory of symbols, no convenient way of even typing them, much less pronouncing them. And there’s no real payoff, either. If I did this, it would be for fun, not for glory.

Yet none of that ever stopped me before, so why should it now?

If you’re interested, stick around. I’ll post something more about this mad scheme in the coming weeks.