The other day, Twitter announced it would retain its 140-character limit for posts. A hundred and forty letters, spaces, and numerals isn’t all that much, if you think about it. In fact, this paragraph is already far too long to fit in a single tweet.
Speakers of other languages which use other scripts sometimes have it easier. Twitter’s limit applies to characters, not bytes, and all of Unicode is available for use. If you’re writing in Chinese, for example, you can fit a lot more information into 140 characters. That’s because Chinese is a logographic writing system, not an alphabet like ours. There are thousands of symbols, most of them representing part, if not all, of a word. But Chinese writing is hard to learn, especially for Westerners. To make matters worse, we Americans don’t have an easy way to input characters of foreign languages. (A lot of Chinese-made phones do, but that doesn’t help unless you know how to do it.)
What many Americans do have, however, is another input method: emoji. All told, there are about a thousand of the little symbols, representing everything from emotions to animals to food. Why not use those for a script?
Enter E-zi, the emoji syllabary. E-zi is a set of rules and conventions to turn English text into a string of symbols—emoji, but also letters, numbers, and the graphical symbols available on all keyboards—that are easily inputted. Due to the nature of the English language, we can’t quite achieve the efficiency of Chinese or Japanese writing, but we can do better than plain text.
To get a taste of how E-zi works, here’s a sample. The text is the first article of the Universal Declaration of Human Rights:
All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.
In E-zi, this comes out as:
◻ 👤 Bns R 🐗📯 🆓 & = N 💫🔫🔼🍵 & 👉s. 👫 R N🔽d + ♻🌞 & 🚧🔀ns & 👞🔽 🎬 ▶ 1 A🔘* N A 👻 O 🔆🔘*❓d.
By my count, we’ve dropped from about 170 characters to 81. Many of those are spaces, however, and some can be removed without any loss of understanding. A lot of people do that already in cramped conditions like Twitter, with abbreviations like “txt”. This is just taking it a few steps further.
The core of the E-zi system is in the way emoji are read. By and large, this must be learned, unfortunately, but the basic ideas are pretty simple.
First, every non-alphanumeric symbol can be read in two types of ways. Take ◻, for example, the first icon we used. When written as its own word, it should be read as “all” or “always”. Another symbol, 🔴, similarly indicates the word “red”, and 🔎 is “detail”.
When combined with other symbols (including letters and numbers), these are instead considered to be syllables or sounds. A given symbol can have lots of these syllabic or phonetic readings, but most are easily derived from the lexical form. For example, ◻ has a syllabic reading AHL (I’ll explain this notation later); 🔴 can be R, RE, or RED, with the last two being most common; and 🔎 reads as DEE or the rarer DEET. Put them together as the E-zi “word” ◻🔴🔎: AHL-RE(D)-DEE, or “already”.
This rebus principle is the heart of E-zi. Words are built as groups of sounds, with a large amount of leeway in which sounds can combine. A good illustration of this is the string I used to translate “dignity”: 💫🔫🔼🍵 . This consists of four symbols, as you can see. The first, 💫, is read DI. 🔫, obviously, is “gun” or, in our phonetic notation, GUHN. The little 🔼 indicates a schwa sound, noted as UH. Finally, 🍵 is, well, “tea”. It’s not a perfect match for the English pronunciation of “dignity”, but it comes close enough that we understand what’s meant.
If we would like a lexical reading where we’d normally have a syllabic one, or vice versa, we can use the * symbol after a symbol. We did that in our quote above, with A🔘* “another”.
In the cases where there isn’t an appropriate symbol, we can resort to letters. But it’s a waste to have two sets of the same thing, upper and lower case, so E-zi repurposes the capital letters. Each one stands for a simple word, usually grammatical in nature:
|B||be (or “is”, etc.)|
|L||will (future tense)|
|Q||question, cue, or queue|
|X||ex-, or the word “ex”|
Lower-case letters, on the other hand, are absolutely necessary. They can be placed anywhere, and they mostly sound like their English equivalents. You’d also use them to spell out names and other words where the actual letters used are important. A few, however, also have special meaning when placed at the end of a symbolic word. These are based on a very simplified English grammar:
- “s” creates a plural form of any word: 🚗 “car”; 🚗s “cars”. This is always regular, so no need to worry about convoluted English plurals like “oxen”.
- “d” forms the past tense of a verb: one reading of ⚔ is “fight”, so ⚔d can be read “fought”. Again, totally regular.
- “n” makes a verb into a participle form. Using the same example, ⚔n should be read “fighting”. Similarly, Gn is “going”.
These can all be combined. In the larger text, we had Bns for “beings”. It might look odd in the middle of all the symbols, but it works just the same.
E-zi also makes use of the other keys on your keyboard. Numbers work a bit like capital letters, in that they stand for full words—namely, their values. In other words, 1 is to be read as “one”, and so on. You can use the rebus principle to play with these (4m for “form”, ↩8 for “late”), but numbers have no syllabic or phonetic readings.
Most punctuation marks are used like in English. These include the period, comma, question mark, exclamation mark, semicolon, and colon. Quotation marks are, of course, used to mark quoted speech. The apostrophe, on the other hand, is reused as a general possessive suffix: M’ is “my”, T👨’ is “the man’s”, etc. Like “s” and the others, this ignores traditional English sensibilities in favor of logical regularity.
The other symbols are mainly those on the top row of your computer’s keyboard. The number sign # is reserved for use in hashtags, and the caret ^ indicates a signature. The others can be used as symbols:
|$||dollar sign||dollar, money|
|+||plus sign||and, with|
|=||equals sign||is, equals|
||||vertical bar||or (choices)|
Most of these are lexical-only, but &, +, @, | , and ~ work as syllables, too.
Finally, the grave accent ` marks off text as “regular”, i.e., not E-zi script. You use it in pairs, like quotation marks. The other “grouping” symbols, like brackets and parentheses, are reserved for future use.
There are two special groups of Unicode symbols that E-zi gives special meaning to. The clock faces are a set of 24 symbols displaying times in half-hour increments. In E-zi, these simply indicate those times, with no other reading possible. For instance, 🕥 means the time 10:30 and nothing else.
Another group is called the “regional indicator symbols” by Unicode. These are the letters A-Z in boxes, but they are considered the preferred way of marking countries. E-zi uses them in this way, always in pairs representing a country code. (In some places, this may cause them to be displayed as a national flag instead. E-zi takes that as a feature.) Thus, since the country code for the United States is US, the E-zi sequence 🇺🇸 can be read as “United States”. In addition, the emoji 👤 can be suffixed to create an abbreviated form representing a nationality: 🇺🇸👤 should be understood as “American”.
All other symbols are unused by E-zi. The “cat faces” such as 😸 are reserved for expansion, while the skin color modifiers (“Fitzpatrick modifiers”) are ignored.
English is a notoriously difficult language to pronounce, even before you worry about dialectal differences. Thus, I’m using a phonetic notation to record E-zi syllabic readings. Nothing’s perfect, but as long as we’re in the ballpark, communication will work.
Most consonants in the notation should be read as their IPA values. This means that K is preferred to C, KW instead of QU, and so on, so some words will look odd. A few, though, have to be changed:
Fortunately, I’ve taken the liberty of dismissing a few of the more common dialectal problems. Final “r” is assumed to always be there; it’s an America-centric view, but it makes things easier. It doesn’t mean you have to pronounce it, though. The “wh” sound, on the other hand, is merged with plain “w”, meaning “what” and “watt” are both WAHT. That was hard for me to do, as I still have a separate “wh” sound, but I realize I’m in the minority. Other dialectal sounds are also ignored; you’ll have to find their closest “standard” equivalents.
Vowels are even more problematic. I’ve taken the Wells lexical set and assigned a sequence of letters to each vowel sound in it. That gives us something reasonable to work with, although it can look a little strange at first. Don’t worry if some of these vowels sound the same in your dialect; that just means you have more opportunities for word play.
The big list
By my count, there are 970 emoji that have both lexical and syllabic readings. That’s way too many to list here, so I’ll just link a PDF chart that you can view at your leisure. Note that it’s sorted primarily by the least complex syllabic reading. My organization skills do leave something to be desired, however.
I hope E-zi is a fun and interesting way to make your short-form text more descriptive, more efficient, and more alive. Sure, I could have gone and made something with no relation to English whatsoever, but this way is much easier. Not only that, but it’s a method that stands the test of time. The basic principles of E-zi are the same used in Chinese script, Egyptian hieroglyphs, and Sumerian cuneiform, the oldest writing systems we know. In our modern world, however, we have new reasons to want the succinctness of syllabic and logographic script. Thus, unlike these others, E-zi is made with today’s needs in mind. It’s a twenty-first century take on a millennia-old idea.
All of E-zi is free for you to use however you see fit. No charge, no strings attached. So, have fun with it, and let me know if you use it!