Elan – Unicode and Emoji in your code

Everyone knows about Unicode. A massive set of thousands of characters, all ready to be used everywhere. Except in code. Sure, many programming languages allow accented letters (for example) in their names, and most support Unicode in comments or strings. But there are very few that harness the full power of the Universal Character Set. Most text editors are Unicode-aware, but we’re still forced to write :<= instead of .

On the other side of the coin, the younger generation has fully embraced Unicode, whether they realize it or not. The latest revision of the Unicode standard added a number of “emoji”, emoticons and pictographs that are popular in messaging apps and elsewhere. With just a few symbols, emoji can convey a large amount of information, such as moods and activities. It’s even possible to write a story using only emoji. But you can’t write code with them.

Until now.

Introducing Elan

Elan (Emoji LANguage) is a new programming language I have created to merge these two disparate worlds. In Elan, almost every Unicode character, from emoji to Greek letters to Chinese ideographs to musical notes are usable. Every operation in the language can be shown with symbols, in addition to text. This functionality can give our code the same expressive power available to our smartphones. No longer do we have to be limited to ASCII. With Elan, the whole world is open to you.

In the rest of this post, I’ll give a brief description and tutorial on Elan. It’s always in development, though, and new features are on the horizon. The latest code is available at this Github repo, and I’ve also put up a live compiler.

A word about character support

I understand that not all systems, keyboards, and editors have support for entering the full range of Unicode characters. To help those unfortunate programmers that must use such hardware and software, Elan has an alternative input facility. Each symbol that can’t be typed on a US keyboard has an alternate form consisting of either a valid symbol or a few letters surrounded by colons. For example, the assignment operator ⬅ can also be written as :=:. The live compiler page lists all the symbols currently used in Elan with their corresponding alternates.

Variables

Variables are just names attached to values. In Elan, a name be a string of letters and numbers from any language, as long as you start with a letter. But Elan also allows single symbols as variable identifiers. (Specifically, they have to be symbols from outside the Basic Multilingual Plane of Unicode, but this is a current limitation of the compiler.)

This means that you can have variables named any of the following:
xyz ALongVariableNameWithMixedCase élan π こんにちわ 🌈

Variables can be assigned values. In many languages, this uses the equals sign (=), which causes confusion. With Elan’s support for Unicode, we can use ⬅ instead:

x ⬅ 1; twox ⬅ x * 2; こんにちわ ⬅ "hello"; TheAnswer ⬅ 42

As you can see, semicolons can be used to separate statements, but newlines work just as well.

Operators

Every language needs a set of operators. Elan includes the following:

  • The assignment operator ⬅, which you just saw.
  • The usual arithmetic operators: + - * / for addition, subtraction, multiplication, and division. The minus sign can also be used for negative numbers.
  • The modulo operator %, similar to many other languages. (This may get an alternate symbol in a later version.)
  • The power operator ^, that raises one value to the power of another. (e.g., 3^3 = 27)
  • Comparison operators < > = ≤ ≥ ≠. Note that, because we don’t use the equals sign for assignment, we can use it for comparison, just like in math class. (If you can’t type the “or equal” symbols, you can also use <= >= /=.)
  • Logical AND and OR, as & and |. (Logical NOT is an oversight that I’ll fix in a future version.)
  • Increment and decrement operators, ⬆ (:++:) and ⬇ (:–:), that take a variable and either add 1 or subtract 1.
  • A lot more that I’ll show you in the sections below.

Expressions can be as complicated as you need them to be. The order of operations should be familiar if you’ve used any other programming language, with multiplication coming before addition, etc. You can also use parentheses to make your intent clear.

Types of Values

Elan doesn’t have a static type system like Java or C++. But it does have the notion of different kinds of values: numbers, strings, booleans, lists, functions, objects, and null. Most of these should be obvious: numbers are numbers, strings are text inside quotes. Booleans can be either true (✔ or :t:) or false (✖ or :f:). Lists are like arrays in other languages, with multiple values accessed by number, while objects have values that are accessed by name. Functions are blocks of code that you can call. Null (🚫 or :null:) is simply the absence of a value.

Examples of values:

  • Numbers: 1 42 0.5 -17
  • Strings: "c" "string with spaces" "512"
  • Lists: {1,2,3} {"a","b"} {}
  • Booleans: :f: ✔
  • Null: 🚫

Functions and objects will be discussed below.

Conditional Expressions and Statements

The conditional expression is similar to other languages’ ternary operators. Given a condition and two options, it returns the first option if the condition is true, the second if it is false. The general format of a conditional is:

condition ? true-expression ! false-expression !

For example, x < 10 ? "less" : "more" produces the string "less" if x is less than 10, otherwise it is "more". (Of course, this doesn't handle the case where x equals 10, but that's for another time.)

If you don't need the "else" portion of the conditional, it still has to be present, but you can leave it blank: x < 0 ? x ⬅ 0!!.

A conditional statement is slightly different. It doesn't produce a value, so you can't assign it to a variable, but it can include multiple statements, including any of those shown below, such as loops or function definitions.

Functions

Functions allow us to do things. As such, they are an important part of Elan. To define a function, you use the ✏ (:def:) symbol. To its left are the parameters of the function, written as a list. To the right is a block of statements ended by the ◀ (:end:) symbol. You can have an empty parameter list with {}, but the body of the function must have at least one statement.

Inside the function, you can return a value by writing that value followed by the ↩ (:return:) symbol. (To return a boolean value, you can instead use the shortcut symbols 👍 (:yes:) and 👎 (:no:) for true and false.)

How you call a function depends on if it requires parameters. For a function that doesn't, you can use the :call: symbol, which you can write as either 📞 or 📱. To pass parameters to a function, you first use the ✉ (:with:) symbol with a list of parameters, then call the function as before. For example, 📞 f and ✉ {a,b} 📱 g.

Examples of functions include:

  • {n} ✏ n*2 ↩ ◀, which doubles any number passed to it.
  • {s} ✏ s.length ≥ 1 ? 👍 ! 👎 ! ◀, which returns true if a string is not empty.
  • {x,y} ✏ ✉ {(x^2 + y^2)} 📞 Math.sqrt ↩ ◀, the Pythagorean function.

By themselves functions you define have no names, but they can be assigned to variables to give them names. Doing this obviously allows you to call your own functions, and it also enables recursion, as in the classic Fibonacci function:

fib ⬅ {n} ✏ n < 2 ? 1 ↩ ! ✉ {n-1} 📞 fib + ✉ {n-2} 📞 fib ↩ ! ◀

Iteration and Loops

Elan has two kinds of looping. The first is iteration, which goes through a list and executes a block of code for each value in the list. (This is like a for loop in other languages.) For this, we can use the 🔁 (:iter:) symbol. On the left goes the list, which can also be a variable holding a list, while the code block goes to the right. Like with functions, the code block ends at the ◀ symbol. Inside the block, you'll most likely need the current value, and this can be obtained with the :i: symbol, which can be written as any of ☝, 👆, or 👇.

Examples:

  • {1,2,3,4,5} 🔁 ✉ {☝^2} 👍 console.log ◀ prints the squares of the numbers 1 to 5 in order.
  • values ✉ {☝} 👍 someFunction ◀ calls someFunction with each value in the values list. (This pattern is so common that it might get its own shortcut symbol in a future version.

Instead of iterating through a list, you can also set up a loop that will run until a condition is met, or until you specifically tell it to stop. These loops (called while loops in many languages) are defined by the :loop: symbol, either 🔀 or 🔃. This symbol is followed by the block to execute each time through the loop, and it can be preceded by a condition that will stop the loop.

Inside the loop, you can use the ◼ (:stop:) symbol to "break out of" the loop, ending it immediately. (This is the only way to stop a loop without a condition.) Also, the ⏩ (:ff: or :continue:) symbol skips the rest of the block, returning to the beginning of the loop.

For example, x ⬅ 1; x < 1000 🔀 ✉ {x} 👍 f; x ⬅ x*2 ◀ passes successive powers of 2 to the function f, stopping once it reaches 1000.

Choices

For choosing one path of code from among many, Elan has the choice statement, similar to other languages' switch. The general format of the choice statement is:

test-expression :switch: default-block :case: choice-1 :do: code-block :case: choice-2 :do: code-block etc.

The :switch: symbol can be written as 💡, while :case: can be either ☑ or 🔘.

Example: {x} ✏ x 💡 "many" ↩ ◀ ☑ 1 ➡ "one" ↩ ◀ ☑ 2 ➡ "two" ↩ ◀ ◀ defines a function that returns a word based on its parameter. The number 1 becomes "one", 2 becomes "two", and any other number returns the value "many".

Objects and Lists

Objects in the current version of Elan function in the same way as their Javascript equivalents. You create an object by using the :object: symbol, written as either ☺ or 📦. This is followed by a block containing assignments. Each assignment defines a property of the object.

Example: o ⬅ ☺ x ⬅ 42 ◀ creates a variable o which is an object. This object has a single property called x, which is set to 42.

(Note: In the current version of Elan, lists are essentially just objects whose properties are consecutive integers starting from 0. In other words, a list like {"a","b","c"} can be considered an object with properties 0, 1, and 2.)

You can access the properties of an object in one of two ways. First, the dot (.) works just like in Javascript, meaning that o.x on the above object will give us the value 42. The underscore (_) does exactly the same thing, but it is set to a much lower precedence, allowing you to use an expression like a_b-1.

New objects can be created with the 🔨 (:new:) symbol, which, at the moment, works just like the Javascript new. Example: a ⬅ 🔨 Array.

Error Handling

Currently, errors are handled using something like Javascript's exception handlers.

The general form for handling errors is:
:try: attempted-statements :catch: error-variable error-handler

The :try: symbol can be 🔍 or 🔎, while :catch: is ✋. If an error occurs in the "try" block, then the "catch" block will be executed. The error variable is set when the error happens, and it is only used for the catch block.

You can throw errors from your own with the :throw: symbol, written as ⚾, 💣, or 💩. It is followed by an expression that will be passed as the error variable to any block that catches it. Example: 💩 "Something went wrong".

Conclusion

There are a lot more things that a language probably should do, but Elan is complete enough that you can play with it, tinker with it. I have plenty more ideas, but it's best to start small.

I hope you like what you've seen. If you have any comments, bug reports, or feature requests, leave them here or at the Github repo.

Have fun! 😎

2 thoughts on “Elan – Unicode and Emoji in your code”

  1. Addendum:

    First off, happy April Fools’ Day! I hope you liked the joke.

    Elan is my attempt at joining the long line of programming languages with more humor value than practical value, like INTERCAL, Malbolge, LOLCODE, and Perl 6. That said, the best jokes have a message, and Elan is no different.

    Although most of the language is in jest, I do think that non-ASCII characters have their place in programming. Unicode is a fact of life at this point, something we really couldn’t say a decade ago. Most of the more modern programming languages understand this. Javascript, for example, allows its identifiers to contain any “letter” with a codepoint under 0xFFFF (but not the higher, “astral”, characters, which was an actual problem when I was writing Elan’s code generator). Perl 6 actually does allow ≥ and ≤ as synonyms for < = and >=. But even in these cases more could be done, and "legacy" languages like C++ don't even have that option.

    Emoji, OTOH, I could do without. But I like to think of myself as an amateur linguist, and I have to admit that the idea is interesting. After all, China has gotten by with "writing in pictures" for a few thousand years. The emoji characters really go back to the most ancient style of writing, and I have more ideas on that front that will have to wait.

    Of the Elan language itself, there isn't really much notable, IMO. It compiles in a fairly straightforward way into Javascript. The only real translation problems I encountered were in the handling of conditionals and for-loops, but even then I made things easier on myself by making the most common uses map to the simplest Javascript. The whole thing was really a learning experience for me, since I'd never seriously used Jison or Gulp before. Plus, I don't have the "fear of rejection" anxiety that I usually get when I post code publicly, because I know that the whole thing is silly. Anybody that takes it seriously is missing the joke.

    That said, I don't mind constructive criticism. I can't guarantee that I'll support Elan, and it certainly isn't going to be the next CoffeeScript or Typescript, but I don't mind taking requests. Besides, there's always next year...

Leave a Reply

Your email address will not be published. Required fields are marked *