Assembly: the first steps

(Editor’s note: I pretty much gave up on the formatting for this one. Short of changing to a new syntax highlighter, there’s not an awful lot I can do for it, so I just left it as is. I did add an extra apostrophe on one line to stop the thing from thinking it was reading an unclosed string. Sorry for any trouble you might have when reading.)

As I’ve been saying throughout this little series, assembly is the closest we programmers can get to bare metal. On older systems, it was all but necessary to forgo the benefits of a higher-level language, because the speed gains from using assembly outweighed the extra developer time needed to write it. Nowadays, of course, the pendulum has swung quite far in the opposite direction, and assembly is usually only used in those few places where it can produce massive speedups.

But we’re looking at the 6502, a processor that is ancient compared to those of today. And it didn’t have the luxury of high-level languages, except for BASIC, which wasn’t much better than a prettified assembly language. The 6502, before you add in the code stored in a particular system’s ROM, couldn’t even multiply two numbers, much less perform complex string manipulation or operate on data structures.

This post has two code samples, written by myself, that demonstrate two things. First, they show you what assembly looks like, in something more than the little excerpts from last time. Second, they illustrate just how far we’ve come. These aren’t all that great, I’ll admit, and they’re probably not the fastest or smallest subroutines around. But they work for our purposes.

A debug printer

Debugging is a very important part of coding, as any programmer can (or should) agree. Assembly doesn’t give us too much in the way of debugging tools, however. Some assemblers do, and you might get something on your particular machine, but the lowest level doesn’t even have that. So this first snippet prints a byte to the screen in text form.

; Prints the byte in A to the address ($10),Y
; as 2 characters, then a space
printb:
    tax          ; save for later
    ; Some assemblers prefer these as "lsr a" instead
    lsr          ; shift A right 4 bits
    lsr          ; this moves the high bits to the bottom
    lsr
    lsr
    jsr outb     ; we use a subroutine for each character
    txa          ; reload A
    and #$0F     ; mask out the top 4 bits
    jsr outb     ; now print the bottom 4 bits
    lda #$20     ; $20 = ASCII space
    sta ($10),Y
    iny
    rts
outb:
    clc
    adc #$30     ; ASCII codes for digits are $30-$39
    cmp #$39     ; if A > 9, we print a letter, not a digit
    bmi digit
    clc
; Comment out this next line if you're using 6502asm.com '
    adc #$07     ; ASCII codes for A-F are $41-$46
digit:           ; either way, we end up here
    sta ($10),Y
    iny          ; move the "cursor" forward
    rts

You can call this with JSR printb, and it will do just what the comments say: print the byte in the accumulator. You’d probably want to set $10 and $11 to point to video memory. (On many 6502-based systems, that starts at $0400.)

Now, how does it work? The comments should help you—assembly programming requires good commenting—but here’s the gist. Hexadecimal is the preferred way of writing numbers when using assembly, and each hex digit corresponds to four bits. Thus, our subroutine takes the higher four bits (sometimes called a nibble, and occasionally spelled as nybble) and converts them to their ASCII text representation. Then it does the same thing with the lower four bits.

How does it do that part, though? Well, that’s the mini-subroutine at the end, starting at the label outb. I use the fact that ASCII represents the digits 0-9 as hexadecimal $30-$39. In other words, all you have to do is add $30. For hex A-F, this doesn’t work, because the next ASCII characters are punctuation. That’s what the CMP #$39...BMI digit check is for. The code checks to see if it should print a letter; if so, then it adds a further correction factor to get the right ASCII characters. (Since the online assembler doesn’t support true text output, we should comment out this adjustment; we’re only printing pixels, and these don’t need to be changed.)

This isn’t much, granted. It’s certainly not going to replace printf anytime soon. Then again, printf takes a lot more than 34 bytes. Yes, that’s all the space this whole subroutine needs, although it’s still about 1/2000 of the total memory of a 6502-based computer.

If you’re using the online assembler, you’ll probably want to hold on to this subroutine. Coders using a real machine (or emulation thereof) can use the available ROM routines. On a Commodore 64, for example, you might be able to use JSR $FFD2 instead.

Filling a gap

As I stated above, the 6502 processor can’t multiply. All it can do, as far as arithmetic is concerned, is add and subtract. Let’s fix that.

; Multiplies two 8-bit numbers at $20 and $21
; Result is a 16-bit number stored at $22-$23
; Uses $F0-$F2 as scratch memory
multi:
    ldx #$08    ; X holds our counter
    lda #$00    ; clear our result and scratch memory
    sta $22     ; these start at 0
    sta $23
    sta $F1

    lda $20     ; these can be copied
    sta $F0
    lda $21
    sta $F2

nxbit:
    lsr $F2
    bcc next    ; if no carry, skip the addition
    clc
    lda $22     ; 16-bit addition
    adc $F0
    sta $22
    lda $23
    adc $F1
    sta $23

next:
    asl $F0     ; 2-byte shift
    rol $F1
    dex         ; if our counter is > 0, repeat
    bne nxbit
    rts

This one will be harder to adapt to a true machine, since we use a few bytes of the zero page for “scratch” space. When you only have a single arithmetic register, sacrifices have to be made. On newer or more modern machines, we’d be able to use extra registers to hold our temporary results. (We’d also be more likely to have a built-in multiply instruction, but that’s beside the point.)

The subroutine uses a well-known algorithm, sometimes called peasant multiplication, that actually dates back thousands of years. I’ll let Wikipedia explain the details of the method itself, while I focus on the assembly-specific bits.

Basically, our routine is only useful for multiplying a byte by another byte. The result of this is a 16-bit number, which shouldn’t be too surprising. Of course, we only have an 8-bit register to use, so we need to do some contortions to get things to work, one of the problems of using the 6502. (This is almost like a manual version of what compilers call register spilling.)

What’s most important for illustrative purposes isn’t the algorithm itself, though, but the way we call it. We have to set things up in just the right way, with our values at the precise memory locations; we must adhere to a calling convention. When you use a higher-level language, the compiler takes care of this for you. And when you use assembly to interface with higher-level code (the most common use for it today), it’s something you need to watch.

As an example, take a modern x86 system using the GCC compiler. When you call a C function, the compiler emits a series of instructions to account for the function’s arguments and return value. Arguments are pushed to the stack in a call frame, then the function is called. It accesses those arguments by something like the 6502’s indexed addressing mode, then it does whatever it’s supposed to do, and returns a result either in a register (or two) or at a caller-specified memory location. Then, the caller manipulates the stack pointer—much faster than repeatedly popping from the stack—to remove the call frame, and continues execution.

No matter how it’s done, assembly code that’s intended to connect to higher-level libraries—whether in C or some other language—have to respect that language’s calling conventions. Other languages do, too. That’s what extern "C" is for in C++, and it’s also why many other languages have a foreign function interface, or FFI. In our case, however, we’re writing those libraries, and the 6502 is such a small and simple system, so we can make our own calling conventions. And that’s another reason we need good documentation when coding assembly.

Coming up

We’ll keep going through this wonderful, primitive world a while longer. I’ll touch on data structures, because they have a few interesting implications when working at this low level, but we won’t spend too much time on them. After that, who knows?

Alternate histories

For a lot of people, especially writers and other dreamers, one of the great questions, a question that provokes more thought, debate, and even argument, is “What if?” What if one single part of history was changed? What would be the result? These alternate histories are somewhat popular, as fictional sub-genres go, and they aren’t just limited to the written word. It’s a staple of Star Trek series, for example, to travel into the past or visit the “mirror universe”, either of which involves a specific change that can completely alter the present (their present, mind you, which would be our future).

What-if scenarios are also found in nonfiction works. Look at the history section of your favorite bookstore, digital or physical. You’ll find numerous examples asking things like “What if the D-Day invasion failed?” or (much earlier in the timeline) “What if Alexander had gone west to conquer, instead of east?” Some books focus on a single one of these questions, concocting an elaborate alternative to our known history. Others stuff a number of possibilities in a single work, necessarily giving each of them a less-detailed look.

And altering the course of history is a fun diversion, too. Not only that, but it can make a great story seed. You don’t have to write a novel of historical fiction to use “real” history and change things around a little bit. Plenty of fantasy is little more than a retelling of one part of the Middle Ages, with only the names changed to protect the innocent. Sci-fi also benefits, simply because history, in the broadest strokes, does repeat itself. The actors are different, but the play remains the same.

Divergence

So, let’s say you do want to construct an alternate timeline. That could easily fill an entire book—there’s an idea—but we’ll stick to the basics in this post. First and foremost, believability is key. Sure, it’s easy to say that the Nazis and Japanese turned the tide in World War II, eventually invading the US and splitting it between them. (World War II, by the way, is a favorite for speculators. I don’t know why.) But there’s more to it than that.

The Butterfly Effect is a well-known idea that can help us think about how changing history can work. As in the case of the butterfly flapping its wings and causing a hurricane, small differences in the initial conditions can grow into much larger repercussions. And the longer the time since the breakaway point, the bigger the changes will be.

I’m writing this on September 21, and some of the recent headlines include the Emmy Awards, the Greek elections, and the Federal Reserve’s decision to hold interest rates, rather than raising them. Change any bit of any of these, and the world today isn’t going to be much different. Go back a few years, however, and divergences grow more numerous, and they have more impact. Obviously, one of the biggest events of the current generation is the World Trade Center attacks in 2001. Get rid of those (as Family Guy did in one of their time-travel episodes), and most of the people alive today would still be here, but the whole world would change around them.

It’s not hard to see how this gets worse as you move the breakaway back in time. Plenty of people—including some that might be reading this—have ancestors that fought in World War II. And plenty of those would be wiped out if a single battle went differently, if a single unit’s fortunes were changed. World War I, the American Civil War (or your local equivalent), and so on, each turning point causes more and more difference in the final outcome. Go back in time to assassinate Genghis Khan before he began his conquests, for instance, and millions of people in the present never would have been born.

Building a history

It’s not just the ways that things would change, or the people that wouldn’t have lived. Those are important parts of an alternate history, but they aren’t the only parts. History is fractal. The deeper you go, the more detail you find. You could spend a lifetime working out the ramifications of a single change, or you could shrug it off and focus on only the highest levels. Either way is acceptable, but they fit different styles.

The rest of this post is going to look at a few different examples of altering history, of changing a single event and watching the ripples in time that it creates. They go in reverse chronological order, and they’re nothing more than the briefest glances. Deeper delving will have to wait for later posts, unless you want to take up the mantle.

Worked example 1: The Nazi nuke

Both ways of looking at alternate timelines, however, require us to follow logical pathways. Let’s look at the tired, old scenario of Germany getting The Bomb in WWII. However it happens, it happens. It’s plausible—the Axis had a lot of scientific talent that defected around that time, including Albert Einstein, Werner von Braun, and Enrico Fermi. It’s not that great a leap to say that the atomic bomb could be pushed up a couple of years.

But what does that do to the world? Well, it obviously gives the Axis an edge in the war; given their leaders’ tendencies, it’s not too much of a stretch to say that such a weapon would have been used, possibly on a large city like London. (In the direst scenario, it’s used on Berlin, to stop the Red Army.) Nuclear weapons would still have the same production problems they had in our 1940s, so we wouldn’t have a Cold War-era “hundreds of nukes ready to launch” situation. At most, we’d have a handful of blasts, most likely on big cities. That would certainly be horrible, but it wouldn’t really affect the outcome of the war that much, only the scale of destruction. The Allies would likely end up with The Bomb, too, whether through parallel development, defections, or espionage. In this case, the Soviets might get it earlier, as well, which might lead to a longer, darker Cold War.

There’s not really a logical path from an earlier, more widespread nuclear weapon to a Nazi invasion of America, though. Russia, yes, although their army would have something to say about that. But invading the US would require a severe increase in manpower and a series of major victories in Europe. (The Japanese, on the other hand, wouldn’t have nearly as much trouble, especially if they could wrap up their problems with China.) The Man in the High Castle is a good story, but we need more than one change to make it happen.

Worked example 2: The South shall rise

Another what-if that’s popular with American authors involves the Civil War. Specifically, what if the South, the Confederacy, had fought the Union to a stalemate, or even won? On the surface, this one doesn’t have as much military impact, although we’d need to tweak the manpower and supply numbers in favor of our new victors. (Maybe France offered their help or something.) Economically and socially, however, there’s a lot of fertile ground for change.

Clearly, the first and most obvious difference would be that, in 1865 Dixie, slavery would still exist. That was, after all, the main reason for the war in the first place. So we can accept that as a given, but that doesn’t necessarily mean it would be the case 150 years later. Slavery started out as an economic measure as much as a racial one. Plantations, especially those growing cotton, needed a vast amount of labor. Slaves were seen as the cheapest and simplest way of filling that need. The racial aspects only came later.

Even by the end of the Civil War, however, the Industrial Revolution was coming into full force. Steam engines were already there, and railroads were growing all around. It’s not too far-fetched to see the South investing into machinery, especially if it turns out to be a better, more efficient, less rebellious method of harvesting. It’s natural—for a Yankee, anyway—to think of Southerners as backwards rednecks, but an independent Confederacy could conceivably be quite advanced in this specific area. (There are problems with this line of reasoning, I’ll admit. One of those is that the kind of cotton grown in the South isn’t as amenable to machine harvesting as others. Still, any automation would cut down on the number of slaves needed.)

The states of the Confederacy depended on agriculture, and that wouldn’t change much. Landowners would be reluctant to give up their slaves—Southerners, as I know from personal experience, tend to be conservative—but it’s possible that they could be wooed by the economic factors. The more farming can be automated, the less sense it makes for servile labor. Remember, even though slaves didn’t have to be paid, they did have costs: housing, for example. (Conversely, slavery can still exist if the economic factors don’t add up in favor of automation. We can see the same thing today, with low-wage, illegal immigrant labor, a common “problem” in the South.)

Socially, of course, the ramifications of a Confederate victory would be much more important. It’s very easy to imagine the racism of slavery coming to the fore, even if automation ends the practice itself. That part might not change much from our own history, except in the timing. Persecuted, separated, or disfavored minorities are easy to find in the modern world, and their experiences can be a good guide here. Not just the obvious examples—the Palestinians, the Kurds, and the natives of America and Australia—but those less noteworthy, like the Chechens or even the Ainu. Revolt and rebellion might become common, even to the point of developing autonomous regions.

This might even be more likely, given the way the Confederacy was made. It was intended to be a weak national government with strong member states, more like the EU than the US. That setup, as anyone familiar with modern Europe will attest, almost nurtures the idea of secession. It’s definitely within the realm of possibility that the Confederate states would break up even further, maybe even to the point of individual nations, and a “black” state might splinter off from this. If you look closely, you can see that the US became much more centralized after the Civil War, giving more and more power to the federal government. The Confederates might have to do that, too, which would smack of betrayal.

Worked example 3: Gibbon’s nightmare

One of the other big “change the course of history” events is the fall of the Roman Empire, and that will be our last example today. How we prevent such a collapse isn’t obvious. Stopping the barbarian hordes from sacking Rome really only buys time; the whole system was hopelessly corrupt already. For the sake of argument, let’s say that we found the single turning-point that will stop the whole house of cards from falling. What does this do to history?

Well, put simply, it wrecks it. The Western world of the last fifteen hundred years is a direct result of the Romans and their fall. Now, we can salvage a lot by deciding that the ultimate event merely shifted power away from Rome, into the Eastern (Byzantine) Empire centered on Constantinople. That helps a lot, since the Goths and Vandals and Franks and whatnot mostly respected the authority of the Byzantines, at least in the beginning. Doing it like this might delay the inevitable, but it’s not the fun choice. Instead, let’s see what happens if the Roman Empire as a whole remains intact. Decadent, perhaps, and corrupt at every level, but whole. What happens next?

If we can presume some way of keeping it together over centuries, down to the present day, then we have a years-long project for a team of writers, because almost every aspect of life would be different. The Romans had a slave economy (see above for how that plays out), a republican government, and some pretty advanced technology, especially compared to their immediate successors. We can’t assume that all of this would carry down through the centuries, though. Even the Empire went through its regressive times. The modern world might be 400 years more advanced, but it’s no less likely that development would be retarded by a hundred or more years. The Romans liked war, and war is a great driver of technology, but you eventually run out of people to fight, and a successful empire requires empire-building. And a Pax Romana can lead to stagnation.

But the Dark Ages wouldn’t have happened, not like they really did. The spread of Islam might have been stopped early on, or simply contained in Arabia, but that would have also prevented their own advances in mathematics and other sciences. The Mongol invasions could have been stopped by imperial armies, or they could have been the ruin of Rome on a millennium-long delay. Exploration might not have happened at the same pace, although expeditions to the Orient would be an eventual necessity. (It gets really fun if you posit that China becomes a superpower in the same timeline. You could even have a medieval-era Cold War.)

Today’s world, in this scenario, would be different in every way, especially in the West. Medieval Europe was held together by the Christian Church. Our hypothetical Romans would have that, sure, but also the threat of empire to go with it. Instead of the patchwork of nation-states that marked the Middle Ages, you would have a hegemony. There might be no need for the Crusades, but also no need for the great spiritual works iconic of the Renaissance. And how would political theory grow in an eternal empire? It likely wouldn’t; it’s only when people can see different states with different systems of government that such things come about. If everybody is part of The One Empire, what use is there in imagining another way of doing things?

I could go on, but I won’t. This is a well without a bottom, and it only gets deeper as you fall further. It’s the Abyss, and it can and will stare back at you. One of my current writing projects involves something like an alternate timeline—basically, it’s a planet where Native Americans were allowed to develop without European influence—and it has taken me down roads I’ve never dreamed of traveling. Even after spending hundreds of hours thinking about it, I still don’t feel like I’ve done more than scratch the surface. But that’s worldbuilding for you.

Let’s make a language – Part 6b: Word order (Conlangs)

After the rather long post last time, you’ll be happy to know that describing the word order for our two conlangs is actually quite simple. Of course, a real grammar for a language would need to go into excruciating detail, but we’re just sketching things out at this stage. We can fill in exceptions to the rules as they come. And, if you’re making a natural-looking conlang, then they will come.

Sentences

The sentence level is where Isian and Ardari diverge the most. Isian is an SVO language, like English; subjects go before the verb, while objects go after. So we might have e sam cheres ta hu “the man saw a dog”. (By the way, this is a complete sentence, but we’ll ignore punctuation and capitalization for the time being.) For intransitive sentences, the order is simply SV: es tays ade eya “the children are laughing”. Oblique arguments, when we eventually see them, will immediately follow the verb.

Ardari is a little different. Instead of SVO, this language is SOV, although it’s not quite as attached to its ordering as Isian. Most sentences, however, will end with a verb; those that don’t will generally have a good reason not to. Using the same example above, we have konatö rhasan ivitad “the man saw a dog”. Intransitives are usually the same SV as Isian: sèdar jejses “the children are laughing”. We can change things around a little, though. An Ardari speaker would understand you if you said rhasan konatö ivitad, although he might wonder what was so important about the dog.

Verb phrases

There’s not too much to verb phrases in either of our conlangs, mostly because we haven’t talked much about them. Still, I’ll assume you know enough about English grammar to follow along.

For Isian, calling it “order” might be too much. Adverbs and auxiliary verbs will come before the head verb, but oblique clauses will follow it. This is pretty familiar to English speakers, and—with a few exceptions that will pop up later—Isian verb phrases are going to look a lot like their English counterparts.

Ardari might seem a little bit more complicated, but it’s really just unusual compared to what you know. The general rule for Ardari verb phrases (and the other types of phrases, for the most part) is simple: the head goes last. This is basically an extension to the SOV sentence order, carried throughout the language, and it’s common in SOV languages. (Look at Japanese for a good example.) So adverbs and oblique clauses and all the rest will all come before the main verb.

Noun phrases

Because of all the different possibilities, there’s no easy way of describing noun phrase order. For Isian, it’s actually quite complex, and almost entirely fixed, again like English. The basic order is this:

  • Determiners come first. These can be articles, numerals, or demonstratives. (We’ll meet these last two in a later post.)
  • Next are adjectives, which can also be phrases in their own right.
  • Complement clauses come next. These are hard to explain, so it’s best to wait until later.
  • Attributive words are next. This type of noun is what creates English compounds like “boat house”.
  • After these comes the head noun, buried in the middle of things.
  • After the head, some nouns can have an infinitive or subjunctive phrase added in here.
  • Prepositional phrases are next.
  • Lastly, we have the relative clauses.

That’s a lot, but few noun phrases are going to have all of these. Most will get by with a noun, maybe an adjective or two, and possibly a relative or prepositional phrase.

Ardari isn’t nearly as bad. Once again, the head is final, and this means the noun. Everything else comes before it, in this order:

  • Demonstratives and numerals come first. (Ardari doesn’t have articles, remember.)
  • Attributive adjectives and nouns are next, along with a few types of oblique phrases that we’ll mention as they come up.
  • Relative, complement, postpositional, adjectival, and other complex clauses, go after these.
  • The head noun goes here, and this is technically the end of the noun phrase.
  • Some adverb clauses that modify nouns can appear after the head, but these are rare.

For the most part, the order doesn’t matter so much in Ardari, as long as each phrase is self-contained. Since it’s easy to tell when a phrase ends (when it gets to the head noun/verb/adjective/whatever), we can mix things up without worry. The above is the most “natural” order, the one that our fictitious Ardari speakers will use by default.

Prepositions

Isian has prepositions, and they work just like those in English. Ardari, on the other hand, uses post-positions, which follow their noun phrases, again another example of its head-final nature. (The “head” of a prepositional phrase is the preposition itself, not the head noun.) We’ll definitely see a lot of both of these in the coming weeks.

Everything else

All the other possible types of phrase will be dealt with in time. For Ardari, the general rule of “head goes at the end” carries through most of them. Isian is more varied, but it will usually stick to something approximating English norms.

Looking ahead

Next up is adjectives, which will give us a way to make much more interesting sentences in both our fledgling conlangs. We’ll also get quite a bit more vocabulary, and possibly our first full translations. (We’ll see about that one. They may be left as exercises for the reader.)

Beyond that, things will start to become less structured. With the linguistic trinity of noun-verb-adjective out of the way, the whole world of language opens up. Think of everything so far as the tutorial mission. Soon, we’ll enter the open sandbox.

Assembly: the building blocks

(Editor’s note: Sorry about the formatting for this one. The syntax highlighter I use on here had absolutely no idea what to do with assembly—I think it thought it was Perl—and it screwed everything up. To turn it off, I had to wrap each code sample in HTML code blocks, which killed the indentation.)

So here we are. One of the reasons I chose the 6502 for this little series is because it has such a simple assembly language. It could fit in one post, even covering the intricacies of addressing, the stack, and other bits we’ll get into. Compare that to, say, 16-bit x86, with its fairly complex addressing modes, its segmented memory model, and a completely different I/O system. Add to that the requirement to have an OS, even one such as FreeDOS, and you have quite the task just getting started. The 6502, by contrast, is easy, at least as far as any assembly language can be called easy.

The idea of assembly

In most modern programming languages, things are a bit abstract. You have functions, flow control statements (if, while, and for in C, for example), variables, maybe even objects and exceptions and other neat stuff like that. You’ve got a big standard library full of pre-built routines so you don’t have to reinvent the wheel. In some styles of programming, you aren’t even supposed to care about the low-level details of just how your code runs.

With assembly, all that is gone. It’s just you and the machine. You don’t write assembly code on the level of functions or even statements. You write instructions. You’re telling the computer exactly what to do at each step, and you have to tell it in its own language.

That leads us to a couple of basic concepts regarding assembly. First, each processor has an instruction set. This, obviously, is the set of instructions that it understands. Typically, these all have a prose description like “load accumulator” or “logical shift right”. This notation is a convenience for those studying the instruction set, not for those actually using it. The processor itself doesn’t understand them; it works in numbers like (hexadecimal) $A9 and $4A, what are often called opcodes (a shortened version of “operation codes”). Assembly programmers get a compromise between these extremes: a set of mnemonics, one for each kind of instruction that the processor understands. These are abbreviations, usually only a few letters—the 6502, for example, always uses 3-letter mnemonics. In this form, the two instructions above would be written as LDA and LSR. (Most assemblers nowadays are case-insensitive, so you can write lda and lsr if you want, and I will in assembly language program listings. For describing the instructions themselves, however, I’ll stick to all-caps.)

The second thing to know about assembly also regards its lack of abstractions, but concerning the computer’s memory. Especially on early microprocessors like the 6502, the assembly programmer needs intimate knowledge of the memory layout and how the CPU can access it. Remember, we can’t call a function like in C (f(x,y)). We have to convert even that to a language that the computer understands. How we do that depends very much on the specific system we’re using, so now it’s time to look at the 6502 in particular.

Addressing the 6502

Before we get to the meat of 6502 assembly, we need to look at how a programmer can communicate with the processor. Obviously, most of the work will be done by the registers we saw last time, namely A, X, and Y. Of course, three bytes of usable data isn’t that much, so we’ll be accessing memory almost constantly. And the 6502 offers a few ways to do that—called addressing modes—although some only work with certain instructions.

The first way we can access data not in a register is by putting it in the instruction itself, as an immediate value. On 6502 assemblers, this is usually indicated by a #. For example, LDA #$10 places the value 16 (or $10 in hexadecimal) into the accumulator.

If we want to work with a known location of memory, we might be able to give that location to the instruction using absolute addressing. For example, the Commodore 64’s screen memory is set up so that the upper-left character on the screen is at address $0400. To store the value in the A register there, we could use STA $0400. When using zero-page addresses ($00xx), we can omit the top byte: LDA $FE. This actually saves a byte of memory, which is a lot more important on a system with only 64K than on todays multi-gig computers.

Most of the other addressing modes of the 6502 are in some way indirect, using a value in memory or a register like a pointer (for those of you that know a language like C). These include:

  • Absolute indirect. Only one instruction actually uses this one. JMP ($FFFE) jumps to the address stored at memory location $FFFE. Since the 6502 has a 16-bit address space, this actually uses the address you give and the one right after it—in this case, $FFFE and $FFFF. (The 6502, like the x86, is little-endian, meaning that the first byte is the low one.)

  • Relative. This mode is only used by the branching instructions, and it indicates a relative “displacement”. BEQ $08, for example, would jump forward 8 bytes “if equal”. Negative values are allowed, but they’re encoded as two’s-complement numbers (basically, think of N bytes back as $100 - N ahead): BNE $FE jumps back 2 bytes, which makes an awfully effective infinite loop.

  • Indexed. This is where the X and Y registers start coming into their own. With indexed addressing, one of these is added to the address you give (either 2-byte absolute or 1-byte zero-page). An example would be STA $0400,X, which stores the accumulator value in the address $0400 + X. So, if the X register contains $10, this writes to $0410. Note that some instructions can only use X, some can only use Y, and a few are limited to zero-page addresses.

  • Indexed indirect and indirect indexed. Don’t worry about mixing the names up on these; they don’t matter. What does matter is how they work. These both use a memory location as a pointer and a register value as an index. The difference is where they add the index in. Indexed indirect adds the X register to the address you give, and creates a pointer from that. Indirect indexed, on the other hand, adds the Y register to the stored value, then uses that as the pointer.

As an example, let’s say that memory locations $30 and $31 each contain the value $80, while $40 and $41 both have $20. Also, both X and Y are set to $10. In this setup, indexed indirect (LDA ($30,X)) takes the memory location $30 + X (i.e., $40) and loads whatever is at the address stored there, essentially as if you’d written LDA $2020. Indirect indexed (LDA ($30),Y) instead takes what is stored at the location you give ($8080, in our example), then adds Y ($10) to that to get the final pointer: $8080 + $10 = $8090. In this case, the effect is the same as LDA $8090.

Finally, assemblers allow the use of labels, so you don’t have to worry about exact addresses. These are the closest you’re going to get to something like named functions. In assembly source code, they’re defined like this: label:. Later on, you can refer to them like you would any memory address, e.g., LDA label or BEQ label. One of the assembler’s jobs is to replace the labels with the”real” addresses, and it’s pretty good at that.

The instructions

After all that, the actual instruction set of the 6502 is refreshingly uncomplicated. All told, there are only a few dozen possible instructions, all of them performing only the most basic of actions. Yet this small arsenal was enough for a generation of 8-bit computers.

Many assembly language references put the instructions in alphabetical order by mnemonics. But the 6502’s set is so small that we can get away with ordering them by what they do. As it turns out, there aren’t too many categories, only about a dozen or so. Also, I’ll have examples for some of the categories, but not all. In the code samples, a ; marks a comment; the rest of the line is ignored, just like // or #, depending on your favorite language.

Load and store

Especially on older, less capable systems like the 6502, moving data around is one of the most important tasks. And there are two ways that data can go: from memory into a register or the other way around. For our CPU, moving a byte from memory to a register is a load, while sending it from a register to a memory location is a store. (The x86, to illustrate a different way of doing things, uses a single instruction, MOV, to do both of these.)

There are three “user” registers on the 6502, and each one has a load and a store instruction. To load a value into a register, you use LDA, LDX, or LDY. To store a value from one of them into memory, it’s STA, STX, and STY. (I think you can figure out which one uses which register.)

In terms of addressing, these instructions are the most well-rounded. The accumulator gives you the most options here, offering immediate, absolute, and all the indirect options. With X and Y, you can’t use indirect addressing, and you can only use the other of the two registers as an index. So you can write LDX $30,Y, but not LDX $30,X.

This code example doesn’t do too much. It sets up the first two memory locations as a pointer to $0400, then writes the byte $36 to that location. For the online assembler I’m using, that makes a blue dot on the left side of the screen, in the middle. On a real C64 or Apple II, that address is the top-left corner of the screen, so it will display whatever the system thinks $36 should be, probably the number 6.


start:
lda #$00 ; we need 2 loads & 2 stores
sta $00 ; to set up a 16-bit address
lda #$04
sta $01

lda #$36
ldy #$00 ; clear Y to use as an index
sta ($00),Y ; stores our byte at $0400

Arithmetic

Besides shuffling data around, computers mainly do math. It’s what they’re best at. As an older microprocessor, the 6502 had to cut corners; by itself, it can only add and subtract, and then only using the accumulator. These two instructions, ADC and SBC, are a little finicky, and they’re our first introduction to the processor status or “flags” register, P. So we’ll take a quick diversion to look at it.

The P register on the 6502, like all its other registers, is a single byte. But we usually don’t care about its byte value as a whole. Instead, we want to look at the individual bits. Since there are eight bits in a byte, there are eight possible flags. The 6502 uses seven of these, although the online assembler doesn’t support two of those, and a third was rarely used even back in the day. So that leaves four that are important enough to mention here:

  • Bit 7, the Negative (N) flag, is changed after most instructions that affect the A register. It’ll be equal to the high bit of the accumulator, which will always indicate a negative number.
  • Bit 6, Overflow (V), is set whenever the “sign” of the accumulator changes from arithmetic.
  • Bit 1 is the Zero (Z) flag, which is only set if the last load, store, or arithmetic instruction ended in a 0.
  • Bit 0, the Carry (C) flag, is the important one. It’s set when an addition or subtraction causes a result that can’t fit into a byte, as well as when we use some bitwise instructions.

Now, the two arithmetic instructions are ADC and SBC, which stand for “add with carry” and “subtract with carry”. The 6502 doesn’t have a way to add or subtract without involving the carry flag! So, if we don’t want it messing with us, we need to clear it (CLC, which we’ll see again below) before we start doing our addition. Conversely, before subtracting, we must set it with the SEC instruction. (The reason for this is due to the way the processor was designed.)

Also, these instructions only work with the accumulator and a memory address or immediate value. You can’t directly add to X or Y with them, but that’s okay. In the next section, we’ll see instructions that can help us.

The code example here builds on the last one. In the online assembler, it displays a brown pixel next to the blue one. On real computers, it should put a 9 to the right of the 6, because 8-bit coders have dirty minds.


start:
lda #$00 ; we need 2 loads & 2 stores
sta $00 ; to set up a 16-bit address
lda #$04
sta $01

lda #$36
ldy #$00 ; clear Y to use as an index
sta ($00),Y ; stores our byte at $0400

clc ; always clear carry first
adc #$03 ; A += 3
iny ; move the position right 1
sta ($00),Y ; store the new value

Increment and decrement

The INY (“increment Y register”) instruction I just used is one of a group of six: INC, DEC, INX, DEX, INY, DEY.

All these do instructions do is add or subtract 1, an operation so common that just about every processor in existence has dedicated instructions for it, which is also why C has the ++ and -- operators. For the 6502, these can work on either of our index registers or a memory location. (If you’re lucky enough to have a later model, you also have INA and DEA, which work on the accumulator.)

Our code example this time is an altered version of the last one. This time, instead of incrementing the Y register, we increment the memory location $00 directly. The effect, though, is the same.


start:
lda #$00 ; we need 2 loads & 2 stores
sta $00 ; to set up a 16-bit address
lda #$04
sta $01

lda #$36
ldy #$00 ; clear Y to use as an index
sta ($00),Y ; stores our byte at $0400

clc ; always clear carry first
adc #$03 ; A += 3
inc $00 ; move the position right 1
sta ($00),Y ; store the new value

Flags

We’ve already seen CLC and SEC. Those are part of a group of instructions that manipulate the flags register. Since we don’t care about all the flags, there’s only one more of these that is important: CLV. All it does is clear the overflow flag, which can come in handy sometimes.

By the way, the other four are two pairs. CLI and SEI work on the interrupt flag, which the online assembler doesn’t support. CLD and SED manipulate the decimal flag, which doesn’t seem to get much use.

There’s no real code example this time, since we’ve already used CLC. SEC works the same way, and I can’t think of a quick use of the overflow flag.

Comparison

Sometimes, it’s useful to just compare numbers, without adding or subtracting. For this, the 6502 offers a trio of arithmetic comparison instructions and one bitwise one.

CMP, CPX, CPY each compare a value in memory to the one in a register (CMP uses A, the others are obvious). If the register value is less than the memory one, the N flag is set. Otherwise, the C flag gets set. If they’re equal, it also sets the Z flag.

BIT works a little differently. It sets the N and V flags to the top two bits of the memory location (no indirection or indexing allowed). Then, it sets the Z flag if the bitwise-AND of the memory byte and the accumulator is zero, i.e., if they have no 1 bits in common.

Comparison instructions are most useful in branching, so I’ll hold off on the example until then.

Branching

Branching is how we simulate the higher-level control structures like conditionals and loops. In the 6502, we have the option of conditionally hopping around our code by using any of nine different instructions. Eight of these come in pairs, each pair based on one of the four main flags.

  • BCC and BCS branch if the C flag is clear (0) or set (1), respectively.
  • BNE (“branch not equal”) and BEQ (“branch equal”) do the same for the Z flag.
  • BVC and BVS branch based on the state of the V flag.
  • BPL (“branch plus”) and BMI (“branch minus”) work on the N flag.

All of these use the “relative” addressing mode, limiting them to short jumps.

The ninth instruction is JMP, and it can go anywhere. You can use it with a direct address (JMP $FFFE) or an indirect one (JMP ($0055)), and it always jumps. Put simply, it’s GOTO. But that’s not as bad as it sounds. Remember, we don’t have the luxury of while or for. JMP is how we make those.

This code sample, still building on our earlier attempts, draws nine dots (or the digits 0-9) on the screen.


start:
lda #$00
sta $00
lda #$04
sta $01

lda #$30
ldy #$00

loop:
sta ($00),Y ; write the byte to the screen
clc
adc #$01 ; add 1 to A for next character
iny ; move 1 character to the right
cpy #$0a ; have we reached 10 yet?
bne loop ; if not, go again

For comparison, a pseudo-C version of the same thing:

char* screen = 0x0400;
char value = 0x30;
for (int i = 0; i < 10; i++) {
    screen[i] = value;
    value++;
}
The stack

The stack, on the 6502 processor, is the second page of memory, starting at address $0100. It can be used to store temporary values, addresses, and other data, but it’s all accessed through the stack pointer (SP). You push a value onto the stack, then pop (or pull, to use 6502 terminology) it back off when you need it back.

We’ve got an even half dozen instructions to control the stack. We can push the accumulator value onto it with PHA, and we can do the same with the flags by using PHP. (Not the programming language with that name, thankfully.) Popping—or pulling, if you prefer the archaic term—the value pointed to by the SP uses PLA and PLP. The other two instructions, TSX and TXS let us copy the stack pointer to the X register, or vice versa.

Subroutines

Branches give us flow control, an important part of any high-level programming. For functions, the assembly programmer uses subroutines, and the 6502 has a pair of instructions that help us implement them. JSR (“jump to subroutine”) is an unconditional jump like JMP, except that it pushes the address of the next instruction to the stack before jumping. (Since we only have a page of stack space, this limits how “deep” you can go.) When the subroutine is done, the RTS instruction sends you back to where you started, just after the JSR.

The code sample here shows a little subroutine. See if you can figure out what it does.


start:
lda #$00
sta $00
lda #$04
sta $01

lda #$31
ldy #$09
jsr show ; call our subroutine
jmp end ; jump past when we're done

show:

sta ($00),Y ; write the byte to screen mem
clc
adc #$01 ; add 1 to accumulator
dey
bne show ; loop until Y = 0
rts ; return when we're done

end:
; label so we can skip the subroutine

Bitwise

We’ve got a total of seven bitwise instructions (not counting BIT, which is different). Three of these correspond to the usual AND, OR, and XOR operations, and they work on a memory location and the accumulator. AND has an obvious name, ORA stands for “OR with accumulator”, and EOR is “exclusive OR”. (Why they used “EOR” instead of “XOR”, I don’t know.) If you’ve ever used the bit-twiddling parts of C or just about any other language, you know how these work. These three instructions also change the Z and N flags: Z if the result is 0, N if the highest bit of the result is set.

The other four manipulate the bits of memory or the accumulator themselves. ASL is “arithmetic shift left”, identical to the C << operator, except that it only works one bit at a time. The high bit is shifted into the C flag, while Z and N are altered like you’d expect. LSR (“logical shift right”) works mostly in reverse: every bit is shifted down, a 0 is moved into the high bit, and the low bit goes into C.

ROL and ROR (“rotate left/right”) are the oddballs, as few higher-level languages have a counterpart to them. Really, though, they’re simple. ROL works just like ASL, except that it shifts whatever was in the C flag into the low bit instead of always a 0. ROR is the same, but the other way around, putting the C flag’s value into the high bit.

Transfer

We could move bytes between the A, X, and Y registers by copying them to memory or using the stack instructions. That’s time-consuming, though. Instead, we’ve got the TAX, TAY, TXA, and TYA instructions. These transfer a value from one register to another, with the second letter of the mnemonic as the source and the third as the destination. (TAX copies A to X, etc.) The flags are set how you’d expect.

The other guys

There are two other 6502 assembly instructions that don’t do too much. BRK causes an interrupt, which the online assembler can’t handle and isn’t that important for user-level coding. NOP does nothing at all. It’s used to fill space, basically.

Next time

Whew. Maybe I was wrong about fitting all this in one post. Still, that’s essentially everything you need to know about the 6502 instruction set. The web has a ton of tutorials, all of them better than mine. But this is the beginning. In the next part, we’ll look at actually doing things with assembly. That one will be full of code, too.

Dragons in fantasy

If there is one thing, one creature, one being that we can point to as the symbol of the fantasy genre, it has to be the dragon. They’re everywhere in fantasy literature. The Hobbit, of course, is an old fantasy story that has come back into vogue in the last few years. More recent books involve dragons as major characters (Steven Erikson’s Malazan series) or as plot points (Daniel Abraham’s appropriately-titled The Dragon’s Path). Movies go through cycles, and dragons are sometimes the “in” subject (the movies based on The Hobbit, but also less recent films like Reign of Fire). Television likes dragons, too, when it has the budget to do them (Game of Thrones, of course). And we can also find these magnificent creatures represented in video games (Drakengard, Skyrim), tabletop RPGs (Dungeons & Dragons—it’s even in the name!), and music (DragonForce).

So what makes dragons so…interesting? It’s not a recent phenomenon; dragon legends go back centuries. They feature in Arthurian legend, Chinese mythology, and Greek epics. They’re everywhere, all throughout history. Something about them fires the imagination, so what is it?

The birth of the dragon

Every ancient culture, it seems, has a mythology involving giant beasts of a kind unknown to modern science. We think of the Greek myths of the Hydra, of course, but it’s only one of many. Even in the Bible, monsters are found: the leviathan and behemoth found in the book of Job, for example. But something like a dragon seems to be found in almost every mythos.

How did this happen? For things like this, there are usually a few possible explanations. One, it could be a borrowing, something that arose in one culture, then spread to its neighbors. That seems plausible, except that New World peoples also have dragon-like supernatural beings, and they had them before Columbus. Another possibility is that the first idea of the dragon was invented in the deep past, before humanity spread to every corner of the globe. But that’s a bit far-fetched. You’d then have to explain how something like that stuck around for 30,000 or so years with so little change, using only art and oral transmission for most of that time.

The third option is, in my opinion, the most reasonable: the idea of dragons arose in a few different places independently, in something like convergent evolution. Each “region” would have its own dragon mythology, where the concept of “dragon” is about the same, while different regions might have wildly different ideas of what they should be.

I would also say that the same should be true for other fantastical creatures—giants, for instance—that pop up around the world. And, in my mind, there’s a perfectly good reason why these same tropes appear everywhere: fossils. We know that there used to be huge animals roaming the earth. Dinosaurs could be enormous, and you could imagine a Bronze Age hunter stumbling upon the fossilized bones of one of them and jumping to conclusions.

Even in recent geological time, it was only the Ice Age that wiped out the mammoths and so many other “megafauna”. (Today’s environmental movement tends to want to blame humans for everything bad, including this, but the evidence can be twisted just about any way you like.) In these cases, we can see the possibility that early human bands did meet these true giants, and they would have told stories about them. In time, those stories, as such stories tend to do, could have become legendary. For dragons, this one doesn’t matter too much, but it’s a point in favor of the idea that ancient peoples saw giant creatures—or their remains—and mythologized them into dragons and giants and everything else.

The nature of the beast

Moving far forward in time, we can see that the modern era’s literature has taken the time-honored myth of the dragon and given it new direction. At some point in the last few decades, authors seem to have decided that dragons must make sense. Sure, that’s completely silly from a mythological point of view, but that’s how it is.

Even in older stories, though, dragons had a purpose. That purpose was different for different stories, as it is today. For many of them, the dragon is a nemesis, an enemy. Sometimes, it’s essentially a force of nature, if not a god in its own right. In a few, dragons are good guys, protectors. Christian cultures in medieval times liked to use the slaying dragon as a symbol for the defeat of paganism. But it’s only relatively recently that the idea of dragons as “people” has become popular. Nowadays, we can find fiction where dragons are represented as magicians, sages, and oracles. A few settings even turn them into another sapient race, with their own civilization, culture, religion, and so on.

The form of dragons also depends a lot on the which mythos we’re talking about. The modern perception of a dragon as a winged, bipedal serpent who breathes fire and hoards gold (in other words, more like the wyvern) is just one possibility. Plenty of cultures have wingless dragons, and most of the “true” dragons have no legs; they’re more like giant snakes. Still, there’s an awful lot of variation, and there’s no single, definitive version of a dragon.

Your own dragon

Dragons in a work of fiction, whether novel or film or game, need to be there for a reason, if you want a coherent story. You don’t have to work out a whole ecological treatise on them, showing their diets, sleep patterns, and reproductive habits—Tolkien’s dragons, for example, were supernatural creations, so they didn’t have to make scientific sense—but you should know why a dragon appears.

If there’s only one of them, there’s probably a reason why. Maybe it’s a demon, or a creation of the gods, or an avatar of chaos. Maybe it’s the sole survivor of its kind, frozen in time for millennia (that’s a big spoiler, but I’m not going to tell you for what). Whatever you come up with, you should be able to justify it with something more than “because it’s there”. The more dragons you have, the more this problem can grow. In the extreme, if they’re everywhere, why aren’t they running things?

More than their reason for existing in the first place, you need to think about their story role. Are they enemies? Are they good or evil? Can they talk? What are they like? Smaug was greedy and haughty, for instance, and it’s a conceit of D&D that dragons are complex beings that are completely misunderstood by us lesser mortals simply because we can’t understand their true motives.

Are there different kinds of dragons? Again we can look at D&D, which has a bewildering assortment even before we include wyverns, lesser drakes, and the like. Of course, a game will need a different notion of role than a novel, and gamers like variation in their enemies, but only the most jaded player would think of a dragon as anything less than a major boss character.

Another thing that’s popular is the idea that dragons can change their form to look human. This might be derived from RPGs, or they might have taken it from an earlier source. However it worked out, a lot of people like the idea of a shapeshifting dragon. (Half the characters in the aforementioned Malazan series seem to be like this, and that’s not the only example in fantasy.) Shapechanging, of course, is an important part of a lot of fantasy, and I might do a post on it later on. It is another interesting possibility, though, if you can get it right.

In a very big way, dragons-as-people is a similar problem as other fantasy races, as well as sci-fi aliens. The challenge here is to make something that feels different, something that isn’t quite human, while still making it believable for the story at hand. If dragons live for 500 years, for example, they will have a different outlook on life and history than we would. If they lay eggs—and who doesn’t like dragon eggs?—they won’t understand the pain and danger of live childbirth, among other things. The ways in which a dragon isn’t like a human are breeding grounds for conflict, both internal and external. All you have to do is follow the notion towards its logical conclusion. You know, just like everything else.

In conclusion, I’d like to say that I do like dragons, when they’re done right. They can be these imposing, alien presences beyond reason or understanding, and that is something I find interesting. But in the wrong hands, they turn into little more than pets or mounts, giant versions of dogs and horses that happen to have scales. Dragons don’t need to be noble or evil, but they should have an impact when you meet one. I mean, you’d feel amazed if you met one in real life, wouldn’t you?

Let’s make a language – Part 6a: Word order (Intro)

We’ve looked at nouns and verbs in isolation, and even in a few simple phrases. Now it’s time to start putting things together, using these small parts as building blocks to create larger, more complex utterances. To do that, though, we need to set a few ground rules, because there’s a big difference between a jumble of words and a grammatically correct sentence. We must have order.

You have a few different options for going about this. Personally, I like to take a “top-down” approach, starting at the level of sentences and working my way down. Others prefer the “bottom-up” approach, where you work out the rules for noun phrases, verb phrases, and so on before putting them all into a sentence. Either way is fine, but the bottom-up fans will have to wait to apply the lessons of this part, since we haven’t even begun to cover adjectives, adverbs, prepositions, and all the other little bits of a language. (We’ll get to them soon, I promise.)

The sentence

Obviously, the biggest unit of speech where grammar rules actually come into play is the sentence. And sentences can be divided into a few different parts. Pretty much every one of them, for example, has a verb or verb phrase, which we’ll label V. Transitive sentences also have a subject (S) and an object (O); both of these are typically noun phrases. Intransitives, as you’ll recall, only have one argument, which we’ll also call the subject. There are also oblique phrases, which are sort of like an adverb; these will come into play a bit later, where we’ll label them X, following the convention in WALS Chapter 84. Some other kinds of phrases, like prepositions, quoted speech, and conjunctions, don’t really factor into the main word order, so we’ll look at them as they come up.

Given a basic transitive sentence, then, we have three main parts: S, V, and O. A simple count should show you that there are six possibilities of ordering them, and every one of those six is attested by some natural language in the world. The SVO order (subject-verb-object) is certainly familiar, as it’s the one used in English. SOV shows up in a number of European languages, and it’s also the main order in Japanese. The others will likely sound “off” to you; OSV and VOS, for example, are utterly alien to Western ears, which is why they were used to make Yoda sound alien.

In terms of statistics, SVO and SOV are about even around the world, SOV having a slight edge. The two of them together account for somewhere around 80% of all natural languages. VSO is a distant third, at about 10-15%, but you’ll no doubt recognized some of those: Arabic, Welsh, Irish, and Tagalog, among many others. These three, a total of over 90% or the world’s languages, all have one thing in common: the subject comes before the object.

The rest of the possibilities, where the object comes first, are much rarer, and many of those languages also allow a more common subject-first ordering. Of the three, VOS is the most common in the WALS survey, with such examples as Kiribati and Malagasy. OVS, the mirror image of English, is listed as the main form in eleven languages, including such notables as Hixkaryana and Tuvaluan. OSV, in their survey of over 1,300 languages (about a quarter of the world’s total), only shows up as dominant in Kxoe, Nadëb, Tobati, and Wik Ngathana, and I couldn’t tell you a single thing about any of them.

Conlangs have a slightly different distribution, owing to the artistic differences of their authors. According to CALS, the conlang counterpart to WALS, SVO has a narrow edge over SOV, but VSO is much more common than in the real world. The object-first trio also makes up a bigger percentage, but it’s still vastly outnumbered by the subject-first languages.

It’s certainly possible for a language to have no main word order for its sentences. This tends to be the case (pardon the pun) in languages that have case systems, but it’s also possible in caseless languages. There are even a few languages where there are two major word orders. German is an example of this; it’s normally SVO, but many sentences with more complex verb phrases often push the main verb to the end, effectively becoming SOV.

Now, in intransitive sentences, things can change a little bit. Since there’s no real object, you only have two possibilities: SV and VS. SV, as you might expect, is vastly more popular (about 6:1). But the distinct minority of VS languages also includes many of the ergative languages, which are normally SOV or SVO. Ergative languages often treat the subject of an intransitive verb like the direct object of a transitive one, so a VS order almost makes sense.

Noun phrases

Moving on, we’ll go down a level and look at those subject and object phrases. Since we haven’t quite made it to adjectives and the like, this will necessarily be a bit abstract. In general, though, noun phrases aren’t exactly like sentences. They have a head noun, the main part of the phrase, and a bunch of potential modifiers to that head. These modifiers can go either before or after the head, and their order (relative to each other) is often fixed. For example, English allows a noun phrase like the three big men, with an article, numeral, and an adjective all preceding the noun. No other permutation of these four elements is grammatically correct, though. We can’t say the big three men; the big three is okay, but then three becomes the head noun.

So we’ll have to do things a little different for this section. Instead of showing all the possible orderings of all the different parts of a noun phrase, we’ll look at each one individually.

  • Articles: Articles are a little weird. If they’re separate words, they’re often the first part of a noun phrase. If they’re suffixes or similar, then they’re last. And then you have something like Arabic, where the article is a prefix that attaches to both the nouns and adjectives in a phrase.

  • Adjectives: The topic of the next part of this series, adjectives are the main modifier words. English is actually in a minority by having its adjectives precede nouns, but it’s a sizable minority: about 25%. Noun-adjective languages make up about 60%, and there’s also a group that allows either possibility. But this tends to run in families. All the Germanic languages like adjectives first, but the Romance ones are the other way around.

  • Demonstratives: These are words like this in this man. Here, it’s too close to call. (Seriously, WALS has it as 561-542 in favor of demonstratives after nouns.) Again, though, it’s very much a familial trait. The only following-demonstrative languages in Europe are a few Celtic languages and Basque, which is always the outlier. Most of Southeast Asia, on the other hand, likes their demonstratives to be last.

  • Numerals: The “number” words are another close split, but not quite even. Call it 55-45, with following numerals having the lead. However, you could say this is due to politics. Africa, Asia, and New Guinea, with their vast numbers of languages, tip the scales. Europe, with its large, united, national languages, is universally numerals-first.

  • Genitives: This means any kind of possession, ownership, kinship, and a few other categories, not necessarily the genitive case. Genitives tend to come before nouns, again around 55%. English is among the rarities by having two different versions, one on either side of the divide: Jack’s house, the home of the brave. This one is actually somewhat related to sentence order; VO languages tend to have noun-genitive ordering, while OV languages are more likely to be genitive-first.

  • Relative clauses: We won’t be covering these for a long time, but we can already see where they’ll go. Overwhelmingly, it turns out, they go after the noun. It’s possible to have them before the noun, though, and there’s one example in Europe. (Guess which one.) It’s more common in Asia, except the Middle East. Some Native American languages even do a weird thing where they put the head noun inside the relative clause. You’ll have to look that one up yourself, or wait until the big “relative clauses” post in about three months.

Other phrases

There aren’t too many options for other phrases. Verb phrases have the option of putting adverbs before or after the head verb, the same as adjectives and nouns. Adjectives themselves can be modified, and they then become the head of their own adjective phrase, with its own order.

One case that is interesting is that of prepositions. These are the little words like in or short phrases like in front of, and we’ll see a lot more of them soon. They’re actually the heads of their own type of phrase, known in English as the prepositional phrase. And in English, they precede the rest of that phrase: at the house, in front of the car.

Well, that’s not the only option. You can also put the preposition at the end of its phrase, and this is more common in the world’s languages. Of course, then the name “preposition” doesn’t make much sense, so these are called postpositions. They’re not common in Europe, except in the non-Indo-European parts (Finnish, Hungarian, Estonian, and—naturally—Basque). Most of India likes them, though, as do Iran, Georgia, and Armenia. They’re also popular among the many languages of South America.

Conclusion

Basically, any time you have more than one word, you have word order. Some languages don’t make much of a fuss about it. Cases let you be free in your wording, because it doesn’t matter where an object goes if it always has an accusative suffix on it. French and Spanish allow some adjectives before the noun (e.g., grand prix), even though most of them have to follow it. And poets have made a living breaking the rules. Conlangers, really, aren’t much different.

But rules can be helpful, too. If every sentence ends with a verb, then you always know when you’ve reached the end. (There’s a joke about a German professor in here, but I don’t remember all of it.) For conlangs, word order rules become a kind of template. I know my language is VSO, for example, so I can look at real-world VSO languages for inspiration. Those tend to have prepositions, so my language will, too, because I want it to feel natural. Auxiliary languages are even more in need of hard and fast rules about word order, and they will certainly want to follow the observed connections.

In the next post, we’ll look at how Isian and Ardari put their sentences and phrases together. Then, it’s on to adjectives, the third jewel in the linguistic Triple Crown.

Assembly: welcome to the machine

Well, we have a winner, and it’s the 6502. Sure, it’s simple and limited, and it’s actually the oldest (this year, it turned 40) of our four finalists. But I chose it for a few reasons. First, the availability of an online assembler and emulator at 6502asm.com. Second, because of its wide use in 8-bit computers, especially those I’ve used (you might call this nostalgia, but it’s a practical reason, too). And finally, its simplicity. Yes, it’s limited, but those limitations make it easier to understand the system as a whole.

Before we get started

A couple of things to point out before I get into the details of our learning system. I’m using the online assembler at the link above. You can use that if you like. If you’d rather use something approximating a real 6502, there are plenty of emulators out there. There’s a problem, though. The 6502 was widely used as a base for the home computers of the 70s and 80s, but each system changed things a little. And the derivative processors, such as the Commodore 64’s MOS 6510 or the Ricoh 2A03 used in the NES, each had their own quirks. So, this series will focus on an “ideal” 6502 wherever possible, only entering the real world when necessary.

A second thing to note is the use of hexadecimal (base-16) numbers. They’re extremely common in assembly programming; they might even be used more than decimal numbers. But writing them down poses a problem. The mathematically correct way is to use a subscript: 040016. That’s awfully hard to do, especially on a computer, so programmers developed alternatives. In modern times, we mostly use the prefix “0x”: 0x0400. In the 8-bit days, however, the convention was a prefixed dollar sign: $0400. Since that’s the tradition for 6502-based systems, that’s what we’ll use here.

The processor

Officially, we’re discussing the MOS Technology 6502 microprocessor. Informally, everybody just calls it the 6502. It’s an 8-bit microprocessor that was originally released in 1975, a full eight years before I was born, in what might as well be the Stone Age in terms of computers. Back then, it was cheap, easy to use, and developer-friendly, exactly what most other processors weren’t. And those qualities gave it its popularity among hobbyists and smaller manufacturers, at a time when most people couldn’t even imagine wanting a computer at home.

Electronically, the 6502 is a microprocessor. Basically, all that really means is that it isn’t meant to do much by itself, but it’s intended to drive other chips. It’s the core of the system, not the whole thing. In the Commodore 64, for example, the 6502 (actually 6510, but close enough) was accompanied by the VIC-II graphics chip, the famous SID chip for sound, and a pair of 6526 microcontrollers for input and output (I/O). Other home computers had their own families of companion chips, and it was almost expected that you’d add peripherals for increased functionality.

Internally, there’s not too much to tell. The 6502 is 8-bit, which means that it works with byte-sized machine words. It’s little-endian, so larger numbers are stored from the lowest byte up. (Example: the decimal number 1024, hexadecimal $0400, would be listed in a memory readout as 00 04.) Most 6502s run at somewhere around 1 MHz, but some were clocked faster, up to 2 MHz.

For an assembly programmer, the processor is very much bare-bones. It can access a 16-bit memory space, which means a total of 65,536 bytes, or 64K. (We’ll ignore the silly “binary” units here.) You’ve got a mere handful of registers, including:

  • The accumulator (A), which is your only real “working” register,
  • Two index registers (X and Y), used for indirect memory access,
  • A stack pointer (SP), which is nominally 16-bit, but the upper byte is hardwired to $01,
  • The processor status register (P), a set of “flag” bits that are used to determine certain conditions,
  • A program counter (PC) that keeps track of the address of the currently-executing assembly instruction.

That’s…not a lot. By contrast, the 8086 had 14 registers (4 general purpose, 2 index, 2 stack pointer, 4 segment registers, an instruction pointer, and the processor flags). Today’s x86-64 processors add quite a few more (e.g., 8 more general purpose, 2 more segment registers that are completely useless in 64-bit mode, 8 floating-point registers, and 8 SIMD registers). But, in 1975, it wasn’t easy to make a microprocessor that sold for $25. Or $100, for that matter, so that’s what you had to work with. The whole early history of personal computers, in fact, is an epic tale of cutting corners. (The ZX81, for example, had a “slow” and a “fast” mode. In fast mode, it disabled video output, because that was the only way the processor could run code at full speed!)

Memory

Because of the general lack of registers inside the processor, memory becomes of paramount importance on the 6502. Now, we don’t think of it much today, but there are two main kinds of memory: read-only (ROM) and writable (“random access”, hence RAM). How ROM and RAM were set up was a detail for the computer manufacturer or dedicated hobbyist; the processor itself didn’t really care. The Apple IIe, for example, had 64K of RAM and 16K of ROM; the Commodore 64 gave you the same amount of RAM, but had a total of 24K of ROM.

Astute readers—anyone who can add—will note that I’ve already said the 6502 had a 16-bit memory address space, which caps at 64K. That’s true. However much memory you really had (e.g., 80K total on the Apple IIe), assembly code could only access 64K of it at a time. Different systems had different ways of coping with this, mostly by using bank switching, where a part of address space could be switched to show a different “window” of the larger memory.

One quirk of the 6502’s memory handling needs to be mentioned, because it forms a very important part of assembly programming on the processor. Since the 6502 is an 8-bit processor, it’s natural to divide memory into pages, each page being 256 bytes. In hexadecimal terms, pages start at $xx00 (where xx can be anything) and run to $xxFF. The key thing to notice is that the higher byte stays the same for every address in the same page. Since the computer only works with bytes, the less we have to cross a page “boundary” (from $03FF to $0400, for instance), the better.

The processor itself even acknowledges this. The zero page, the memory located from $0000 to $00FF, can be used in 6502 assembly as one-byte addresses. And because the 6502 wasn’t that fast to begin with, and memory wasn’t that much slower, it’s almost like having an extra 256 registers! (Of course, much of this precious memory space is reserved on actual home computers, meaning that it’s unavailable for us. Even 6502asm uses two bytes of the zero page, $FE and $FF, for its own purposes.)

Video and everything else

Video display depended almost completely on the additional hardware installed alongside a 6502 processor. The online assembler I’ll be using has a very simplified video system: 32×32 pixels, each pixel taking up one byte, running from address $0200 to $05FF, with 16 possible colors. Typically, actual computers gave you much more. Most of them had a text mode (40×24, 80×25, or something like that) that may or may not have offered colors, along with a high-res mode that was either monochrome or very restricted in colors.

Almost any other function you can think of is also dependent on the system involved, rather than being a part of the 6502 itself. Our online version doesn’t have any extra bells and whistles, so I won’t be covering them in the near future. If interest is high enough, however, I might go back and delve deeper into one of the many emulators available.

Coming up

So that’s pretty much it for the basics of the 6502. A bare handful of registers, not even enough memory to hold the stylesheet for this post, and a bunch of peripherals that were wholly dependent upon the manufacturer of the specific computer you used. And it was still the workhorse of a generation. After four decades, it looks primitive to us, because it is. But every journey starts somewhere, and sometimes we need to go back to a simpler time because it was simpler.

That’s the case here. While I could certainly demonstrate assembly by making something in modern, 64-bit x86 code, it wouldn’t have the same impact. Modern assembly, on a larger scale, is often not much more than a waste of developer resources. But older systems didn’t have the luxury of optimizing compilers and high-level functional programming. For most people in the 80s, you used BASIC to learn how to program, then you switched to assembly when you wanted to make something useful. That was okay, because everybody else had the same limitations, and the system itself was so small that you could be productive with assembly.

In the next post of this series, we’ll actually start looking at the 6502 assembly language. We’ll even take a peek under that hood, going down to the ultimate in low-level, machine code. I hope you’ll enjoy reading about it as much as I am writing it.

Character alignment

If you’ve ever played or even read about Dungeons & Dragons or similar role-playing games (including derivative RPGs like Pathfinder or even computer games like Nethack), you might have heard of the concept of alignment. It’s a component of a character that, in some cases, can play an important role in defining that character. Depending on the Game Master (GM), alignment can be one more thing to note on a character sheet before forgetting it altogether, or it can be a role-playing straitjacket, a constant presence that urges you towards a particular outcome. Good games, of course, place it somewhere between these two extremes.

The concept also has its uses outside of the particulars of RPGs. Specifically, in the realm of fiction, the notion of alignment can be made to work as an extra “label” for a character. Rather than totally defining the character, pigeonholing him into one of a hew boxes, I find that it works better as a starting point. In a couple of words, we can neatly capture a bit of a character’s essence. It doesn’t always work, and it’s far too coarse for much more than a rough draft, but it can neatly convey the core of a character, giving us a foundation.

First, though, we need to know what alignment actually is. In the “traditional” system, it’s a measure of a character’s nature on two different scales. These each have three possible values; elementary multiplication should tell you that we have nine possibilities. Clearly, this isn’t an exact science, but we don’t need it to be. It’s the first step.

One of the two axes in our alignment graph is the time-honored spectrum of good and evil. A character can be Good, Evil, or Neutral. In a game, these would be quite important, as some magic spells detect Evil or only affect Good characters. Also, some GMs refuse to allow players to play Evil characters. For writing, this distinction by itself matters only in certain kinds of fiction, where “good versus evil” morality is a major theme. Mythic fantasy, for example, is one of these.

The second axis is a little harder to define, even among gamers. The possibilities, again, are threefold: Lawful, Chaotic, or Neutral. Broadly, this is a reflection of a character’s willingness to follow laws, customs, and traditions. In RPGs, it tends to have more severe implications than morality (e.g., D&D barbarians can’t be Lawful), but less severe consequences (few spells, for example, only affect Chaotic characters). In non-gaming fiction, I find the Lawful–Chaotic continuum to be more interesting than the Good–Evil one, but that’s just me.

As I said before, there are nine different alignments. Really, all you do is pick one value from either axis: Lawful Good, Neutral Evil, etc. Each of these affects gameplay and character development, at least if the GM wants it to. And, as it happens, each one covers a nice segment of possible characters in fiction. So, let’s take a look at them.

Lawful Good

We’ll start with Lawful Good (LG). In D&D, paladins must be of this alignment, and “paladin” is a pretty good descriptor of it. Lawful Good is the paragon, the chivalrous knight, the holy saint. It’s Superman. LG characters will be Good with a capital G. They’ll fight evil, then turn the Bad Guys over to the authorities, safe in the knowledge that truth and justice will prevail.

The nicey-niceness of Lawful Good can make for some interesting character dynamics, but they’re almost all centered on situations that force the LG character to make a choice between what is legal and what is morally right. A cop or a knight isn’t supposed to kill innocents, but what happens when inaction causes him to? Is war just, even that waged against evil? Is a mass murderer worth saving? LG, at first, seems one-dimensional; in a way, it is. But there’s definitely a story in there. Something like Isaac Asimov’s “Three Laws of Robotics” works here, as does anything with a strict code of morality and honor.

Some LG characters include Superman, obviously, and Eddard Stark of A Song of Ice and Fire (and look where that got him). Real-world examples are harder to come by; a lot of people think they’re Lawful Good (or they aspire to it), but few can actually uphold the ideal.

Neutral Good

You can be good without being Good, and that’s what this alignment is. Neutral Good (NG) is for those that try their best to do the right thing legally, but who aren’t afraid to take matters into their own hands if necessary (but only then). You’re still a Good Guy, but you don’t keep to the same high standards as Lawful Good, nor do you hold others to those standards.

Neutral Good fits any general “good guys” situation, but it can also be more specific. It’s not the perfect paragon that Lawful Good is. NG characters have flaws. They have suspicions. That makes them feel more “real” than LG white knights. The stories for an NG protagonist are easier to write than those for LG, because there are more possibilities. Any good-and-evil story works, for starters. The old “cop gets fired/taken off the case” also fits Neutral Good.

Truly NG characters are hard to find, but good guys that aren’t obviously Lawful or Chaotic fit right in. Obi-Wan Kenobi is a nice example, as Star Wars places a heavy emphasis on morality. The “everyday heroes” we see on the news are usually NG, too, and that’s a whole class that can work in short stories or a serial drama.

Chaotic Good

I’ll admit, I’m biased. I like Chaotic Good (CG) characters, so I can say the most about them, but I’ll try to restrain myself. CG characters are still good guys. They still fight evil. But they do it alone, following their own moral compass that often—but not always—points towards freedom. If laws get in the way of doing good, then a CG hero ignores them, and he worries about the consequences later.

Chaotic Good is the (supposed) alignment of the vigilante, the friendly rogue, the honorable thief, the freedom fighter working against a tyrannical, oppressive government. It’s the guys that want to do what they believe is right, not what they’re told is right. In fiction, especially modern fantasy and sci-fi, when there are characters that can be described as good, they’re usually Chaotic Good. They’re popular for quite a few reasons: everybody likes the underdog, everyone has an inner rebel, and so on. You have a good guy fighting evil, but also fighting the corruption of The System. The stories practically write themselves.

CG characters are everywhere, especially in movies and TV: Batman is one of the most prominent examples from popular culture of the last decade. But Robin Hood is CG, too. In the real world, CG fairly accurately fits most of the heroes of history, those who chose to do the right thing even knowing what it would cost. (If you’re of a religious bent, you could even make the claim that Jesus was CG. I wouldn’t argue.)

Lawful Neutral

Moving away from the good guys, we come to Lawful Neutral (LN). The best way to describe this alignment, I think, is “order above all”. Following the law (or your code of honor, promises, contracts, etc.) is the most important thing. If others come to harm because of it, that’s not your concern. It’s kind of a cold, calculating style, if you ask me, but there’s good to be had in it, and “the needs of the many outweigh the needs of the few” is completely Lawful Neutral in its sentiment.

LN, in my opinion, is hard to write as a protagonist. Maybe that’s my own Chaotic inclination talking. Still, there are plenty of possibilities. A judge is a perfect example of Lawful Neutral, as are beat cops. (More…experienced cops, as well as most lawyers, probably fall under Lawful Evil.) Political and religious leaders both fall under Lawful Neutral, and offer lots of potential. But I think LN works best as the secondary characters. Not the direct protagonist, but not the antagonists, either.

Lawful Neutral, as I said above, best describes anybody whose purpose is upholding the law without judging it. Those people aren’t likely to be called heroes, but they won’t be villains, either, except in the eyes of anarchists.

True Neutral

The intersection of the two alignment axes is the “Neutral Neutral” point, which is most commonly called True Neutral or simply Neutral (N). Most people, by default, go here. Every child is born Neutral. Every animal incapable of comprehending morality or legality is also True Neutral. But some people are there by choice. Whether they’re amoral, or they strive for total balance, or they’re simply too wishy-washy to take a stand, they stay Neutral.

Neutrality, in and of itself, isn’t that exciting. A double dose can be downright boring. But it works great as a starting point. For an origin story, we can have the protagonist begin as True Neutral, only coming to his final alignment as the story progresses. Characters that choose to be Neutral, on the other hand, are harder to justify. They need a reason, although that itself can be cause for a tale. They can make good “third parties”, too, the alternative to the extremes of Good and Evil. In a particularly dark story, even the best characters might never be more “good” than N.

True Neutral people are everywhere, as the people that have no clear leanings in either direction on either axis. Chosen Neutrals, on the other hand, are a little rarer. It tends to be more common as a quality of a group rather than an individual: Zen Buddhism, Switzerland.

Chaotic Neutral

Seasoned gamers are often wary of Chaotic Neutral (CN), if only because it’s often used as the ultimate “get out of jail free” card of alignment. Some people take CN as saying, “I can do whatever I want.” But that’s not it at all. It’s individualism, freedom above all. Egalitarianism, even anarchy. For Chaotic Neutral, the self rules all. That doesn’t mean you have a license to ignore consequences; on the contrary, CN characters will often run right into them. But they’ll chalk that up as another case of The Man holding them back.

If you don’t consider Chaotic Neutral to be synonymous with Chaotic Stupid, then you have a world of character possibilities. Rebels of all kinds fall under CN. Survivalists fit here, too. Stories with a CN protagonist might be full of reflection, or of fights for freedom. Chaotic Neutral antagonists, by contrast, might stray more into the “do what I want” category. In fiction, the alignment tends to show up more in stories where there isn’t a strong sense of morality, where there are no definite good or bad guys. A dystopic sci-fi novel could easily star a CN protagonist, but a socialist utopia would see them as the villains.

Most of the less…savory sorts of rogues are CN, at least those that aren’t outright evil. Stoners and hippies, anarchists and doomsday preppers, all of these also fit into Chaotic Neutral. As for fictional characters, just about any “anti-hero” works here. The Punisher might be one example.

Lawful Evil

Evil, it might be said, is relative. Lawful Evil (LE) might even be described as contentious. I would personally describe it as tyranny, oppression. The police state in fiction is Lawful Evil, as are the police who uphold it and the politicians who created it. For the LE character, the law is the perfect way to exploit people.

All evil works best for the bad guys, and it takes an amazing writer to pull off an Evil protagonist. LE villains, however, are perfect, especially when the hero is Chaotic Good. Greedy corporations, rogue states, and the Machiavellian schemer are all Lawful Evil, and they all make great bad guys. Like CG, Lawful Evil baddies are downright easy to write, although they’re certainly susceptible to overuse.

LE characters abound, nearly always as antagonists. Almost any “evil empire” of fiction is Lawful Evil. The corrupted churches popular in medieval fantasy fall under this alignment, as well. In reality, too, we can find plenty of LE examples: Hitler, the Inquisition, Dick Cheney, the list goes on.

Neutral Evil

Like Neutral Good, Neutral Evil (NE) fits best into stories where morality is key. But it’s also the best alignment to describe the kind of self-serving evil that marks the sociopath. A character who is NE is probably selfish, certainly not above manipulating others for personal gain, but definitely not insane or destructive. Vindictive, maybe.

Neutral Evil characters tend to fall into a couple of major roles. One is the counterpart to NG: the Bad Guy. This is the type you’ll see in stories of pure good and evil. The second is the true villain, the kind of person who sees everyone around him as a tool to be used and—when no longer required—discarded. It’s an amoral sort of evil, more nuanced than either Lawful or Chaotic, and thus more real. It’s easy to truly hate a Neutral Evil character.

Some of the best antagonists in fiction are NE, but so are some of the most clichéd. The superhero’s nemesis tends to be Neutral Evil, unless he’s a madman or a tyrant; the same is true of the bad guys of action movies. Real-life examples also include many corporate executives (studies claim that as many as 90% of the highest-paid CEOs are sociopaths), quite a few hacking groups (those that are doing it for the money, especially), and likely many of the current Republican presidential candidates (the Democrats tend to be Lawful Evil).

Chaotic Evil

The last of our nine alignments, Chaotic Evil (CE) embraces chaos and madness. It’s the alignment of D&D demons, true, but also psychopaths and terrorists. Pathfinder’s “Strategy Guide” describes CE as “Just wants to watch the world burn”, and that’s a pretty good way of putting it.

For a writer, though, Chaotic Evil is almost a trap. It’s almost too easy. CE characters don’t need motivations, or organization, or even coherent plans. They can act out of impulse, which is certainly interesting, but maybe not the best for characterization. It’s absolutely possible to write a Chaotic Evil villain (though probably impossible to write a believably CE anti-hero), but you have to be careful not to give in to him. You can’t let him take over, because he could do anything. Chaos is inherently unpredictable.

Chaotic Evil is easy to find in fiction. Just look at the Joker, or Jason Voorhees, or every summoned demon and Mad King in fantasy literature. And, unfortunately, it’s far too easy to find CE people in our world’s history: Osama bin Laden, Charles Manson, the Unabomber, and a thousand others along the same lines.

In closing

As I stated above, alignment isn’t the whole of a character. It’s not even a part, really. It’s a guideline, a template to quickly find where a character stands. Saying that a protagonist is Chaotic Good, for instance, is a shorthand way of specifying a number of his qualities. It tells a little about him, his goals, his motivations. It even gives us a hint as to his enemies: Lawful and/or Evil characters and groups, those most distant on either alignment axis.

In some RPGs, acting “out of alignment” is a cardinal sin. It certainly is for player characters like D&D paladins, who have to adhere to a strict moral code. (How strict that code is depends on the GM.) For a fictional character in a story, it’s not so bad, but it can be jarring if it happens suddenly. Given time to develop, on the other hand, it’s a way to show the growth of a character’s morality. Good guys turn bad, lawmen go rogue, but not on a whim.

Again, alignment is not a straitjacket to constrain you, but it can be a writing aid. Sure, it doesn’t fit all sizes. As a lot of gamers will tell you, it’s not even necessary for an RPG. But it’s one more tool at our disposal. This simple three-by-three system lets us visualize, at a glance, a complex web of relationships, and that can be invaluable.

Let’s make a language – Part 5c: Verbs (Ardari)

The nouns of our conlang Ardari, you might recall, were quite complex. So you’ll be happy to know that the verbal morphology, by contrast, is actually fairly simple. That doesn’t mean it’s less capable of expressing the full range of description, nor does it mean that everything is entirely straightforward. It’s just a little easier to figure out than the nouns, that’s all.

The shape of a verb

Where Ardari nouns had three main classes, verbs effectively have two. A verbal stem can end in either a consonant or a vowel, and the vowel stems are typically verbs with an intransitive meaning. This isn’t always true, of course, but it will be a fairly effective rule of thumb. There aren’t any genders or cases to worry about, no definite markers or plurals or anything like that. Just two main classes, and they share the same basic conjugation pattern.

We’ll use the same two example words from last time, but they’ll be the Ardari stem forms brin- “walk” and tum- “eat”. See the hyphens? That means that these aren’t words in their own right. They can’t stand alone, but we’ll see how to turn them into proper words.

Concord

Like Isian and many natural languages, Ardari requires agreement markers on its verbs. They’re a bit odd, though, mainly because there are two sets of them, and they don’t exactly mean what you think. First, let’s take a look at them.

Concord Agent Sing. Agent Pl. Pat. Sing. Pat. Pl.
1st Person -o -on -ma -mi
2nd Person -tya -tyi
3rd Person -a -e -da -dyi

As you can see, the forms change for person and number, but also for “agent” and “patient”. These are more technical terms than the usual “subject” and “object”, and for good reason. They don’t quite match up. For transitive verbs, it’s simple: the subject is the agent and the object is the patient. So we can say konatö fèse tumada “the man eats food”. (Verbs usually come at the end of a sentence in Ardari, by the way.)

Intransitive verbs are a little different. For many of them, the same rule applies: the subject is the agent. This is true for our example: brino “I walk”. But some are different. This class of irregular verbs consists mainly of those with less “active” meanings, like minla- “stand”. For these, the subject takes the patient concord markers: minlama “I stand”. Most of these are verbs just like “stand”, in the sense that they’re kind of “static”. I’ll point out those few that act like this as we meet them, but it’s one more thing to watch out for. Fortunately, they’re pretty easy to spot, as they’re mostly the stems that end in vowels.

Also, there’s a special concord marker -y. This is used in two main places. First, Ardari uses this for “weather” verbs, where English would have a dummy “it” as subject, as in luvy “it’s raining”. Second, any transitive verb can take it to make a passive-like construction: fèsetö tumyd “the food was eaten”, using the preterite tense marker we’ll see in a second.

Tense and aspect

Ardari has a total of seven classes that are effectively combinations of tense and aspect. Each of them (except the present, which is considered the default) has its own suffix, and that suffix goes after the concord markers above. The choices are:

  • -s: A present “progressive” that indicates an ongoing action: fèse tumodas “I am eating food”.

  • -d: The preterite, which is effectively a past tense, but always implies a completed action: fèse tumodad “I ate food”.

  • -dyt: Usually a past tense to reference actions that were ongoing at the moment in question: fèse tumodadyt “I was eating food”.

  • -jan: Implies that an event began in the past: fèsetö tumodajan “I began to eat the food”. (The technical term is inceptive.)

  • -ll: A basic future tense: fèse tumodall “I will eat food”.

  • -lyët: Used for speaking of events that will end in the future: fèsetö tumodalyët “I’m about to finish eating the food”. (Technically known as a cessative.)

Mood markers

In Ardari, a change in mood is handled by a separate set of suffixes that follow the tense markers. These include:

  • -u (-ru when following a vowel): A simple negation marker: brinaru “he doesn’t walk” or brinasu “he isn’t walking”. This can replace the final vowel of most of the other mood markers.

  • -ka (-ga when following voiced consonants): A subjunctive, mostly used for various types of phrases we’ll see later.

  • -afi (-rafi after a vowel): A conditional mood that states that another action depends on this one: brinarafi “if I walk”.

  • -je: An imperative, for giving commands or orders, but also used to express a desire, hope, or even a call to action: tumje “let’s eat”. (When combined with the negative marker, it becomes -ju, and it’s technically called the prohibitive.)

  • -rha: This one’s a little hard to explain, but it implies that the speaker assumes or otherwise doesn’t know for sure that the action has taken place: fèsetö tumadadrha “he ate the food, as far as I know”. (This is very much a rough translation.)

Most of these can be combined. The negative marker works with pretty much all the others, and the “indirect” -rha goes with anything but the imperative. The conditional and subjunctive are mutually exclusive, though, and the imperative doesn’t make sense with anything else. In total, there are 14 sensible combinations:

  • (no suffix): indicative
  • -ka: subjunctive
  • -afi: conditional
  • -je: imperative
  • -u: indicative negative
  • -ku: subjunctive negative
  • -afu: conditional negative
  • -ju: imperative negative (prohibitive)
  • -rha: indicative indirect
  • -karha: subjunctive indirect
  • -afirha: conditional indirect
  • -rhu: indicative indirect negative
  • -karhu: subjunctive indirect negative
  • -afirhu: conditional indirect negative

These 14 moods, combined with the seven tense suffixes and the 31 possibilities for concord give Ardari just over three thousand forms for each verb, but they’re all so regular and predictable that we don’t have to worry about ever memorizing anything like that. Instead, we can just build up a verb piece by piece. That’s the power of the agglutinative style of language.

Vocabulary

That’s pretty much it for the basics of Ardari verbs. There’s a lot more to them, but we’ll cover everything else in a later post. For now, here are some new words, including all the new verbs I’ve used so far. With the exceptions of minla- and luz-, these are all perfectly regular, even the one for “to be”.

  • to be: èll-
  • to become: onyir-
  • to seem: ègr-
  • to stand: minla-
  • to have: per-
  • to come: ton-
  • to go: shin-
  • to drink: kabus-
  • to laugh: jejs-
  • to hold: yfily-
  • to hear: ablon-
  • to wash: oznèr-
  • to cook: lòsty-
  • to speak: sim-
  • to call: qon-
  • to read: proz-
  • to write: farn-
  • to want: majtas-
  • to rain: luz-

Next time

The next post will be about word order, so that we can finally start constructing sentences in our constructed languages. After that will be the third part of the trinity of word categories, the adjective. We’re really starting to flesh out both our conlangs. Pretty soon, we’ll be able to write a whole story in them.

Assembly: narrowing the field

Okay, I guess I told a little fib. But I’ve got a good reason. I’ve been thinking this over for a while, and I still haven’t come to a decision. I want to do a few posts introducing an assembly language and the “style” of programming it, but I don’t know which assembly language to use. So, in the interest of better informing you (and giving myself an extra week to figure out an answer), I’ve decided to put some of my thought processes into words.

The contenders

I know a handful of assembly variants, and I know of quite a few more, especially if you count different versions of architectures as separate languages. Some of these are more useful for introductory purposes than others, while some are much more useful in actual work. These two subsets, unfortunately, don’t overlap much. But I’ve come up with four finalists for this coveted crown, and here they are, complete with my own justifications.

6502

If you’re older than about 25, then you’ve probably used a 6502-based system before, whether you knew it or not. It was a mainstay for 80s “home computers”, including the Apple II and Commodore 64, and a customized version was used in the NES. It’s still a hobbyist favorite, mostly for nostalgia reasons rather than any technical superiority, and there’s no sign that it will ever become truly extinct. (Bender, after all, has one in his head, which explains a lot.)

Pros:

  • 8-bit processors are about as simple as you can get while still being useful.
  • There’s a lot of work out there already, in terms of tutorials, programming guides, etc.
  • An online assembler exists, which makes things much easier.
  • Plenty of emulators are available, although these are usually for specific 6502 computers.

Cons:

  • 8-bit can be very limiting, mostly because it is so simple.
  • There aren’t many registers, which slows down a lot of work and means a lot of memory trickery.
  • Despite what its followers might think, 6502 is pretty much a dead end for development, as it has been for about 20 years.
Early x86

By “early”, I specifically mean the 16-bit x86 processors, the 8086 and 80286. These are the CPUs of the first IBM-compatible personal computers, and the ancestors of the i5 and A10 we use today. You would think that would give it direct relevance, but you’d actually be wrong. Today’s x86 processors, when running in 64-bit mode, actually can’t directly run 16-bit code. But we’d be using an emulator of some sort, anyway, so that’s not a problem that would concern us.

Pros:

  • Very familiar and widespread, as (a descendent of) x86 is still used in just about every PC today.
  • 16-bit isn’t that much more complicated than 8-bit.
  • All those old DOS assembly tutorials are out there, somewhere, and most of them contain useful information.
  • Even though current processors can’t execute 16-bit code directly, you can still use one of the dozens of emulators out there, including DOSBox.

Cons:

  • The segmented memory architecture is weird and hard to explain.
  • It’s easy to fall into the trap of using skills and tricks that were relevant here in more modern applications, where they simply don’t carry over.
  • A lot of people just don’t like x86 and think it’s horrible; I don’t understand this, but I respect it.
AVR

Atmel’s AVR line of microcontrollers is pretty popular in the embedded world. One of them powers the Arduino, for example. Thus, there’s actually a built-in market for AVR assembly, although most programmers now use C. Of course, AVR has its own problems, not the least is its odd way of segregating program and data memory.

Pros:

  • Very relevant today as an embedded platform.
  • A ton of support online, including tutorials, forums, etc.
  • The assembly language and architecture, despite a few warts, is actually nice, in my opinion.
  • Lots of good tools, including a port of GCC.

Cons:

  • Emulator quality is hit or miss. (AVR Studio was okay when I used it 4 years ago, but it’s non-free and Windows only, and the free options are mostly beta quality.)
  • The Harvard architecture (totally separate memory spaces for code and data) is used by almost nothing else today, and it’s cumbersome at best.
  • AVR is very much a microcontroller platform, not a microprocessor one. It’s intended for use in embedded systems, not PCs.
MIPS

MIPS is a bit of an oddball. Sure, it’s used in a few places out there. There’s a port of Android to it, and MIPS CPUs were used in most of the 90s consoles, like the N64 and PlayStation. There’s not much modern development on it, though, except in the Chinese Loongson, which started out as a knock-off, but then became a true MIPS implementation. But its true value seems to be in its assembly language, which is often recommended as a good way to learn the ropes.

Pros:

  • Fairly simple assembly language and a sensible architecture.
  • Tools are widespread, including cross-compilers, emulators, and even native versions of Linux.
  • RISC, if you like that. After all, “RISC is good”, to quote Hackers.

Cons:

  • Not quite as relevant as it once was. (If I had written this 20 years ago, the choice would probably be between this and x86.)
  • A bit more complex than the others, which actually removes some of the need for assembly.
  • I don’t really know it that well, so I would have to learn it first.

The also-rans

Those aren’t the only assembly language platforms out there. There are quite a few that are popular enough that they could fit here, but I didn’t pick them for whatever reason. Some of them include:

  • PowerPC: Used in Macs from about a decade ago, as well as the consoles of the 2000s (GameCube, Wii, PS3, XBox 360), but I don’t know much about it, and it’s mostly on servers now.

  • 68000: The 16-bit CPU from the Sega Genesis and the original Macintosh, later developed into a true 32-bit processor. It has its supporters, and it’s not that bad, but again, I don’t know it like I do the others.

  • Z80: This one was used by a few home computers, like the ZX80, Kaypro, and (my favorite) the TRS-80. It’s a transparent remake of the 8080 (forerunner to the 8086) with just enough detail changed to avoid lawsuits. But I know x86 better.

  • PIC: One of the most popular architectures in the 8-bit embedded world. I’ve read a little bit about it, and I don’t exactly like what I see. Its assembly requires a few contortions that I think distract from the task at hand.

  • ARM: The elephant in the room. Technically, ARM is a whole family of architectures, each slightly different, and that is the main problem. The tools are great, the assembly language isn’t half bad (but increasingly less necessary). It’s just that there’s too much choice.

  • MIX: Donald Knuth invented this fictional assembly language for his series The Art of Computer Programming. Then, after 30 years of work, he scrapped it for the next edition, replacing it with the “modern” MMIX. Neither one of them works here, in my opinion. MMIX has a lack of tool support (since it’s not a “real” machine language) and the best tutorial is Knuth’s book. MIX is even worse, since it’s based on the horrible architectures of the 60s and early 70s.

Better times ahead

Hopefully, I’ll come to a decision soon. Then I can start the hard part: actually making assembly language interesting. Wish me luck.