Novel Month 2015 – Day 5, morning

I started writing late last night, so I figured I’d roll it into a single post.

Chapter 2 is in full swing. I still haven’t gotten out of the exposition phase, but things are developing. It’s a different POV, but one I can empathize with, which makes the whole thing go faster. I’d say I’m about a third of the way done with it. And that means I’m running out of time to figure out where the story goes from here.

On the off chance that I write some more this morning, I’ll throw in a “late morning” post or something. Otherwise, see you tonight or early tomorrow.

This session’s word count: 2,177
Total word count: 8,891

Novel Month 2015 – Day 4, morning

I’m ready to go to bed.

I finished Chapter 1. I won’t say it’s the best thing I’ve ever written, but it’s not supposed to be. It’s pretty much all worldbuilding, setting the stage for the rest of the story. I’m still not sure what that story is, but I’ve got time to work that out.

This session’s word count: 1,518
Total word count: 6,714

Novel Month 2015 – Day 3, continued

So I’m feeling a bit better, and I’m definitely more rested. (I slept till 4:30 PM!)

I finished the penultimate scene of Chapter 1, and I plan on completing the chapter before I go to bed. It’s all still exposition and worldbuilding at this point, but the story is coming together.

I’m ahead of the pace for now, and I hope to keep it that way. My weird sleeping schedule means I’ll lose a day somewhere before the end of November, so it’s important that I keep moving.

This session’s word count: 828 (Seems silly to call it “today’s” count at this point.)
Total word count: 5,196

Novel Month 2015 – Day 2, continued

I told you I’d be back.

I slept really late today, and that’s probably going to be the case for the rest of this week, so I’ll likely be doing these “split” updates the next few days. I’ve been writing some more on Chapter 1, and I’m close to the 3/4 mark. Maybe I can finish it late tonight. If not, it’ll definitely be tomorrow.

Today’s word count: 1,086
Total word count: 3,873

Novel Month 2015 – Day 2

It’s after midnight, so this counts as a new day, right?

I got bored, and I’m not sleepy, so I wrote a bit more. It’s not much, just finishing the scene I was in yesterday evening. If nothing else, it’s that much less I’d have to write later. This all but finishes the first half of Chapter 1, and it gives me time to think, so I’m calling it a win.

Whatever I write after I wake up will probably go in a continuation post tonight.

Today’s word count: 950
Total word count: 2,787

Novel Month 2015 – Day 1

Let’s get this started.

The first day could’ve gone better, but I beat the average. Most of the writing was setup work, building the bridge from the first part of the overarching story. I’d say I’m about a third of the way through Chapter 1, maybe a little less. I had planned for more, but I’ll take it.

I’m using the built-in word count in Vim, so this may not reflect the actual count, but it should be close enough. If we get to the end of the month and I’m that close, then I didn’t do a good enough job.

Today’s word count: 1,837
Total word count: 1,837

At the starting gate

When this post goes up, it’ll be Halloween, even though I’m writing it a couple of days ahead of time. Tomorrow, then, will be November 1st, and that means it’s time to write a novel. Officially, this isn’t NaNoWriMo, because I’m not following their rules to the letter. But I am going by what I feel is the original spirit of the challenge.

So here’s the goal: 50,000 words or a complete novel, whichever comes first. The deadline? Midnight on the 30th. Each day, I’ll try to post a little update about my progress. This certainly won’t be some kind of live blog, though, so don’t expect up-to-the-minute results. After all, I can only write so much. Regular posts (writing stuff on Mondays, code on Wednesdays, conlangs on Fridays) will resume December 2. Until then, I’ll be in hardcore writing mode.

I already have the basic idea for the story I’ll be writing. It’s a continuation of the one I did in 2013. To be honest, I have written parts 2 and 3, along with about half of part 4, but I’ve decided to scrap that work, because I have a better understanding of the setting now, and the old parts don’t fit into it anymore. (Technically, NaNoWriMo requires an original story, and you’re not supposed to start thinking about it until October. Yet another reason why I’m not following the letter of the rules.)

Now, my sleeping schedule is a bit…odd, and my writing schedule is even worse, so I’m not going to schedule these daily updates like I have been with everything else on the site. They’ll go up when I feel I’m “done” writing for the day. That may be at 2 PM or 2 AM. There’s not much I can do about that, short of forcing myself to stay on a schedule, and…let’s just say that circumstances tend conspire against that.

If you want to play along at home, that’s great! Whether you stick to the NaNoWriMo rules or follow my lead and take it easy, just go for it. If you can’t do it, there’s always next year.

Let’s make a language – Part 8b: Pronouns (Conlangs)

We’ve gotten away with neglecting pronouns in our budding conlangs of Isian and Ardari so far, but now the time has come to fill the gap. Now, we’ll give both of them a nice set of pronouns to use, checking off all the boxes from the last theory post.

Isian

Isian will have a fair amount of complexity in its pronominal system, and it will contain more than one irregularity. In that sense, we’re making it much like the languages common in the West.

If you’ll recall, Isian doesn’t use case on its nouns, much like English. But we will have personal pronouns that change depending on their role in a sentence. Specifically, Isian has, for most of them, a subject, object, and possessive form. Here’s the full list:

Pronouns Subject Object Possessive
1st Singular em men mi
1st Plural mit mida mich
2nd Person so tas ti
3rd M. Sing. i im ey
3rd F. Sing. sha shim shi
3rd M. Pl. is sim si
3rd F. Pl. shas sham shay

In the third person, there are separate pronouns for masculine and feminine; unlike English, the plural also changes for gender. (Masculine is the default in “formal” Isian, but we’ll see a way to change that in a moment.)

We can use the subject and object pronouns in sentences anywhere a noun would go: sha fusas men “she kissed me”; em hame tas “I love you”. The possessive pronouns, however, function more like articles, and they always go at the beginning of a noun phrase: mi doyan “my brother”; ey wa talar “their big house”.

We also have a “generic” third-person pronoun, which doesn’t change for case. In the singular, it’s ed, while the plural form is des. This can be used like the English generic “you” or “one”: ed las an yoweni “you can’t enter”. In informal speech, we can also use these as genderless personal pronouns, more like English singular “they”: ed an daliga e talar “they don’t live in the house”.

Finally, we have the reflexive or intensive pronoun lan. This covers the functions of all of English’s “-self” pronouns all by, well, itself: e sam sipes lan “the man cut himself”; e esher hishis lan “the girls washed themselves”; em ocata lan “I asked myself”.

Beyond the personal pronouns, we have a couple more classes. We’ll start with Isian’s demonstratives, which come in distinct singular and plural forms. For near things, we have the singular ne and plural nes. Far things are denoted by to and tos. These four words are close in meaning and scope to English “this”, “these”, “that”, and “those”, respectively, and they can be used in much the same way, either as independent pronouns or like adjectives: nes “these”, nes jedi “these boys”.

Next are the interrogatives, or question words. Isian has two of these. For people, we use con, while things take cal. All the other possible questions (where, when, etc.) can be made from compounds or phrases based on one of these, which we’ll see in a later post, when we look at forming questions.

More relevant to today’s subject are the indefinite pronouns, which are derived from the question words. We have four pairs of these, each of them created by means of a prefix:

  • je- “some”: jecon “someone”, jecal “something”.
  • es- “any”: escon “anyone”, escal “anything”.
  • licha- “every”: lichacon “everybody”, lichacal “everything”.
  • ano- “none”: anocon “nobody” or “no one”; anocal “nothing”.

Finally, “standard” Isian (assuming a culture that has such a thing) doesn’t normally allow pronoun omission, or pro-drop. We’ve been using it so far, but that’s because we didn’t have any pronouns up to this point. Our hypothetical speakers of Isian would find it a little informal, though.

Ardari

Ardari has quite a few more pronouns than Isian, but the idea is still the same. First, let’s take a look at the personal pronouns:

Pronouns Subject Object Possessive
1st Singular my myne mynin
1st Excl. Plural nyr nyran nyri
1st Incl. Plural sinyr sinran sinri
2nd Informal sy syne synin
2nd Form. Sing. tro trone tronin
2nd Form. Pl. trowar trone tronin
3rd Masc. Sing. a anön ani
3rd Masc. Pl. ajo ajon oj
3rd Fem. Sing. ti tise tini
3rd Fem. Pl. tir ti tisin
3rd Neuter Sing. ys yse ysin
3rd Neuter Pl. ysar ysar ysoj
Impersonal mantö manetö manintö

That looks like a lot, but it’s really not too much. It’s the different distinctions that Ardari makes that can be hard to understand. The cases are largely the same as they were in the simpler conlang. It’s the left-hand column where the complexity lies.

For the first person, the singular should be obvious. But we have two plurals, labeled “exclusive” and “inclusive”. Which one to use is determined by whether you want to include the listener in the action. If you do, you use the inclusive; otherwise, you need the exclusive.

The second person again has a distinction unfamiliar to speakers of English, but this one shows up in plenty of other languages. The informal is used, surprisingly enough, in informal situations, such as among friends, and it works for singular and plural. The formal is for people you don’t know as well, when you need to show deference, or similar situations. It does change for the plural, but only if it’s the subject.

The third person shouldn’t be that hard to figure out. Remember that Ardari has masculine, feminine, and neuter. Here, we can use the neuter for the case of the unknown or of mixed gender; it doesn’t carry the same connotations of inhumanity as English “it”.

The impersonal form can be used for generic instances and cases where you’re not sure which person is right; it’s transparently derived from man “one”, with the definite article attached.

Reflexive pronouns can be made by adding the regular suffix -das to any object pronoun: mynedas “myself”; anöndas “himself”. Attach it to a subject pronoun, and you get an intensive meaning: mydas “I myself”.

And then we have a special, irregular pronoun lataj. This one roughly means “each other”, and it’s used anywhere you’d need a “reciprocal” meaning: ysar lataj salmedi “they love each other”.

Finally, to add flavor and that hint of verisimilitude, Ardari has vocative forms of a few pronouns. These are: second-person formal troda and plural trodavar; third-person masculine anaj and aja; third-person feminine tija (singular and plural); and third-person neuter singular ys.

Of course, few of these are really needed in Ardari, because the language employs pro-drop liberally, thanks to the concord marking on verbs. If you can get away without a subject or even object pronoun, our hypothetical Ardari speakers would, except in the most formal situations.

For demonstratives, we have a threefold division. The table below shows the “determiner” form; separate pronouns can be made by adding the suffix -man. (Literally, zaman translates to “this one”, and so on for the rest.)

Near Middle Far
Masc. Sing. za pro gyon
Fem. Sing. zi pri gyen
Neut. Sing. zall prall alyör
Plural zej prej ejn

“Near” is those things near or known only by the speaker, or something specifically referred to recently in conversation, so that both speaker and hearer know it. “Middle” is used for things closer to the listener, or something that is well-known to both parties but absent. The “Far” demonstratives are used for those things that are far away from both speaker and listener, are not known to the listener at all, or are speculative in some way.

A few examples of these, since there are so many, and they don’t fit the same pattern as English:

  • ablonyje zallman “listen to this”; uses the “near” form because the speaker knows it, but the listener doesn’t.

  • sinyr prallman virdondall “we’ll sell that one”; takes the middle form, indicating something nearby and known to both parties.

  • mynin tyeri ejnman majtasa “my daughter wants some of those”; the far form connotes something that neither the speaker nor the listener has.

After all that, the interrogatives are easy. In fact, they’re all derived from a single word, qom “what”. From this, we get qomban “who”, qomren “where”, qomlajch “when”, and qoman “which (one)”. These inflect like any other neuter noun, but they can’t take an article suffix.

Indefinite pronouns can be formed from these just like in Isian. (Call it linguistic borrowing or author laziness, the effect is the same.) We have four possibilities here: ta “some”, za- “every”, du- “no”, and manö- “any”. Making whatever you need is as simple as slapping these in front of an interrogative: taqomban “someone”, zaqom “everything”, and so on.

Pausing the game

After this post, the series is going on temporary hiatus. You’ll see why tomorrow, but I’ll be back with more conlanging action on December 4. In the meantime, have fun playing with Isian, Ardari, or your own language.

When I come back, we’ll work on prepositional phrases, relative clauses, and whatever else I can think of. Then, for the start of the new year, you’ll get to see the first significant writing in both languages.

Assembly: optimization in the past and present

In this post, I won’t be discussing assembly language in any depth. Rather, I want to focus on one of the main reasons to use assembly: optimization. Actually, it might be the main reason today, because there’s not much need for assembly coding these days; it’s only when we want something to be as fast as possible that it comes into play.

Also, I’m moving away from the 6502 for this one, instead using the x86 architecture for my examples. Why? Because x86 is still the leading processor family for desktops, and so much has been written about it over the decades. There’s a lot of optimization info out there, far more than for just about any other architecture. Yes, ARM is growing, especially in the lower-end part of the market where assembly can still be very useful, but ARM—due to its very nature—is so fragmented that it’s hard to give concrete examples. Also, because x86 is so long-lived, we can trace the development of various processor features through its evolution. For that, though, we’ll need a bit of a history lesson.

Starting gates

The first microprocessors, from a bird’s-eye view, weren’t exactly complicated in function. They took an instruction from memory, decoded it, executed it, then moved on, sometimes jumping around the executable code space. Decoding each instruction and performing it were the only real hard parts. That’s one reason why RISC was so highly touted, as the smaller, more fundamental instruction set required less chip space for decoding and execution. (Look up something like VAX assembly for the opposite—CISC—end of the spectrum.)

Fetching the instruction was a simple load from memory, something every processor does as a matter of course. Decoding required a major portion of the circuit (the 6502 used a programmable array a bit like a modern FPGA, except that its function was fixed in the factory) but a comparatively small portion of processor time. Executing could require more memory accesses for the instruction’s operands, and it could take a lot of time, especially for something complex like multiplication—an ability the 6502, among others, lacks.

The birth of parallelism

But memory has always been slower than the processor itself. On all but the most complicated instructions, memory access takes the most time of any phase of execution. Thus, the prefetch queue was born. In a sense, this was the forerunner of today’s cache. Basically, it tried to predict the future by fetching the next few bytes from memory. That way, the large time constants required for RAM access could be amortized.

The problem with the prefetch queue, as with all cache, comes with branching. Branches, as we saw in earlier posts, are the key to making decisions in assembly language. But they force the processor to jump around, instead of following a linear path through memory. A branch, then, negates the advantage of the prefetch queue.

Processor designers (and assembly programmers) tried a few methods of working around the cost of branching. That’s why, at a certain time long ago, loop unrolling was considered a very important optimization technique. If you need to run a particular group of instructions, say, ten times, then it was a bit faster to “copy and paste” the assembly instructions than it was to set up a branching loop. It used more space, but the speed gains made up for that.

Another optimization trick was rearranging the branch instructions so that they would fail more often than not. For example, the x86 has a pair of instructions, JZ and JNZ, that branch if the zero flag is set or clear, respectively. (This is equivalent to the 6502’s BEQ and BNE, except that the x86 has more registers and instructions that can change it.) If you have a section of code that is run only when an argument is 0, and 0 rarely shows up, the naive way of writing it would be to skip over that section with a JNZ. But it might be faster (on these earlier processors, at least) to put the “only if 0” code at the end of the subroutine (or some other place that’s out of the way) and use JZ to branch to it when you need it.

In the pipeline

Eventually, the interests of speed caused a fundamental shift in the way processors were made. This was the birth of the pipeline, which opened a whole new world of possibilities, but also brought new optimization problems. The prefetch queue described above was one of the first visible effects of pipelining, but not the last.

The idea of a pipeline is that the processor’s main purpose, executing code, is broken into smaller tasks, each given over to a dedicated circuit. These can then work on their own, like stations on an assembly line. The instruction fetcher gets the next instruction, passes it on to the decoder, and so on. A well-oiled machine, in theory. In practice, it’s hard to get all the parts to work as one, and sometimes the pipeline would be stalled, waiting on one part to finish its work.

The beauty of the pipeline is that each stage is distinctly ordered. Once an instruction has been retrieved, the fetcher isn’t needed, so it can do something else. Specifically, it can fetch the next instruction. If the timing works out, it can fill up the prefetch queue and keep it topped off when it has the free time.

Fortune-telling

But branches are the wrenches in the works. Since they break the linear flow of instructions, they force the pipeline to stall. This is where the processor designers had to get smart. They had to find a way of predicting the future, and thus branch prediction was popularized.

When it works, branch prediction can completely negate the cost of a conditional jump. (Of course, when it fails, it stalls the whole thing, but that’s no worse than not predicting at all.) From an assembly language point of view, it means that we could mostly ditch the clever tricks like loop unrolling and condition negation. They would still have their uses, but they wouldn’t need to be quite so common. That’s a good thing, because the extra code size brought by loop unrolling affected another part of these newfangled processors: the cache.

Cache really came about as another way to make memory access faster. The method was a little roundabout, but it worked, and cache has stuck with us through today. It’s only getting bigger, too; most of the physical space on today’s processors is, in fact, cache memory. Many chips actually have more memory on the die than the 4 MB my first PC had in total.

The trick to cache comes from looking at how code accesses memory. As it turns out, there’s a pattern, and it’s called the principle of locality. Put simply, reading one memory location is a pretty good indicator that you’re going to be reading the ones right next to it. If we could just load all of those at once, then we’d save a ton of time. So that’s what they did. Instead of loading memory a byte or a word at a time, they started pulling them in 16 or more at once. And it was faster, but only while you stayed in the region of memory loaded into the cache.

Soon, cache became not fast enough, not big enough, and they had to find ways to fix both of these problems. And that’s where we are today. Modern x86 chips have three levels of cache. The first, L1, is the smallest, but also the fastest. L2 cache is a bit slower, but there’s more of it. And L3 is the slowest (though still faster than RAM), but big enough to hold the entirety of, say, Windows 95.

The present state of affairs

So now the optimization strategy once again focuses on space. Speed is mostly a non-factor, as the desktop x86 processors can execute most of their instructions in a single clock cycle, branch prediction saves us from the cost of jumps, and huge amounts of cache mean fewer of the horrifically slow memory accesses. But cache is limited, especially the ultra-fast L1. Every instruction counts, and we want them to all be as close together as possible. (Data is the same way, but we’ll ignore it for now.) Unrolling loops, for example, is a waste of valuable cache.

A few other optimizations have been lost along the way, made obsolete by the march of progress. One of my favorite examples is that of clearing a register. It’s a common need in assembly language, and the favored method of doing it early in the x86 days was by using the XOR instruction, e.g., XOR AX, AX. Exclusive-OR, when given the same value twice, always produces 0, and this method was quicker (and shorter) than loading an immediate 0 value (MOV AX, 0).

The self-XOR trick was undone by a combination of factors. The first was register renaming, which essentially gave the processor a bunch of “virtual” registers that it could use as internal scratch space, moving data to and from the “real” ones as needed. The second was out-of-order execution, and that takes a little more explaining.

If you’ve ever looked at the optimized assembly output of your favorite high-level compiler (and you should, at least once), then you might have noticed that things aren’t always where you put them. Every language allows some form of rearranging, as long as the compiler can prove that it won’t affect the outcome of the program. For example, take a look at this C snippet:

int a, d;
int b = 4;
int c = 5;
for (a = 0; a < b; a++) {
    f(a);
}
d = b * c;

The final statement, assigning the value 20 to d, can be moved before the loop, since there’s no way that loop can change the value of b or c; the only thing the loop changes, a, has nothing to do with any other variable. (We’re assuming more code than this, otherwise the compiler would replace b, c, and d all with the simple constant 20.)

A processor can do this on the assembly level, too. But it has the same restriction: it can only rearrange instructions if there are no dependencies between them. And that’s where our little XOR broke. Because it uses the same register for source and destination, it created a choke point. If the next few instructions read from the AX register, they had to wait. (I don’t know for sure, but I’ve heard that modern processors have a special case just for self-XOR, so it lives again.)

Ending the beginning

This has been a bit of a rambling post, I’ll admit. But the key point is this: optimization is a moving target. What might have worked long ago might not today. (There’s a reason Gentoo is so fast, and it’s because of this very thing.) As processors advance, assembly programmers need new tricks. And we’re not the only ones. Compilers produce virtually all assembly that runs these days, but somebody has to give them the knowledge of new optimization techniques.