The JavaScript package problem

One of the first lessons every budding programmer learns is a very simple, very logical one: don’t reinvent the wheel. Chances are, most of the internal work of whatever it is you’re coding has already been done before, most likely by someone better at it than you. So, instead of writing a complex math function like, say, an FFT, you should probably hunt down a library that’s already made, and use that. Indeed, that was one of the first advances in computer science, the notion of reusable code.

Of course, there are good reasons to reinvent the wheel. One, it’s a learning experience. You’ll never truly understand the steps of an algorithm until you implement them yourself. That doesn’t mean you should go and use your own string library instead of the standard one, but creating one for fun and education can be very rewarding.

Another reason is that you really might be able to improve on the “standard”. A custom version of a standard function or algorithm might be a better fit for the data you’ll be working with. Or, to take this in another direction, the existing libraries might all have some fatal flaw. Maybe they use the wrong license, or they’re too hard to integrate, or you’d have to write a wrapper that’s no less complicated than the library itself.

Last of all, there might not be a wheel already invented, at least not in the language you’re using. And that brings us to JavaScript.

Bare bones

JavaScript has three main incarnations these days. First, we have the “classic” JS in the browser, where it’s now used for pretty much everything. Next is the server-side style exemplified by Node, which is basically the same thing, but without the DOM. (Whether that’s a good or bad thing is debatable.) Finally, there’s embedded JavaScript, which uses something like Google’s V8 to make JS work as a general scripting language.

Each of these has its own quirks, but they all share one thing in common. JavaScript doesn’t have much of a standard library. It really doesn’t. I mean, we’re talking about a language that, in its standardized form, has no general I/O. (console.log isn’t required to exist, while alert and document.write only work in the browser.) It’s not like Python, where you get builtin functions for everything from creating ZIP files to parsing XML to sending email. No, JS is bare-bones.

Well, that’s not necessarily a problem. Every Perl coder knows about CPAN, a vast collection of modules that contains everything you want, most things you don’t, and a lot that make you question the sanity of their creators. (Question no longer. They’re Perl programmers. They’ve long since lost their sanity.) Other languages have created similar constructs, such as Python’s PyPi (or whatever they’re using these days), Ruby’s gems, the TeX CTAN collection, and so on. Whatever you use, chances are you’ve got a pretty good set of not-quite-standard libraries, modules, and the like just waiting to be used.

So what about JavaScript? What does it have? That would be npm, which quite transparently started out as the Node Package Manager. Thanks to the increase in JS tooling in recent years, it’s grown to become a kind of general JavaScript repository manager, and the site backing it contains far more than just Node packages today. It’s a lot more…democratic than some other languages, and JavaScript’s status as the hipster language du jour has given it a quality that sometimes seems a bit questionable, but there’s no denying that it covers almost everything a JS programmer could need. And therein lies the problem.

The little things

The UNIX philosophy is often stated as, “Do one thing, and do it well.” JavaScript programmers have taken that to heart, and they’ve taken it to the extreme, and that has caused a serious problem. See, because JS has such a small standard library, there are a lot of little utility functions, functions that pop up in almost any sizable codebase, that aren’t going to be there.

For most languages, this would be no trouble. With C++, for instance, you’d link in Boost, and the compiler would only add in the parts you actually use. Java or C#? If you don’t import it, it won’t go in. And so on down the line, with one glaring exception.

Because JavaScript was originally made for the browser—because it was never really intended for application development—it has no capability for importing or even basic encapsulation. Node and recent versions of ECMAScript are working on this, but support is far from universal at this point. Worse, since JavaScript comes as plain text, rather than some intermediate or native binary format, even unused code wastes space and bandwidth. There’s no compilation step between server and client, and there’s no way to take only the parts of a library that you need, so evolutionary pressure has caused the JavaScript ecosystem to create a somewhat surprising solution.

That is the NPM solution: lots of tiny packages with myriad interdependencies, and a package manager that integrates with the build system to put everything together in an optimized bundle. JavaScript, of course, has no end of build systems, which come in and out of style like seasonal fashions. I haven’t really looked into this space in about eight months, and my knowledge is already obsolete! (Who uses Bower anymore? It’s all Webpack…I think. Unless something else has replaced by the time this post goes up.)

This is a prime example of the UNIX philosophy in action, and it can work. Linux package managers do it all the time: for reference, my desktop runs Debian, and it has about 2000 packages installed, most of which are simple C or C++ libraries, or else “data” packages used by actual applications. But I’m not so sure it works for JavaScript.

Picking up the pieces

From that one design decision—JavaScript sent as plain text—comes the necessity of small packages, but some developers have taken that a bit too far. In what other language would you need to import a new module to test if a number is positive? Or to pad the left side of a string? The JS standard library provides neither function, so coders have created npm packages for both, and those are only two of the most egregious examples. (Some might even be jokes, like the one package that does nothing but return the number 5, but it’s often hard to tell what’s serious and what isn’t. Think Poe’s Law for programmers.)

These wouldn’t be so bad, but they’re one more thing to remember to import, one more step to add into the build system. And the JavaScript philosophy, along with the bandwidth requirements its design enforces, combine to make “utility” libraries a nonstarter. Almost nobody uses bigger libraries like Underscore or Lodash these days; why bother adding in all that extra code you don’t need? People have to download that! The same even goes for old standbys like jQuery.

The push, then, is for ever more tiny libraries, each with only one use, one function, one export. Which wouldn’t be so bad, except that larger packages—you know, applications—can depend on all these little pieces. Or they can depend on each other. Or both, with the end result a spaghetti tangle of interdependent parts. And what happens if one of those parts disappears?

You might think that’s crazy, but it did happen. That “left pad” function I mentioned earlier? That one actually did vanish, thanks to a rogue developer, and it broke everything. So many applications and app libraries depended on little old leftpad, often indirectly and without even noticing, that its disappearance left them unable to update, unable to even install. For a few brief moments, half the JavaScript world was paralyzed because a package providing a one-line function, something that, in a language with simple string concatenation, comes essentially for free, was removed from the main code-sharing repository.

Solutions?

Is there a middle ground? Can we balance the need for small, space-optimized codebases with the robustness necessary for building serious applications? Or is NPM destined to be nothing more than a pale imitation of CPAN crossed with an enthusiast’s idea of the perfect Linux distro? I wish I knew, because then I’d be rich. But I’ll give a few thoughts on the matter.

First off, the space requirement isn’t going away anytime soon. As long as we have mobile and home data caps, bandwidth will remain important, and wasting it on superfluous code is literally too expensive. In backwards Third World countries without net neutrality, like perhaps the US by the time this post goes up, it’ll be even worse. Bite-size packages work better for the Internet we have.

On the other hand, a lot of the more ridiculous JS packages wouldn’t be necessary if the language had a halfway decent standard library. I know the standards crew is giving it their best shot on this one, but compatibility is going to remain an issue for the foreseeable future. Yes, we can use polyfills, but then we’re back to our first problem, because that’s just more code that has to be sent down the wire, code that might not be needed in the first place.

The way the DOM is set up doesn’t really help us here, but there might be a solution hiding in there. Speculative loading, where a small shim comes first, checking for the existence of the needed functions. If they’re found, then the rest of the app can come along. Otherwise, send out a request for the right polyfills first. That would take some pretty heavy event hacking, but it might be possible to make it work. (The question is, can we make it work better than what we’ve got?)

As for the general problem of ever-multiplying dependencies, there might not be a good fix. But there also might not be a need to keep the old maxim in mind. Do we really need to import a whole new package to put padding at the beginning of a string? Yes, JS has a wacky type system, but " " + string is what the package would do anyway. (Probably with a lot of extra type checking, but you’re going to do that, too.) If you only need it once, why bother going to all the trouble of importing, adding in dependencies, and all that?

Ultimately, what has happened is that, as JavaScript lacks even the most basic systems for code reuse, its developers have reinvented them, but poorly. As it has a stunted standard library, the third party has had to fill those gaps. Again, they have done so poorly, at least in comparison to more mature languages. That’s not to say that it doesn’t work, because everything we use on the Internet today is proof that it does. But it could be better. There has to be a better way, so let’s find it.

Leave a Reply

Your email address will not be published. Required fields are marked *