Skip to content

Latest commit

 

History

History
executable file
·
4869 lines (3225 loc) · 225 KB

YDKJS_T&G.md

File metadata and controls

executable file
·
4869 lines (3225 loc) · 225 KB

You Don't Know JS: Types & Grammar

Chapter 1: Types

Most developers would say that a dynamic language (like JS) does not have types.

Now, if you're a fan of strongly typed (statically typed) languages, you may object to this usage of the word "type." In those languages, "type" means a whole lot more than it does here in JS.

Some people say JS shouldn't claim to have "types," and they should instead be called "tags" or perhaps "subtypes".

Bah! We're going to use this rough definition: a type is an intrinsic, built-in set of characteristics that uniquely identifies the behavior of a particular value and distinguishes it from other values, both to the engine and to the developer.

In other words, if both the engine and the developer treat value 42 (the number) differently than they treat value "42" (the string), then those two values have different types -- number and string, respectively. When you use 42, you are intending to do something numeric, like math. But when you use "42", you are intending to do something string'ish, like outputting to the page, etc. These two values have different types.

That's by no means a perfect definition. But it's good enough for this discussion. And it's consistent with how JS describes itself.

A Type By Any Other Name...

Having a proper understanding of each type and its intrinsic behavior is absolutely essential to understanding how to properly and accurately convert values to different types. Nearly every JS program ever written will need to handle value coercion in some shape or form, so it's important you do so responsibly and with confidence.

If you have the number value 42, but you want to treat it like a string, such as pulling out the "2" as a character in position 1, you obviously must first convert (coerce) the value from number to string.

But there are many different ways that such coercion can happen. Some of these ways are explicit, easy to reason about, and reliable. But if you're not careful, coercion can happen in very strange and surprising ways.

Coercion confusion is perhaps one of the most profound frustrations for JS developers. Armed with a full understanding of JavaScript types, we're aiming to illustrate why coercion's bad reputation is largely overhyped and somewhat undeserved -- to flip your perspective, to seeing coercion's power and usefulness. But first, we have to get a much better grip on values and types.

Built-in Types

JavaScript defines seven built-in types:

  • null
  • undefined
  • boolean
  • number
  • string
  • object
  • symbol -- added in ES6!

Note: All of these types except object are called "primitives".

The typeof operator inspects the type of the given value, and always returns one of seven string values -- surprisingly, there's not an exact 1-to-1 match with the seven built-in types we just listed.

typeof undefined === 'undefined'; // true
typeof true === 'boolean'; // true
typeof 42 === 'number'; // true
typeof '42' === 'string'; // true
typeof { life: 42 } === 'object'; // true
typeof Symbol() === 'symbol'; // true

These six listed types have values of the corresponding type and return a string value of the same name, as shown.

As you may have noticed, I excluded null from the above listing. It's special -- special in the sense that it's buggy when combined with the typeof operator:

typeof null === 'object'; // true

It would have been nice (and correct!) if it returned "null", but this original bug in JS has persisted for nearly two decades, and will likely never be fixed because there's too much existing web content that relies on its buggy behavior that "fixing" the bug would create more "bugs" and break a lot of web software.

If you want to test for a null value using its type, you need a compound condition:

var a = null;

!a && typeof a === 'object'; // true

null is the only primitive value that is "falsy" (aka false-like) but that also returns "object" from the typeof check.

So what's the seventh string value that typeof can return?

typeof function a() {
  /* .. */
} === 'function'; // true

It's easy to think that function would be a top-level built-in type in JS, especially given this behavior of the typeof operator. However, it's actually a "subtype" of object. Specifically, a function is referred to as a "callable object" -- an object that has an internal [[Call]] property that allows it to be invoked.

The fact that functions are actually objects is quite useful. Most importantly, they can have properties. For example:

function a(b, c) {
  /* .. */
}

The function object has a length property set to the number of formal parameters it is declared with.

a.length; // 2

Since you declared the function with two formal named parameters (b and c), the "length of the function" is 2.

What about arrays? They're native to JS, so are they a special type?

typeof [1, 2, 3] === 'object'; // true

Nope, just objects. It's most appropriate to think of them also as a "subtype" of object, in this case with the additional characteristics of being numerically indexed (as opposed to just being string-keyed like plain objects) and maintaining an automatically updated .length property.

Values as Types

In JavaScript, variables don't have types -- values have types. Variables can hold any value, at any time.

Another way to think about JS types is that JS doesn't have "type enforcement," in that the engine doesn't insist that a variable always holds values of the same initial type that it starts out with. A variable can, in one assignment statement, hold a string, and in the next hold a number, and so on.

The value 42 has an intrinsic type of number, and its type cannot be changed. Another value, like "42" with the string type, can be created from the number value 42 through a process called coercion.

If you use typeof against a variable, it's not asking "what's the type of the variable?" as it may seem, since JS variables have no types. Instead, it's asking "what's the type of the value in the variable?"

var a = 42;
typeof a; // "number"

a = true;
typeof a; // "boolean"

The typeof operator always returns a string. So:

typeof typeof 42; // "string"

The first typeof 42 returns "number", and typeof "number" is "string".

undefined vs "undeclared"

Variables that have no value currently, actually have the undefined value. Calling typeof against such variables will return "undefined":

var a;

typeof a; // "undefined"

var b = 42;
var c;

// later
b = c;

typeof b; // "undefined"
typeof c; // "undefined"

It's tempting for most developers to think of the word "undefined" and think of it as a synonym for "undeclared." However, in JS, these two concepts are quite different.

An "undefined" variable is one that has been declared in the accessible scope, but at the moment has no other value in it. By contrast, an "undeclared" variable is one that has not been formally declared in the accessible scope.

var a;

a; // undefined
b; // ReferenceError: b is not defined

An annoying confusion is the error message that browsers assign to this condition. As you can see, the message is "b is not defined," which is of course very easy and reasonable to confuse with "b is undefined." Yet again, "undefined" and "is not defined" are very different things.

There's also a special behavior associated with typeof as it relates to undeclared variables that even further reinforces the confusion. Consider:

var a;

typeof a; // "undefined"
typeof b; // "undefined"

The typeof operator returns "undefined" even for "undeclared" (or "not defined") variables. Notice that there was no error thrown when we executed typeof b, even though b is an undeclared variable. This is a special safety guard in the behavior of typeof.

Similar to above, it would have been nice if typeof used with an undeclared variable returned "undeclared" instead of conflating the result value with the different "undefined" case.

typeof Undeclared

Nevertheless, this safety guard is a useful feature when dealing with JavaScript in the browser, where multiple script files can load variables into the shared global namespace.

Note: Many developers believe there should never be any variables in the global namespace, and that everything should be contained in modules and private/separate namespaces. This is great in theory but nearly impossible in practicality; still it's a good goal to strive toward!

As a simple example, imagine having a "debug mode" in your program that is controlled by a global variable (flag) called DEBUG. You'd want to check if that variable was declared before performing a debug task like logging a message to the console. A top-level global var DEBUG = true declaration would only be included in a "debug.js" file, which you only load into the browser when you're in development/testing, but not in production.

// oops, this would throw an error!
if (DEBUG) {
  console.log('Debugging is starting');
}

// this is a safe existence check
if (typeof DEBUG !== 'undefined') {
  console.log('Debugging is starting');
}

This sort of check is useful even if you're not dealing with user-defined variables (like DEBUG). If you are doing a feature check for a built-in API, you may also find it helpful to check without throwing an error:

if (typeof atob === 'undefined') {
  atob = function() {
    /*..*/
  };
}

Note: If you're defining a "polyfill" for a feature if it doesn't already exist, you probably want to avoid using var to make the atob declaration. If you declare var atob inside the if statement, this declaration is hoisted to the top of the scope, even if the if condition doesn't pass (because the global atob already exists!). In some browsers and for some special types of global built-in variables, this duplicate declaration may throw an error. Omitting the var prevents this hoisted declaration.

Another way of doing these checks against global variables but without the safety guard feature of typeof is to observe that all global variables are also properties of the global object, which in the browser is basically the window object. So, the above checks could have been done as:

if (window.DEBUG) {
  // ..
}

if (!window.atob) {
  // ..
}

Unlike referencing undeclared variables, there is no ReferenceError thrown if you try to access an object property that doesn't exist.

On the other hand, manually referencing the global variable with a window reference is something some developers prefer to avoid, especially if your code needs to run in multiple JS environments (not just browsers, but server-side node.js, for instance), where the global object may not always be called window.

Technically, this safety guard on typeof is useful even if you're not using global variables, though these circumstances are less common, and some developers may find this design approach less desirable. Imagine a utility function that you want others to copy-and-paste into their programs or modules, in which you want to check to see if the including program has defined a certain variable or not:

function doSomethingCool() {
  var helper =
    typeof FeatureXYZ !== 'undefined'
      ? FeatureXYZ
      : function() {
          /*.. default feature ..*/
        };

  var val = helper();
  // ..
}

doSomethingCool() tests for a variable called FeatureXYZ, and if found, uses it, but if not, uses its own. Now, if someone includes this utility in their module/program, it safely checks if they've defined FeatureXYZ or not:

(function() {
  function FeatureXYZ() {
    /*.. my XYZ feature ..*/
  }

  // include `doSomethingCool(..)`
  function doSomethingCool() {
    var helper =
      typeof FeatureXYZ !== 'undefined'
        ? FeatureXYZ
        : function() {
            /*.. default feature ..*/
          };

    var val = helper();
    // ..
  }

  doSomethingCool();
})();

Here, FeatureXYZ is not at all a global variable, but we're still using the safety guard of typeof to make it safe to check for. And importantly, here there is no object we can use (like we did for global variables with window.___) to make the check, so typeof is quite helpful.

Other developers would prefer a design pattern called "dependency injection," where instead of doSomethingCool() inspecting implicitly for FeatureXYZ to be defined outside/around it, it would need to have the dependency explicitly passed in, like:

function doSomethingCool(FeatureXYZ) {
  var helper =
    FeatureXYZ ||
    function() {
      /*.. default feature ..*/
    };

  var val = helper();
  // ..
}

There are lots of options when designing such functionality. No one pattern here is "correct" or "wrong" -- there are various tradeoffs to each approach. But overall, it's nice that the typeof undeclared safety guard gives us more options.

Review

JavaScript has seven built-in types: null, undefined, boolean, number, string, object, symbol. They can be identified by the typeof operator.

Variables don't have types, but the values in them do. These types define intrinsic behavior of the values.

Many developers will assume "undefined" and "undeclared" are roughly the same thing, but in JavaScript, they're quite different. undefined is a value that a declared variable can hold. "Undeclared" means a variable has never been declared.

JavaScript unfortunately kind of conflates these two terms, not only in its error messages ("ReferenceError: a is not defined") but also in the return values of typeof, which is "undefined" for both cases.

However, the safety guard (preventing an error) on typeof when used against an undeclared variable can be helpful in certain cases.


Chapter 2: Values

arrays, strings, and numbers are the most basic building-blocks of any program, but JavaScript has some unique characteristics with these types that may either delight or confound you.

Arrays

As compared to other type-enforced languages, JavaScript arrays are just containers for any type of value, from string to number to object to even another array.

var a = [ 1, "2", [3] ];

a.length;		// 3
a[0] === 1;		// true
a[2][0] === 3;	// true

You don't need to presize your arrays, you can just declare them and add values as you see fit:

var a = [ ];

a.length;	// 0

a[0] = 1;
a[1] = "2";
a[2] = [ 3 ];

a.length;	// 3

Warning: Using delete on an array value will remove that slot from the array, but even if you remove the final element, it does not update the length property, so be careful! We'll cover the delete operator itself in more detail in Chapter 5.

Be careful about creating "sparse" arrays (leaving or creating empty/missing slots):

var a = [ ];

a[0] = 1;
// no `a[1]` slot set here
a[2] = [ 3 ];

a[1];		// undefined

a.length;	// 3

While that works, it can lead to some confusing behavior with the "empty slots" you leave in between. While the slot appears to have the undefined value in it, it will not behave the same as if the slot is explicitly set (a[1] = undefined).

arrays are numerically indexed, but the tricky thing is that they also are objects that can have string keys/properties added to them (but which don't count toward the length of the array):

var a = [ ];

a[0] = 1;
a["foobar"] = 2;

a.length;		// 1
a["foobar"];	// 2
a.foobar;		// 2

However, a gotcha to be aware of is that if a string value intended as a key can be coerced to a standard base-10 number, then it is assumed that you wanted to use it as a number index rather than as a string key!

var a = [ ];

a["13"] = 42;

a.length; // 14

Generally, it's not a great idea to add string keys/properties to arrays. Use objects for holding values in keys/properties, and save arrays for strictly numerically indexed values.

Array-Likes

There will be occasions where you need to convert an array-like value (a numerically indexed collection of values) into a true array, usually so you can call array utilities (like indexOf(..), concat(..), forEach(..), etc.) against the collection of values.

For example, various DOM query operations return lists of DOM elements that are not true arrays but are array-like enough for our conversion purposes. Another common example is when functions expose the arguments (array-like) object to access the arguments as a list.

One very common way to make such a conversion is to borrow the slice(..) utility against the value:

function foo() {
	var arr = Array.prototype.slice.call( arguments );
	arr.push( "bam" );
	console.log( arr );
}

foo( "bar", "baz" ); // ["bar","baz","bam"]

If slice() is called without any other parameters, as it effectively is in the above snippet, the default values for its parameters have the effect of duplicating the array (or, in this case, array-like).

As of ES6, there's also a built-in utility called Array.from(..) that can do the same task:

...
var arr = Array.from( arguments );
...

Strings

It's a very common belief that strings are essentially just arrays of characters. While the implementation under the covers may or may not use arrays, it's important to realize that JavaScript strings are really not the same as arrays of characters.

var a = "foo";
var b = ["f","o","o"];

Both of them having a length property, an indexOf(..) method, and a concat(..) method:

a.length;							// 3
b.length;							// 3

a.indexOf( "o" );					// 1
b.indexOf( "o" );					// 1

var c = a.concat( "bar" );			// "foobar"
var d = b.concat( ["b","a","r"] );	// ["f","o","o","b","a","r"]

a === c;							// false
b === d;							// false

a;									// "foo"
b;									// ["f","o","o"]

So, they're both basically just "arrays of characters", right? Not exactly:

a[1] = "O";
b[1] = "O";

a; // "foo"
b; // ["f","O","o"]

JavaScript strings are immutable, while arrays are quite mutable.

A consequence of immutable strings is that none of the string methods that alter its contents can modify in-place, but rather must create and return new strings. By contrast, many of the methods that change array contents actually do modify in-place.

c = a.toUpperCase();
a === c;	// false
a;			// "foo"
c;			// "FOO"

b.push( "!" );
b;			// ["f","O","o","!"]

Also, many of the array methods that could be helpful when dealing with strings are not actually available for them, but we can "borrow" non-mutation array methods against our string:

a.join;			// undefined
a.map;			// undefined

var c = Array.prototype.join.call( a, "-" );
var d = Array.prototype.map.call( a, function(v){
	return v.toUpperCase() + ".";
} ).join( "" );

c;				// "f-o-o"
d;				// "F.O.O."

Let's take another example: reversing a string. arrays have a reverse() in-place mutator method, but strings do not:

a.reverse;		// undefined

b.reverse();	// ["!","o","O","f"]
b;				// ["!","o","O","f"]

Unfortunately, this "borrowing" doesn't work with array mutators, because strings are immutable and thus can't be modified in place:

Array.prototype.reverse.call( a );
// still returns a String object wrapper (see Chapter 3)
// for "foo" :(

Another workaround is to convert the string into an array, perform the desired operation, then convert it back to a string.

var c = a
	// split `a` into an array of characters
	.split( "" )
	// reverse the array of characters
	.reverse()
	// join the array of characters back to a string
	.join( "" );

c; // "oof"

If that feels ugly, it is. Nevertheless, it works for simple strings, so if you need something quick-n-dirty, often such an approach gets the job done.

Warning: Be careful! This approach doesn't work for strings with complex (unicode) characters in them. You need more sophisticated library utilities that are unicode-aware for such operations to be handled accurately.

The other way to look at this is: if you are more commonly doing tasks on your "strings" that treat them as basically arrays of characters, perhaps it's better to just actually store them as arrays rather than as strings. You'll probably save yourself a lot of hassle of converting from string to array each time. You can always call join("") on the array of characters whenever you actually need the string representation.

Numbers

JavaScript has just one numeric type: number. This type includes both "integer" values and fractional decimal numbers. I say "integer" in quotes because it's long been a criticism of JS that there are not true integers, as there are in other languages. For now, we just have numbers for everything.

So, in JS, an "integer" is just a value that has no fractional decimal value. That is, 42.0 is as much an "integer" as 42.

Like most modern languages, the implementation of JavaScript's numbers is based on the "IEEE 754" standard, often called "floating-point." JavaScript specifically uses the "double precision" format (aka "64-bit binary") of the standard.

Numeric Syntax

Number literals are expressed in JavaScript generally as base-10 decimal literals. For example:

var a = 42;
var b = 42.3;

The leading portion of a decimal value, if 0, is optional:

var a = 0.42;
var b = .42;

Similarly, the trailing portion (the fractional) of a decimal value after the ., if 0, is optional:

var a = 42.0;
var b = 42.;

By default, most numbers will be outputted as base-10 decimals, with trailing fractional 0s removed. So:

var a = 42.300;
var b = 42.0;

a; // 42.3
b; // 42

Very large or very small numbers will by default be outputted in exponent form, the same as the output of the toExponential() method, like:

var a = 5E10;
a;					// 50000000000
a.toExponential();	// "5e+10"

var b = a * a;
b;					// 2.5e+21

var c = 1 / a;
c;					// 2e-11

Because number values can be boxed with the Number object wrapper, number values can access methods that are built into the Number.prototype. For example, the toFixed(..) method allows you to specify how many fractional decimal places you'd like the value to be represented with:

var a = 42.59;

a.toFixed( 0 ); // "43"
a.toFixed( 1 ); // "42.6"
a.toFixed( 2 ); // "42.59"
a.toFixed( 3 ); // "42.590"
a.toFixed( 4 ); // "42.5900"

Notice that the output is actually a string representation of the number, and that the value is 0-padded on the right-hand side if you ask for more decimals than the value holds.

toPrecision(..) is similar, but specifies how many significant digits should be used to represent the value:

var a = 42.59;

a.toPrecision( 1 ); // "4e+1"
a.toPrecision( 2 ); // "43"
a.toPrecision( 3 ); // "42.6"
a.toPrecision( 4 ); // "42.59"
a.toPrecision( 5 ); // "42.590"
a.toPrecision( 6 ); // "42.5900"

You don't have to use a variable with the value in it to access these methods; you can access these methods directly on number literals. But you have to be careful with the . operator. Since . is a valid numeric character, it will first be interpreted as part of the number literal, if possible, instead of being interpreted as a property accessor.

// invalid syntax:
42.toFixed( 3 );	// SyntaxError

// these are all valid:
(42).toFixed( 3 );	// "42.000"
0.42.toFixed( 3 );	// "0.420"
42..toFixed( 3 );	// "42.000"

42.toFixed(3) is invalid syntax, because the . is swallowed up as part of the 42. literal, and so then there's no . property operator present to make the .toFixed access.

42..toFixed(3) works because the first . is part of the number and the second . is the property operator. But it probably looks strange, and indeed it's very rare to see something like that in actual JavaScript code. In fact, it's pretty uncommon to access methods directly on any of the primitive values. Uncommon doesn't mean bad or wrong.

Note: There are libraries that extend the built-in Number.prototype to provide extra operations on/with numbers, and so in those cases, it's perfectly valid to use something like 10..makeItRain() to set off a 10-second money raining animation, or something else silly like that.

This is also technically valid (notice the space):

42 .toFixed(3); // "42.000"

However, with the number literal specifically, this is particularly confusing coding style and will serve no other purpose but to confuse other developers (and your future self). Avoid it.

numbers can also be specified in exponent form, which is common when representing larger numbers, such as:

var onethousand = 1E3;						// means 1 * 10^3
var onemilliononehundredthousand = 1.1E6;	// means 1.1 * 10^6

number literals can also be expressed in other bases, like binary, octal, and hexadecimal.

These formats work in current versions of JavaScript:

0xf3; // hexadecimal for: 243
0Xf3; // ditto

0363; // octal for: 243

Note: Starting with ES6 + strict mode, the 0363 form of octal literals is no longer allowed. The 0363 form is still allowed in non-strict mode, but you should stop using it anyway, to be future-friendly (and because you should be using strict mode by now!).

As of ES6, the following new forms are also valid:

0o363;		// octal for: 243
0O363;		// ditto

0b11110011;	// binary for: 243
0B11110011; // ditto

Please do your fellow developers a favor: never use the 0O363 form. 0 next to capital O is just asking for confusion. Always use the lowercase predicates 0x, 0b, and 0o.

Small Decimal Values

The most (in)famous side effect of using binary floating-point numbers (which, remember, is true of all languages that use IEEE 754 -- not just JavaScript as many assume/pretend) is:

0.1 + 0.2 === 0.3; // false

Mathematically, we know that statement should be true. Why is it false?

Simply put, the representations for 0.1 and 0.2 in binary floating-point are not exact, so when they are added, the result is not exactly 0.3. It's really close: 0.30000000000000004, but if your comparison fails, "close" is irrelevant.

Now, the question is, if some numbers can't be trusted to be exact, does that mean we can't use numbers at all? Of course not.

There are some applications where you need to be more careful, especially when dealing with fractional decimal values. There are also plenty of applications that only deal with whole numbers ("integers"), and moreover, only deal with numbers in the millions or trillions at maximum. These applications have been, and always will be, perfectly safe to use numeric operations in JS.

What if we did need to compare two numbers, like 0.1 + 0.2 to 0.3, knowing that the simple equality test fails?

The most commonly accepted practice is to use a tiny "rounding error" value as the tolerance for comparison. This tiny value is often called "machine epsilon," which is commonly 2^-52 (2.220446049250313e-16) for the kind of numbers in JavaScript.

As of ES6, Number.EPSILON is predefined with this tolerance value, so you'd want to use it, but you can safely polyfill the definition for pre-ES6:

if (!Number.EPSILON) {
	Number.EPSILON = Math.pow(2,-52);
}

We can use this Number.EPSILON to compare two numbers for "equality" (within the rounding error tolerance):

function numbersCloseEnoughToEqual(n1,n2) {
	return Math.abs( n1 - n2 ) < Number.EPSILON;
}

var a = 0.1 + 0.2;
var b = 0.3;

numbersCloseEnoughToEqual( a, b );					// true
numbersCloseEnoughToEqual( 0.0000001, 0.0000002 );	// false

The maximum floating-point value that can be represented is roughly 1.798e+308 (which is really, really, really huge!), predefined for you as Number.MAX_VALUE. On the small end, Number.MIN_VALUE is roughly 5e-324, which isn't negative but is really close to zero!

Safe Integer Ranges

Because of how numbers are represented, there is a range of "safe" values for the whole number "integers", and it's significantly less than Number.MAX_VALUE.

The maximum integer that can "safely" be represented (that is, there's a guarantee that the requested value is actually representable unambiguously) is 2^53 - 1, which is 9007199254740991. If you insert your commas, you'll see that this is just over 9 quadrillion. So that's pretty darn big for numbers to range up to.

This value is actually automatically predefined in ES6, as Number.MAX_SAFE_INTEGER. Unsurprisingly, there's a minimum value, -9007199254740991, and it's defined in ES6 as Number.MIN_SAFE_INTEGER.

The main way that JS programs are confronted with dealing with such large numbers is when dealing with 64-bit IDs from databases, etc. 64-bit numbers cannot be represented accurately with the number type, so must be stored in (and transmitted to/from) JavaScript using string representation.

Numeric operations on such large ID number values (besides comparison, which will be fine with strings) aren't all that common, thankfully. But if you do need to perform math on these very large values, for now you'll need to use a big number utility. Big numbers may get official support in a future version of JavaScript.

Testing for Integers

To test if a value is an integer, you can use the ES6-specified Number.isInteger(..):

Number.isInteger( 42 );		// true
Number.isInteger( 42.000 );	// true
Number.isInteger( 42.3 );	// false

To polyfill Number.isInteger(..) for pre-ES6:

if (!Number.isInteger) {
	Number.isInteger = function(num) {
		return typeof num == "number" && num % 1 == 0;
	};
}

To test if a value is a safe integer, use the ES6-specified Number.isSafeInteger(..):

Number.isSafeInteger( Number.MAX_SAFE_INTEGER );	// true
Number.isSafeInteger( Math.pow( 2, 53 ) );			// false
Number.isSafeInteger( Math.pow( 2, 53 ) - 1 );		// true

To polyfill Number.isSafeInteger(..) in pre-ES6 browsers:

if (!Number.isSafeInteger) {
	Number.isSafeInteger = function(num) {
		return Number.isInteger( num ) &&
			Math.abs( num ) <= Number.MAX_SAFE_INTEGER;
	};
}

32-bit (Signed) Integers

While integers can range up to roughly 9 quadrillion safely (53 bits), there are some numeric operations (like the bitwise operators) that are only defined for 32-bit numbers, so the "safe range" for numbers used in that way must be much smaller.

The range then is Math.pow(-2,31) (-2147483648, about -2.1 billion) up to Math.pow(2,31)-1 (2147483647, about +2.1 billion).

To force a number value in a to a 32-bit signed integer value, use a | 0. This works because the | bitwise operator only works for 32-bit integer values (meaning it can only pay attention to 32 bits and any other bits will be lost). Then, "or'ing" with zero is essentially a no-op bitwise speaking.

Note: Certain special values such as NaN and Infinity are not "32-bit safe," in that those values when passed to a bitwise operator will pass through the abstract operation ToInt32 (see Chapter 4) and become simply the +0 value for the purpose of that bitwise operation.

Special Values

There are several special values spread across the various types that the alert JS developer needs to be aware of, and use properly.

The Non-value Values

For the undefined type, there is one and only one value: undefined. For the null type, there is one and only one value: null. So for both of them, the label is both its type and its value.

Both undefined and null are often taken to be interchangeable as either "empty" values or "non" values. Other developers prefer to distinguish between them with nuance. For example:

  • null is an empty value
  • undefined is a missing value

Or:

  • undefined hasn't had a value yet
  • null had a value and doesn't anymore

Regardless of how you choose to "define" and use these two values, null is a special keyword, not an identifier, and thus you cannot treat it as a variable to assign to (why would you!?). However, undefined is (unfortunately) an identifier.

Undefined

In non-strict mode, it's actually possible (though incredibly ill-advised!) to assign a value to the globally provided undefined identifier:

function foo() {
	undefined = 2; // really bad idea!
}

foo();
function foo() {
	"use strict";
	undefined = 2; // TypeError!
}

foo();

In both non-strict mode and strict mode, however, you can create a local variable of the name undefined. But again, this is a terrible idea!

function foo() {
	"use strict";
	var undefined = 2;
	console.log( undefined ); // 2
}

foo();

Friends don't let friends override undefined. Ever.

void Operator

While undefined is a built-in identifier that holds the built-in undefined value, another way to get this value is the void operator.

The expression void ___ "voids" out any value, so that the result of the expression is always the undefined value. It doesn't modify the existing value; it just ensures that no value comes back from the operator expression.

var a = 42;

console.log( void a, a ); // undefined 42

By convention, to represent the undefined value stand-alone by using void, you'd use void 0 (though clearly even void true or any other void expression does the same thing). There's no practical difference between void 0, void 1, and undefined.

But the void operator can be useful in a few other circumstances, if you need to ensure that an expression has no result value.

function doSomething() {
	// note: `APP.ready` is provided by our application
	if (!APP.ready) {
		// try again later
		return void setTimeout( doSomething, 100 );
	}

	var result;

	// do some other stuff
	return result;
}

// were we able to do it right away?
if (doSomething()) {
	// handle next tasks right away
}

Here, the setTimeout(..) function returns a numeric value (the unique identifier of the timer interval, if you wanted to cancel it), but we want to void that out so that the return value of our function doesn't give a false-positive with the if statement.

Many devs prefer to just do these actions separately, which works the same but doesn't use the void operator:

if (!APP.ready) {
	// try again later
	setTimeout( doSomething, 100 );
	return;
}

In general, if there's ever a place where a value exists (from some expression) and you'd find it useful for the value to be undefined instead, use the void operator. That probably won't be terribly common in your programs, but in the rare cases you do need it, it can be quite helpful.

Special Numbers

The number type includes several special values. We'll take a look at each in detail.

The Not Number, Number

Any mathematic operation you perform without both operands being numbers (or values that can be interpreted as regular numbers in base 10 or base 16) will result in the operation failing to produce a valid number, in which case you will get the NaN value.

NaN literally stands for "not a number", though this label/description is very poor and misleading. It would be much more accurate to think of NaN as being "invalid number," "failed number," or even "bad number," than to think of it as "not a number."

var a = 2 / "foo";		// NaN

typeof a === "number";	// true

In other words: "the type of not-a-number is 'number'!" Hooray for confusing names and semantics.

NaN is a kind of "sentinel value" (an otherwise normal value that's assigned a special meaning) that represents a special kind of error condition within the number set. The error condition is, in essence: "I tried to perform a mathematic operation but failed, so here's the failed number result instead."

So, if you have a value in some variable and want to test to see if it's this special failed-number NaN, you might think you could directly compare to NaN itself, as you can with any other value, like null or undefined. Nope.

var a = 2 / "foo";

a == NaN;	// false
a === NaN;	// false

NaN is a very special value in that it's never equal to another NaN value (i.e., it's never equal to itself). It's the only value, in fact, that is not reflexive (without the Identity characteristic x === x). So, NaN !== NaN. A bit strange, huh?

So how do we test for it, if we can't compare to NaN?

var a = 2 / "foo";

isNaN( a ); // true

Easy enough, right? We use the built-in global utility called isNaN(..) and it tells us if the value is NaN or not. Problem solved!

Not so fast.

The isNaN(..) utility has a fatal flaw. It appears it tried to take the meaning of NaN ("Not a Number") too literally -- that its job is basically: "test if the thing passed in is either not a number or is a number." But that's not quite accurate.

var a = 2 / "foo";
var b = "foo";

a; // NaN
b; // "foo"

window.isNaN( a ); // true
window.isNaN( b ); // true -- ouch!

Clearly, "foo" is literally not a number, but it's definitely not the NaN value either!

As of ES6, finally a replacement utility has been provided: Number.isNaN(..). A simple polyfill for it so that you can safely check NaN values now even in pre-ES6 browsers is:

if (!Number.isNaN) {
	Number.isNaN = function(n) {
		return (
			typeof n === "number" &&
			window.isNaN( n )
		);
	};
}

var a = 2 / "foo";
var b = "foo";

Number.isNaN( a ); // true
Number.isNaN( b ); // false -- phew!

Actually, we can implement a Number.isNaN(..) polyfill even easier, by taking advantage of that peculiar fact that NaN isn't equal to itself. NaN is the only value in the whole language where that's true; every other value is always equal to itself.

So:

if (!Number.isNaN) {
	Number.isNaN = function(n) {
		return n !== n;
	};
}

NaNs are probably a reality in a lot of real-world JS programs, either on purpose or by accident. It's a really good idea to use a reliable test, like Number.isNaN(..) as provided (or polyfilled), to recognize them properly.

If you're currently using just isNaN(..) in a program, the sad reality is your program has a bug, even if you haven't been bitten by it yet!

Infinities

Developers from traditional compiled languages like C are probably used to seeing either a compiler error or runtime exception, like "Divide by zero," for an operation like:

var a = 1 / 0;

However, in JS, this operation is well-defined and results in the value Infinity (aka Number.POSITIVE_INFINITY). Unsurprisingly:

var a = 1 / 0;	// Infinity
var b = -1 / 0;	// -Infinity

As you can see, -Infinity (aka Number.NEGATIVE_INFINITY) results from a divide-by-zero where either (but not both!) of the divide operands is negative.

JS uses finite numeric representations (IEEE 754 floating-point), so contrary to pure mathematics, it seems it is possible to overflow even with an operation like addition or subtraction, in which case you'd get Infinity or -Infinity.

For example:

var a = Number.MAX_VALUE;	// 1.7976931348623157e+308
a + a;						// Infinity
a + Math.pow( 2, 970 );		// Infinity
a + Math.pow( 2, 969 );		// 1.7976931348623157e+308

If an operation like addition results in a value that's too big to represent, the IEEE 754 "round-to-nearest" mode specifies what the result should be. So, in a crude sense, Number.MAX_VALUE + Math.pow( 2, 969 ) is closer to Number.MAX_VALUE than to Infinity, so it "rounds down," whereas Number.MAX_VALUE + Math.pow( 2, 970 ) is closer to Infinity so it "rounds up".

Once you overflow to either one of the infinities, however, there's no going back. In other words, in an almost poetic sense, you can go from finite to infinite but not from infinite back to finite.

It's almost philosophical to ask: "What is infinity divided by infinity". Our naive brains would likely say "1" or maybe "infinity." Turns out neither is true. Both mathematically and in JavaScript, Infinity / Infinity is not a defined operation. In JS, this results in NaN.

But what about any positive finite number divided by Infinity? That's easy! 0. And what about a negative finite number divided by Infinity?

Zeros

While it may confuse the mathematics-minded reader, JavaScript has both a normal zero 0 (known as a positive zero +0) and a negative zero -0. Before we explain why the -0 exists, we should examine how JS handles it, because it can be quite confusing.

Besides being specified literally as -0, negative zero also results from certain mathematic operations. For example:

var a = 0 / -3; // -0
var b = 0 * -3; // -0

Addition and subtraction cannot result in a negative zero.

A negative zero when examined in the developer console will usually reveal -0, though that was not the common case until fairly recently, so some older browsers you encounter may still report it as 0.

However, if you try to stringify a negative zero value, it will always be reported as "0", according to the spec.

var a = 0 / -3;

a;							// -0
a.toString();				// "0"
a + "";						// "0"
String( a );				// "0"

// strangely, even JSON gets in on the deception
JSON.stringify( a );		// "0"

Interestingly, the reverse operations (going from string to number) don't lie:

+"-0";				// -0
Number( "-0" );		// -0
JSON.parse( "-0" );	// -0

Warning: The JSON.stringify( -0 ) behavior of "0" is particularly strange when you observe that it's inconsistent with the reverse: JSON.parse( "-0" ) reports -0 as you'd correctly expect.

In addition to stringification of negative zero being deceptive to hide its true value, the comparison operators are also (intentionally) configured to lie.

var a = 0;
var b = 0 / -3;

a == b;		// true
-0 == 0;	// true

a === b;	// true
-0 === 0;	// true

0 > -0;		// false
a > b;		// false

Clearly, if you want to distinguish a -0 from a 0 in your code, you can't just rely on what the developer console outputs, so you're going to have to be a bit more clever:

function isNegZero(n) {
	n = Number( n );
	return (n === 0) && (1 / n === -Infinity);
}

isNegZero( -0 );		// true
isNegZero( 0 / -3 );	// true
isNegZero( 0 );			// false

Now, why do we need a negative zero, besides academic trivia?

There are certain applications where developers use the magnitude of a value to represent one piece of information (like speed of movement per animation frame) and the sign of that number to represent another piece of information (like the direction of that movement).

In those applications, as one example, if a variable arrives at zero and it loses its sign, then you would lose the information of what direction it was moving in before it arrived at zero. Preserving the sign of the zero prevents potentially unwanted information loss.

Special Equality

As we saw above, the NaN value and the -0 value have special behavior when it comes to equality comparison. NaN is never equal to itself, so you have to use ES6's Number.isNaN(..) (or a polyfill). Similarly, -0 lies and pretends that it's equal (even === strict equal) to regular positive 0, so you have to use the somewhat hackish isNegZero(..) utility we suggested above.

As of ES6, there's a new utility that can be used to test two values for absolute equality, without any of these exceptions. It's called Object.is(..):

var a = 2 / "foo";
var b = -3 * 0;

Object.is( a, NaN );	// true
Object.is( b, -0 );		// true

Object.is( b, 0 );		// false

There's a pretty simple polyfill for Object.is(..) for pre-ES6 environments:

if (!Object.is) {
	Object.is = function(v1, v2) {
		// test for `-0`
		if (v1 === 0 && v2 === 0) {
			return 1 / v1 === 1 / v2;
		}
		// test for `NaN`
		if (v1 !== v1) {
			return v2 !== v2;
		}
		// everything else
		return v1 === v2;
	};
}

Object.is(..) probably shouldn't be used in cases where == or === are known to be safe, as the operators are likely much more efficient and certainly are more idiomatic/common. Object.is(..) is mostly for these special cases of equality.

Value vs. Reference

In many other languages, values can either be assigned/passed by value-copy or by reference-copy depending on the syntax you use.

In JavaScript, there are no pointers, and references work a bit differently. You cannot have a reference from one JS variable to another variable. That's just not possible.

A reference in JS points at a (shared) value, so if you have 10 different references, they are all always distinct references to a single shared value; none of them are references/pointers to each other.

Moreover, in JavaScript, there are no syntactic hints that control value vs. reference assignment/passing. Instead, the type of the value solely controls whether that value will be assigned by value-copy or by reference-copy.

Let's illustrate:

var a = 2;
var b = a; // `b` is always a copy of the value in `a`
b++;
a; // 2
b; // 3

var c = [1,2,3];
var d = c; // `d` is a reference to the shared `[1,2,3]` value
d.push( 4 );
c; // [1,2,3,4]
d; // [1,2,3,4]

Simple values (aka primitives) are always assigned/passed by value-copy: null, undefined, string, number, boolean, and symbol.

Compound values -- objects (including arrays, and all boxed object wrappers) and functions -- always create a copy of the reference on assignment or passing.

In the above snippet, because 2 is a scalar primitive, a holds one initial copy of that value, and b is assigned another copy of the value. When changing b, you are in no way changing the value in a.

But both c and d are separate references to the same shared value [1,2,3], which is a compound value. It's important to note that neither c nor d more "owns" the [1,2,3] value -- both are just equal peer references to the value. So, when using either reference to modify (.push(4)) the actual shared array value itself, it's affecting just the one shared value, and both references will reference the newly modified value [1,2,3,4].

Since references point to the values themselves and not to the variables, you cannot use one reference to change where another reference is pointed:

var a = [1,2,3];
var b = a;
a; // [1,2,3]
b; // [1,2,3]

b = [4,5,6];
a; // [1,2,3]
b; // [4,5,6]

When we make the assignment b = [4,5,6], we are doing absolutely nothing to affect where a is still referencing ([1,2,3]). To do that, b would have to be a pointer to a rather than a reference to the array -- but no such capability exists in JS!

The most common way such confusion happens is with function parameters:

function foo(x) {
	x.push( 4 );
	x; // [1,2,3,4]

	x = [4,5,6];
	x.push( 7 );
	x; // [4,5,6,7]
}

var a = [1,2,3];

foo( a );

a; // [1,2,3,4]  not  [4,5,6,7]

When we pass in the argument a, it assigns a copy of the a reference to x. x and a are separate references pointing at the same [1,2,3] value. Now, inside the function, we can use that reference to mutate the value itself (push(4)). But when we make the assignment x = [4,5,6], this is in no way affecting where the initial reference a is pointing -- still points at the (now modified) [1,2,3,4] value.

There is no way to use the x reference to change where a is pointing. We could only modify the contents of the shared value that both a and x are pointing to.

To accomplish changing a to have the [4,5,6,7] value contents, you can't create a new array and assign -- you must modify the existing array value:

function foo(x) {
	x.push( 4 );
	x; // [1,2,3,4]

	x.length = 0; // empty existing array in-place
	x.push( 4, 5, 6, 7 );
	x; // [4,5,6,7]
}

var a = [1,2,3];

foo( a );

a; // [4,5,6,7]  not  [1,2,3,4]

As you can see, x.length = 0 and x.push(4,5,6,7) were not creating a new array, but modifying the existing shared array. So of course, a references the new [4,5,6,7] contents.

Remember: you cannot directly control/override value-copy vs. reference -- those semantics are controlled entirely by the type of the underlying value.

To effectively pass a compound value (like an array) by value-copy, you need to manually make a copy of it, so that the reference passed doesn't still point to the original. For example:

foo( a.slice() );

slice(..) with no parameters by default makes an entirely new (shallow) copy of the array. So, we pass in a reference only to the copied array, and thus foo(..) cannot affect the contents of a.

To do the reverse -- pass a scalar primitive value in a way where its value updates can be seen, kinda like a reference -- you have to wrap the value in another compound value (object, array, etc) that can be passed by reference-copy:

function foo(wrapper) {
	wrapper.a = 42;
}

var obj = {
	a: 2
};

foo( obj );

obj.a; // 42

Here, obj acts as a wrapper for the scalar primitive property a. When passed to foo(..), a copy of the obj reference is passed in and set to the wrapper parameter. We now can use the wrapper reference to access the shared object, and update its property. After the function finishes, obj.a will see the updated value 42.

It may occur to you that if you wanted to pass in a reference to a scalar primitive value like 2, you could just box the value in its Number object wrapper (see Chapter 3).

It is true a copy of the reference to this Number object will be passed to the function, but unfortunately, having a reference to the shared object is not going to give you the ability to modify the shared primitive value, like you may expect:

function foo(x) {
	x = x + 1;
	x; // 3
}

var a = 2;
var b = new Number( a ); // or equivalently `Object(a)`

foo( b );
console.log( b ); // 2, not 3

The problem is that the underlying scalar primitive value is not mutable (same goes for String and Boolean). If a Number object holds the scalar primitive value 2, that exact Number object can never be changed to hold another value; you can only create a whole new Number object with a different value.

When x is used in the expression x + 1, the underlying scalar primitive value 2 is unboxed (extracted) from the Number object automatically, so the line x = x + 1 very subtly changes x from being a shared reference to the Number object, to just holding the scalar primitive value 3 as a result of the addition operation 2 + 1. Therefore, b on the outside still references the original unmodified/immutable Number object holding the value 2.

You can add properties on top of the Number object (just not change its inner primitive value), so you could exchange information indirectly via those additional properties.

This is not all that common, however; it probably would not be considered a good practice by most developers.

Instead of using the wrapper object Number in this way, it's probably much better to use the manual object wrapper (obj) approach in the earlier snippet. That's not to say that there's no clever uses for the boxed object wrappers like Number -- just that you should probably prefer the scalar primitive value form in most cases.

References are quite powerful, but sometimes they get in your way, and sometimes you need them where they don't exist. The only control you have over reference vs. value-copy behavior is the type of the value itself, so you must indirectly influence the assignment/passing behavior by which value types you choose to use.

Review

In JavaScript, arrays are simply numerically indexed collections of any value-type. strings are somewhat "array-like", but they have distinct behaviors and care must be taken if you want to treat them as arrays. Numbers in JavaScript include both "integers" and floating-point values.

Several special values are defined within the primitive types.

The null type has just one value: null, and likewise the undefined type has just the undefined value. undefined is basically the default value in any variable or property if no other value is present. The void operator lets you create the undefined value from any other value.

numbers include several special values, like NaN (supposedly "Not a Number", but really more appropriately "invalid number"); +Infinity and -Infinity; and -0.

Simple scalar primitives (strings, numbers, etc.) are assigned/passed by value-copy, but compound values (objects, etc.) are assigned/passed by reference-copy. References are not like references/pointers in other languages -- they're never pointed at other variables/references, only at the underlying values.


Chapter 3: Natives

Here's a list of the most commonly used natives:

  • String()
  • Number()
  • Boolean()
  • Array()
  • Object()
  • Function()
  • RegExp()
  • Date()
  • Error()
  • Symbol() -- added in ES6!

As you can see, these natives are actually built-in functions.

If you're coming to JS from a language like Java, JavaScript's String() will look like the String(..) constructor you're used to for creating string values. So, you'll quickly observe that you can do things like:

var s = new String( "Hello World!" );

console.log( s.toString() ); // "Hello World!"

It is true that each of these natives can be used as a native constructor. But what's being constructed may be different than you think.

var a = new String( "abc" );

typeof a; // "object" ... not "String"

a instanceof String; // true

Object.prototype.toString.call( a ); // "[object String]"

The result of the constructor form of value creation (new String("abc")) is an object wrapper around the primitive ("abc") value.

Importantly, typeof shows that these objects are not their own special types, but more appropriately they are subtypes of the object type.

This object wrapper can further be observed with:

console.log( a );

The output of that statement varies depending on your browser, as developer consoles are free to choose however they feel it's appropriate to serialize the object for developer inspection.

Note: At the time of writing, the latest Chrome prints something like this: String {0: "a", 1: "b", 2: "c", length: 3, [[PrimitiveValue]]: "abc"}. But older versions of Chrome used to just print this: String {0: "a", 1: "b", 2: "c"}. Of course, these results are subject to rapid change and your experience may vary.

The point is, new String("abc") creates a string wrapper object around "abc", not just the primitive "abc" value itself.

Internal [[Class]]

Values that are typeof "object" (such as an array) are additionally tagged with an internal [[Class]] property (think of this more as an internal classification rather than related to classes from traditional class-oriented coding). This property cannot be accessed directly, but can generally be revealed indirectly by borrowing the default Object.prototype.toString(..) method called against the value.

Object.prototype.toString.call( [1,2,3] );			// "[object Array]"

Object.prototype.toString.call( /regex-literal/i );	// "[object RegExp]"

So, for the array in this example, the internal [[Class]] value is "Array", and for the regular expression, it's "RegExp". In most cases, this internal [[Class]] value corresponds to the built-in native constructor that's related to the value, but that's not always the case.

What about primitive values? First, null and undefined:

Object.prototype.toString.call( null );			// "[object Null]"
Object.prototype.toString.call( undefined );	// "[object Undefined]"

You'll note that there are no Null() or Undefined() native constructors, but nevertheless the "Null" and "Undefined" are the internal [[Class]] values exposed.

But for the other simple primitives like string, number, and boolean, another behavior actually kicks in, which is usually called "boxing":

Object.prototype.toString.call( "abc" );	// "[object String]"
Object.prototype.toString.call( 42 );		// "[object Number]"
Object.prototype.toString.call( true );		// "[object Boolean]"

Each of the simple primitives are automatically boxed by their respective object wrappers, which is why "String", "Number", and "Boolean" are revealed as the respective internal [[Class]] values.

Note: The behavior of toString() and [[Class]] as illustrated here has changed a bit from ES5 to ES6, but we cover those details in the ES6 & Beyond title of this series.

Boxing Wrappers

These object wrappers serve a very important purpose. Primitive values don't have properties or methods, so to access .length or .toString() you need an object wrapper around the value. Thankfully, JS will automatically box (aka wrap) the primitive value to fulfill such accesses.

var a = "abc";

a.length; // 3
a.toUpperCase(); // "ABC"

So, if you're going to be accessing these properties/methods on your string values regularly, like a i < a.length condition in a for loop for instance, it might seem to make sense to just have the object form of the value from the start, so the JS engine doesn't need to implicitly create it for you.

But it turns out that's a bad idea. Browsers long ago performance-optimized the common cases like .length, which means your program will actually go slower if you try to "preoptimize" by directly using the object form (which isn't on the optimized path).

In general, there's basically no reason to use the object form directly. It's better to just let the boxing happen implicitly where necessary. In other words, never do things like new String("abc"), new Number(42), etc -- always prefer using the literal primitive values "abc" and 42.

Object Wrapper Gotchas

There are some gotchas with using the object wrappers directly that you should be aware of if you do choose to ever use them.

For example, consider Boolean wrapped values:

var a = new Boolean( false );

if (!a) {
	console.log( "Oops" ); // never runs
}

The problem is that you've created an object wrapper around the false value, but objects themselves are "truthy", so using the object behaves oppositely to using the underlying false value itself, which is quite contrary to normal expectation.

If you want to manually box a primitive value, you can use the Object(..) function (no new keyword):

var a = "abc";
var b = new String( a );
var c = Object( a );

typeof a; // "string"
typeof b; // "object"
typeof c; // "object"

b instanceof String; // true
c instanceof String; // true

Object.prototype.toString.call( b ); // "[object String]"
Object.prototype.toString.call( c ); // "[object String]"

Again, using the boxed object wrapper directly (like b and c above) is usually discouraged, but there may be some rare occasions you'll run into where they may be useful.

Unboxing

If you have an object wrapper and you want to get the underlying primitive value out, you can use the valueOf() method:

var a = new String( "abc" );
var b = new Number( 42 );
var c = new Boolean( true );

a.valueOf(); // "abc"
b.valueOf(); // 42
c.valueOf(); // true

Unboxing can also happen implicitly, when using an object wrapper value in a way that requires the primitive value:

var a = new String( "abc" );
var b = a + ""; // `b` has the unboxed primitive value "abc"

typeof a; // "object"
typeof b; // "string"

Natives as Constructors

For array, object, function, and regular-expression values, it's almost universally preferred that you use the literal form for creating the values, but the literal form creates the same sort of object as the constructor form does (that is, there is no nonwrapped value).

Just as we've seen above with the other natives, these constructor forms should generally be avoided, unless you really know you need them, mostly because they introduce exceptions and gotchas that you probably don't really want to deal with.

Array(..)

var a = new Array( 1, 2, 3 );
a; // [1, 2, 3]

var b = [1, 2, 3];
b; // [1, 2, 3]

Note: The Array(..) constructor does not require the new keyword in front of it. If you omit it, it will behave as if you have used it anyway. So Array(1,2,3) is the same outcome as new Array(1,2,3).

The Array constructor has a special form where if only one number argument is passed, instead of providing that value as contents of the array, it's taken as a length to "presize the array" (well, sorta).

This is a terrible idea. Firstly, you can trip over that form accidentally, as it's easy to forget.

But more importantly, there's no such thing as actually presizing the array. Instead, what you're creating is an otherwise empty array, but setting the length property of the array to the numeric value specified.

An array that has no explicit values in its slots, but has a length property that implies the slots exist, is a weird exotic type of data structure in JS with some very strange and confusing behavior. The capability to create such a value comes purely from old, deprecated, historical functionalities ("array-like objects" like the arguments object).

Note: An array with at least one "empty slot" in it is often called a "sparse array."

var a = new Array( 3 );

a.length; // 3
a;

The serialization of a in Chrome is (at the time of writing): [ undefined x 3 ]. This is really unfortunate. It implies that there are three undefined values in the slots of this array, when in fact the slots do not exist (so-called "empty slots" -- also a bad name!).

var a = new Array( 3 );
var b = [ undefined, undefined, undefined ];
var c = [];
c.length = 3;

a;
b;
c;

Note: As you can see with c in this example, empty slots in an array can happen after creation of the array. Changing the length of an array to go beyond its number of actually-defined slot values, you implicitly introduce empty slots. In fact, you could even call delete b[1] in the above snippet, and it would introduce an empty slot into the middle of b.

For b (in Chrome, currently), you'll find [ undefined, undefined, undefined ] as the serialization, as opposed to [ undefined x 3 ] for a and c. Confused? Yeah, so is everyone else.

Worse than that, at the time of writing, Firefox reports [ , , , ] for a and c. Did you catch why that's so confusing? Look closely. Three commas implies four slots, not three slots like we'd expect.

What!? Firefox puts an extra , on the end of their serialization here because as of ES5, trailing commas in lists (array values, property lists, etc.) are allowed (and thus dropped and ignored). So if you were to type in a [ , , , ] value into your program or the console, you'd actually get the underlying value that's like [ , , ] (that is, an array with three empty slots). This choice, while confusing if reading the developer console, is defended as instead making copy-n-paste behavior accurate.

Unfortunately, it gets worse. More than just confusing console output, a and b from the above code snippet actually behave the same in some cases but differently in others:

a.join( "-" ); // "--"
b.join( "-" ); // "--"

a.map(function(v,i){ return i; }); // [ undefined x 3 ]
b.map(function(v,i){ return i; }); // [ 0, 1, 2 ]

Ugh.

The a.map(..) call fails because the slots don't actually exist, so map(..) has nothing to iterate over. join(..) works differently. Basically, we can think of it implemented sort of like this:

function fakeJoin(arr,connector) {
	var str = "";
	for (var i = 0; i < arr.length; i++) {
		if (i > 0) {
			str += connector;
		}
		if (arr[i] !== undefined) {
			str += arr[i];
		}
	}
	return str;
}

var a = new Array( 3 );
fakeJoin( a, "-" ); // "--"

As you can see, join(..) works by just assuming the slots exist and looping up to the length value. Whatever map(..) does internally, it (apparently) doesn't make such an assumption, so the result from the strange "empty slots" array is unexpected and likely to cause failure.

So, if you wanted to actually create an array of actual undefined values (not just "empty slots"), how could you do it (besides manually)?

var a = Array.apply( null, { length: 3 } );
a; // [ undefined, undefined, undefined ]

apply(..) is a utility available to all functions, which calls the function it's used with but in a special way.

The first argument is a this object binding, so we set it to null. The second argument is supposed to be an array (or something like an "array-like object"). The contents of this "array" are "spread" out as arguments to the function in question.

So, Array.apply(..) is calling the Array(..) function and spreading out the values (of the { length: 3 } object value) as its arguments.

Inside of apply(..), we can envision there's another for loop (kinda like join(..) from above) that goes from 0 up to, but not including, length (3 in our case).

For each index, it retrieves that key from the object. So if the array-object parameter was named arr internally inside of the apply(..) function, the property access would effectively be arr[0], arr[1], and arr[2]. Of course, none of those properties exist on the { length: 3 } object value, so all three of those property accesses would return the value undefined.

In other words, it ends up calling Array(..) basically like this: Array(undefined,undefined,undefined), which is how we end up with an array filled with undefined values, and not just those empty slots.

While Array.apply( null, { length: 3 } ) is a strange and verbose way to create an array filled with undefined values, it's vastly better and more reliable than what you get with the footgun'ish Array(3) empty slots.

Bottom line: never ever, under any circumstances, should you intentionally create and use these exotic empty-slot arrays. Just don't do it. They're nuts.

Object(..), Function(..), and RegExp(..)

The Object(..)/Function(..)/RegExp(..) constructors are also generally optional (and thus should usually be avoided unless specifically called for):

var c = new Object();
c.foo = "bar";
c; // { foo: "bar" }

var d = { foo: "bar" };
d; // { foo: "bar" }

var e = new Function( "a", "return a * 2;" );
var f = function(a) { return a * 2; };
function g(a) { return a * 2; }

var h = new RegExp( "^a*b+", "g" );
var i = /^a*b+/g;

There's practically no reason to ever use the new Object() constructor form, especially since it forces you to add properties one-by-one instead of many at once in the object literal form.

The Function constructor is helpful only in the rarest of cases, where you need to dynamically define a function's parameters and/or its function body. Do not just treat Function(..) as an alternate form of eval(..). You will almost never need to dynamically define a function in this way.

Regular expressions defined in the literal form (/^a*b+/g) are strongly preferred, not just for ease of syntax but for performance reasons -- the JS engine precompiles and caches them before code execution. Unlike the other constructor forms we've seen so far, RegExp(..) has some reasonable utility: to dynamically define the pattern for a regular expression.

var name = "Kyle";
var namePattern = new RegExp( "\\b(?:" + name + ")+\\b", "ig" );

var matches = someText.match( namePattern );

This kind of scenario legitimately occurs in JS programs from time to time, so you'd need to use the new RegExp("pattern","flags") form.

Date(..) and Error(..)

The Date(..) and Error(..) native constructors are much more useful than the other natives, because there is no literal form for either.

To create a date object value, you must use new Date(). The Date(..) constructor accepts optional arguments to specify the date/time to use, but if omitted, the current date/time is assumed.

By far the most common reason you construct a date object is to get the current timestamp value (a signed integer number of milliseconds since Jan 1, 1970). You can do this by calling getTime() on a date object instance.

But an even easier way is to just call the static helper function defined as of ES5: Date.now().

Note: If you call Date() without new, you'll get back a string representation of the date/time at that moment. The exact form of this representation is not specified in the language spec, though browsers tend to agree on something close to: "Fri Jul 18 2014 00:31:02 GMT-0500 (CDT)".

The Error(..) constructor (much like Array() above) behaves the same with the new keyword present or omitted.

The main reason you'd want to create an error object is that it captures the current execution stack context into the object (in most JS engines, revealed as a read-only .stack property once constructed). This stack context includes the function call-stack and the line-number where the error object was created, which makes debugging that error much easier.

You would typically use such an error object with the throw operator:

function foo(x) {
	if (!x) {
		throw new Error( "x wasn't provided" );
	}
	// ..
}

Error object instances generally have at least a message property, and sometimes other properties (which you should treat as read-only), like type. However, other than inspecting the above-mentioned stack property, it's usually best to just call toString() on the error object (either explicitly, or implicitly through coercion -- see Chapter 4) to get a friendly-formatted error message.

Tip: Technically, in addition to the general Error(..) native, there are several other specific-error-type natives: EvalError(..), RangeError(..), ReferenceError(..), SyntaxError(..), TypeError(..), and URIError(..). But it's very rare to manually use these specific error natives. They are automatically used if your program actually suffers from a real exception (such as referencing an undeclared variable and getting a ReferenceError error).

Symbol(..)

New as of ES6, an additional primitive value type has been added, called "Symbol". Symbols are special "unique" (not strictly guaranteed!) values that can be used as properties on objects with little fear of any collision. They're primarily designed for special built-in behaviors of ES6 constructs, but you can also define your own symbols.

Symbols can be used as property names, but you cannot see or access the actual value of a symbol from your program, nor from the developer console. If you evaluate a symbol in the developer console, what's shown looks like Symbol(Symbol.create), for example.

There are several predefined symbols in ES6, accessed as static properties of the Symbol function object, like Symbol.create, Symbol.iterator, etc. To use them, do something like:

obj[Symbol.iterator] = function(){ /*..*/ };

To define your own custom symbols, use the Symbol(..) native. The Symbol(..) native "constructor" is unique in that you're not allowed to use new with it, as doing so will throw an error.

var mysym = Symbol( "my own symbol" );
mysym;				// Symbol(my own symbol)
mysym.toString();	// "Symbol(my own symbol)"
typeof mysym; 		// "symbol"

var a = { };
a[mysym] = "foobar";

Object.getOwnPropertySymbols( a );
// [ Symbol(my own symbol) ]

While symbols are not actually private (Object.getOwnPropertySymbols(..) reflects on the object and reveals the symbols quite publicly), using them for private or special properties is likely their primary use-case. For most developers, they may take the place of property names with _ underscore prefixes, which are almost always by convention signals to say, "hey, this is a private/special/internal property, so leave it alone!"

Note: Symbols are not objects, they are simple scalar primitives.

Native Prototypes

Each of the built-in native constructors has its own .prototype object -- Array.prototype, String.prototype, etc.

These objects contain behavior unique to their particular object subtype.

For example, all string objects, and by extension (via boxing) string primitives, have access to default behavior as methods defined on the String.prototype object.

Note: By documentation convention, String.prototype.XYZ is shortened to String#XYZ, and likewise for all the other .prototypes.

  • String#indexOf(..): find the position in the string of another substring
  • String#charAt(..): access the character at a position in the string
  • String#substr(..), String#substring(..), and String#slice(..): extract a portion of the string as a new string
  • String#toUpperCase() and String#toLowerCase(): create a new string that's converted to either uppercase or lowercase
  • String#trim(): create a new string that's stripped of any trailing or leading whitespace

None of the methods modify the string in place. Modifications create a new value from the existing value.

By virtue of prototype delegation, any string value can access these methods:

var a = " abc ";

a.indexOf( "c" ); // 3
a.toUpperCase(); // " ABC "
a.trim(); // "abc"

The other constructor prototypes contain behaviors appropriate to their types, such as Number#toFixed(..) (stringifying a number with a fixed number of decimal digits) and Array#concat(..) (merging arrays). All functions have access to apply(..), call(..), and bind(..) because Function.prototype defines them.

But, some of the native prototypes aren't just plain objects:

typeof Function.prototype;			// "function"
Function.prototype();				// it's an empty function!

RegExp.prototype.toString();		// "/(?:)/" -- empty regex
"abc".match( RegExp.prototype );	// [""]

A particularly bad idea, you can even modify these native prototypes (not just adding properties as you're probably familiar with):

Array.isArray( Array.prototype );	// true
Array.prototype.push( 1, 2, 3 );	// 3
Array.prototype;					// [1,2,3]

// don't leave it that way, though, or expect weirdness!
// reset the `Array.prototype` to empty
Array.prototype.length = 0;

As you can see, Function.prototype is a function, RegExp.prototype is a regular expression, and Array.prototype is an array.

Prototypes As Defaults

Function.prototype being an empty function, RegExp.prototype being an "empty" (e.g., non-matching) regex, and Array.prototype being an empty array, make them all nice "default" values to assign to variables if those variables wouldn't already have had a value of the proper type.

For example:

function isThisCool(vals,fn,rx) {
	vals = vals || Array.prototype;
	fn = fn || Function.prototype;
	rx = rx || RegExp.prototype;

	return rx.test(
		vals.map( fn ).join( "" )
	);
}

isThisCool();		// true

isThisCool(
	["a","b","c"],
	function(v){ return v.toUpperCase(); },
	/D/
);					// false

Note: As of ES6, we don't need to use the vals = vals || .. default value syntax anymore, because default values can be set for parameters via native syntax in the function declaration.

One minor side-benefit of this approach is that the .prototypes are already created and built-in, thus created only once. By contrast, using [], function(){}, and /(?:)/ values themselves for those defaults would be recreating those values (and probably garbage-collecting them later) for each call of isThisCool(..). That could be memory/CPU wasteful.

Also, be very careful not to use Array.prototype as a default value that will subsequently be modified. In this example, vals is used read-only, but if you were to instead make in-place changes to vals, you would actually be modifying Array.prototype itself, which would lead to the gotchas mentioned earlier!

Note: While we're pointing out these native prototypes and some usefulness, be cautious of relying on them and even more wary of modifying them in any way. See Appendix A "Native Prototypes" for more discussion.

Review

JavaScript provides object wrappers around primitive values, known as natives (String, Number, Boolean, etc). These object wrappers give the values access to behaviors appropriate for each object subtype (String#trim() and Array#concat(..)).

If you have a simple scalar primitive value like "abc" and you access its length property or some String.prototype method, JS automatically "boxes" the value (wraps it in its respective object wrapper) so that the property/method accesses can be fulfilled.


Chapter 4: Coercion

Now that we much more fully understand JavaScript's types and values, we turn our attention to a very controversial topic: coercion.

As we mentioned in Chapter 1, the debates over whether coercion is a useful feature or a flaw in the design of the language have raged since day one. If you've read other popular books on JS, you know that the overwhelmingly prevalent message out there is that coercion is magical, evil, confusing, and just downright a bad idea.

Rather than running away from coercion because everyone else does, or because you get bitten by some quirk, I think you should run toward that which you don't understand and seek to get it more fully.

Our goal is to fully explore the pros and cons of coercion, so that you can make an informed decision on its appropriateness in your program.

Converting Values

Converting a value from one type to another is often called "type casting," when done explicitly, and "coercion" when done implicitly (forced by the rules of how a value is used).

Note: It may not be obvious, but JavaScript coercions always result in one of the scalar primitive values, like string, number, or boolean. There is no coercion that results in a complex value like object or function. Chapter 3 covers "boxing," which wraps scalar primitive values in their object counterparts, but this is not really coercion in an accurate sense.

Another way these terms are often distinguished is as follows: "type casting" (or "type conversion") occur in statically typed languages at compile time, while "type coercion" is a runtime conversion for dynamically typed languages.

However, in JavaScript, most people refer to all these types of conversions as coercion, so the way I prefer to distinguish is to say "implicit coercion" vs. "explicit coercion."

The difference should be obvious: "explicit coercion" is when it is obvious from looking at the code that a type conversion is intentionally occurring, whereas "implicit coercion" is when the type conversion will occur as a less obvious side effect of some other intentional operation.

var a = 42;

var b = a + "";			// implicit coercion

var c = String( a );	// explicit coercion

For b, the coercion that occurs happens implicitly, because the + operator combined with one of the operands being a string value ("") will insist on the operation being a string concatenation (adding two strings together), which as a (hidden) side effect will force the 42 value in a to be coerced to its string equivalent: "42".

By contrast, the String(..) function makes it pretty obvious that it's explicitly taking the value in a and coercing it to a string representation.

Both approaches accomplish the same effect: "42" comes from 42. But it's the how that is at the heart of the heated debates over JavaScript coercion.

The terms "explicit" and "implicit," or "obvious" and "hidden side effect," are relative.

If you know exactly what a + "" is doing and you're intentionally doing that to coerce to a string, you might feel the operation is sufficiently "explicit." Conversely, if you've never seen the String(..) function used for string coercion, its behavior might seem hidden enough as to feel "implicit" to you.

But we're having this discussion of "explicit" vs. "implicit" based on the likely opinions of an average, reasonably informed, but not expert or JS specification devotee developer. To whatever extent you do or do not find yourself fitting neatly in that bucket, you will need to adjust your perspective on our observations here accordingly.

Just remember: it's often rare that we write our code and are the only ones who ever read it. Even if you're an expert on all the ins and outs of JS, consider how a less experienced teammate of yours will feel when they read your code. Will it be "explicit" or "implicit" to them in the same way it is for you?

Abstract Value Operations

Before we can explore explicit vs implicit coercion, we need to learn the basic rules that govern how values become either a string, number, or boolean. We will specifically pay attention to: ToString, ToNumber, and ToBoolean, and to a lesser extent, ToPrimitive.

ToString

When any non-string value is coerced to a string representation, the conversion is handled by the ToString abstract operation.

Built-in primitive values have natural stringification: null becomes "null", undefined becomes "undefined" and true becomes "true". numbers are generally expressed in the natural way you'd expect, but very small or very large numbers are represented in exponent form:

// multiplying `1.07` by `1000`, seven times over
var a = 1.07 * 1000 * 1000 * 1000 * 1000 * 1000 * 1000 * 1000;

// seven times three digits => 21 digits
a.toString(); // "1.07e21"

For regular objects, unless you specify your own, the default toString() will return the internal [[Class]], like for instance "[object Object]".

But if an object has its own toString() method on it, and you use that object in a string-like way, its toString() will automatically be called, and the string result of that call will be used instead.

Note: The way an object is coerced to a string technically goes through the ToPrimitive abstract operation, but those nuanced details are covered in more detail in the ToNumber section later in this chapter, so we will skip over them here.

Arrays have an overridden default toString() that stringifies as the (string) concatenation of all its values (each stringified themselves), with "," in between each value:

var a = [1,2,3];

a.toString(); // "1,2,3"

Again, toString() can either be called explicitly, or it will automatically be called if a non-string is used in a string context.

JSON Stringification

Another task that seems awfully related to ToString is when you use the JSON.stringify(..) utility to serialize a value to a JSON-compatible string value.

It's important to note that this stringification is not exactly the same thing as coercion. But since it's related to the ToString rules above, we'll take a slight diversion to cover JSON stringification behaviors here.

For most simple values, JSON stringification behaves basically the same as toString() conversions, except that the serialization result is always a string:

JSON.stringify( 42 );	// "42"
JSON.stringify( "42" );	// ""42"" (a string with a quoted string value in it)
JSON.stringify( null );	// "null"
JSON.stringify( true );	// "true"

Any JSON-safe value can be stringified by JSON.stringify(..). But what is JSON-safe? Any value that can be represented validly in a JSON representation.

It may be easier to consider values that are not JSON-safe. Some examples: undefineds, functions, symbols, and objects with circular references (where property references in an object structure create a never-ending cycle through each other). These are all illegal values for a standard JSON structure, mostly because they aren't portable to other languages that consume JSON values.

The JSON.stringify(..) utility will automatically omit undefined, function, and symbol values when it comes across them. If such a value is found in an array, that value is replaced by null (so that the array position information isn't altered). If found as a property of an object, that property will simply be excluded.

JSON.stringify( undefined );					// undefined
JSON.stringify( function(){} );					// undefined

JSON.stringify( [1,undefined,function(){},4] );	// "[1,null,null,4]"
JSON.stringify( { a:2, b:function(){} } );		// "{"a":2}"

But if you try to JSON.stringify(..) an object with circular reference(s) in it, an error will be thrown.

JSON stringification has the special behavior that if an object value has a toJSON() method defined, this method will be called first to get a value to use for serialization.

If you intend to JSON stringify an object that may contain illegal JSON value(s), or if you just have values in the object that aren't appropriate for the serialization, you should define a toJSON() method for it that returns a JSON-safe version of the object.

var o = { };

var a = {
	b: 42,
	c: o,
	d: function(){}
};

// create a circular reference inside `a`
o.e = a;

// would throw an error on the circular reference
// JSON.stringify( a );

// define a custom JSON value serialization
a.toJSON = function() {
	// only include the `b` property for serialization
	return { b: this.b };
};

JSON.stringify( a ); // "{"b":42}"

It's a very common misconception that toJSON() should return a JSON stringification representation. That's probably incorrect, unless you're wanting to actually stringify the string itself (usually not!). toJSON() should return the actual regular value (of whatever type) that's appropriate, and JSON.stringify(..) itself will handle the stringification.

In other words, toJSON() should be interpreted as "to a JSON-safe value suitable for stringification," not "to a JSON string" as many developers mistakenly assume.

var a = {
	val: [1,2,3],

	// probably correct!
	toJSON: function(){
		return this.val.slice( 1 );
	}
};

var b = {
	val: [1,2,3],

	// probably incorrect!
	toJSON: function(){
		return "[" +
			this.val.slice( 1 ).join() +
		"]";
	}
};

JSON.stringify( a ); // "[2,3]"
JSON.stringify( b ); // ""[2,3]""

In the second call, we stringified the returned string rather than the array itself, which was probably not what we wanted to do.

While we're talking about JSON.stringify(..), let's discuss some lesser-known functionalities that can still be very useful.

An optional second argument can be passed to JSON.stringify(..) that is called replacer. This argument can either be an array or a function. It's used to customize the recursive serialization of an object by providing a filtering mechanism for which properties should and should not be included, in a similar way to how toJSON() can prepare a value for serialization.

If replacer is an array, it should be an array of strings, each of which will specify a property name that is allowed to be included in the serialization of the object. If a property exists that isn't in this list, it will be skipped.

If replacer is a function, it will be called once for the object itself, and then once for each property in the object, and each time is passed two arguments, key and value. To skip a key in the serialization, return undefined. Otherwise, return the value provided.

var a = {
	b: 42,
	c: "42",
	d: [1,2,3]
};

JSON.stringify( a, ["b","c"] ); // "{"b":42,"c":"42"}"

JSON.stringify( a, function(k,v){
	if (k !== "c") return v;
} );
// "{"b":42,"d":[1,2,3]}"

Note: In the function replacer case, the key argument k is undefined for the first call (where the a object itself is being passed in). The if statement filters out the property named "c". Stringification is recursive, so the [1,2,3] array has each of its values (1, 2, and 3) passed as v to replacer, with indexes (0, 1, and 2) as k.

A third optional argument can also be passed to JSON.stringify(..), called space, which is used as indentation for prettier human-friendly output. space can be a positive integer to indicate how many space characters should be used at each indentation level. Or, space can be a string, in which case up to the first ten characters of its value will be used for each indentation level.

var a = {
	b: 42,
	c: "42",
	d: [1,2,3]
};

JSON.stringify( a, null, 3 );
// "{
//    "b": 42,
//    "c": "42",
//    "d": [
//       1,
//       2,
//       3
//    ]
// }"

JSON.stringify( a, null, "-----" );
// "{
// -----"b": 42,
// -----"c": "42",
// -----"d": [
// ----------1,
// ----------2,
// ----------3
// -----]
// }"

Remember, JSON.stringify(..) is not directly a form of coercion. We covered it here, however, for two reasons that relate its behavior to ToString coercion:

  1. string, number, boolean, and null values all stringify for JSON basically the same as how they coerce to string values via the rules of the ToString abstract operation.
  2. If you pass an object value to JSON.stringify(..), and that object has a toJSON() method on it, toJSON() is automatically called to (sort of) "coerce" the value to be JSON-safe before stringification.

ToNumber

If any non-number value is used in a way that requires it to be a number, such as a mathematical operation, the ES5 spec defines the ToNumber abstract operation.

For example, true becomes 1 and false becomes 0. undefined becomes NaN, but (curiously) null becomes 0.

ToNumber for a string value essentially works for the most part like the rules/syntax for numeric literals. If it fails, the result is NaN (instead of a syntax error as with number literals). One example difference is that 0-prefixed octal numbers are not handled as octals (just as normal base-10 decimals) in this operation, though such octals are valid as number literals.

Note: The differences between number literal grammar and ToNumber on a string value are subtle and highly nuanced, and thus will not be covered further here.

Objects (and arrays) will first be converted to their primitive value equivalent, and the resulting value (if a primitive but not already a number) is coerced to a number according to the ToNumber rules just mentioned.

To convert to this primitive value equivalent, the ToPrimitive abstract operation will consult the value (using the internal DefaultValue operation) in question to see if it has a valueOf() method. If valueOf() is available and it returns a primitive value, that value is used for the coercion. If not, but toString() is available, it will provide the value for the coercion.

If neither operation can provide a primitive value, a TypeError is thrown.

As of ES5, you can create such a noncoercible object -- one without valueOf() and toString() -- if it has a null value for its [[Prototype]], typically created with Object.create(null). See the this & Object Prototypes title of this series for more information on [[Prototype]]s.

var a = {
	valueOf: function(){
		return "42";
	}
};

var b = {
	toString: function(){
		return "42";
	}
};

var c = [4,2];
c.toString = function(){
	return this.join( "" );	// "42"
};

Number( a );			// 42
Number( b );			// 42
Number( c );			// 42
Number( "" );			// 0
Number( [] );			// 0
Number( [ "abc" ] );	// NaN

ToBoolean

First and foremost, JS has actual keywords true and false, and they behave exactly as you'd expect of boolean values. It's a common misconception that the values 1 and 0 are identical to true/false. In JS, the numbers are numbers and the booleans are booleans. You can coerce 1 to true (and vice versa) or 0 to false (and vice versa). But they're not the same.

Falsy Values

But that's not the end of the story. We need to discuss how values other than the two booleans behave whenever you coerce to their boolean equivalent.

All of JavaScript's values can be divided into two categories:

  1. values that will become false if coerced to boolean
  2. everything else (which will obviously become true)

How do we know what the list of values is? In the ES5 spec, section 9.2 defines a ToBoolean abstract operation, which says exactly what happens for all the possible values when you try to coerce them "to boolean."

From that table, we get the following as the so-called "falsy" values list:

  • undefined
  • null
  • false
  • +0, -0, and NaN
  • ""

That's it. If a value is on that list, it's a "falsy" value, and it will coerce to false if you force a boolean coercion on it.

By logical conclusion, if a value is not on that list, it must be on another list, which we call the "truthy" values list. But JS doesn't really define a "truthy" list per se. It gives some examples, such as saying explicitly that all objects are truthy, but mostly the spec just implies: anything not explicitly on the falsy list is therefore truthy.

Falsy Objects

Wait a minute, that section title even sounds contradictory. I literally just said the spec calls all objects truthy, right? There should be no such thing as a "falsy object."

You might be tempted to think it means an object wrapper around a falsy value (such as "", 0 or false). But don't fall into that trap.

var a = new Boolean( false );
var b = new Number( 0 );
var c = new String( "" );

We know all three values here are objects wrapped around obviously falsy values. But do these objects behave as true or as false? That's easy to answer:

var d = Boolean( a && b && c );

d; // true

So, all three behave as true, as that's the only way d could end up as true.

Tip: Notice the Boolean( .. ) wrapped around the a && b && c expression -- you might wonder why that's there. For a sneak-peek (trivia-wise), try for yourself what d will be if you just do d = a && b && c without the Boolean( .. ) call!

So, if "falsy objects" are not just objects wrapped around falsy values, what the heck are they?

The tricky part is that they can show up in your JS program, but they're not actually part of JavaScript itself.

What!?

There are certain cases where browsers have created their own sort of exotic values behavior, namely this idea of "falsy objects," on top of regular JS semantics.

A "falsy object" is a value that looks and acts like a normal object (properties, etc.), but when you coerce it to a boolean, it coerces to a false value.

Why!?

The most well-known case is document.all: an array-like (object) provided to your JS program by the DOM (not the JS engine itself), which exposes elements in your page to your JS program. It used to behave like a normal object--it would act truthy. But not anymore.

document.all itself was never really "standard" and has long since been deprecated/abandoned. But there's far too many legacy JS code bases out there that rely on using it.

So, why make it act falsy? Because coercions of document.all to boolean were almost always used as a means of detecting old, nonstandard IE.

IE has long since come up to standards compliance, and in many cases is pushing the web forward as much or more than any other browser. But all that old if (document.all) { /* it's IE */ } code is still out there, and much of it is probably never going away. All this legacy code is still assuming it's running in decade-old IE, which just leads to bad browsing experience for IE users.

So, we can't remove document.all completely, but IE doesn't want if (document.all) { .. } code to work anymore, so that users in modern IE get new, standards-compliant code logic.

"What should we do?" **"I've got it! Let's bastardize the JS type system and pretend that document.all is falsy!" It's a crazy gotcha that most JS developers don't understand. But the alternative (doing nothing about the above no-win problems) sucks just a little bit more.

So... that's what we've got: crazy, nonstandard "falsy objects" added to JavaScript by the browsers.

Truthy Values

Back to the truthy list. What exactly are the truthy values? Remember: a value is truthy if it's not on the falsy list.

Consider:

var a = "false";
var b = "0";
var c = "''";

var d = Boolean( a && b && c );

d;

What value do you expect d to have here? It's gotta be either true or false.

It's true. Why? Because despite the contents of those string values looking like falsy values, the string values themselves are all truthy, because "" is the only string value on the falsy list.

var a = [];				// empty array -- truthy or falsy?
var b = {};				// empty object -- truthy or falsy?
var c = function(){};	// empty function -- truthy or falsy?

var d = Boolean( a && b && c );

d;

Yep, you guessed it, d is still true here. Why? Same reason as before. Despite what it may seem like, [], {}, and function(){} are not on the falsy list, and thus are truthy values. In other words, the truthy list is infinitely long.

Take five minutes, write the falsy list on a post-it note for your computer monitor, or memorize it if you prefer. Either way, you'll easily be able to construct a virtual truthy list whenever you need it by simply asking if it's on the falsy list or not.

The importance of truthy and falsy is in understanding how a value will behave if you coerce it to a boolean value. Now that you have those two lists in mind, we can dive into coercion examples themselves.

Explicit Coercion

Explicit coercion refers to type conversions that are obvious and explicit. There's a wide range of type conversion usage that clearly falls under the explicit coercion category for most developers.

The goal here is to identify patterns in our code where we can make it clear and obvious that we're converting a value from one type to another, so as to not leave potholes for future developers to trip into. The more explicit we are, the more likely someone later will be able to read our code and understand without undue effort what our intent was.

Explicitly: Strings <--> Numbers

We'll start with the simplest and perhaps most common coercion operation: coercing values between string and number representation.

To coerce between strings and numbers, we use the built-in String(..) and Number(..) functions, but very importantly, we do not use the new keyword in front of them. As such, we're not creating object wrappers.

var a = 42;
var b = String( a );

var c = "3.14";
var d = Number( c );

b; // "42"
d; // 3.14

String(..) coerces from any other value to a primitive string value, using the rules of the ToString operation discussed earlier. Number(..) coerces from any other value to a primitive number value, using the rules of the ToNumber operation discussed earlier.

I call this explicit coercion because in general, it's pretty obvious to most developers that the end result of these operations is the applicable type conversion.

In fact, this usage actually looks a lot like it does in some other statically typed languages.

Besides String(..) and Number(..), there are other ways to "explicitly" convert these values between string and number:

var a = 42;
var b = a.toString();

var c = "3.14";
var d = +c;

b; // "42"
d; // 3.14

Calling a.toString() is explicit (pretty clear that "toString" means "to a string"), but there's some hidden implicitness here. toString() cannot be called on a primitive value like 42. So JS automatically "boxes" 42 in an object wrapper, so that toString() can be called against the object. In other words, you might call it "explicitly implicit."

+c here is showing the unary operator form of the + operator. Instead of performing mathematic addition (or string concatenation), the unary + explicitly coerces its operand (c) to a number value.

Is +c explicit coercion? Depends on your experience and perspective. If you know that unary + is explicitly intended for number coercion, then it's pretty explicit and obvious. However, if you've never seen it before, it can seem awfully confusing, implicit, with hidden side effects, etc.

Note: The generally accepted perspective in the open-source JS community is that unary + is an accepted form of explicit coercion.

Even if you really like the +c form, there are definitely places where it can look awfully confusing. Consider:

var c = "3.14";
var d = 5+ +c;

d; // 8.14

The unary - operator also coerces like + does, but it also flips the sign of the number. However, you cannot put two -- next to each other to unflip the sign, as that's parsed as the decrement operator. Instead, you would need to do: - -"3.14" with a space in between, and that would result in coercion to 3.14.

You should strongly consider avoiding unary + (or -) coercion when it's immediately adjacent to other operators. While the above works, it would almost universally be considered a bad idea. Even d = +c (or d =+ c for that matter!) can far too easily be confused for d += c, which is entirely different!

Note: Another extremely confusing place for unary + to be used adjacent to another operator would be the ++ increment operator and -- decrement operator. For example: a +++b, a + ++b, and a + + +b.

Remember, we're trying to be explicit and reduce confusion, not make it much worse!

Date To number

Another common usage of the unary + operator is to coerce a Date object into a number, because the result is the unix timestamp (milliseconds elapsed since 1 January 1970 00:00:00 UTC) representation:

var d = new Date( "Mon, 18 Aug 2014 08:53:06 CDT" );

+d; // 1408369986000

The most common usage of this idiom is to get the current now moment as a timestamp, such as:

var timestamp = +new Date();

Note: Some developers are aware of a peculiar syntactic "trick" in JavaScript, which is that the () set on a constructor call (a function called with new) is optional if there are no arguments to pass. So you may run across the var timestamp = +new Date; form. However, not all developers agree that omitting the () improves readability, as it's an uncommon syntax exception that only applies to the new fn() call form and not the regular fn() call form.

But coercion is not the only way to get the timestamp out of a Date object. A noncoercion approach is perhaps even preferable, as it's even more explicit:

var timestamp = new Date().getTime();
// var timestamp = (new Date()).getTime();
// var timestamp = (new Date).getTime();

But an even more preferable noncoercion option is to use the ES5 added Date.now() static function:

var timestamp = Date.now();

I'd recommend skipping the coercion forms related to dates. Use Date.now() for current now timestamps, and new Date( .. ).getTime() for getting a timestamp of a specific non-now date/time that you need to specify.

The Curious Case of the ~

One coercive JS operator that is often overlooked and usually very confused is the tilde ~ operator (aka "bitwise NOT"). Many of those who even understand what it does will often times still want to avoid it. But sticking to the spirit of our approach in this book and series, let's dig into it to find out if ~ has anything useful to give us.

In the "32-bit (Signed) Integers" section of Chapter 2, we covered how bitwise operators in JS are defined only for 32-bit operations, which means they force their operands to conform to 32-bit value representations. The rules for how this happens are controlled by the ToInt32 abstract operation.

ToInt32 first does a ToNumber coercion, which means if the value is "123", it's going to first become 123 before the ToInt32 rules are applied.

While not technically coercion itself (since the type doesn't change!), using bitwise operators (like | or ~) with certain special number values produces a coercive effect that results in a different number value.

For example, let's first consider the | "bitwise OR" operator used in the otherwise no-op idiom 0 | x, which essentially only does the ToInt32 conversion:

0 | -0;			// 0
0 | NaN;		// 0
0 | Infinity;	// 0
0 | -Infinity;	// 0

These special numbers aren't 32-bit representable (since they come from the 64-bit IEEE 754 standard), so ToInt32 just specifies 0 as the result from these values.

It's debatable if 0 | __ is an explicit form of this coercive ToInt32 operation or if it's more implicit. From the spec perspective, it's unquestionably explicit, but if you don't understand bitwise operations at this level, it can seem a bit more implicitly magical. Nevertheless, consistent with other assertions in this chapter, we will call it explicit.

So, let's turn our attention back to ~. The ~ operator first "coerces" to a 32-bit number value, and then performs a bitwise negation (flipping each bit's parity).

Note: This is very similar to how ! not only coerces its value to boolean but also flips its parity.

Another way of thinking about the definition of ~ comes from old-school computer science/discrete Mathematics: ~ performs two's-complement. Great, thanks, that's totally clearer!

Let's try again: ~x is roughly the same as -(x+1). That's weird, but slightly easier to reason about. So:

~42;	// -(42+1) ==> -43

You're probably still wondering what the heck all this ~ stuff is about, or why it really matters for a coercion discussion. Let's quickly get to the point.

Consider -(x+1). What's the only value that you can perform that operation on that will produce a 0 (or -0 technically!) result? -1. In other words, ~ used with a range of number values will produce a falsy (easily coercible to false) 0 value for the -1 input value, and any other truthy number otherwise.

-1 is commonly called a "sentinel value," which basically means a value that's given an arbitrary semantic meaning within the greater set of values of its same type (numbers).

JavaScript adopted this precedent when defining the string operation indexOf(..), which searches for a substring and if found returns its zero-based index position, or -1 if not found.

It's pretty common to try to use indexOf(..) not just as an operation to get the position, but as a boolean check of presence/absence of a substring in another string. Here's how developers usually perform such checks:

var a = "Hello World";

if (a.indexOf( "lo" ) >= 0) {	// true
	// found it!
}
if (a.indexOf( "lo" ) != -1) {	// true
	// found it
}

if (a.indexOf( "ol" ) < 0) {	// true
	// not found!
}
if (a.indexOf( "ol" ) == -1) {	// true
	// not found!
}

I find it kind of gross to look at >= 0 or == -1. It's basically a "leaky abstraction," in that it's leaking underlying implementation behavior -- the usage of sentinel -1 for "failure" -- into my code. I would prefer to hide such a detail.

And now, finally, we see why ~ could help us! Using ~ with indexOf() "coerces" (actually just transforms) the value to be appropriately boolean-coercible:

var a = "Hello World";

~a.indexOf( "lo" );			// -4   <-- truthy!

if (~a.indexOf( "lo" )) {	// true
	// found it!
}

~a.indexOf( "ol" );			// 0    <-- falsy!
!~a.indexOf( "ol" );		// true

if (!~a.indexOf( "ol" )) {	// true
	// not found!
}

~ takes the return value of indexOf(..) and transforms it: for the "failure" -1 we get the falsy 0, and every other value is truthy.

Note: The -(x+1) pseudo-algorithm for ~ would imply that ~-1 is -0, but actually it produces 0 because the underlying operation is actually bitwise, not mathematic.

Technically, if (~a.indexOf(..)) is still relying on implicit coercion of its resultant 0 to false or nonzero to true. But overall, ~ still feels to me more like an explicit coercion mechanism, as long as you know what it's intended to do in this idiom.

I find this to be cleaner code than the previous >= 0 / == -1 clutter.

Truncating Bits

There's one more place ~ may show up in code you run across: some developers use the double tilde ~~ to truncate the decimal part of a number (i.e., "coerce" it to a whole number "integer"). It's commonly (though mistakingly) said this is the same result as calling Math.floor(..).

How ~~ works is that the first ~ applies the ToInt32 "coercion" and does the bitwise flip, and then the second ~ does another bitwise flip, flipping all the bits back to the original state. The end result is just the ToInt32 "coercion" (aka truncation).

Note: The bitwise double-flip of ~~ is very similar to the parity double-negate !! behavior, explained in the "Explicitly: * --> Boolean" section later.

However, ~~ needs some caution/clarification. First, it only works reliably on 32-bit values. But more importantly, it doesn't work the same on negative numbers as Math.floor(..) does!

Math.floor( -49.6 );	// -50
~~-49.6;				// -49

Setting the Math.floor(..) difference aside, ~~x can truncate to a (32-bit) integer. But so does x | 0, and seemingly with (slightly) less effort.

So, why might you choose ~~x over x | 0, then? Operator precedence (see Chapter 5):

~~1E20 / 10;		// 166199296

1E20 | 0 / 10;		// 1661992960
(1E20 | 0) / 10;	// 166199296

Just as with all other advice here, use ~ and ~~ as explicit mechanisms for "coercion" and value transformation only if everyone who reads/writes such code is properly aware of how these operators work!

Explicitly: Parsing Numeric Strings

A similar outcome to coercing a string to a number can be achieved by parsing a number out of a string's character contents. There are, however, distinct differences between this parsing and the type conversion we examined above.

var a = "42";
var b = "42px";

Number( a );	// 42
parseInt( a );	// 42

Number( b );	// NaN
parseInt( b );	// 42

Parsing a numeric value out of a string is tolerant of non-numeric characters -- it just stops parsing left-to-right when encountered -- whereas coercion is not tolerant and fails resulting in the NaN value.

Parsing should not be seen as a substitute for coercion. These two tasks, while similar, have different purposes. Parse a string as a number when you don't know/care what other non-numeric characters there may be on the right-hand side. Coerce a string (to a number) when the only acceptable values are numeric and something like "42px" should be rejected as a number.

Tip: parseInt(..) has a twin, parseFloat(..), which pulls out a floating-point number from a string.

Don't forget that parseInt(..) operates on string values. It makes absolutely no sense to pass a number value to parseInt(..). Nor would it make sense to pass any other type of value, like true, function(){..} or [1,2,3].

If you pass a non-string, the value you pass will automatically be coerced to a string first, which would clearly be a kind of hidden implicit coercion. It's a really bad idea to rely upon such a behavior in your program, so never use parseInt(..) with a non-string value.

As of ES5, parseInt(..) no longer guesses octal. Unless you say otherwise, it assumes base-10 (or base-16 for "0x" prefixes). That's much nicer. Just be careful if your code has to run in pre-ES5 environments, in which case you still need to pass 10 for the radix.

Parsing Non-Strings

One somewhat infamous example of parseInt(..)'s behavior is highlighted in a sarcastic joke post a few years ago, poking fun at this JS behavior:

parseInt( 1/0, 19 ); // 18

The assumptive (but totally invalid) assertion was, "If I pass in Infinity, and parse an integer out of that, I should get Infinity back, not 18." Surely, JS must be crazy for this outcome, right?

Though this example is obviously contrived and unreal, let's indulge the madness for a moment and examine whether JS really is that crazy.

First off, the most obvious sin committed here is to pass a non-string to parseInt(..). That's a no-no. Do it and you're asking for trouble. But even if you do, JS politely coerces what you pass in into a string that it can try to parse.

Some would argue that this is unreasonable behavior, and that parseInt(..) should refuse to operate on a non-string value. Should it perhaps throw an error? That would be very Java-like, frankly. I shudder at thinking JS should start throwing errors all over the place so that try..catch is needed around almost every line.

Should it return NaN? Maybe. But... what about:

parseInt( new String( "42") );

Should that fail, too? It's a non-string value. If you want that String object wrapper to be unboxed to "42", then is it really so unusual for 42 to first become "42" so that 42 can be parsed back out?

I would argue that this half-explicit, half-implicit coercion that can occur can often be a very helpful thing. For example:

var a = {
	num: 21,
	toString: function() { return String( this.num * 2 ); }
};

parseInt( a ); // 42

The fact that parseInt(..) forcibly coerces its value to a string to perform the parse on is quite sensible. If you pass in garbage, and you get garbage back out, don't blame the trash can -- it just did its job faithfully.

So, if you pass in a value like Infinity (the result of 1 / 0 obviously), what sort of string representation would make the most sense for its coercion? Only two reasonable choices come to mind: "Infinity" and "∞". JS chose "Infinity". I'm glad it did.

I think it's a good thing that all values in JS have some sort of default string representation, so that they aren't mysterious black boxes that we can't debug and reason about.

Now, what about base-19? Obviously, completely bogus and contrived. No real JS programs use base-19. It's absurd. In base-19, the valid numeric characters are 0 - 9 and a - i (case insensitive).

So, back to our parseInt( 1/0, 19 ) example. It's essentially parseInt( "Infinity", 19 ). How does it parse? The first character is "I", which is value 18 in the silly base-19. The second character "n" is not in the valid set of numeric characters, and as such the parsing simply politely stops, just like when it ran across "p" in "42px".

The result? 18. Exactly like it sensibly should be. The behaviors involved to get us there, and not to an error or to Infinity itself, are very important to JS, and should not be so easily discarded.

Other examples of this behavior with parseInt(..) that may be surprising but are quite sensible include:

parseInt( 0.000008 );		// 0   ("0" from "0.000008")
parseInt( 0.0000008 );		// 8   ("8" from "8e-7")
parseInt( false, 16 );		// 250 ("fa" from "false")
parseInt( parseInt, 16 );	// 15  ("f" from "function..")

parseInt( "0x10" );			// 16
parseInt( "103", 2 );		// 2

parseInt(..) is actually pretty predictable and consistent in its behavior. If you use it correctly, you'll get sensible results. If you use it incorrectly, the crazy results you get are not the fault of JavaScript.

Explicitly: * --> Boolean

Now, let's examine coercing from any non-boolean value to a boolean.

Just like with String(..) and Number(..) above, Boolean(..) (without the new, of course!) is an explicit way of forcing the ToBoolean coercion:

var a = "0";
var b = [];
var c = {};

var d = "";
var e = 0;
var f = null;
var g;

Boolean( a ); // true
Boolean( b ); // true
Boolean( c ); // true

Boolean( d ); // false
Boolean( e ); // false
Boolean( f ); // false
Boolean( g ); // false

While Boolean(..) is clearly explicit, it's not at all common or idiomatic.

Just like the unary + operator coerces a value to a number, the unary ! negate operator explicitly coerces a value to a boolean. The problem is that it also flips the value from truthy to falsy or vice versa. So, the most common way JS developers explicitly coerce to boolean is to use the !! double-negate operator, because the second ! will flip the parity back to the original:

var a = "0";
var b = [];
var c = {};

var d = "";
var e = 0;
var f = null;
var g;

!!a;	// true
!!b;	// true
!!c;	// true

!!d;	// false
!!e;	// false
!!f;	// false
!!g;	// false

Any of these ToBoolean coercions would happen implicitly without the Boolean(..) or !!, if used in a boolean context such as an if (..) .. statement. But the goal here is to explicitly force the value to a boolean to make it clearer that the ToBoolean coercion is intended.

Another example use-case for explicit ToBoolean coercion is if you want to force a true/false value coercion in the JSON serialization of a data structure:

var a = [
	1,
	function(){ /*..*/ },
	2,
	function(){ /*..*/ }
];

JSON.stringify( a ); // "[1,null,2,null]"

JSON.stringify( a, function(key,val){
	if (typeof val == "function") {
		// force `ToBoolean` coercion of the function
		return !!val;
	}
	else {
		return val;
	}
} );
// "[1,true,2,true]"

If you come to JavaScript from Java, you may recognize this idiom:

var a = 42;

var b = a ? true : false;

The ? : ternary operator will test a for truthiness, and based on that test will either assign true or false to b, accordingly.

On its surface, this idiom looks like a form of explicit ToBoolean-type coercion, since it's obvious that only either true or false come out of the operation.

However, there's a hidden implicit coercion, in that the a expression has to first be coerced to boolean to perform the truthiness test. I'd call this idiom "explicitly implicit." Furthermore, I'd suggest you should avoid this idiom completely in JavaScript. It offers no real benefit, and worse, masquerades as something it's not.

Boolean(a) and !!a are far better as explicit coercion options.

Implicit Coercion

Implicit coercion refers to type conversions that are hidden, with non-obvious side-effects that implicitly occur from other actions. In other words, implicit coercions are any type conversions that aren't obvious (to you).

While it's clear what the goal of explicit coercion is (making code explicit and more understandable), it might be too obvious that implicit coercion has the opposite goal: making code harder to understand.

Taken at face value, I believe that's where much of the ire towards coercion comes from. The majority of complaints about "JavaScript coercion" are actually aimed at implicit coercion.

Note: Douglas Crockford, author of "JavaScript: The Good Parts", has claimed in many conference talks and writings that JavaScript coercion should be avoided. But what he seems to mean is that implicit coercion is bad (in his opinion). However, if you read his own code, you'll find plenty of examples of coercion, both implicit and explicit! In truth, his angst seems to primarily be directed at the == operation, but as you'll see in this chapter, that's only part of the coercion mechanism.

Let's take a different perspective on what implicit coercion is, and can be, than just that it's "the opposite of the good explicit kind of coercion." That's far too narrow and misses an important nuance.

Let's define the goal of implicit coercion as: to reduce verbosity, boilerplate, and/or unnecessary implementation detail that clutters up our code with noise that distracts from the more important intent.

Simplifying Implicitly

Before we even get to JavaScript, let me suggest something pseudo-code'ish from some theoretical strongly typed language to illustrate:

SomeType x = SomeType( AnotherType( y ) )

In this example, I have some arbitrary type of value in y that I want to convert to the SomeType type. The problem is, this language can't go directly from whatever y currently is to SomeType. It needs an intermediate step, where it first converts to AnotherType, and then from AnotherType to SomeType.

Now, what if that language did just let you say:

SomeType x = SomeType( y )

Wouldn't you generally agree that we simplified the type conversion here to reduce the unnecessary "noise" of the intermediate conversion step? I mean, is it really all that important to see and deal with the fact that y goes to AnotherType first before then going to SomeType?

Some would argue, at least in some circumstances, yes. But I think an equal argument can be made of many other circumstances that here, the simplification actually aids in the readability of the code by abstracting or hiding away such details, either in the language itself or in our own abstractions.

Undoubtedly, behind the scenes, somewhere, the intermediate conversion step is still happening. But if that detail is hidden from view here, we can just reason about getting y to type SomeType as a generic operation and hide the messy details.

While not a perfect analogy, what I'm going to argue throughout the rest of this chapter is that JS implicit coercion can be thought of as providing a similar aid to your code.

But, and this is very important, that is not an unbounded, absolute statement. There are definitely plenty of evils lurking around implicit coercion, that will harm your code much more than any potential readability improvements. Clearly, we have to learn how to avoid such constructs so we don't poison our code with all manner of bugs.

Many developers believe that if a mechanism can do some useful thing A but can also be abused or misused to do some awful thing Z, then we should throw out that mechanism altogether, just to be safe.

My encouragement to you is: don't settle for that. Don't "throw the baby out with the bathwater." Don't assume implicit coercion is all bad because all you think you've ever seen is its "bad parts." I think there are "good parts" here, and I want to help and inspire more of you to find and embrace them!

Implicitly: Strings <--> Numbers

Earlier we explored explicitly coercing between string and number values. Now, let's explore the same task but with implicit coercion approaches. But before we do, we have to examine some nuances of operations that will implicitly force coercion.

The + operator is overloaded to serve the purposes of both number addition and string concatenation. So how does JS know which type of operation you want to use? Consider:

var a = "42";
var b = "0";
a + b; // "420"

var c = 42;
var d = 0;
c + d; // 42

What's different that causes "420" vs 42? It's a common misconception that the difference is whether one or both of the operands is a string, as that means + will assume string concatenation. While that's partially true, it's more complicated than that.

var a = [1,2];
var b = [3,4];

a + b; // "1,23,4"

Neither of these operands is a string, but clearly they were both coerced to strings and then the string concatenation kicked in. So what's really going on?


The + algorithm (when an object value is an operand) will concatenate if either operand is either already a string, or if the following steps produce a string representation. So, when + receives an object (including array) for either operand, it first calls the ToPrimitive abstract operation on the value, which then calls the [[DefaultValue]] algorithm with a context hint of number.

If you're paying close attention, you'll notice that this operation is now identical to how the ToNumber abstract operation handles objects. The valueOf() operation on the array will fail to produce a simple primitive, so it then falls to a toString() representation. The two arrays thus become "1,2" and "3,4", respectively. Now, + concatenates the two strings as you'd normally expect: "1,23,4".


Let's set aside those messy details and go back to an earlier, simplified explanation: if either operand to + is a string (or becomes one with the above steps!), the operation will be string concatenation. Otherwise, it's always numeric addition.

Note: A commonly cited coercion gotcha is [] + {} vs. {} + [], as those two expressions result, respectively, in "[object Object]" and 0. There's more to it, though, and we cover those details in "Blocks" in Chapter 5.

What's that mean for implicit coercion?

You can coerce a number to a string simply by "adding" the number and the "" empty string:

var a = 42;
var b = a + "";

b; // "42"

Tip: Numeric addition with the + operator is commutative, which means 2 + 3 is the same as 3 + 2. String concatenation with + is obviously not generally commutative, but with the specific case of "", it's effectively commutative, as a + "" and "" + a will produce the same result.

It's extremely common/idiomatic to (implicitly) coerce number to string with a + "" operation. In fact, interestingly, even some of the most vocal critics of implicit coercion still use that approach in their own code, instead of one of its explicit alternatives.

I think this is a great example of a useful form in implicit coercion, despite how frequently the mechanism gets criticized!

Comparing this implicit coercion of a + "" to our earlier example of String(a) explicit coercion, there's one additional quirk to be aware of. Because of how the ToPrimitive abstract operation works, a + "" invokes valueOf() on the a value, whose return value is then finally converted to a string via the internal ToString abstract operation. But String(a) just invokes toString() directly.

Both approaches ultimately result in a string, but if you're using an object instead of a regular primitive number value, you may not necessarily get the same string value!

Consider:

var a = {
	valueOf: function() { return 42; },
	toString: function() { return 4; }
};

a + "";			// "42"

String( a );	// "4"

Generally, this sort of gotcha won't bite you unless you're really trying to create confusing data structures and operations, but you should be careful if you're defining both your own valueOf() and toString() methods for some object, as how you coerce the value could affect the outcome.

What about the other direction? How can we implicitly coerce from string to number?

var a = "3.14";
var b = a - 0;

b; // 3.14

The - operator is defined only for numeric subtraction, so a - 0 forces a's value to be coerced to a number. While far less common, a * 1 or a / 1 would accomplish the same result, as those operators are also only defined for numeric operations.

What about object values with the - operator? Similar story as for + above:

var a = [3];
var b = [1];

a - b; // 2

Both array values have to become numbers, but they end up first being coerced to strings (using the expected toString() serialization), and then are coerced to numbers, for the - subtraction to perform on.

So, is implicit coercion of string and number values the ugly evil you've always heard horror stories about? I don't personally think so.

Compare b = String(a) (explicit) to b = a + "" (implicit). I think cases can be made for both approaches being useful in your code. Certainly b = a + "" is quite a bit more common in JS programs, proving its own utility regardless of feelings about the merits or hazards of implicit coercion in general.

Implicitly: Booleans --> Numbers

I think a case where implicit coercion can really shine is in simplifying certain types of complicated boolean logic into simple numeric addition. Of course, this is not a general-purpose technique, but a specific solution for specific cases.

function onlyOne(a,b,c) {
	return !!((a && !b && !c) ||
		(!a && b && !c) || (!a && !b && c));
}

var a = true;
var b = false;

onlyOne( a, b, b );	// true
onlyOne( b, a, b );	// true

onlyOne( a, b, a );	// false

This onlyOne(..) utility should only return true if exactly one of the arguments is true / truthy. It's using implicit coercion on the truthy checks and explicit coercion on the others, including the final return value.

But what if we needed that utility to be able to handle four, five, or twenty flags in the same way? It's pretty difficult to imagine implementing code that would handle all those permutations of comparisons.

But here's where coercing the boolean values to numbers (0 or 1, obviously) can greatly help:

function onlyOne() {
	var sum = 0;
	for (var i=0; i < arguments.length; i++) {
		// skip falsy values. same as treating
		// them as 0's, but avoids NaN's.
		if (arguments[i]) {
			sum += arguments[i];
		}
	}
	return sum == 1;
}

var a = true;
var b = false;

onlyOne( b, a );		// true
onlyOne( b, a, b, b, b );	// true

onlyOne( b, b );		// false
onlyOne( b, a, b, b, b, a );	// false

What we're doing here is relying on the 1 for true/truthy coercions, and numerically adding them all up. sum += arguments[i] uses implicit coercion to make that happen. If one and only one value in the arguments list is true, then the numeric sum will be 1, otherwise the sum will not be 1 and thus the desired condition is not met.

We could of course do this with explicit coercion instead:

function onlyOne() {
	var sum = 0;
	for (var i=0; i < arguments.length; i++) {
		sum += Number( !!arguments[i] );
	}
	return sum === 1;
}

We first use !!arguments[i] to force the coercion of the value to true or false. That's so you could pass non-boolean values in, like onlyOne( "42", 0 ), and it would still work as expected (otherwise you'd end up with string concatenation and the logic would be incorrect).

Once we're sure it's a boolean, we do another explicit coercion with Number(..) to make sure the value is 0 or 1.

Is the explicit coercion form of this utility "better"? It does avoid the NaN trap as explained in the code comments. But, ultimately, it depends on your needs. I personally think the former version, relying on implicit coercion is more elegant (if you won't be passing undefined or NaN), and the explicit version is needlessly more verbose.

Note: Regardless of implicit or explicit approaches, you could easily make onlyTwo(..) or onlyFive(..) variations by simply changing the final comparison from 1, to 2 or 5, respectively. That's drastically easier than adding a bunch of && and || expressions. So, generally, coercion is very helpful in this case.

Implicitly: * --> Boolean

Now, let's turn our attention to implicit coercion to boolean values, as it's by far the most common and also by far the most potentially troublesome.

Remember, implicit coercion is what kicks in when you use a value in such a way that it forces the value to be converted. For numeric and string operations, it's fairly easy to see how the coercions can occur.

But, what sort of expression operations require/force (implicitly) a boolean coercion?

  1. The test expression in an if (..) statement.
  2. The test expression in a for ( .. ; .. ; .. ) header.
  3. The test expression in while (..) and do..while(..) loops.
  4. The test expression in ? : ternary expressions.
  5. The left-hand operand (which serves as a test expression) to the || ("logical or") and && ("logical and") operators.

Any value used in these contexts that is not already a boolean will be implicitly coerced to a boolean using the rules of the ToBoolean abstract operation covered earlier in this chapter.

var a = 42;
var b = "abc";
var c;
var d = null;

if (a) {
	console.log( "yep" );		// yep
}

while (c) {
	console.log( "nope, never runs" );
}

c = d ? a : b;
c;					// "abc"

if ((a && d) || c) {
	console.log( "yep" );		// yep
}

In all these contexts, the non-boolean values are implicitly coerced to their boolean equivalents to make the test decisions.

Operators || and &&

It's quite likely that you have seen the || ("logical or") and && ("logical and") operators in most or all other languages you've used.

In fact, I would argue these operators shouldn't even be called "logical ___ operators", as that name is incomplete in describing what they do. If I were to give them a more accurate name, I'd call them "selector operators," or more completely, "operand selector operators."

Why? Because they don't actually result in a logic value (aka boolean) in JavaScript, as they do in some other languages.

They result in the value of one (and only one) of their two operands. In other words, they select one of the two operand's values.

Quoting the ES5 spec from section 11.11:

The value produced by a && or || operator is not necessarily of type Boolean. The value produced will always be the value of one of the two operand expressions.

var a = 42;
var b = "abc";
var c = null;

a || b;		// 42
a && b;		// "abc"

c || b;		// "abc"
c && b;		// null

Wait, what!? Think about that. In languages like C and PHP, those expressions result in true or false, but in JS (and Python and Ruby), the result comes from the values themselves.

Both || and && operators perform a boolean test on the first operand (a or c). If the operand is not already boolean, a normal ToBoolean coercion occurs, so that the test can be performed.

For the || operator, if the test is true, the || expression results in the value of the first operand (a or c). If the test is false, the || expression results in the value of the second operand (b).

Inversely, for the && operator, if the test is true, the && expression results in the value of the second operand (b). If the test is false, the && expression results in the value of the first operand (a or c).

The result of a || or && expression is always the underlying value of one of the operands, not the (possibly coerced) result of the test. In c && b, c is null, and thus falsy. But the && expression itself results in null (the value in c), not in the coerced false used in the test.

Do you see how these operators act as "operand selectors", now?

Another way of thinking about these operators:

a || b;
// roughly equivalent to:
a ? a : b;

a && b;
// roughly equivalent to:
a ? b : a;

Note: I call a || b "roughly equivalent" to a ? a : b because the outcome is identical, but there's a nuanced difference. In a ? a : b, if a was a more complex expression (like for instance one that might have side effects like calling a function, etc.), then the a expression would possibly be evaluated twice (if the first evaluation was truthy). By contrast, for a || b, the a expression is evaluated only once, and that value is used both for the coercive test as well as the result value (if appropriate). The same nuance applies to the a && b and a ? b : a expressions.

An extremely common and helpful usage of this behavior, which there's a good chance you may have used before and not fully understood, is:

function foo(a,b) {
	a = a || "hello";
	b = b || "world";

	console.log( a + " " + b );
}

foo();					// "hello world"
foo( "yeah", "yeah!" );	// "yeah yeah!"

The a = a || "hello" idiom acts to test a and if it has no value (or only an undesired falsy value), provides a backup default value ("hello").

Be careful, though!

foo( "That's it!", "" ); // "That's it! world" <-- Oops!

See the problem? "" as the second argument is a falsy value, so the b = b || "world" test fails, and the "world" default value is substituted, even though the intent probably was to have the explicitly passed "" be the value assigned to b.

This || idiom is extremely common, and quite helpful, but you have to use it only in cases where all falsy values should be skipped. Otherwise, you'll need to be more explicit in your test, and probably use a ? : ternary instead.

This default value assignment idiom is so common (and useful!) that even those who publicly and vehemently decry JavaScript coercion often use it in their own code!

What about &&?

There's another idiom that is quite a bit less commonly authored manually, but which is used by JS minifiers frequently. The && operator "selects" the second operand if and only if the first operand tests as truthy, and this usage is sometimes called the "guard operator" -- the first expression test "guards" the second expression:

function foo() {
	console.log( a );
}

var a = 42;

a && foo(); // 42

foo() gets called only because a tests as truthy. If that test failed, this a && foo() expression statement would just silently stop -- this is known as "short circuiting" -- and never call foo().

Again, it's not nearly as common for people to author such things. Usually, they'd do if (a) { foo(); } instead. But JS minifiers choose a && foo() because it's much shorter. So, now, if you ever have to decipher such code, you'll know what it's doing and why.

OK, so || and && have some neat tricks up their sleeve, as long as you're willing to allow the implicit coercion into the mix.

Note: Both the a = b || "something" and a && b() idioms rely on short circuiting behavior, which we cover in more detail in Chapter 5.

The fact that these operators don't actually result in true and false is possibly messing with your head a little bit by now. You're probably wondering how all your if statements and for loops have been working, if they've included compound logical expressions like a && (b || c).

Don't worry! It's just that you probably never realized before that there was an implicit coercion to boolean going on after the compound expression was evaluated.

var a = 42;
var b = null;
var c = "foo";

if (a && (b || c)) {
	console.log( "yep" );
}

This code still works the way you always thought it did, except for one subtle extra detail. The a && (b || c) expression actually results in "foo", not true. So, the if statement then forces the "foo" value to coerce to a boolean, which of course will be true.

And now you also realize that such code is using implicit coercion. If you're in the "avoid (implicit) coercion camp" still, you're going to need to go back and make all of those tests explicit:

if (!!a && (!!b || !!c)) {
	console.log( "yep" );
}

Good luck with that! ... Sorry, just teasing.

Symbol Coercion

Up to this point, there's been almost no observable outcome difference between explicit and implicit coercion -- only the readability of code has been at stake.

But ES6 Symbols introduce a gotcha into the coercion system that we need to discuss briefly. Explicit coercion of a symbol to a string is allowed, but implicit coercion of the same is disallowed and throws an error.

var s1 = Symbol( "cool" );
String( s1 );					// "Symbol(cool)"

var s2 = Symbol( "not cool" );
s2 + "";						// TypeError

symbol values cannot coerce to number at all (throws an error either way), but strangely they can both explicitly and implicitly coerce to boolean (always true).

Consistency is always easier to learn, and exceptions are never fun to deal with, but we just need to be careful around the new ES6 symbol values and how we coerce them.

Loose Equals vs. Strict Equals

Loose equals is the == operator, and strict equals is the === operator. Both operators are used for comparing two values for "equality," but the "loose" vs. "strict" indicates a very important difference in behavior between the two, specifically in how they decide "equality."

A very common misconception about these two operators is: "== checks values for equality and === checks both values and types for equality." While that sounds nice and reasonable, it's inaccurate. Countless well-respected JavaScript books and blogs have said exactly that, but unfortunately they're all wrong.

The correct description is: "== allows coercion in the equality comparison and === disallows coercion."

Equality Performance

Stop and think about the difference between the first (inaccurate) explanation and this second (accurate) one.

In the first explanation, it seems obvious that === is doing more work than ==, because it has to also check the type. In the second explanation, == is the one doing more work because it has to follow through the steps of coercion if the types are different.

Don't fall into the trap, as many have, of thinking this has anything to do with performance, though, as if == is going to be slower than === in any relevant way. While it's measurable that coercion does take a little bit of processing time, it's mere microseconds.

If you're comparing two values of the same types, == and === use the identical algorithm, and so other than minor differences in engine implementation, they should do the same work.

If you're comparing two values of different types, the performance isn't the important factor. What you should be asking yourself is: when comparing these two values, do I want coercion or not?

If you want coercion, use == loose equality, but if you don't want coercion, use === strict equality.

Note: The implication here then is that both == and === check the types of their operands. The difference is in how they respond if the types don't match.

Abstract Equality

The == operator's behavior is defined as "The Abstract Equality Comparison Algorithm" in the ES5 spec. What's listed there is a comprehensive but simple algorithm that explicitly states every possible combination of types, and how the coercions (if necessary) should happen for each combination.

Warning: When (implicit) coercion is maligned as being too complicated and too flawed to be a useful good part, it is these rules of "abstract equality" that are being condemned. Generally, they are said to be too complex and too unintuitive for developers to practically learn and use, and that they are prone more to causing bugs in JS programs than to enabling greater code readability. I believe this is a flawed premise -- that you readers are competent developers who write (and read and understand!) algorithms (aka code) all day long. So, what follows is a plain exposition of the "abstract equality" in simple terms. But I implore you to also read the ES5 spec section 11.9.3. I think you'll be surprised at just how reasonable it is.

Basically, the first clause (11.9.3.1) says, if the two values being compared are of the same type, they are simply and naturally compared via Identity as you'd expect. For example, 42 is only equal to 42, and "abc" is only equal to "abc".

Some minor exceptions to normal expectation to be aware of:

  • NaN is never equal to itself
  • +0 and -0 are equal to each other

The final provision in clause 11.9.3.1 is for == loose equality comparison with objects (including functions and arrays). Two such values are only equal if they are both references to the exact same value. No coercion occurs here.

Note: The === strict equality comparison is defined identically to 11.9.3.1, including the provision about two object values. It's a very little known fact that == and === behave identically in the case where two objects are being compared!

The rest of the algorithm in 11.9.3 specifies that if you use == loose equality to compare two values of different types, one or both of the values will need to be implicitly coerced. This coercion happens so that both values eventually end up as the same type, which can then directly be compared for equality using simple value Identity.

Note: The != loose not-equality operation is defined exactly as you'd expect, in that it's literally the == operation comparison performed in its entirety, then the negation of the result. The same goes for the !== strict not-equality operation.

Comparing: strings to numbers

To illustrate == coercion, let's first build off the string and number examples earlier in this chapter:

var a = 42;
var b = "42";

a === b;	// false
a == b;		// true

As we'd expect, a === b fails, because no coercion is allowed, and indeed the 42 and "42" values are different.

However, the second comparison a == b uses loose equality, which means that if the types happen to be different, the comparison algorithm will perform implicit coercion on one or both values.

But exactly what kind of coercion happens here? Does the a value of 42 become a string, or does the b value of "42" become a number?

In the ES5 spec, clauses 11.9.3.4-5 say:

  1. If Type(x) is Number and Type(y) is String, return the result of the comparison x == ToNumber(y).
  2. If Type(x) is String and Type(y) is Number, return the result of the comparison ToNumber(x) == y.

Warning: The spec uses Number and String as the formal names for the types, while this book prefers number and string for the primitive types. Do not let the capitalization of Number in the spec confuse you for the Number() native function.

Clearly, the spec says the "42" value is coerced to a number for the comparison. The how of that coercion has already been covered earlier, specifically with the ToNumber abstract operation. In this case, it's quite obvious then that the resulting two 42 values are equal.

Comparing: anything to boolean

One of the biggest gotchas with the implicit coercion of == loose equality pops up when you try to compare a value directly to true or false.

Consider:

var a = "42";
var b = true;

a == b;	// false

Wait, what happened here!? We know that "42" is a truthy value. So, how come it's not == loose equal to true?

The reason is both simple and tricky. It's so easy to misunderstand, many JS developers never pay close enough attention to fully grasp it.

Let's again quote the spec, clauses 11.9.3.6-7:

  1. If Type(x) is Boolean, return the result of the comparison ToNumber(x) == y.
  2. If Type(y) is Boolean, return the result of the comparison x == ToNumber(y).
var x = true;
var y = "42";

x == y; // false

The Type(x) is indeed Boolean, so it performs ToNumber(x), which coerces true to 1. Now, 1 == "42" is evaluated. The types are still different, so we reconsult the algorithm, which just as above will coerce "42" to 42, and 1 == 42 is clearly false.

var x = "42";
var y = false;

x == y; // false

The Type(y) is Boolean this time, so ToNumber(y) yields 0. "42" == 0 recursively becomes 42 == 0, which is of course false.

In other words, the value "42" is neither == true nor == false. At first, that statement might seem crazy. How can a value be neither truthy nor falsy?

But that's the problem! You're asking the wrong question, entirely. It's not your fault, really. Your brain is tricking you.

"42" is indeed truthy, but "42" == true is not performing a boolean test/coercion at all, no matter what your brain says. "42" is not being coerced to a boolean (true), but instead true is being coerced to a 1, and then "42" is being coerced to 42.

Whether we like it or not, ToBoolean is not even involved here, so the truthiness or falsiness of "42" is irrelevant to the == operation!

What is relevant is to understand how the == comparison algorithm behaves with all the different type combinations. As it regards a boolean value on either side of the ==, a boolean always coerces to a number first.

If that seems strange to you, you're not alone. I personally would recommend to never, ever, under any circumstances, use == true or == false. Ever.

But remember, I'm only talking about == here. === true and === false wouldn't allow the coercion, so they're safe from this hidden ToNumber coercion.

var a = "42";

// bad (will fail!):
if (a == true) {
	// ..
}

// also bad (will fail!):
if (a === true) {
	// ..
}

// good enough (works implicitly):
if (a) {
	// ..
}

// better (works explicitly):
if (!!a) {
	// ..
}

// also great (works explicitly):
if (Boolean( a )) {
	// ..
}

If you avoid ever using == true or == false (aka loose equality with booleans) in your code, you'll never have to worry about this truthiness/falsiness mental gotcha.

Comparing: nulls to undefineds

Another example of implicit coercion can be seen with == loose equality between null and undefined values. Yet again quoting the ES5 spec, clauses 11.9.3.2-3:

  1. If x is null and y is undefined, return true.
  2. If x is undefined and y is null, return true.

null and undefined, when compared with == loose equality, equate to (aka coerce to) each other, and no other values in the entire language.

What this means is that null and undefined can be treated as indistinguishable for comparison purposes, if you use the == loose equality operator to allow their mutual implicit coercion.

var a = null;
var b;

a == b;		// true
a == null;	// true
b == null;	// true

a == false;	// false
b == false;	// false
a == "";	// false
b == "";	// false
a == 0;		// false
b == 0;		// false

The coercion between null and undefined is safe and predictable, and no other values can give false positives in such a check. I recommend using this coercion to allow null and undefined to be indistinguishable and thus treated as the same value.

var a = doSomething();

if (a == null) {
	// ..
}

The a == null check will pass only if doSomething() returns either null or undefined, and will fail with any other value, even other falsy values like 0, false, and "".

The explicit form of the check, which disallows any such coercion, is unnecessarily much uglier (and perhaps a tiny bit less performant!):

var a = doSomething();

if (a === undefined || a === null) {
	// ..
}

In my opinion, the form a == null is yet another example where implicit coercion improves code readability, but does so in a reliably safe way.

Comparing: objects to non-objects

If an object/function/array is compared to a simple scalar primitive (string, number, or boolean), the ES5 spec says in clauses 11.9.3.8-9:

  1. If Type(x) is either String or Number and Type(y) is Object, return the result of the comparison x == ToPrimitive(y).
  2. If Type(x) is Object and Type(y) is either String or Number, return the result of the comparison ToPrimitive(x) == y.

Note: You may notice that these clauses only mention String and Number, but not Boolean. That's because clauses 11.9.3.6-7 take care of coercing any Boolean operand presented to a Number first.

var a = 42;
var b = [ 42 ];

a == b;	// true

The [ 42 ] value has its ToPrimitive abstract operation called, which results in the "42" value. From there, it's just 42 == "42", which as we've already covered becomes 42 == 42, so a and b are found to be coercively equal.

Tip: All the quirks of the ToPrimitive abstract operation that we discussed earlier in this chapter (toString(), valueOf()) apply here as you'd expect. This can be quite useful if you have a complex data structure that you want to define a custom valueOf() method on, to provide a simple value for equality comparison purposes.

In Chapter 3, we covered "unboxing," where an object wrapper around a primitive value (like from new String("abc"), for instance) is unwrapped, and the underlying primitive value ("abc") is returned. This behavior is related to the ToPrimitive coercion in the == algorithm:

var a = "abc";
var b = Object( a );	// same as `new String( a )`

a === b;				// false
a == b;					// true

a == b is true because b is coerced (aka "unboxed," unwrapped) via ToPrimitive to its underlying "abc" simple scalar primitive value, which is the same as the value in a.

There are some values where this is not the case, though, because of other overriding rules in the == algorithm.

var a = null;
var b = Object( a );	// same as `Object()`
a == b;					// false

var c = undefined;
var d = Object( c );	// same as `Object()`
c == d;					// false

var e = NaN;
var f = Object( e );	// same as `new Number( e )`
e == f;					// false

The null and undefined values cannot be boxed -- they have no object wrapper equivalent -- so Object(null) is just like Object() in that both just produce a normal object.

NaN can be boxed to its Number object wrapper equivalent, but when == causes an unboxing, the NaN == NaN comparison fails because NaN is never equal to itself.

Edge Cases

Now that we've thoroughly examined how the implicit coercion of == loose equality works, let's try to call out the worst, craziest corner cases so we can see what we need to avoid to not get bitten with coercion bugs.

First, let's examine how modifying the built-in native prototypes can produce crazy results:

A Number By Any Other Value Would...

Number.prototype.valueOf = function() {
	return 3;
};

new Number( 2 ) == 3;	// true

Warning: 2 == 3 would not have fallen into this trap, because neither 2 nor 3 would have invoked the built-in Number.prototype.valueOf() method because both are already primitive number values and can be compared directly. However, new Number(2) must go through the ToPrimitive coercion, and thus invoke valueOf().

Evil, huh? Of course it is. No one should ever do such a thing. The fact that you can do this is sometimes used as a criticism of coercion and ==. But that's misdirected frustration. JavaScript is not bad because you can do such things, a developer is bad if they do such things. Don't fall into the "my programming language should protect me from myself" fallacy.

Next, let's consider another tricky example, which takes the evil from the previous example to another level:

if (a == 2 && a == 3) {
	// ..
}

You might think this would be impossible, because a could never be equal to both 2 and 3 at the same time. But "at the same time" is inaccurate, since the first expression a == 2 happens strictly before a == 3.

So, what if we make a.valueOf() have side effects each time it's called, such that the first time it returns 2 and the second time it's called it returns 3? Pretty easy:

var i = 2;

Number.prototype.valueOf = function() {
	return i++;
};

var a = new Number( 42 );

if (a == 2 && a == 3) {
	console.log( "Yep, this happened." );
}

Again, these are evil tricks. Don't do them. But also don't use them as complaints against coercion. Potential abuses of a mechanism are not sufficient evidence to condemn the mechanism. Just avoid these crazy tricks, and stick only with valid and proper usage of coercion.

False-y Comparisons

The most common complaint against implicit coercion in == comparisons comes from how falsy values behave surprisingly when compared to each other.

To illustrate, let's look at a list of the corner-cases around falsy value comparisons, to see which ones are reasonable and which are troublesome:

"0" == null;			// false
"0" == undefined;		// false
"0" == false;			// true -- UH OH!
"0" == NaN;				// false
"0" == 0;				// true
"0" == "";				// false

false == null;			// false
false == undefined;		// false
false == NaN;			// false
false == 0;				// true -- UH OH!
false == "";			// true -- UH OH!
false == [];			// true -- UH OH!
false == {};			// false

"" == null;				// false
"" == undefined;		// false
"" == NaN;				// false
"" == 0;				// true -- UH OH!
"" == [];				// true -- UH OH!
"" == {};				// false

0 == null;				// false
0 == undefined;			// false
0 == NaN;				// false
0 == [];				// true -- UH OH!
0 == {};				// false

In this list of 24 comparisons, 17 of them are quite reasonable and predictable. For example, we know that "" and NaN are not at all equatable values, and indeed they don't coerce to be loose equals, whereas "0" and 0 are reasonably equatable and do coerce as loose equals.

However, seven of the comparisons are marked with "UH OH!" because as false positives, they are much more likely gotchas that could trip you up. "" and 0 are definitely distinctly different values, and it's rare you'd want to treat them as equatable, so their mutual coercion is troublesome. Note that there aren't any false negatives here.

The Crazy Ones

We don't have to stop there, though. We can keep looking for even more troublesome coercions:

[] == ![];		// true

Oooo, that seems at a higher level of crazy, right!? Your brain may likely trick you that you're comparing a truthy to a falsy value, so the true result is surprising, as we know a value can never be truthy and falsy at the same time!

But that's not what's actually happening. Let's break it down. What do we know about the ! unary operator? It explicitly coerces to a boolean using the ToBoolean rules (and it also flips the parity). So before [] == ![] is even processed, it's actually already translated to [] == false. We already saw that form in our above list (false == []), so its surprise result is not new to us.

How about other corner cases?

2 == [2];		// true
"" == [null];	// true

As we said earlier in our ToNumber discussion, the right-hand side [2] and [null] values will go through a ToPrimitive coercion so they can be more readily compared to the simple primitives (2 and "", respectively) on the left-hand side. Since the valueOf() for array values just returns the array itself, coercion falls to stringifying the array.

[2] will become "2", which then is ToNumber coerced to 2 for the right-hand side value in the first comparison. [null] just straight becomes "".

So, 2 == 2 and "" == "" are completely understandable.

If your instinct is to still dislike these results, your frustration is not actually with coercion like you probably think it is. It's actually a complaint against the default array values' ToPrimitive behavior of coercing to a string value. More likely, you'd just wish that [2].toString() didn't return "2", or that [null].toString() didn't return "".

But what exactly should these string coercions result in? I can't really think of any other appropriate string coercion of [2] than "2", except perhaps "[2]" -- but that could be very strange in other contexts!

You could rightly make the case that since String(null) becomes "null", then String([null]) should also become "null". That's a reasonable assertion. So, that's the real culprit.

Implicit coercion itself isn't the evil here. Even an explicit coercion of [null] to a string results in "". What's at odds is whether it's sensible at all for array values to stringify to the equivalent of their contents, and exactly how that happens. So, direct your frustration at the rules for String( [..] ), because that's where the craziness stems from. Perhaps there should be no stringification coercion of arrays at all? But that would have lots of other downsides in other parts of the language.

Another famously cited gotcha:

0 == "\n";		// true

As we discussed earlier with empty "", "\n" (or " " or any other whitespace combination) is coerced via ToNumber, and the result is 0. What other number value would you expect whitespace to coerce to? Does it bother you that explicit Number(" ") yields 0?

Really the only other reasonable number value that empty strings or whitespace strings could coerce to is the NaN. But would that really be better? The comparison " " == NaN would of course fail, but it's unclear that we'd have really fixed any of the underlying concerns.

The chances that a real-world JS program fails because 0 == "\n" are awfully rare, and such corner cases are easy to avoid.

Type conversions always have corner cases, in any language -- nothing specific to coercion. The issues here are about second-guessing a certain set of corner cases (and perhaps rightly so!?), but that's not a salient argument against the overall coercion mechanism.

Bottom line: almost any crazy coercion between normal values that you're likely to run into will boil down to the short seven-item list of gotcha coercions we've identified above.

To contrast against these 24 likely suspects for coercion gotchas, consider another list like this:

42 == "43";							// false
"foo" == 42;						// false
"true" == true;						// false

42 == "42";							// true
"foo" == [ "foo" ];					// true

In these nonfalsy, noncorner cases, the coercion results are totally safe, reasonable, and explainable.

Sanity Check

OK, we've definitely found some crazy stuff when we've looked deeply into implicit coercion. No wonder that most developers claim coercion is evil and should be avoided, right!?

But let's take a step back and do a sanity check.

By way of magnitude comparison, we have a list of seven troublesome gotcha coercions, but we have another list of (at least 17, but actually infinite) coercions that are totally sane and explainable.

If you're looking for a textbook example of "throwing the baby out with the bathwater," this is it: discarding the entirety of coercion (the infinitely large list of safe and useful behaviors) because of a list of literally just seven gotchas.

The more prudent reaction would be to ask, "how can I use the countless good parts of coercion, but avoid the few bad parts?"

Let's look again at the bad list:

"0" == false;			// true -- UH OH!
false == 0;				// true -- UH OH!
false == "";			// true -- UH OH!
false == [];			// true -- UH OH!
"" == 0;				// true -- UH OH!
"" == [];				// true -- UH OH!
0 == [];				// true -- UH OH!

Four of the seven items on this list involve == false comparison, which we said earlier you should always, always avoid. That's a pretty easy rule to remember.

Now the list is down to three.

"" == 0;				// true -- UH OH!
"" == [];				// true -- UH OH!
0 == [];				// true -- UH OH!

Are these reasonable coercions you'd do in a normal JavaScript program? Under what conditions would they really happen?

I don't think it's terribly likely that you'd literally use == [] in a boolean test in your program, at least not if you know what you're doing. You'd probably instead be doing == "" or == 0, like:

function doSomething(a) {
	if (a == "") {
		// ..
	}
}

You'd have an oops if you accidentally called doSomething(0) or doSomething([]). Another scenario:

function doSomething(a,b) {
	if (a == b) {
		// ..
	}
}

Again, this could break if you did something like doSomething("",0) or doSomething([],"").

So, while the situations can exist where these coercions will bite you, and you'll want to be careful around them, they're probably not super common on the whole of your code base.

Safely Using Implicit Coercion

The most important advice I can give you: examine your program and reason about what values can show up on either side of an == comparison. To effectively avoid issues with such comparisons, here's some heuristic rules to follow:

  1. If either side of the comparison can have true or false values, don't ever, EVER use ==.
  2. If either side of the comparison can have [], "", or 0 values, seriously consider not using ==.

In these scenarios, it's almost certainly better to use === instead of ==, to avoid unwanted coercion. Follow those two simple rules and pretty much all the coercion gotchas that could reasonably hurt you will effectively be avoided.

Being more explicit/verbose in these cases will save you from a lot of headaches.

The question of == vs. === is really appropriately framed as: should you allow coercion for a comparison or not?

There's lots of cases where such coercion can be helpful, allowing you to more tersely express some comparison logic (like with null and undefined, for example).

In the overall scheme of things, there's relatively few cases where implicit coercion is truly dangerous. But in those places, for safety sake, definitely use ===.

Tip: Another place where coercion is guaranteed not to bite you is with the typeof operator. typeof is always going to return you one of seven strings, and none of them are the empty "" string. As such, there's no case where checking the type of some value is going to run afoul of implicit coercion. typeof x == "function" is 100% as safe and reliable as typeof x === "function". Literally, the spec says the algorithm will be identical in this situation. So, don't just blindly use === everywhere simply because that's what your code tools tell you to do, or (worst of all) because you've been told in some book to not think about it. You own the quality of your code.

Is implicit coercion evil and dangerous? In a few cases, yes, but overwhelmingly, no.

Be a responsible and mature developer. Learn how to use the power of coercion (both explicit and implicit) effectively and safely. And teach those around you to do the same.

Here's a handy table made by Alex Dorey (@dorey on GitHub) to visualize a variety of comparisons:

Source: https://github.com/dorey/JavaScript-Equality-Table

Abstract Relational Comparison

While this part of implicit coercion often gets a lot less attention, it's important nonetheless to think about what happens with a < b comparisons.

The "Abstract Relational Comparison" algorithm in ES5 section 11.8.5 essentially divides itself into two parts: what to do if the comparison involves both string values (second half), or anything else (first half).

Note: The algorithm is only defined for a < b. So, a > b is handled as b < a.

The algorithm first calls ToPrimitive coercion on both values, and if the return result of either call is not a string, then both values are coerced to number values using the ToNumber operation rules, and compared numerically.

var a = [ 42 ];
var b = [ "43" ];

a < b;	// true
b < a;	// false

Note: Similar caveats for -0 and NaN apply here as they did in the == algorithm discussed earlier.

However, if both values are strings for the < comparison, simple lexicographic (natural alphabetic) comparison on the characters is performed:

var a = [ "42" ];
var b = [ "043" ];

a < b;	// false

a and b are not coerced to numbers, because both of them end up as strings after the ToPrimitive coercion on the two arrays. So, "42" is compared character by character to "043", starting with the first characters "4" and "0", respectively. Since "0" is lexicographically less than than "4", the comparison returns false.

The exact same behavior and reasoning goes for:

var a = [ 4, 2 ];
var b = [ 0, 4, 3 ];

a < b;	// false

Here, a becomes "4,2" and b becomes "0,4,3", and those lexicographically compare identically to the previous snippet.

var a = { b: 42 };
var b = { b: 43 };

a < b;	// ??

a < b is also false, because a becomes [object Object] and b becomes [object Object], and so clearly a is not lexicographically less than b.

var a = { b: 42 };
var b = { b: 43 };

a < b;	// false
a == b;	// false
a > b;	// false

a <= b;	// true
a >= b;	// true

Why is a == b not true? They're the same string value ("[object Object]"), so it seems they should be equal, right? Nope. Recall the previous discussion about how == works with object references.

But then how are a <= b and a >= b resulting in true, if a < b and a == b and a > b are all false?

Because the spec says for a <= b, it will actually evaluate b < a first, and then negate that result. Since b < a is also false, the result of a <= b is true.

That's probably awfully contrary to how you might have explained what <= does up to now, which would likely have been the literal: "less than or equal to." JS more accurately considers <= as "not greater than" (!(a > b), which JS treats as !(b < a)). Moreover, a >= b is explained by first considering it as b <= a, and then applying the same reasoning.

Unfortunately, there is no "strict relational comparison" as there is for equality. In other words, there's no way to prevent implicit coercion from occurring with relational comparisons like a < b, other than to ensure that a and b are of the same type explicitly before making the comparison.

Use the same reasoning from our earlier == vs. === sanity check discussion. If coercion is helpful and reasonably safe, like in a 42 < "43" comparison, use it. On the other hand, if you need to be safe about a relational comparison, explicitly coerce the values first, before using < (or its counterparts).

var a = [ 42 ];
var b = "043";

a < b;						// false -- string comparison!
Number( a ) < Number( b );	// true -- number comparison!

Review

In this chapter, we turned our attention to how JavaScript type conversions happen, called coercion, which can be characterized as either explicit or implicit.

Coercion gets a bad rap, but it's actually quite useful in many cases. An important task for the responsible JS developer is to take the time to learn all the ins and outs of coercion to decide which parts will help improve their code, and which parts they really should avoid.

Explicit coercion is code which is obvious that the intent is to convert a value from one type to another. The benefit is improvement in readability and maintainability of code by reducing confusion.

Implicit coercion is coercion that is "hidden" as a side-effect of some other operation, where it's not as obvious that the type conversion will occur. While it may seem that implicit coercion is the opposite of explicit and is thus bad, actually implicit coercion is also about improving the readability of code.

Especially for implicit, coercion must be used responsibly and consciously. Know why you're writing the code you're writing, and how it works. Strive to write code that others will easily be able to learn from and understand as well.


Chapter 5: Grammar

The last major topic we want to tackle is how JavaScript's language syntax works (aka its grammar). You may think you know how to write JS, but there's an awful lot of nuance to various parts of the language grammar that lead to confusion and misconception, so we want to dive into those parts and clear some things up.

Note: The term "grammar" may be a little less familiar to readers than the term "syntax." In many ways, they are similar terms, describing the rules for how the language works. There are nuanced differences, but they mostly don't matter for our discussion here. The grammar for JavaScript is a structured way to describe how the syntax (operators, keywords, etc.) fits together into well-formed, valid programs. In other words, discussing syntax without grammar would leave out a lot of the important details. So our focus here in this chapter is most accurately described as grammar, even though the raw syntax of the language is what developers directly interact with.

Statements & Expressions

It's fairly common for developers to assume that the term "statement" and "expression" are roughly equivalent. But here we need to distinguish between the two, because there are some very important differences in our JS programs.

To draw the distinction, let's borrow from terminology you may be more familiar with: the English language.

A "sentence" is one complete formation of words that expresses a thought. It's comprised of one or more "phrases," each of which can be connected with punctuation marks or conjunction words ("and," "or," etc). A phrase can itself be made up of smaller phrases. Some phrases are incomplete and don't accomplish much by themselves, while other phrases can stand on their own. These rules are collectively called the grammar of the English language.

And so it goes with JavaScript grammar. Statements are sentences, expressions are phrases, and operators are conjunctions/punctuation.

Every expression in JS can be evaluated down to a single, specific value result.

var a = 3 * 6;
var b = a;
b;

In this snippet, 3 * 6 is an expression (evaluates to the value 18). But a on the second line is also an expression, as is b on the third line. The a and b expressions both evaluate to the values stored in those variables at that moment, which also happens to be 18.

Moreover, each of the three lines is a statement containing expressions. var a = 3 * 6 and var b = a are called "declaration statements" because they each declare a variable (and optionally assign a value to it). The a = 3 * 6 and b = a assignments (minus the vars) are called assignment expressions.

The third line contains just the expression b, but it's also a statement all by itself. This is generally referred to as an "expression statement."

Statement Completion Values

It's a fairly little known fact that statements all have completion values (even if that value is just undefined).

How would you even go about seeing the completion value of a statement?

The most obvious answer is to type the statement into your browser's developer console, because when you execute it, the console by default reports the completion value of the most recent statement it executed.

Let's consider var b = a. What's the completion value of that statement?

The b = a assignment expression results in the value that was assigned (18 above), but the var statement itself results in undefined. Why? Because var statements are defined that way in the spec. If you put var a = 42; into your console, you'll see undefined reported back instead of 42.

Note: Technically, it's a little more complex than that. In the ES5 spec, section 12.2 "Variable Statement," the VariableDeclaration algorithm actually does return a value (a string containing the name of the variable declared), but that value is basically swallowed up by the VariableStatement algorithm, which forces an empty (aka undefined) completion value.

In fact, if you've done much code experimenting in your console (or in a JavaScript environment REPL), you've probably seen undefined reported after many different statements, and perhaps never realized why or what that was. Put simply, the console is just reporting the statement's completion value.

But what the console prints out for the completion value isn't something we can use inside our program. So how can we capture the completion value?

Before we explain how, let's explore why you would want to do that.

We need to consider other types of statement completion values. For example, any regular { .. } block has a completion value of the completion value of its last contained statement/expression.

var b;

if (true) {
	b = 4 + 38;
}

If you typed that into your console, you'd probably see 42 reported, since 42 is the completion value of the if block, which took on the completion value of its last assignment expression statement b = 4 + 38.

In other words, the completion value of a block is like an implicit return of the last statement value in the block.

But there's an obvious problem. This kind of code doesn't work:

var a, b;

a = if (true) {
	b = 4 + 38;
};

We can't capture the completion value of a statement and assign it into another variable in any easy syntactic/grammatical way.

We could use the much maligned eval(..) function to capture this completion value.

var a, b;

a = eval( "if (true) { b = 4 + 38; }" );

a;	// 42

Yeeeaaahhhh. That's terribly ugly. But it works! And it illustrates the point that statement completion values are a real thing that can be captured not just in our console but in our programs.

There's a proposal for ES7 called "do expression." Here's how it might work:

var a, b;

a = do {
	if (true) {
		b = 4 + 38;
	}
};

a;	// 42

The do { .. } expression executes a block, and the final statement completion value inside the block becomes the completion value of the do expression, which can then be assigned to a as shown.

The general idea is to be able to treat statements as expressions -- they can show up inside other statements -- without needing to wrap them in an inline function expression and perform an explicit return ...

For now, statement completion values are not much more than trivia. But they're probably going to take on more significance as JS evolves, and hopefully do { .. } expressions will reduce the temptation to use stuff like eval(..).

Warning: Repeating my earlier admonition: avoid eval(..). Seriously. See the Scope & Closures title of this series for more explanation.

Expression Side Effects

Most expressions don't have side effects. For example:

var a = 2;
var b = a + 3;

The expression a + 3 did not itself have a side effect, like for instance changing a. It had a result, which is 5, and that result was assigned to b in the statement b = a + 3.

The most common example of an expression with (possible) side effects is a function call expression:

function foo() {
	a = a + 1;
}

var a = 1;
foo();		// result: `undefined`, side effect: changed `a`

There are other side-effecting expressions, though.

var a = 42;
var b = a++;

The expression a++ has two separate behaviors. First, it returns the current value of a, which is 42 (which then gets assigned to b). But next, it changes the value of a itself, incrementing it by one.

var a = 42;
var b = a++;

a;	// 43
b;	// 42

Many developers would mistakenly believe that b has value 43 just like a does. But the confusion comes from not fully considering the when of the side effects of the ++ operator.

The ++ increment operator and the -- decrement operator are both unary operators, which can be used in either a postfix ("after") position or prefix ("before") position.

var a = 42;

a++;	// 42
a;		// 43

++a;	// 44
a;		// 44

When ++ is used in the prefix position as ++a, its side effect happens before the value is returned from the expression, rather than after as with a++.

Note: Would you think ++a++ was legal syntax? If you try it, you'll get a ReferenceError error, but why? Because side-effecting operators require a variable reference to target their side effects to. For ++a++, the a++ part is evaluated first (because of operator precedence), which gives back the value of a before the increment. But then it tries to evaluate ++42, which (if you try it) gives the same ReferenceError error, since ++ can't have a side effect directly on a value like 42.

It is sometimes mistakenly thought that you can encapsulate the after side effect of a++ by wrapping it in a ( ) pair, like:

var a = 42;
var b = (a++);

a;	// 43
b;	// 42

Unfortunately, ( ) itself doesn't define a new wrapped expression that would be evaluated after the after side effect of the a++ expression, as we might have hoped. In fact, even if it did, a++ returns 42 first, and unless you have another expression that reevaluates a after the side effect of ++, you're not going to get 43 from that expression, so b will not be assigned 43.

There's an option, though: the , statement-series comma operator. This operator allows you to string together multiple standalone expression statements into a single statement:

var a = 42, b;
b = ( a++, a );

a;	// 43
b;	// 43

Note: The ( .. ) around a++, a is required here. The reason is operator precedence.

The expression a++, a means that the second a statement expression gets evaluated after the after side effects of the first a++ statement expression, which means it returns the 43 value for assignment to b.

Another example of a side-effecting operator is delete, delete is used to remove a property from an object or a slot from an array. But it's usually just called as a standalone statement:

var obj = {
	a: 42
};

obj.a;			// 42
delete obj.a;	// true
obj.a;			// undefined

The result value of the delete operator is true if the requested operation is valid/allowable, or false otherwise. But the side effect of the operator is that it removes the property (or array slot).

Note: What do we mean by valid/allowable? Nonexistent properties, or properties that exist and are configurable will return true from the delete operator. Otherwise, the result will be false or an error.

One last example of a side-effecting operator, which may at once be both obvious and nonobvious, is the = assignment operator.

var a;

a = 42;		// 42
a;			// 42

It may not seem like = in a = 42 is a side-effecting operator for the expression. But if we examine the result value of the a = 42 statement, it's the value that was just assigned (42), so the assignment of that same value into a is essentially a side effect.

Tip: The same reasoning about side effects goes for the compound-assignment operators like +=, -=, etc. For example, a = b += 2 is processed first as b += 2 (which is b = b + 2), and the result of that = assignment is then assigned to a.

This behavior that an assignment expression (or statement) results in the assigned value is primarily useful for chained assignments, such as:

var a, b, c;

a = b = c = 42;

Here, c = 42 is evaluated to 42 (with the side effect of assigning 42 to c), then b = 42 is evaluated to 42 (with the side effect of assigning 42 to b), and finally a = 42 is evaluated (with the side effect of assigning 42 to a).

Warning: A common mistake developers make with chained assignments is like var a = b = 42. While this looks like the same thing, it's not. If that statement were to happen without there also being a separate var b (somewhere in the scope) to formally declare b, then var a = b = 42 would not declare b directly. Depending on strict mode, that would either throw an error or create an accidental global.

Another scenario to consider:

function vowels(str) {
	var matches;

	if (str) {
		// pull out all the vowels
		matches = str.match( /[aeiou]/g );

		if (matches) {
			return matches;
		}
	}
}

vowels( "Hello World" ); // ["e","o","o"]

This works, and many developers prefer such. But using an idiom where we take advantage of the assignment side effect, we can simplify by combining the two if statements into one:

function vowels(str) {
	var matches;

	// pull out all the vowels
	if (str && (matches = str.match( /[aeiou]/g ))) {
		return matches;
	}
}

vowels( "Hello World" ); // ["e","o","o"]

Note: The ( .. ) around matches = str.match.. is required. The reason is operator precedence, which we'll cover in the "Operator Precedence" section later in this chapter.

I prefer this shorter style, as I think it makes it clearer that the two conditionals are in fact related rather than separate. But as with most stylistic choices in JS, it's purely opinion which one is better.

Contextual Rules

There are quite a few places in the JavaScript grammar rules where the same syntax means different things depending on where/how it's used. This kind of thing can, in isolation, cause quite a bit of confusion.

{ .. } Curly Braces

There's two main places that a pair of { .. } curly braces will show up in your code.

Object Literals

First, as an object literal:

// assume there's a `bar()` function defined

var a = {
	foo: bar()
};

How do we know this is an object literal? Because the { .. } pair is a value that's getting assigned to a.

Note: The a reference is called an "l-value" (aka left-hand value) since it's the target of an assignment. The { .. } pair is an "r-value" (aka right-hand value) since it's used just as a value (in this case as the source of an assignment).

Labels

What happens if we remove the var a = part of the above snippet?

// assume there's a `bar()` function defined

{
	foo: bar()
}

A lot of developers assume that the { .. } pair is just a standalone object literal that doesn't get assigned anywhere. But it's actually entirely different.

Here, { .. } is just a regular code block. It's not very idiomatic in JavaScript to have a standalone { .. } block like that, but it's perfectly valid JS grammar. It can be especially helpful when combined with let block-scoping declarations.

The { .. } code block here is functionally pretty much identical to the code block being attached to some statement, like a for/while loop, if conditional, etc.

But if it's a normal block of code, what's that bizarre looking foo: bar() syntax, and how is that legal?

It's because of a little known feature in JavaScript called "labeled statements." foo is a label for the statement bar() (which has omitted its trailing ;). But what's the point of a labeled statement?

If JavaScript had a goto statement, you'd theoretically be able to say goto foo and have execution jump to that location in code. gotos are usually considered terrible coding idioms as they make code much harder to understand, so it's a very good thing that JavaScript doesn't have a general goto.

However, JS does support a limited, special form of goto: labeled jumps. Both the continue and break statements can optionally accept a specified label, in which case the program flow "jumps" kind of like a goto. Consider:

// `foo` labeled-loop
foo: for (var i=0; i<4; i++) {
	for (var j=0; j<4; j++) {
		// whenever the loops meet, continue outer loop
		if (j == i) {
			// jump to the next iteration of
			// the `foo` labeled-loop
			continue foo;
		}

		// skip odd multiples
		if ((j * i) % 2 == 1) {
			// normal (non-labeled) `continue` of inner loop
			continue;
		}

		console.log( i, j );
	}
}
// 1 0
// 2 0
// 2 1
// 3 0
// 3 2

Note: continue foo does not mean "go to the 'foo' labeled position to continue", but rather, "continue the loop that is labeled 'foo' with its next iteration." So, it's not really an arbitrary goto.

As you can see, we skipped over the odd-multiple 3 1 iteration, but the labeled-loop jump also skipped iterations 1 1 and 2 2.

Perhaps a slightly more useful form of the labeled jump is with break __ from inside an inner loop where you want to break out of the outer loop. Without a labeled break, this same logic could sometimes be rather awkward to write:

// `foo` labeled-loop
foo: for (var i=0; i<4; i++) {
	for (var j=0; j<4; j++) {
		if ((i * j) >= 3) {
			console.log( "stopping!", i, j );
			// break out of the `foo` labeled loop
			break foo;
		}

		console.log( i, j );
	}
}
// 0 0
// 0 1
// 0 2
// 0 3
// 1 0
// 1 1
// 1 2
// stopping! 1 3

Note: break foo does not mean "go to the 'foo' labeled position to continue," but rather, "break out of the loop/block that is labeled 'foo' and continue after it." Not exactly a goto in the traditional sense, huh?

The nonlabeled break alternative to the above would probably need to involve one or more functions, shared scope variable access, etc. It would quite likely be more confusing than labeled break, so here using a labeled break is perhaps the better option.

A label can apply to a non-loop block, but only break can reference such a non-loop label. You can do a labeled break ___ out of any labeled block, but you cannot continue ___ a non-loop label, nor can you do a non-labeled break out of a block.

function foo() {
	// `bar` labeled-block
	bar: {
		console.log( "Hello" );
		break bar;
		console.log( "never runs" );
	}
	console.log( "World" );
}

foo();
// Hello
// World

Labeled loops/blocks are extremely uncommon, and often frowned upon. It's best to avoid them if possible; for example using function calls instead of the loop jumps. But there are perhaps some limited cases where they might be useful. If you're going to use a labeled jump, make sure to document what you're doing with plenty of comments!

It's a very common belief that JSON is a proper subset of JS, so a string of JSON (like {"a":42} -- notice the quotes around the property name as JSON requires!) is thought to be a valid JavaScript program. Not true! Try putting {"a":42} into your JS console, and you'll get an error.

That's because statement labels cannot have quotes around them, so "a" is not a valid label, and thus : can't come right after it.

So, JSON is truly a subset of JS syntax, but JSON is not valid JS grammar by itself.

One extremely common misconception along these lines is that if you were to load a JS file into a <script src=..> tag that only has JSON content in it, the data would be read as valid JavaScript but just be inaccessible to the program. JSON-P (the practice of wrapping the JSON data in a function call, like foo({"a":42})) is usually said to solve this inaccessibility by sending the value to one of your program's functions.

Not true! The totally valid JSON value {"a":42} by itself would actually throw a JS error because it'd be interpreted as a statement block with an invalid label. But foo({"a":42}) is valid JS because in it, {"a":42} is an object literal value being passed to foo(..). So, properly said, JSON-P makes JSON into valid JS grammar!

Blocks

Another commonly cited JS gotcha (related to coercion) is:

[] + {}; // "[object Object]"
{} + []; // 0

This seems to imply the + operator gives different results depending on whether the first operand is the [] or the {}. But that actually has nothing to do with it!

On the first line, {} appears in the + operator's expression, and is therefore interpreted as an actual value (an empty object). [] is coerced to "" and thus {} is coerced to a string value as well: "[object Object]".

But on the second line, {} is interpreted as a standalone {} empty block. Blocks don't need semicolons to terminate them, so the lack of one here isn't a problem. Finally, + [] is an expression that explicitly coerces the [] to a number, which is the 0 value.

Object Destructuring

Starting with ES6, another place that you'll see { .. } pairs showing up is with "destructuring assignments", specifically object destructuring.

function getData() {
	// ..
	return {
		a: 42,
		b: "foo"
	};
}

var { a, b } = getData();

console.log( a, b ); // 42 "foo"

As you can probably tell, var { a , b } = .. is a form of ES6 destructuring assignment, which is roughly equivalent to:

var res = getData();
var a = res.a;
var b = res.b;

Note: { a, b } is actually ES6 destructuring shorthand for { a: a, b: b }, so either will work, but it's expected that the shorter { a, b } will become the preferred form.

Object destructuring with a { .. } pair can also be used for named function arguments, which is sugar for this same sort of implicit object property assignment:

function foo({ a, b, c }) {
	// no need for:
	// var a = obj.a, b = obj.b, c = obj.c
	console.log( a, b, c );
}

foo( {
	c: [1,2,3],
	a: 42,
	b: "foo"
} );	// 42 "foo" [1, 2, 3]

So, the context we use { .. } pairs in entirely determines what they mean, which illustrates the difference between syntax and grammar. It's very important to understand these nuances to avoid unexpected interpretations by the JS engine.

else if And Optional Blocks

It's a common misconception that JavaScript has an else if clause, because you can do:

if (a) {
	// ..
}
else if (b) {
	// ..
}
else {
	// ..
}

But there's a hidden characteristic of the JS grammar here: there is no else if. But if and else statements are allowed to omit the { } around their attached block if they only contain a single statement. You've seen this many times before, undoubtedly:

if (a) doSomething( a );

Many JS style guides will insist that you always use { } around a single statement block, like:

if (a) { doSomething( a ); }

However, the exact same grammar rule applies to the else clause, so the else if form you've likely always coded is actually parsed as:

if (a) {
	// ..
}
else {
	if (b) {
		// ..
	}
	else {
		// ..
	}
}

The if (b) { .. } else { .. } is a single statement that follows the else, so you can either put the surrounding { } in or not. In other words, when you use else if, you're technically breaking that common style guide rule and just defining your else with a single if statement.

Of course, the else if idiom is extremely common and results in one less level of indentation, so it's attractive. Whichever way you do it, just call out explicitly in your own style guide/rules and don't assume things like else if are direct grammar rules.

Operator Precedence

JavaScript's version of && and || are interesting in that they select and return one of their operands, rather than just resulting in true or false. That's easy to reason about if there are only two operands and one operator.

var a = 42;
var b = "foo";

a && b;	// "foo"
a || b;	// 42

But what about when there's two operators involved, and three operands?

var a = 42;
var b = "foo";
var c = [1,2,3];

a && b || c; // ???
a || b && c; // ???

To understand what those expressions result in, we're going to need to understand what rules govern how the operators are processed when there's more than one present in an expression.

These rules are called "operator precedence."

I bet most readers feel they have a decent grasp on operator precedence. But as with everything else we've covered in this book series, we're going to poke and prod at that understanding to see just how solid it really is, and hopefully learn a few new things along the way.

Recall the example from above:

var a = 42, b;
b = ( a++, a );

a;	// 43
b;	// 43

But what would happen if we remove the ( )?

var a = 42, b;
b = a++, a;

a;	// 43
b;	// 42

Wait! Why did that change the value assigned to b?

Because the , operator has a lower precedence than the = operator. So, b = a++, a is interpreted as (b = a++), a. Because a++ has after side effects, the assigned value to b is the value 42 before the ++ changes a.

This is just a simple matter of needing to understand operator precedence. If you're going to use , as a statement-series operator, it's important to know that it actually has the lowest precedence. Every other operator will more tightly bind than , will.

if (str && (matches = str.match( /[aeiou]/g ))) {
	// ..
}

We said the ( ) around the assignment is required, but why? Because && has higher precedence than =, so without the ( ) to force the binding, the expression would instead be treated as (str && matches) = str.match... But this would be an error, because the result of (str && matches) isn't going to be a variable, but instead a value (in this case undefined), and so it can't be the left-hand side of an = assignment!

Let's move on to a more complex example to really test your understanding:

var a = 42;
var b = "foo";
var c = false;

var d = a && b || c ? c || b ? a : c && b : a;

d;		// ??

OK, evil, I admit it. No one would write a string of expressions like that, right? Probably not, but we're going to use it to examine various issues around chaining multiple operators together, which is a very common task.

The result above is 42. But that's not nearly as interesting as how we can figure out that answer without just plugging it into a JS program to let JavaScript sort it out.

The first question -- it may not have even occurred to you to ask -- is, does the first part (a && b || c) behave like (a && b) || c or like a && (b || c)? Do you know for certain? Can you even convince yourself they are actually different?

(false && true) || true;	// true
false && (true || true);	// false

So, there's proof they're different. But still, how does false && true || true behave? The answer:

false && true || true;		// true
(false && true) || true;	// true

So we have our answer. The && operator is evaluated first and the || operator is evaluated second.

But is that just because of left-to-right processing? Let's reverse the order of operators:

true || false && false;		// true

(true || false) && false;	// false -- nope
true || (false && false);	// true -- winner, winner!

Now we've proved that && is evaluated first and then ||, and in this case that was actually counter to generally expected left-to-right processing.

So what caused the behavior? Operator precedence.

Every language defines its own operator precedence list.

If you knew it well, the above examples wouldn't have tripped you up in the slightest, because you'd already know that && is more precedent than ||.

Note: Unfortunately, the JS spec doesn't really have its operator precedence list in a convenient, single location. You have to parse through and understand all the grammar rules. So we'll try to lay out the more common and useful bits here in a more convenient format. For a complete list of operator precedence, see "Operator Precedence" on the MDN site (* https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Operator_Precedence).

Short Circuited

For both && and || operators, the right-hand operand will not be evaluated if the left-hand operand is sufficient to determine the outcome of the operation. Hence, the name "short circuited".

For example, with a && b, b is not evaluated if a is falsy, because the result of the && operand is already certain, so there's no point in bothering to check b. Likewise, with a || b, if a is truthy, the result of the operand is already certain, so there's no reason to check b.

function doSomething(opts) {
	if (opts && opts.cool) {
		// ..
	}
}

The opts part of the opts && opts.cool test acts as sort of a guard, because if opts is unset (or is not an object), the expression opts.cool would throw an error. The opts test failing plus the short circuiting means that opts.cool won't even be evaluated, thus no error!

function doSomething(opts) {
	if (opts.cache || primeCache()) {
		// ..
	}
}

Here, we're checking for opts.cache first, and if it's present, we don't call the primeCache() function, thus avoiding potentially unnecessary work.

Tighter Binding

But let's turn our attention back to that earlier complex statement example with all the chained operators, specifically the ? : ternary operator parts. Does the ? : operator have more or less precedence than the && and || operators?

a && b || c ? c || b ? a : c && b : a

Is that more like this:

a && b || (c ? c || (b ? a : c) && b : a)

or this?

(a && b || c) ? (c || b) ? a : (c && b) : a

The answer is the second one. But why?

Because && is more precedent than ||, and || is more precedent than ? :.

So, the expression (a && b || c) is evaluated first before the ? : it participates in. Another way this is commonly explained is that && and || "bind more tightly" than ? :. If the reverse was true, then c ? c... would bind more tightly, and it would behave like a && b || (c ? c..).

Associativity

So, the && and || operators bind first, then the ? : operator. But what about multiple operators of the same precedence? Do they always process left-to-right or right-to-left?

In general, operators are either left-associative or right-associative, referring to whether grouping happens from the left or from the right.

It's important to note that associativity is not the same thing as left-to-right or right-to-left processing.

But why does it matter whether processing is left-to-right or right-to-left? Because expressions can have side effects, like for instance with function calls:

var a = foo() && bar();

Here, foo() is evaluated first, and then possibly bar() depending on the result of the foo() expression. That definitely could result in different program behavior than if bar() was called before foo().

But this behavior is just left-to-right processing (the default behavior in JavaScript!) -- it has nothing to do with the associativity of &&. In that example, since there's only one && and thus no relevant grouping here, associativity doesn't even come into play.

But with an expression like a && b && c, grouping will happen implicitly, meaning that either a && b or b && c will be evaluated first.

Technically, a && b && c will be handled as (a && b) && c, because && is left-associative (so is ||, by the way). However, the right-associative alternative a && (b && c) behaves observably the same way. For the same values, the same expressions are evaluated in the same order.

Note: If hypothetically && was right-associative, it would be processed the same as if you manually used ( ) to create grouping like a && (b && c). But that still doesn't mean that c would be processed before b. Right-associativity does not mean right-to-left evaluation, it means right-to-left grouping. Either way, regardless of the grouping/associativity, the strict ordering of evaluation will be a, then b, then c (aka left-to-right).

So it doesn't really matter that much that && and || are left-associative, other than to be accurate in how we discuss their definitions.

But that's not always the case. Some operators would behave very differently depending on left-associativity vs. right-associativity.

Consider the ? : ("ternary" or "conditional") operator:

a ? b : c ? d : e;

? : is right-associative, so which grouping represents how it will be processed?

  • a ? b : (c ? d : e)
  • (a ? b : c) ? d : e

The answer is a ? b : (c ? d : e). Unlike with && and || above, the right-associativity here actually matters, as (a ? b : c) ? d : e will behave differently for some combinations of values.

true ? false : true ? true : true;		// false

true ? false : (true ? true : true);	// false
(true ? false : true) ? true : true;	// true

Even more nuanced differences lurk with other value combinations, even if the end result is the same.

true ? false : true ? true : false;		// false

true ? false : (true ? true : false);	// false
(true ? false : true) ? true : false;	// false

From that scenario, the same end result implies that the grouping is moot. However:

var a = true, b = false, c = true, d = true, e = false;

a ? b : (c ? d : e); // false, evaluates only `a` and `b`
(a ? b : c) ? d : e; // false, evaluates `a`, `b` AND `e`

So, we've clearly proved that ? : is right-associative, and that it actually matters with respect to how the operator behaves if chained with itself.

Another example of right-associativity (grouping) is the = operator. Recall the chained assignment example from earlier in the chapter:

var a, b, c;

a = b = c = 42;

We asserted earlier that a = b = c = 42 is processed by first evaluating the c = 42 assignment, then b = .., and finally a = ... Why? Because of the right-associativity, which actually treats the statement like this: a = (b = (c = 42)).

Remember our running complex assignment expression example from earlier in the chapter?

var a = 42;
var b = "foo";
var c = false;

var d = a && b || c ? c || b ? a : c && b : a;

d;		// 42

Armed with our knowledge of precedence and associativity, we should now be able to break down the code into its grouping behavior like this:

((a && b) || c) ? ((c || b) ? a : (c && b)) : a

Or, to present it indented if that's easier to understand:

(
  (a && b)
    ||
  c
)
  ?
(
  (c || b)
    ?
  a
    :
  (c && b)
)
  :
a

Let's solve it now:

  1. (a && b) is "foo".
  2. "foo" || c is "foo".
  3. For the first ? test, "foo" is truthy.
  4. (c || b) is "foo".
  5. For the second ? test, "foo" is truthy.
  6. a is 42.

That's it, we're done! The answer is 42, just as we saw earlier. That actually wasn't so hard, was it?

Disambiguation

You should now have a much better grasp on operator precedence (and associativity) and feel much more comfortable understanding how code with multiple chained operators will behave.

But an important question remains: should we all write code understanding and perfectly relying on all the rules of operator precedence/associativity? Should we only use ( ) manual grouping when it's necessary to force a different processing binding/order?

Or, on the other hand, should we recognize that even though such rules are in fact learnable, there's enough gotchas to warrant ignoring automatic precedence/associativity? If so, should we thus always use ( ) manual grouping and remove all reliance on these automatic behaviors?

This debate is highly subjective, and heavily symmetrical to the debate in Chapter 4 over implicit coercion. Most developers feel the same way about both debates: either they accept both behaviors and code expecting them, or they discard both behaviors and stick to manual/explicit idioms.

I've presented you the pros and cons, and hopefully encouraged enough deeper understanding that you can make informed rather than hype-driven decisions.

In my opinion, there's an important middle ground. We should mix both operator precedence/associativity and ( ) manual grouping into our programs -- I argue the same way in Chapter 4 for healthy/safe usage of implicit coercion, but certainly don't endorse it exclusively without bounds.

For example, if (a && b && c) .. is perfectly OK to me, and I wouldn't do if ((a && b) && c) .. just to explicitly call out the associativity, because I think it's overly verbose.

On the other hand, if I needed to chain two ? : conditional operators together, I'd certainly use ( ) manual grouping to make it absolutely clear what my intended logic is.

Thus, my advice here is similar to that of Chapter 4: use operator precedence/associativity where it leads to shorter and cleaner code, but use ( ) manual grouping in places where it helps create clarity and reduce confusion.

Automatic Semicolons

ASI (Automatic Semicolon Insertion) is when JavaScript assumes a ; in certain places in your JS program even if you didn't put one there.

Why would it do that? Because if you omit even a single required ; your program would fail. Not very forgiving. ASI allows JS to be tolerant of certain places where ; aren't commonly thought to be necessary.

It's important to note that ASI will only take effect in the presence of a newline (aka line break). Semicolons are not inserted in the middle of a line.

Basically, if the JS parser parses a line where a parser error would occur (a missing expected ;), and it can reasonably insert one, it does so. What's reasonable for insertion? Only if there's nothing but whitespace and/or comments between the end of some statement and that line's newline/line break.

var a = 42, b
c;

Should JS treat the c on the next line as part of the var statement? It certainly would if a , had come anywhere (even another line) between b and c. But since there isn't one, JS assumes instead that there's an implied ; (at the newline) after b. Thus, c; is left as a standalone expression statement.

var a = 42, b = "foo";

a
b	// "foo"

That's still a valid program without error, because expression statements also accept ASI.

There's certain places where ASI is helpful, like for instance:

var a = 42;

do {
	// ..
} while (a)	// <-- ; expected here!
a;

The grammar requires a ; after a do..while loop, but not after while or for loops. But most developers don't remember that! So, ASI helpfully steps in and inserts one.

As we said earlier in the chapter, statement blocks do not require ; termination, so ASI isn't necessary:

var a = 42;

while (a) {
	// ..
} // <-- no ; expected here
a;

The other major case where ASI kicks in is with the break, continue, return, and yield keywords:

function foo(a) {
	if (!a) return
	a *= 2;
	// ..
}

The return statement doesn't carry across the newline to the a *= 2 expression, as ASI assumes the ; terminating the return statement. Of course, return statements can easily break across multiple lines, just not when there's nothing after return but the newline/line break.

function foo(a) {
	return (
		a * 2 + 3 / 12
	);
}

Error Correction

One of the most hotly contested religious wars in the JS community is whether to rely heavily/exclusively on ASI or not.

Most, but not all, semicolons are optional, but the two ;s in the for ( .. ) .. loop header are required.

On the pro side of this debate, many developers believe that ASI is a useful mechanism that allows them to write more terse (and more "beautiful") code by omitting all but the strictly required ;s (which are very few). It is often asserted that ASI makes many ;s optional, so a correctly written program without them is no different than a correctly written program with them.

On the con side of the debate, many other developers will assert that there are too many places that can be accidental gotchas, where unintended ;s being magically inserted change the meaning. Similarly, some developers will argue that if they omit a semicolon, it's a flat-out mistake, and they want their tools (linters, etc.) to catch it before the JS engine corrects the mistake under the covers.

Let me just share my perspective. A strict reading of the spec implies that ASI is an "error correction" routine. What kind of error, you may ask? Specifically, a parser error. In other words, in an attempt to have the parser fail less, ASI lets it be more tolerant.

But tolerant of what? In my view, the only way a parser error occurs is if it's given an incorrect/errored program to parse. So, while ASI is strictly correcting parser errors, the only way it can get such errors is if there were first program authoring errors -- omitting semicolons where the grammar rules require them.

So, to put it more bluntly, when I hear someone claim that they want to omit "optional semicolons," my brain translates that claim to "I want to write the most parser-broken program I can that will still work."

I find that to be a ludicrous position to take and the arguments of saving keystrokes and having more "beautiful code" to be weak at best.

Furthermore, I don't agree that this is the same thing as the spaces vs tabs debate -- that it's purely cosmetic -- but rather I believe it's a fundamental question of writing code that adheres to grammar requirements vs. code that relies on grammar exceptions to just barely skate through.

Another way of looking at it is that relying on ASI is essentially considering newlines to be significant "whitespace." Other languages like Python have true significant whitespace. But is it really appropriate to think of JavaScript as having significant newlines as it stands today?

My take: use semicolons wherever you know they are "required," and limit your assumptions about ASI to a minimum.

Errors

Not only does JavaScript have different subtypes of errors (TypeError, ReferenceError, SyntaxError, etc.), but also the grammar defines certain errors to be enforced at compile time, as compared to all other errors that happen during runtime.

In particular, there have long been a number of specific conditions that should be caught and reported as "early errors" (during compilation). Any straight-up syntax error is an early error (e.g., a = ,), but also the grammar defines things that are syntactically valid but disallowed nonetheless.

Since execution of your code has not begun yet, these errors are not catchable with try..catch; they will just fail the parsing/compilation of your program.

Tip: There's no requirement in the spec about exactly how browsers (and developer tools) should report errors. So you may see variations across browsers in the following error examples, in what specific subtype of error is reported or what the included error message text will be.

One simple example is with syntax inside a regular expression literal. There's nothing wrong with the JS syntax here, but the invalid regex will throw an early error:

var a = /+foo/;		// Error!

The target of an assignment must be an identifier (or an ES6 destructuring expression that produces one or more identifiers), so a value like 42 in that position is illegal and can be reported right away:

var a;
42 = a;		// Error!

ES5's strict mode defines even more early errors. For example, in strict mode, function parameter names cannot be duplicated:

function foo(a,b,a) { }					// just fine

function bar(a,b,a) { "use strict"; }	// Error!

Another strict mode early error is an object literal having more than one property of the same name:

(function(){
	"use strict";

	var a = {
		b: 42,
		b: 43
	};			// Error!
})();

Note: Semantically speaking, such errors aren't technically syntax errors but more grammar errors -- the above snippets are syntactically valid. But since there is no GrammarError type, some browsers use SyntaxError instead.

Using Variables Too Early

ES6 defines a (frankly confusingly named) new concept called the TDZ ("Temporal Dead Zone").

The TDZ refers to places in code where a variable reference cannot yet be made, because it hasn't reached its required initialization.

The most clear example of this is with ES6 let block-scoping:

{
	a = 2;		// ReferenceError!
	let a;
}

The assignment a = 2 is accessing the a variable (which is indeed block-scoped to the { .. } block) before it's been initialized by the let a declaration, so it's in the TDZ for a and throws an error.

Interestingly, while typeof has an exception to be safe for undeclared variables, no such safety exception is made for TDZ references:

{
	typeof a;	// undefined
	typeof b;	// ReferenceError! (TDZ)
	let b;
}

Function Arguments

Another example of a TDZ violation can be seen with ES6 default parameter values:

var b = 3;

function foo( a = 42, b = a + b + 5 ) {
	// ..
}

The b reference in the assignment would happen in the TDZ for the parameter b (not pull in the outer b reference), so it will throw an error. However, the a in the assignment is fine since by that time it's past the TDZ for parameter a.

When using ES6's default parameter values, the default value is applied to the parameter if you either omit an argument, or you pass an undefined value in its place:

function foo( a = 42, b = a + 1 ) {
	console.log( a, b );
}

foo();					// 42 43
foo( undefined );		// 42 43
foo( 5 );				// 5 6
foo( void 0, 7 );		// 42 7
foo( null );			// null 1

Note: null is coerced to a 0 value in the a + 1 expression.

From the ES6 default parameter values perspective, there's no difference between omitting an argument and passing an undefined value. However, there is a way to detect the difference in some cases:

function foo( a = 42, b = a + 1 ) {
	console.log(
		arguments.length, a, b,
		arguments[0], arguments[1]
	);
}

foo();					// 0 42 43 undefined undefined
foo( 10 );				// 1 10 11 10 undefined
foo( 10, undefined );	// 2 10 11 10 undefined
foo( 10, null );		// 2 10 null 10 null

Even though the default parameter values are applied to the a and b parameters, if no arguments were passed in those slots, the arguments array will not have entries.

Conversely, if you pass an undefined argument explicitly, an entry will exist in the arguments array for that argument, but it will be undefined and not (necessarily) the same as the default value that was applied to the named parameter for that same slot.

While ES6 default parameter values can create divergence between the arguments array slot and the corresponding named parameter variable, this same disjointedness can also occur in tricky ways in ES5:

function foo(a) {
	a = 42;
	console.log( arguments[0] );
}

foo( 2 );	// 42 (linked)
foo();		// undefined (not linked)

If you pass an argument, the arguments slot and the named parameter are linked to always have the same value. If you omit the argument, no such linkage occurs.

But in strict mode, the linkage doesn't exist regardless:

function foo(a) {
	"use strict";
	a = 42;
	console.log( arguments[0] );
}

foo( 2 );	// 2 (not linked)
foo();		// undefined (not linked)

It's almost certainly a bad idea to ever rely on any such linkage, and in fact the linkage itself is a leaky abstraction that's exposing an underlying implementation detail of the engine, rather than a properly designed feature.

Use of the arguments array has been deprecated (especially in favor of ES6 ... rest parameters), but that doesn't mean that it's all bad.

Prior to ES6, arguments is the only way to get an array of all passed arguments to pass along to other functions, which turns out to be quite useful. You can also mix named parameters with the arguments array and be safe, as long as you follow one simple rule: never refer to a named parameter and its corresponding arguments slot at the same time. If you avoid that bad practice, you'll never expose the leaky linkage behavior.

function foo(a) {
	console.log( a + arguments[1] ); // safe!
}

foo( 10, 32 );	// 42

try..finally

You're probably familiar with how the try..catch block works. But have you ever stopped to consider the finally clause that can be paired with it? In fact, were you aware that try only requires either catch or finally, though both can be present if needed.

The code in the finally clause always runs, and it always runs right after the try (and catch if present) finish, before any other code runs. In one sense, you can kind of think of the code in a finally clause as being in a callback function that will always be called regardless of how the rest of the block behaves.

So what happens if there's a return statement inside a try clause? It obviously will return a value, right? But does the calling code that receives that value run before or after the finally?

function foo() {
	try {
		return 42;
	}
	finally {
		console.log( "Hello" );
	}

	console.log( "never runs" );
}

console.log( foo() );
// Hello
// 42

The return 42 runs right away, which sets up the completion value from the foo() call. This action completes the try clause and the finally clause immediately runs next. Only then is the foo() function complete, so that its completion value is returned back for the console.log(..) statement to use.

The exact same behavior is true of a throw inside try:

 function foo() {
	try {
		throw 42;
	}
	finally {
		console.log( "Hello" );
	}

	console.log( "never runs" );
}

console.log( foo() );
// Hello
// Uncaught Exception: 42

Now, if an exception is thrown inside a finally clause, it will override as the primary completion of that function. If a previous return in the try block had set a completion value for the function, that value will be abandoned.

function foo() {
	try {
		return 42;
	}
	finally {
		throw "Oops!";
	}

	console.log( "never runs" );
}

console.log( foo() );
// Uncaught Exception: Oops!

It shouldn't be surprising that other nonlinear control statements like continue and break exhibit similar behavior to return and throw:

for (var i=0; i<10; i++) {
	try {
		continue;
	}
	finally {
		console.log( i );
	}
}
// 0 1 2 3 4 5 6 7 8 9

The console.log(i) statement runs at the end of the loop iteration, which is caused by the continue statement. However, it still runs before the i++ iteration update statement, which is why the values printed are 0..9 instead of 1..10.

Note: ES6 adds a yield statement, in generators which in some ways can be seen as an intermediate return statement. However, unlike a return, a yield isn't complete until the generator is resumed, which means a try { .. yield .. } has not completed. So an attached finally clause will not run right after the yield like it does with return.

A return inside a finally has the special ability to override a previous return from the try or catch clause, but only if return is explicitly called:

function foo() {
	try {
		return 42;
	}
	finally {
		// no `return ..` here, so no override
	}
}

function bar() {
	try {
		return 42;
	}
	finally {
		// override previous `return 42`
		return;
	}
}

function baz() {
	try {
		return 42;
	}
	finally {
		// override previous `return 42`
		return "Hello";
	}
}

foo();	// 42
bar();	// undefined
baz();	// "Hello"

Normally, the omission of return in a function is the same as return; or even return undefined;, but inside a finally block the omission of return does not act like an overriding return undefined; it just lets the previous return stand.

In fact, we can really up the craziness if we combine finally with labeled break:

function foo() {
	bar: {
		try {
			return 42;
		}
		finally {
			// break out of `bar` labeled block
			break bar;
		}
	}

	console.log( "Crazy" );

	return "Hello";
}

console.log( foo() );
// Crazy
// Hello

But... don't do this. Seriously. Using a finally + labeled break to effectively cancel a return is doing your best to create the most confusing code possible. I'd wager no amount of comments will redeem this code.

switch

Let's briefly explore the switch statement, a sort-of syntactic shorthand for an if..else if..else.. statement chain.

switch (a) {
	case 2:
		// do something
		break;
	case 42:
		// do another thing
		break;
	default:
		// fallback to here
}

As you can see, it evaluates a once, then matches the resulting value to each case expression. If a match is found, execution will begin in that matched case, and will either go until a break is encountered or until the end of the switch block is found.

That much may not surprise you, but there are several quirks about switch you may not have noticed before.

First, the matching that occurs between the a expression and each case expression is identical to the === algorithm. Often times switches are used with absolute values in case statements, as shown above, so strict matching is appropriate.

However, you may wish to allow coercive equality and to do so you'll need to sort of "hack" the switch statement a bit:

var a = "42";

switch (true) {
	case a == 10:
		console.log( "10 or '10'" );
		break;
	case a == 42:
		console.log( "42 or '42'" );
		break;
	default:
		// never gets here
}
// 42 or '42'

This works because the case clause can have any expression, which means it will strictly match that expression's result to the test expression (true). Since a == 42 results in true here, the match is made.

Despite ==, the switch matching itself is still strict, between true and true here. If the case expression resulted in something that was truthy but not strictly true, it wouldn't work. This can bite you if you're for instance using a "logical operator" like || or && in your expression:

var a = "hello world";
var b = 10;

switch (true) {
	case (a || b == 10):
		// never gets here
		break;
	default:
		console.log( "Oops" );
}
// Oops

Since the result of (a || b == 10) is "hello world" and not true, the strict match fails. In this case, the fix is to force the expression explicitly to be a true or false, such as case !!(a || b == 10):.

Lastly, the default clause is optional, and it doesn't necessarily have to come at the end). Even in the default clause, the same rules apply about encountering a break or not:

var a = 10;

switch (a) {
	case 1:
	case 2:
		// never gets here
	default:
		console.log( "default" );
	case 3:
		console.log( "3" );
		break;
	case 4:
		console.log( "4" );
}
// default
// 3

Note: As discussed previously about labeled breaks, the break inside a case clause can also be labeled.

The way this snippet processes is that it passes through all the case clause matching first, finds no match, then goes back up to the default clause and starts executing. Since there's no break there, it continues executing in the already skipped over case 3 block, before stopping once it hits that break.

While this sort of round-about logic is clearly possible in JavaScript, there's almost no chance that it's going to make for reasonable or understandable code. Be very skeptical if you find yourself wanting to create such circular logic flow, and if you really do, make sure you include plenty of code comments to explain what you're up to!

Review

JavaScript grammar has plenty of nuance that we as developers should spend a little more time paying closer attention to than we typically do. A little bit of effort goes a long way to solidifying your deeper knowledge of the language.

Statements and expressions have analogs in English language -- statements are like sentences and expressions are like phrases. Expressions can be pure/self-contained, or they can have side effects.

The JavaScript grammar layers semantic usage rules (aka context) on top of the pure syntax. For example, { } pairs used in various places in your program can mean statement blocks, object literals, destructuring assignments, or named function arguments.

JavaScript operators all have well-defined rules for precedence and associativity (how multiple operator expressions are implicitly grouped). Once you learn these rules, it's up to you to decide if precedence/associativity are too implicit for their own good, or if they will aid in writing shorter, clearer code.

ASI is a parser-error-correction mechanism built into the JS engine, which allows it under certain circumstances to insert an assumed ; in places where it is required, was omitted, and where insertion fixes the parser error. The debate rages over whether this behavior implies that most ; are optional or whether it means that omitting them is making mistakes that the JS engine merely cleans up for you.

JavaScript has several types of errors, but it's less known that it has two classifications for errors: "early" (compiler thrown, uncatchable) and "runtime" (try..catchable). All syntax errors are obviously early errors that stop the program before it runs, but there are others, too.

Function arguments have an interesting relationship to their formal declared named parameters. Specifically, the arguments array has a number of gotchas of leaky abstraction behavior if you're not careful. Avoid arguments if you can, but if you must use it, by all means avoid using the positional slot in arguments at the same time as using a named parameter for that same argument.

The finally clause attached to a try (or try..catch) offers some very interesting quirks in terms of execution processing order. Some of these quirks can be helpful, but it's possible to create lots of confusion, especially if combined with labeled blocks. As always, use finally to make code better and clearer, not more clever or confusing.

The switch offers some nice shorthand for if..else if.. statements, but beware of many common simplifying assumptions about its behavior. There are several quirks that can trip you up if you're not careful, but there's also some neat hidden tricks that switch has up its sleeve!