While Javascript has a simple syntax, what happens behind the scene is far from simple. For example, consider this innocent-looking hypot
function below:
1 | var hypot = function(a, b) { |
It is clear what Math.sqrt
does: it performs a square root. However, to actually execute Math.sqrt(...)
, a lot of steps are needed:
- First, get the global object, where all global variables reside in.
- Then, get the
Math
property of the global object. Normally theMath
property exists (since it is predefined), but we can’t know for sure: if someone had indeed rundelete Math;
before, we must promptly throw out an error. - Next, get the
sqrt
property ofMath
. Note that we cannot even be certain thatMath
is an object (as someone could have doneMath = 123;
). So as in (2), we must not omit any check for error. - Finally, similarly, what the
sqrt
property contains can be anything. Even if it is a function, it could be any function. So as before, we must not omit any check, and ifsqrt
is indeed a Javascript function, we perform the Javascript function call.
So, in order to correctly (as every Javascript engine needs to be) execute this innocently looking Math.sqrt
, a ton of stuffs must be done.
How can we make this faster?
The crucial observation is that while the programmer is technically allowed to do anything, including insane things like delete Math;
or Math = 123
, most sane programs will not do it. So for practical purposes, it is enough if we can make sane programs both correct and fast, while running insane programs only correctly.
In JSC (WebKit’s Javascript engine), this is achieved by Watchpoint.
Conceptually, a WatchpointSet
represents a condition that one expects to be true, or simply put, a watchable condition. For example, we may expect the global object to contain property Math
, and its value being equal to the predefined Math
object.
One may attach Watchpoint
s to the WatchpointSet
. A Watchpoint
is essentially a callback: after attaching to a WatchpointSet
, when the condition represented by the WatchpointSet
becomes false, the callback is invoked (“fired”), so the owner who created the Watchpoint
can react correspondingly.
While the watchpoint mechanism isn’t necessarily binded to JIT Compilation (for example, LLIntPrototypeLoadAdaptiveStructureWatchpoint works without JIT), it is most powerful when combined with JIT Compilation. We generate code that is optimized assuming the watchpoint condition holds, so inside the generated code, we don’t check for the condition at all. If the condition no longer holds, we must jettison the code – this is expensive, because all the work we did to generate the code is wasted, but the whole point of watchpoint is that such bad cases should happen only rarely.
A Motivating Example
Let’s go back to the Math.sqrt
example: we want to get notified when a property of an object changes value. Therefore, all logic that writes value into object properties must cooperate with us. For simplicity, let’s assume the object Math
has a Structure, say S
. Then, there are two kinds of logic that may write to object properties:
- The C++ code that implements object property writes (the slow paths).
- The JIT’ed code that writes to a specific property of a specific structure (the fast paths).
The fast paths are known as inline caches. Inline caching is probably the most important optimization in JSC, but I will leave its details to another post. For the purpose of this post, it’s sufficient to think of each inline cache fast-path as a JIT-compiled piece of code that is specialized for a certain structure S
and a certain property name prop
. Given a value
and an object with structure S
, it writes value
to property prop
of the object.
The slow path case is easy to handle: whenever one writes to a property of an object, one checks whether there are Watchpoints watching the condition, and fire them. Of course, we are doing one extra check for every object property write. However, those code are already slow paths, so it doesn’t hurt too much to make them a bit more slower.
The fast path case is trickier. A naive solution is to add a watchpoint check, as how we handled slow-path. However, this is unsatisfactory: now, every fast-path write is doing one extra check! We can afford slowing down the slow-path, but we want to keep the fast-path fast.
So, the fast-path must not check for watchpoint conditions it violates at runtime. Instead, we permenantly invalidate any and all WatchpointSet
it could violate as soon as the fast-path code is JIT’ed, no matter if there are watchers or not. As another consequence, since the fast path works on a fixed property (e.g. sqrt
) of a fixed Structure (e.g. S
), but not on fixed objects, our watchpoints have to be in the form of <Structure, property>
: they work on Structure-level but not object-level (they are called ValueReplacementWatchpointSet in JSC). For example,when a fast-path writing the sqrt
property of Structure S
is built, we have to be conservative and permanently invalidate WatchpointSet <S, sqrt>
, since we have no way to know if that fast-path is going to run on our Math
object in the future.
The Design
This leads to the following design in JSC. A WatchpointSet
has three possible states[1]:
DoesNotExist
: TheWatchpointSet
object does not physically exist (and is implicitlyValid
). This is needed because there is an infinite number of watchable conditions, and also that we want to save memory. In this state, there exists no fast-path that rely on or violate the watchpoint. Slowpath executions that violate the watchpoint are not recorded (but doing so wouldn’t break the scheme).Valid
[2]: The watchpoint is valid: no fast-path that may violate the watched condition has been built, and one may build fast-path relying on the watchpoint condition as long as it adds itself into the watcher list.Invalidated
: The watchpoint is permaently invalidated.
As one can see from the example in the previous section, the Watchpoint
system needs to handle interactions with three components:
- Slow-path (C++ code) that may violate the watched condition.
- Fast-path (JIT’ed code) that may violate the watched condition.
- Code (C++ or JIT’ed) that is optimized assuming the watched condition is true.
For (1), the slow-path must check in the code any watchable condition it violated, and if the corresponding WatchpointSet
exists, fire all watchers. However, in such case, the slow-path have the choice between invalidate the WatchpointSet
, or to keep it valid[3].
For (2), the fast-path code does not check the watchable condition it violates, but we must transit all WatchpointSet
s it may violate when executed to Invalidated
when such a fast-path is JIT’ed (and we must create such WatchpointSet
object if it does not exist yet).
For (3), we must disable the code when the watcher callback is invoked. If the code is C++ code, then disabling the codepath is as easy as flipping a flag. If the code is JIT’ed code, we must jettison the code[4].
Back to Our Example, and Adaptive Watchpoints
Unfortunately, in our example, it turns out that only watching on <Structure, Property>
is not enough. While this handles writes to existing properties correctly, one may create new properties in the object, thus transitioning its Structure
. Say, one did a Math.abc = 123;
. Since it adds a property to Math
, the object Math
gets a different structure S2
, but our watchpoint is watching on <S, sqrt>
, and we are screwed. To fix this issue, we must get notified when our object changes structure as well. However, as before, since an object-property-write fast-path works on a fixed Structure
but not a fixed object, we have to put our watchpoint at Structure
level. That is, we will have a WachpointSet
on each Structure S
, asserting that it never makes further transitions to other Structures (this is called a StructureTransitionWatchpointSet in JSC).
The last interesting piece is what to do when a StructureTransitionWatchpointSet turns to Invalidated
state. If the transition happened on another object with the same Structure S
, even though our Math
object is not modified, we have no choice but to invalidate our code, as the StructureTransitionWatchpointSet for S
has been invalidated, so we have no way to get notified if our Math
object gets transitioned in the future.
However, if the transition happened on object Math
(i.e. Math
itself gets a new Structure), then it’s possible to keep our optimized code valid: we just need to start watching <S2, sqrt>
instead. So we will move our ValueReplacementWatchpoint to watch <S2, sqrt>
and our StructureTransitionWatchpoint to watch S2
, and keep our code valid[5]. In JSC, such watchpoints whose action on fire is to move themselves to new places have a terminology AdaptiveWatchpoints
.
Ending Thoughts
This way, by watching that the Math
property of the global object never changes value, and that the sqrt
property of the Math
object never changes value, the code Math.sqrt
is reduced from two object property lookups with a ton of error checks to a constant (not even a branch!) in the JIT’ed code.
The watchpoint mechanism also helps other optimizations to generate better code. For example, the call opcode (which calls whatever is stored in Math.sqrt
) has its own inline caching that records which functions it has called. For sane programs that does not mess up with the predefined objects, there will be only one callee recorded: the sqrt
intrinsic function. Normally this would allow the compiler to emit a check (that the result of expression Math.sqrt
equals the sqrt
intrinsic function) and speculatively inline sqrt
. However, since the watchpoint already tells us that Math.sqrt
must evaluates to the sqrt
intrinsic function, the compiler can do better: it may omit the check and inline sqrt
directly. Now, for sane programs, all the terrible stuffs listed at the beginning of this post are gone, so the JIT’ed code to evaluate the Math.sqrt
part is as efficient as if it were directly written in C++!
Finally, a couple of side notes:
- If we want to avoid the case that the transition of another object results in invalidation of our code, we can give our object its own unique Structure, though the downside is that we might blow up the inline cache if we do it for too many objects.
- The slow-path does not fire the watchpoint if the watchpoint is in
DoesNotExist
orClear
state. This not only saves memory, but is also an advantage for the use case above: while it’s plausible to assume that sane programs will not changeMath.sqrt
frequently, it’s also plausible for them to change it at program start (e.g., to log a warning if the input tosqrt
is negative). Since such code will execute in slow-path and before any fast-path relying on theWatchpointSet
is built, they will not invalidate theWatchpointSet
, as desired.
Acknowledgements
I thank Saam Barati from JSC team for teaching me all of these (and more) using his precious spare time, and for his valuable comments on this post. Of course, any mistakes in this post are mine.
Footnotes
Note that the
DoesNotExist
state is not listed in the enum, since in this state the object doesn’t exist at all. ↩︎In fact, JSC further distinguishes
Valid
state intoClear
andWatched
, to determine the behavior when a slow-path violation happened (see Footnote 3). However, this is only a design detail, so we put it in footnote for ease of understanding. ↩︎When the
WatchpointSet
is inClear
state, the slow-path will keep it inClear
state. However, if it is inWatched
state, even if there are no watchers, it will be transitioned toInvalidated
state. ↩︎Things get trickier if the code is already running (e.g., the code being jettisoned is the current function being executed, a function in the call stack, or even a function inlined by the current function), in which case we must OSR Exit to the baseline tier, but we will ignore such complexities in this post. ↩︎
Of course, if the ValueReplacementWatchpointSet of
<S2, sqrt>
or the StructureTransitionWatchpointSet ofS2
is alreadyInvalidated
, we will still have to invalidate our code. ↩︎