「こんなきれいな星も、やっぱりここまで来てから、見れたのだと思うから。だから・・もっと遠くへ・・」

The Watchpoint Mechanism in JSC

While Javascript has a simple syntax, what happens behind the scene is far from simple. For example, consider this innocent-looking hypot function below:

1
2
3
var hypot = function(a, b) {
return Math.sqrt(a * a + b * b);
}

It is clear what Math.sqrt does: it performs a square root. However, to actually execute Math.sqrt(...), a lot of steps are needed:

  1. First, get the global object, where all global variables reside in.
  2. Then, get the Math property of the global object. Normally the Math property exists (since it is predefined), but we can’t know for sure: if someone had indeed run delete Math; before, we must promptly throw out an error.
  3. Next, get the sqrt property of Math. Note that we cannot even be certain that Math is an object (as someone could have done Math = 123;). So as in (2), we must not omit any check for error.
  4. Finally, similarly, what the sqrt property contains can be anything. Even if it is a function, it could be any function. So as before, we must not omit any check, and if sqrt is indeed a Javascript function, we perform the Javascript function call.

So, in order to correctly (as every Javascript engine needs to be) execute this innocently looking Math.sqrt, a ton of stuffs must be done.

How can we make this faster?

The crucial observation is that while the programmer is technically allowed to do anything, including insane things like delete Math; or Math = 123, most sane programs will not do it. So for practical purposes, it is enough if we can make sane programs both correct and fast, while running insane programs only correctly.

In JSC (WebKit’s Javascript engine), this is achieved by Watchpoint.

Conceptually, a WatchpointSet represents a condition that one expects to be true, or simply put, a watchable condition. For example, we may expect the global object to contain property Math, and its value being equal to the predefined Math object.

One may attach Watchpoints to the WatchpointSet. A Watchpoint is essentially a callback: after attaching to a WatchpointSet, when the condition represented by the WatchpointSet becomes false, the callback is invoked (“fired”), so the owner who created the Watchpoint can react correspondingly.

While the watchpoint mechanism isn’t necessarily binded to JIT Compilation (for example, LLIntPrototypeLoadAdaptiveStructureWatchpoint works without JIT), it is most powerful when combined with JIT Compilation. We generate code that is optimized assuming the watchpoint condition holds, so inside the generated code, we don’t check for the condition at all. If the condition no longer holds, we must jettison the code – this is expensive, because all the work we did to generate the code is wasted, but the whole point of watchpoint is that such bad cases should happen only rarely.

A Motivating Example

Let’s go back to the Math.sqrt example: we want to get notified when a property of an object changes value. Therefore, all logic that writes value into object properties must cooperate with us. For simplicity, let’s assume the object Math has a Structure, say S. Then, there are two kinds of logic that may write to object properties:

  1. The C++ code that implements object property writes (the slow paths).
  2. The JIT’ed code that writes to a specific property of a specific structure (the fast paths).

The fast paths are known as inline caches. Inline caching is probably the most important optimization in JSC, but I will leave its details to another post. For the purpose of this post, it’s sufficient to think of each inline cache fast-path as a JIT-compiled piece of code that is specialized for a certain structure S and a certain property name prop. Given a value and an object with structure S, it writes value to property prop of the object.

The slow path case is easy to handle: whenever one writes to a property of an object, one checks whether there are Watchpoints watching the condition, and fire them. Of course, we are doing one extra check for every object property write. However, those code are already slow paths, so it doesn’t hurt too much to make them a bit more slower.

The fast path case is trickier. A naive solution is to add a watchpoint check, as how we handled slow-path. However, this is unsatisfactory: now, every fast-path write is doing one extra check! We can afford slowing down the slow-path, but we want to keep the fast-path fast.

So, the fast-path must not check for watchpoint conditions it violates at runtime. Instead, we permenantly invalidate any and all WatchpointSet it could violate as soon as the fast-path code is JIT’ed, no matter if there are watchers or not. As another consequence, since the fast path works on a fixed property (e.g. sqrt) of a fixed Structure (e.g. S), but not on fixed objects, our watchpoints have to be in the form of <Structure, property>: they work on Structure-level but not object-level (they are called ValueReplacementWatchpointSet in JSC). For example,when a fast-path writing the sqrt property of Structure S is built, we have to be conservative and permanently invalidate WatchpointSet <S, sqrt>, since we have no way to know if that fast-path is going to run on our Math object in the future.

The Design

This leads to the following design in JSC. A WatchpointSet has three possible states[1]:

  1. DoesNotExist: The WatchpointSet object does not physically exist (and is implicitly Valid). This is needed because there is an infinite number of watchable conditions, and also that we want to save memory. In this state, there exists no fast-path that rely on or violate the watchpoint. Slowpath executions that violate the watchpoint are not recorded (but doing so wouldn’t break the scheme).
  2. Valid[2]: The watchpoint is valid: no fast-path that may violate the watched condition has been built, and one may build fast-path relying on the watchpoint condition as long as it adds itself into the watcher list.
  3. Invalidated: The watchpoint is permaently invalidated.

As one can see from the example in the previous section, the Watchpoint system needs to handle interactions with three components:

  1. Slow-path (C++ code) that may violate the watched condition.
  2. Fast-path (JIT’ed code) that may violate the watched condition.
  3. Code (C++ or JIT’ed) that is optimized assuming the watched condition is true.

For (1), the slow-path must check in the code any watchable condition it violated, and if the corresponding WatchpointSet exists, fire all watchers. However, in such case, the slow-path have the choice between invalidate the WatchpointSet, or to keep it valid[3].

For (2), the fast-path code does not check the watchable condition it violates, but we must transit all WatchpointSets it may violate when executed to Invalidated when such a fast-path is JIT’ed (and we must create such WatchpointSet object if it does not exist yet).

For (3), we must disable the code when the watcher callback is invoked. If the code is C++ code, then disabling the codepath is as easy as flipping a flag. If the code is JIT’ed code, we must jettison the code[4].

Back to Our Example, and Adaptive Watchpoints

Unfortunately, in our example, it turns out that only watching on <Structure, Property> is not enough. While this handles writes to existing properties correctly, one may create new properties in the object, thus transitioning its Structure. Say, one did a Math.abc = 123;. Since it adds a property to Math, the object Math gets a different structure S2, but our watchpoint is watching on <S, sqrt>, and we are screwed. To fix this issue, we must get notified when our object changes structure as well. However, as before, since an object-property-write fast-path works on a fixed Structure but not a fixed object, we have to put our watchpoint at Structure level. That is, we will have a WachpointSet on each Structure S, asserting that it never makes further transitions to other Structures (this is called a StructureTransitionWatchpointSet in JSC).

The last interesting piece is what to do when a StructureTransitionWatchpointSet turns to Invalidated state. If the transition happened on another object with the same Structure S, even though our Math object is not modified, we have no choice but to invalidate our code, as the StructureTransitionWatchpointSet for S has been invalidated, so we have no way to get notified if our Math object gets transitioned in the future.

However, if the transition happened on object Math (i.e. Math itself gets a new Structure), then it’s possible to keep our optimized code valid: we just need to start watching <S2, sqrt> instead. So we will move our ValueReplacementWatchpoint to watch <S2, sqrt> and our StructureTransitionWatchpoint to watch S2, and keep our code valid[5]. In JSC, such watchpoints whose action on fire is to move themselves to new places have a terminology AdaptiveWatchpoints.

Ending Thoughts

This way, by watching that the Math property of the global object never changes value, and that the sqrt property of the Math object never changes value, the code Math.sqrt is reduced from two object property lookups with a ton of error checks to a constant (not even a branch!) in the JIT’ed code.

The watchpoint mechanism also helps other optimizations to generate better code. For example, the call opcode (which calls whatever is stored in Math.sqrt) has its own inline caching that records which functions it has called. For sane programs that does not mess up with the predefined objects, there will be only one callee recorded: the sqrt intrinsic function. Normally this would allow the compiler to emit a check (that the result of expression Math.sqrt equals the sqrt intrinsic function) and speculatively inline sqrt. However, since the watchpoint already tells us that Math.sqrt must evaluates to the sqrt intrinsic function, the compiler can do better: it may omit the check and inline sqrt directly. Now, for sane programs, all the terrible stuffs listed at the beginning of this post are gone, so the JIT’ed code to evaluate the Math.sqrt part is as efficient as if it were directly written in C++!

Finally, a couple of side notes:

  1. If we want to avoid the case that the transition of another object results in invalidation of our code, we can give our object its own unique Structure, though the downside is that we might blow up the inline cache if we do it for too many objects.
  2. The slow-path does not fire the watchpoint if the watchpoint is in DoesNotExist or Clear state. This not only saves memory, but is also an advantage for the use case above: while it’s plausible to assume that sane programs will not change Math.sqrt frequently, it’s also plausible for them to change it at program start (e.g., to log a warning if the input to sqrt is negative). Since such code will execute in slow-path and before any fast-path relying on the WatchpointSet is built, they will not invalidate the WatchpointSet, as desired.

Acknowledgements

I thank Saam Barati from JSC team for teaching me all of these (and more) using his precious spare time, and for his valuable comments on this post. Of course, any mistakes in this post are mine.


Footnotes


  1. Note that the DoesNotExist state is not listed in the enum, since in this state the object doesn’t exist at all. ↩︎

  2. In fact, JSC further distinguishes Valid state into Clear and Watched, to determine the behavior when a slow-path violation happened (see Footnote 3). However, this is only a design detail, so we put it in footnote for ease of understanding. ↩︎

  3. When the WatchpointSet is in Clear state, the slow-path will keep it in Clear state. However, if it is in Watched state, even if there are no watchers, it will be transitioned to Invalidated state. ↩︎

  4. Things get trickier if the code is already running (e.g., the code being jettisoned is the current function being executed, a function in the call stack, or even a function inlined by the current function), in which case we must OSR Exit to the baseline tier, but we will ignore such complexities in this post. ↩︎

  5. Of course, if the ValueReplacementWatchpointSet of <S2, sqrt> or the StructureTransitionWatchpointSet of S2 is already Invalidated, we will still have to invalidate our code. ↩︎