Why do the script engines not cache dispatch identifiers?

You COM programmers out there are intimately familiar with the IDispatch interface, I’m sure. To briefly sum up for the rest of you, the point of IDispatch is that it allows a caller to call a function without actually knowing at compile time any details about the name or signature of that function. The caller passes a method name to IDispatch::GetIdsOfNames to get a dispatch identifier — an integer — which identifies the method, and then calls IDispatch::Invoke with the dispid and the arguments. The implementation of Invoke is then responsible for analyzing the arguments and calling the actual function on the object’s virtual function table.

This is how VBScript and JScript work. When you say

Set Frob = CreateObject("BitBucket.Froboznicator") 
Frob.Gnusto 123, "skidoo"

what VBScript actually does is pass the string "Gnusto" to GetIdsOfNames, and then passes the dispid and arguments to Invoke. The object then does the work of actually translating that into a real function call. This is how we get away with not having to write a down-to-the-machine-code compiler for VBScript and JScript.

One of the fundamental rules of OLE Automation is that for any object, any two calls to GetIdsOfNames with the same string must return the same number. This is so that callers can fetch the dispid once and reuse it, rather than having to fetch it anew every time they try to invoke the dispatch object.

VBScript and JScript do not cache the dispid. If you say

Set Frob = CreateObject("BitBucket.Froboznicator") 
Frob.Gnusto 123, "skidoo"
Frob.Gnusto 10, "lords a leaping"

then the script engine will call GetIdsOfNames twice, and get the same value both times.

Surely this is wasteful. Can’t we cache that thing? I mean, it is a small optimization; that call to Invoke is going to dwarf the expense of the GetIdsOfNames. It just seems like there ought to be something we could do here.

Appearances can be deceiving.

You might first think that every time we see Frob.Gnusto we can use the dispid we grabbed the first time and reuse it. For example:

Frob.Gnusto 123, "skidoo" 
' OK, Gnusto is dispid 0x1111.  Add to cache "Frob.Gnusto", 0x1111
Frob.Gnusto 10, "lords a leaping"  
' Look up "Frob.Gnusto" in cache, aha, it is 0x1111

This doesn’t work. Consider this scenario:

Set Frob = CreateObject("BitBucket.Froboznicator") 
Frob.Gnusto 123, "skidoo"
Set Frob = CreateObject("MushySoft.Gronker") 
Frob.Gnusto 10, "lords a leaping"  

Two objects sharing the same variable but with different types may have the same method name but different dispids. We would have to invalidate our cache every time the variable was changed, which is a lot of work to save very little time.

There are other reasons why this doesn’t work. Consider this JScript code:

var frob = new ActiveXObject("BitBucket.Froboznicator"); 
var frab = new ActiveXObject("BitBucket.Flouncer"); 
frob.gnusto(123, "skidoo"); // Cache "frob.gnusto", 0x1111 
with(frab) 
{ 
  frob.gnusto(10, "lords a leaping"); 
} 

Is the call to frab.frob.gnusto or frob.gnusto? The object in frab might have a method frob which has a method gnusto which has a different dispid.

Similarly, there are problems with multiple scopes where local variables may shadow globals. It’s a huge mess. The net result is that we can’t cache dispids against variable names. The variable names are just too ephemeral.

What about the object pointers? Every object has a unique 32 bit pointer associated with it, right? Why not cache the dispids against the pointer-and-method-name pair?

Unfortunately, that doesn’t work either. Suppose object Frob is at address 0x12345000 in memory.

Frob.Gnusto 123, "skidoo" 
' Gnusto is dispid 0x1111. Add to cache "0x12340000.Gnusto", 0x1111
Frob.Gnusto 10, "lords a leaping"  
' Look up "0x12340000.Gnusto" in cache, aha, it is 0x1111

This might look fine, but again, it isn’t:

Set Frob = CreateObject("BitBucket.Froboznicator") 
Frob.Gnusto 123, "skidoo" 
Set Frob = Nothing
Set Frob = CreateObject("MushySoft.Gronker") 
Frob.Gnusto 10, "lords a leaping"  

What happened on that fourth line of code there? We previously threw away the only reference to the current value of Frob, freeing the memory. Then the operating system went and created a new object. The operating system could be very smart about memory re-use. It knows that it has a perfectly good free pointer at 0x12340000, so it might re-use it. Now our cache needs to be invalidated every time an object is freed! We must keep track of when every single object is freed — basically, we have to write a garbage collector for arbitrary COM objects!

To cache dispids you need to ensure that the lifetime of the object is greater than the lifetime of the cache. But object lifetimes are so unpredictable that it is very hard to know when the cache is invalid. We once considered adding dispid caching to those objects where we knew that the objects would live longer than the script — the window object in IE, or the Response object in IIS, but rejected the proposal as too much complication for very little performance gain.

Advertisements

3 thoughts on “Why do the script engines not cache dispatch identifiers?

  1. Pingback: Porting old posts, part 1 | Fabulous adventures in coding

  2. > What happened on that third line of code there?

    It’s first a new MushySoft.Gronker is allocated/assigned *then* the previous instance is released/deallocated not the other way around and this misconception is very popular :-)) But your whole point still holds about matching addresses, probably by first explicitly setting Frob to Nothing.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s