The JScript Type System, Part Five: More On Arrays In JScript .NET

As
I was saying the other day, CLR arrays and JScript arrays are totally different beasts.
It is hard to imagine two things being so different and yet both called the same thing.
Why did the CLR designers and the JScript designers start with the same desire —
create an array system — and come up with completely different implementations?

 

Well,
the CLR implementers knew that dense, nonassociative hard-typed arrays are easy to
make fast and efficient. Furthermore,
such arrays encourage the programmer to keep
homogenous data in strictly bounded tables
. That makes large
programs that do lots of data manipulation easier to understand
. Thus, languages
such as C++, C# and Visual Basic have arrays like this, and thus they are the basic
built-in array type in the CLR.

 

Sparse,
associative, soft-typed arrays are not particularly fast but they are far
more dynamic and flexible
 than Visual Basic-style arrays. They make it easy to
store heterogeneous data in any table without
worrying about picky details
 like exactly how big that table is. In other words,
they are scripty. Languages such as JScript
and Perl have arrays like this.

 

JScript
.NET has both very dynamic, scripty arrays
and more strict CLR arrays, making it suitable
for both rapid development of scripts and programming in the large
. But like I
said, making these two very different kinds of arrays work well together is not trivial.

 

JScript
.NET supports the creation of multidimensional hard-typed arrays. As with single-dimensional
arrays, the array size is not part of the
type
. To annotate a variable as containing a hard-typed multidimensional array
the syntax is to follow the type with brackets containing commas. For example, to
annotate a variable as containing a two dimensional array of Strings you would say:

 

var multiarr
: String[,];

 

The
number of commas between the brackets plus one is equal to the rank of the array.
(By this definition if there are no commas between the brackets then it is a rank-one
array, as we have already seen.)

 

A
multidimensional array is allocated with the new keyword
as you might expect:

 

multiarr
= new String[4,5];

multiarr[0,0]
= “hello”;

 

Notice
that hard-typed array elements are always
accessed with a comma-separated list of integer indices
. There must always be exactly
one index for each dimension in the array
. You can’t use the ragged array syntax [0][0].

 

There
are certain situations in which you know that a variable or function argument will
refer to a hard-typed CLR array but you do not actually know the element type or the
rank, just that it is an array. Should you find yourself in one of these (rather rare)
situations there is a special annotation for a CLR array of unknown type and rank:

 

var sysarr
: System.Array;

sysarr
= new String[4,5];

sysarr
= new double[10];

 

As
you can see, a variable of type System.Array may
hold any CLR array of any type and rank. However, there is a drawback. Variables of
type System.Array may
not be indexed directly because the rank is not known
. This is illegal:

 

var sysarr
: System.Array;

sysarr
= new String[4,5];

sysarr[1,2]
= “hello”;  // ILLEGAL, System.Arrays
are not indexable

 

Rather,
to index a System.Array you
must call the GetValue and SetValue methods with
an array of indices:

 

var sysarr
: System.Array;

sysarr
= new String[4,5];

sysarr.SetValue(“hello”,
[1,2]);

 

The
rank and size of a System.Array can
be determined with the Rank, GetLowerBound and GetUpperBound members.

 

Thinking
about this a bit now, I suppose that we could have
detected at compile time that a System.Array was
being indexed, and constructed the call to the getter/setter appropriately for you,
behind the scenes.  But apparently we
didn’t.  Oh well.

 

Next
time: mixing and matching JScript and CLR arrays.

Tags JScript JScript .NET Scripting

Comments (5)

You must be logged in to post a comment.

  1. Dan ShappirSo far the only potential downside I see in this design is that you cannot write a generic function that works for both JScript arrays and CLR arrays if the arrays have a rank higher than 1. I assume the following function will work on both types of arrays:Would have been even nicer if CLR arrays in JScript were somehow made to support the length property if they were one dimensional.BTW, I can’t resist, using the BeyondJS JavaScript library you could write the above code as:Log in to Reply
  2. var sum = arr.fold(“+”);
  3. Anyway, you could not write such a function if arr was, say, two dimensional.
  4. function sum(arr, length) {
    var result = 0;
    for ( int i = 0 ; i < length ; ++i ) result += arr[i];
    return result;
    }
  5. November 13, 2003 at 3:55 am
  6. Eric Lippert> I assume the following function will work on both types of arrays:> you cannot write a generic function that works for both JScript arrays and CLR arrays if the arrays have a rank higher than 1> Would have been even nicer if CLR arrays in JScript were somehow made to support the length property if they were one dimensional.> var sum = arr.fold(“+”);Log in to Reply
  7. I assume that your fold operator calls eval if the thing passed in is not a function object?
  8. Dude, wait for it. I said I’d discuss interoperability in my NEXT blog! 🙂
  9. Yep, but there are no JScript arrays with rank higher than one, so basically this is saying that you can’t write a generic function that handles arrays of different ranks — but wait a minute, that is what System.Array is for! ie, those rare cases where you don’t know the rank at compile time.
  10. Indeed.
  11. November 13, 2003 at 11:14 am
  12. Dan Shappir> Yep, but there are no JScript arrays with rank higher than one> that is what System.Array is for> Dude, wait for it.>I assume that your fold operator calls eval if the thing passed in is not a function object?You can also do “-“.toFunctionUnary() or “-“.toFunctionBinary() to control which version is generated. Here is the implenetation:Log in to Reply
  13. String.prototype.toFunctionUnary = function() {
    eval(“function __unary__(op) { return ” + this + ” op; }”);
    __unary__.op = this.valueOf();
    return __unary__;
    };
    String.prototype.toFunctionBinary = function() {
    eval(“function __binary__(op1, op2) { return op1 ” + this + ” op2; }”);
    __binary__.op = this.valueOf();
    return __binary__;
    };
    String.prototype.toFunction = function() {
    return “,!,~,++,–,new,delete,typeof,void,”.indexOf(“,” + this + “,”) > -1 ?
    this.toFunctionUnary() : this.toFunctionBinary();
    };
  14. BeyondJS implements a mechanism of converting strings to functions:”+”.toFunction() will generate a binary function
    “!”.toFunction() will generate a unary function.
  15. You caught me, I’m the impatient type 😉
  16. You misunderstood me. I wasn’t looking to write a function that would work for any rank. I was looking for a function that would work for, say, a 2D JScript array and a 2D CLR array. While I fully understand the reasons you chose the indexing syntax used for multi-dimensional CLR arrays, I simply pointed out that as result they are not polymorphic with multi-dimensional JScript arrays.
  17. Technically you are correct, but practically you simply create an array of arrays. And the resulting syntax looks just like C++ or Java. That was my point actually, that for a 2D JScript array you write a[1][2] while for a CLR array you write a[1,2].
  18. November 13, 2003 at 12:23 pm
  19. Eric LippertYeah, there’s no interoperation between ragged arrays and two-d arrays. But there is no interoperation between ragged CLR arrays and two-d CLR arrays either! Ragged arrays and rectangular arrays are pretty much separate concepts. In fact, the whole notion of rank of a ragged array is ill-defined — you can have a ragged array that is 3-d in some axes, 2-d in others, 1-d in still others, etc. There is no sensible notion of “rank”, so making them interoperate is more trouble than its worth.Log in to Reply
  20. Your implementation is pretty slick. (A less functional but perhaps more performant approach would be to generate all the unary and binary operator functions once and put them in a lookup table, rather than searching that string every single time and reconstructing the function object every single time.)
  21. November 13, 2003 at 12:51 pm
  22. Dan ShappirYou are quit correct about both points.With regard to BeyondJS, our motivation was always functionality, with performance a consideration but not more. Anyway, fold generates the function once, and then applies it iteratively to all the members. So the performance hit of generating a new function every time is relatively minor when compared to the cost of the loop.
  23. Log in to Reply
  24. With regard to ragged arrays: I always found it amusing that C++ employs the same exact syntax for accessing ragged and contiguous array. So a[1][2] would generate wildly different code base on the definition of a. OTOH it did buy you that polymorphic behavior I mentioned before.
  25. November 13, 2003 at 3:24 pm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s