Change SURF/TURF syntax for sets.


I've always been slightly uncomfortable with the use of (…) for sets, e.g. (1, 2, 3), introduced by SURF. At the time it seemed natural, but mathematical sets use the {…} notation for sets. But this notation is already taken for maps, and this is almost impossible to change because we want to be able to read JSON files, which use {…} for maps (or associate arrays, that is, JSON "objects").

The bigger problem is that we may want to use parentheses for some sort of functional notation in the future, to mimic constructor invocation, e.g. FooBar(foo, bar). Granted maybe we could overload this, because the two uses would appear in different contexts in the grammar.

But the bigger issue dawned on me yesterday: when we introduce an expression language (which will probably be necessary to do interesting data processing/analysis), we'll have hardly any choice except to use parentheses to mean what they mean in math: simple grouping. This would conflict with their use of sets, or put another way, we couldn't use sets in expressions, such as:

The set notation would conflict with grouping parentheses; it would be hard to tell one from the other.

So after much though, I think we should change the set notation to be (:…) for sets, e.g. {: 1, 2, 3}. Note the colon : introducing the set.

  • Now we're using the same delimiters that math uses.

  • There is no conflict with maps, because a map can never begin with a colon.

  • The parsing might be a tiny bit more complicated: we'd have to wait until we're inside the {…} to know if it's a set or a map.

  • Logically a map is a set—a set of map entries!

  • Python also overloads {…} with set and dict, I think looking for the association colon to differentiate the two. But it runs into problems knowing which is which if the construct is empty! Plus we don't want to parse the entire first "thing" before we find out if the construct is a set or a map. So the : gets us out of the predicament Python got itself in.

So the hypothetical expression language example above would become:

And if it seems that : is a bit arbitrary, consider that in math set builder/comprehension notation is {…|…} or {…|…}. So we can consider {:…} to be a non-comprehension version of a set. But it gets better: either in TURF in the future, or in some new expression language, we can actually add set comprehensions!

This should not conflict with SURF/TURF maps (JSON objects).

We could then add list comprehensions, too!

Luckily frees up the # symbol to be used for bound variables in comprehensions.




Garret Wilson


Garret Wilson