Quoting Without Confusion

Many Lisp beginners find quoting to be a foreign concept. My own confusions around quote and friends began with my first ventures into Scheme, and continue today with the occasional crazy Clojure macro. So I'd like to take some time to look deeper into quoting, one of the most Lispy of the Lisp features. It's a crucial area for budding macro writers to master, and a bit of investment here can really pay off.

The examples are all in Clojure, and many aspects will apply to other Lisps, with minor syntax tweaks.

quote

First and most importantly, the quote special form is one of the most basic in Lisp. [1] Let's start with an example:

user=> (quote (+ 1 2 3))
(+ 1 2 3)

So wrapping quote around a form will return that form, without evaluating it. That form could be a symbol:

user=> (quote hi)
hi

or a list of symbols [2]:

user=> (quote (hi there))
(hi there)

or any other arbitrary expression:

user=> (quote (hi there (my friends)))
(hi there (my friends))

Keep in mind that with all these lists, with all the nesting, any symbols you see are literally symbols. Evaluation, and therefore resolution of those symbols to functions or other values, is totally out of the picture.

I don't see a lot of explicit use of quote in actual code, though. More often, people use a shorthand:

user=> '(1 2 3)
(1 2 3)
user=> 'hi
hi

As you can see, using the single-quote character works in exactly the same way. Just remember that with both of these syntax variants, when you see an opening parenthesis, the quote applies to everything up to and including the matching closing parenthesis. [3] Once you've done this a few times, you'll get the idea.

syntax-quote and unquote

Syntax-quote behaves pretty similarly to quote, but there a number of differences make it far more powerful, and potentially far more confusing. Assume for a brief moment that things work exactly the same as quote, and we'll talk about where that's not the case. First, there's no longhand analog to quote for syntax-quote. [4] You'll need to use the backtick character (`):

user=> `(1 2 3)
(1 2 3)

When you write a symbol inside a syntax-quoted form, the reader will actually resolve the symbol rather than just taking it at face value. The result is that namespace-qualified symbols are the norm in a syntax-quoted form:

user=> `(foo bar)
(user/foo user/bar)

This has important implications for macro writing that are outside the scope of this article, so this namespace qualification is worth remembering.

Probably more well-known is the fact that a syntax-quoted form allows for unquoting inside it.

user=> `(this ~(symbol (str "i" "s" \- "cool")))
(user/this is-cool)

We can think of the tilde (~) as saying that we bounce outside the surrounding syntax-quoted form and evaluate the following form in that context, inserting the result back where the tilde was. Note that here our result is a symbol, but it's not namespace-qualified. The tilde, just like the two quotes, applies to the form immediately following it.

The splicing unquote is similar to unquote, except that it allows multiple forms to be inserted in the place of a single unquote-splicing form:

user=> `(max ~@(shuffle (range 10)))
(clojure.core/max 4 8 5 2 9 0 6 1 3 7)

There's a bit of an analogy between splicing-unquote and apply, in that they both appear to "unroll" a collection into multiple expressions.

Both of these unquote operations are meaningless outside of syntax-quote forms:

user=> ~(str 1 2 3)
IllegalStateException Attempting to call unbound fn: #'clojure.core/unquote  clojure.lang.Var$Unbound.throwArity (Var.java:43)
user=> ~@(list 1 2 3)
IllegalStateException Attempting to call unbound fn: #'clojure.core/unquote-splicing  clojure.lang.Var$Unbound.throwArity (Var.java:43)

Another special bit of syntax only available inside the syntax-quote is the automatic gensym, or "generated symbol" syntax:

user=> `(let [foo# 1] (+ foo# 2))
(clojure.core/let [foo__1310__auto__ 1] (clojure.core/+ foo__1310__auto__ 2))

The generated symbol corresponds to the output of clojure.core/gensym, prefixed by the name before the # sign, and multiple occurrences of the symbol within the syntax-quoted form are made to be the same. These properties can be really handy for macro writing to avoid what are known as variable capture problems. [5]

mixing and matching

If things haven't been clear up until this point in the article, you may want to either re-read the preceding sections or skip ahead, because mixing up different combinations of unquoting and quoting is where things really get crazy.

What if we want a non-namespaced symbol inside a form, but we really want the power of unquoting in other places in the form? For example, we have this:

user=> `[:a ~(+ 1 1) c]
[:a 2 user/c]

We want the non-namespaced symbol c in the form instead of user/c. Incidentally, notice that symbols and parentheses aren't the only things quotes and unquotes can apply to: they work on any expression. Keeping in mind that if the result of an unquote operation is a symbol, that symbol won't be namespaced by the surrounding quoted form, the solution makes sense:

user=> `[:a ~(+ 1 1) ~'c]
[:a 2 c]

For awhile, I memorized this ~' syntax by rote, and that works fine, but it's not necessary when the reasoning is obvious. Armed with that knowledge, you can probably also guess the [unimpressive] result if you were to use ~` instead:

user=> `[:a ~(+ 1 1) ~`c]
[:a 2 user/c]

Generally it's fine to think of the normal quote form as something like a literal, but there's more complexity underneath. Consider what happens when a quote form is nested inside a syntax-quote form:

user=> `{:a 1 :b '~(+ 1 2)}
{:a 1, :b (quote 3)}

The unquote form will still be evaluated as usual, but the result will be quoted. So if we want a plain, non-namespaced, quoted symbol inside our generated form, we need look no further:

user=> `[:a ~(+ 1 1) '~'c]
[:a 2 (quote c)]

An edge case this brings to mind that's fun to think about is quoting the result of the splicing unquote:

user=> `{:a 1 :b '~@(list 1 2)}
{:a 1, :b (quote 1 2)}

There's no good practical use I'm aware of for this, because quote only uses its first argument, but it did help me to understand the way ' forms expand to quote forms.

If you have the misfortune to deal with code that uses nested syntax-quotes, I'm sorry (and I hope I didn't write it). The expansion is confusing to read:

user=> `(1 `(2 3) 4)
(1 (clojure.core/seq (clojure.core/concat (clojure.core/list 2) (clojure.core/list 3))) 4)

And unquoting is a little tricky:

user=> `(list 1 `(2 ~(- 9 6)) 4)
(clojure.core/list 1 (clojure.core/seq (clojure.core/concat (clojure.core/list 2) (clojure.core/list (clojure.core/- 9 6)))) 4)

Since we have two levels of syntax-quoting, we'd need to bounce out two levels of unquoting in order to actually evaluate (- 9 6).

user=> `(list 1 `(2 ~~(- 9 6)) 4)
(clojure.core/list 1 (clojure.core/seq (clojure.core/concat (clojure.core/list 2) (clojure.core/list 3))) 4)

So, what the heck is going on with all the list/seq/concat madness? Don't worry too much about it, but if you work out the result, or just eval it, you'll see that the result makes sense:

user=> (eval `(list 1 `(2 ~~(- 9 6)) 4))
(1 (2 3) 4)

whew!

There are, of course, plenty of other combinations you could dream up to amaze and confuse your friends. Hopefully at this point you've built up enough background to solve most of the quoting confusion you'll encounter. And if not, fire up a REPL and experiment!

[1] It's one of the original seven functions required for eval. See McCarthy's original Lisp paper for more detail.

[2] Symbols are most often used to refer to functions or data in code. They have a sort of dual meaning, depending on their quoted-ness. When quoted, symbols refer to themselves, nothing more or less. When not quoted, their values are determined by local, dynamic, and namespace bindings.

[3] Same deal for square and curly braces in Clojure.

[4] Although Kevin Downey has written a version of syntax-quote as a macro.

[5] Let Over Lambda and On Lisp are two great resources to learn more. Suffice to say, the automatic gensyms and namespace qualification (coupled with the illegality of letting a namespaced symbol) inside syntax-quote forms are a huge help in avoiding accidental variable capture.

Colin Jones, Director of Software Services

Colin Jones is particularly interested in web security and functional programming, using languages like Clojure.