Lisp Tidbits: How the reader treats package-prefixed symbols

Here is a tidbit I came across recently relating to differences in the way the lisp reader handles single and double package markers, i.e. the difference between MY-PACKAGE:FOO vs. MY-PACKAGE::FOO.

I guess most lispers know that a single-colon package marker allows you to access the external symbols of a package, whereas a double-colon marker additionally allows access to internal symbols of the package.

(defpackage #:my-package (:use #:cl) (:export #:*external*))
(in-package #:my-package)
(defvar *external* t)
(defvar *internal* nil)
(in-package #:cl-user)

my-package:*external*  ; ok
my-package:*internal*  ; error
my-package::*internal* ; code smell, but not an error

But that is not the point of this tidbit. The point of this tidbit is: how does the reader actually map package-prefixed strings onto symbols. Of course, it usually doesn't matter. You can know how to use package markers without ever worrying about the details of how the reader handles them, but that's what makes this (in my opinion) a Fun Lisp Tidbit™.

The juicy part of the tidbit

The gist of it is this:

  1. when the reader encounters MY-PACKAGE:FOO, it tries to lookup FOO in the external symbols of MY-PACKAGE and signals an error if FOO is not external in MY-PACKAGE.
  2. when the reader encounters MY-PACKAGE::FOO, it interns FOO in MY-PACKAGE and returns the resulting interned symbol.

(I'll give you a minute to gather your socks).

As a result, if FOO is not accessible in MY-PACKAGE, you get the following distinct errors out of SBCL.

CL-USER> my-package:foo
; Evaluation aborted on #<SB-INT:SIMPLE-READER-PACKAGE-ERROR "Symbol ~S not found in the ~A package." {1008E446B3}>.
CL-USER> my-package::foo
; Evaluation aborted on #<UNBOUND-VARIABLE FOO {1008FE77E3}>.
CL-USER> my-package:foo
; Evaluation aborted on #<SB-INT:SIMPLE-READER-PACKAGE-ERROR "The symbol ~S is not external in the ~A package." {100917AF93}>.

Note that the sequence of errors is

  1. MY-PACKAGE:FOO → "Symbol FOO not found in the MY-PACKAGE package"
  2. MY-PACKAGE::FOO → "UNBOUND-VARIABLE FOO"
  3. MY-PACKAGE:FOO → "The symbol FOO is not external in the MY-PACKAGE package."

In other words, MY-PACKAGE::FOO interned FOO in MY-PACKAGE.

CL-USER> (find-symbol "FOO" "MY-PACKAGE")
MY-PACKAGE::FOO
:INTERNAL

You can think of MY-PACKAGE::FOO as being roughly equivalent to

(let ((*package* (find-package "MY-PACKAGE")))
  foo)

except that the rebinding of *PACKAGE* happens at read time.

Pourquoi?

Why does the reader treat single and double package markers differently? Why doesn't MY-PACKAGE::FOO cause the reader to signal an error if FOO isn't present (or accessible) in MY-PACKAGE?

Dunno. Maybe because it's occasionally useful to be able to bind new symbols in another package? Maybe it simplifies the implementation? Maybe it has something to do with print/read consistency, e.g. so that you can print a symbol and guarantee it can be read back later, even if the symbol is uninterned from the package in the mean time? Bit of a stretch, I know. Maybe it's just less annoying?

Maybe (like a lot of decisions in Common Lisp) the double-colon-interns rule is a compromise for partial compatibility with some pre-Common-Lisp implementation? CLTL2 mentions that Common Lisp's package system is "derived from an earlier package system developed for Lisp Machine Lisp". According to the Lisp Machine Manual:

The colon character (`:') has a special meaning to the Lisp reader. When the reader sees a colon preceded by the name of a package, it reads the next Lisp object with *package* bound to that package.

So Lisp Machine single-colon package prefixes behaved like Common Lisp's double-colon prefixes. The Lisp Machine manual goes on to say:

In Common Lisp programs, simple colon prefixes are supposed to be used only for referring to external symbols. To refer to other symbols, one is supposed to use two colons, as in chaos::lose-it-later. The Lisp machine tradition is to allow reference to any symbol with a single colon. Since this is upward compatible with what is allowed in Common Lisp, single-colon references are always allowed. However, double-colon prefixes are printed for internal symbols when Common Lisp syntax is in use, so that data printed on a Lisp Machine can be read by other Common Lisp implementations.

This concludes the pre-Common-Lisp history portion of today's tidbit.

Common Lisp ≠ C++

A final morsel. A tidbit of a tidbit. A tiny tid (just a bit). When reading a package-prefixed symbol, the reader does not special-case the current package or symbols that are already otherwise accessible1. If you type MY-PACKAGE:FOO, then FOO must be external in MY-PACKAGE even if MY-PACKAGE is the current package. In other words, despite the syntactic similarity, Common Lisp's package markers and external/internal symbols are only tangentially related to (say) C++'s scope resolution operator and public/private access modifiers. But you already knew that.

References

This lisp tidbit was brought to you by section 11.3 of CLTL2 and the corresponding section 2.3.5 of the Common Lisp HyperSpec.

Footnotes:

1

For example, if you import an internal symbol from another package.


Created: 2020-07-16

Last modified: 2021-02-20