Lisp Tidbits: How the reader treats package-prefixed symbols
Here is a tidbit I came across recently relating to differences in the
way the lisp reader handles single and double package markers,
i.e. the difference between MY-PACKAGE:FOO
vs. MY-PACKAGE::FOO
.
I guess most lispers know that a single-colon package marker allows you to access the external symbols of a package, whereas a double-colon marker additionally allows access to internal symbols of the package.
(defpackage #:my-package (:use #:cl) (:export #:*external*))
(in-package #:my-package)
(defvar *external* t)
(defvar *internal* nil)
(in-package #:cl-user)
my-package:*external* ; ok
my-package:*internal* ; error
my-package::*internal* ; code smell, but not an error
But that is not the point of this tidbit. The point of this tidbit is: how does the reader actually map package-prefixed strings onto symbols. Of course, it usually doesn't matter. You can know how to use package markers without ever worrying about the details of how the reader handles them, but that's what makes this (in my opinion) a Fun Lisp Tidbit™.
The juicy part of the tidbit
The gist of it is this:
- when the reader encounters
MY-PACKAGE:FOO
, it tries to lookupFOO
in the external symbols ofMY-PACKAGE
and signals an error ifFOO
is not external inMY-PACKAGE
. - when the reader encounters
MY-PACKAGE::FOO
, it internsFOO
inMY-PACKAGE
and returns the resulting interned symbol.
(I'll give you a minute to gather your socks).
As a result, if FOO
is not accessible in MY-PACKAGE
, you get the
following distinct errors out of SBCL.
CL-USER> my-package:foo
; Evaluation aborted on #<SB-INT:SIMPLE-READER-PACKAGE-ERROR "Symbol ~S not found in the ~A package." {1008E446B3}>.
CL-USER> my-package::foo
; Evaluation aborted on #<UNBOUND-VARIABLE FOO {1008FE77E3}>.
CL-USER> my-package:foo
; Evaluation aborted on #<SB-INT:SIMPLE-READER-PACKAGE-ERROR "The symbol ~S is not external in the ~A package." {100917AF93}>.
Note that the sequence of errors is
MY-PACKAGE:FOO
→ "Symbol FOO not found in the MY-PACKAGE package"MY-PACKAGE::FOO
→ "UNBOUND-VARIABLE FOO"MY-PACKAGE:FOO
→ "The symbol FOO is not external in the MY-PACKAGE package."
In other words, MY-PACKAGE::FOO
interned FOO
in MY-PACKAGE
.
CL-USER> (find-symbol "FOO" "MY-PACKAGE")
MY-PACKAGE::FOO
:INTERNAL
You can think of MY-PACKAGE::FOO
as being roughly equivalent to
(let ((*package* (find-package "MY-PACKAGE")))
foo)
except that the rebinding of *PACKAGE*
happens at read time.
Pourquoi?
Why does the reader treat single and double package markers
differently? Why doesn't MY-PACKAGE::FOO
cause the reader to signal
an error if FOO
isn't present (or accessible) in MY-PACKAGE
?
Dunno. Maybe because it's occasionally useful to be able to bind new symbols in another package? Maybe it simplifies the implementation? Maybe it has something to do with print/read consistency, e.g. so that you can print a symbol and guarantee it can be read back later, even if the symbol is uninterned from the package in the mean time? Bit of a stretch, I know. Maybe it's just less annoying?
Maybe (like a lot of decisions in Common Lisp) the double-colon-interns rule is a compromise for partial compatibility with some pre-Common-Lisp implementation? CLTL2 mentions that Common Lisp's package system is "derived from an earlier package system developed for Lisp Machine Lisp". According to the Lisp Machine Manual:
The colon character (`:') has a special meaning to the Lisp reader. When the reader sees a colon preceded by the name of a package, it reads the next Lisp object with
*package*
bound to that package.
So Lisp Machine single-colon package prefixes behaved like Common Lisp's double-colon prefixes. The Lisp Machine manual goes on to say:
In Common Lisp programs, simple colon prefixes are supposed to be used only for referring to external symbols. To refer to other symbols, one is supposed to use two colons, as in
chaos::lose-it-later
. The Lisp machine tradition is to allow reference to any symbol with a single colon. Since this is upward compatible with what is allowed in Common Lisp, single-colon references are always allowed. However, double-colon prefixes are printed for internal symbols when Common Lisp syntax is in use, so that data printed on a Lisp Machine can be read by other Common Lisp implementations.
This concludes the pre-Common-Lisp history portion of today's tidbit.
Common Lisp ≠ C++
A final morsel. A tidbit of a tidbit. A tiny tid (just a bit). When
reading a package-prefixed symbol, the reader does not special-case
the current package or symbols that are already otherwise
accessible1. If you type MY-PACKAGE:FOO
, then FOO
must be
external in MY-PACKAGE
even if MY-PACKAGE
is the current
package. In other words, despite the syntactic similarity, Common
Lisp's package markers and external/internal symbols are only
tangentially related to (say) C++'s scope resolution operator and
public/private access modifiers. But you already knew that.
References
This lisp tidbit was brought to you by section 11.3 of CLTL2 and the corresponding section 2.3.5 of the Common Lisp HyperSpec.
Footnotes:
For example, if you import
an internal symbol from another
package.