Namespaces

Let's bring back our new regex-based tokenizer. We'll put our loop in a function called read-tokens as well:


(defn read-next-token [text]
  (or ; spaces
      (re-find #"^ +" text)
      ; float
      (re-find #"^\d+\.\d+" text)
      ; ratio
      (re-find #"^\d+/\d+" text)
      ; integer
      (re-find #"^\d+" text)
      ; symbol
      (re-find #"^[^\d ]+" text)))

(defn read-tokens [text]
  (loop [result []
         index 0]
    (let [sub-text (subs text index)
          token (read-next-token sub-text)]
      (if token
        (recur (conj result token) (+ index (count token)))
        result))))

And now we can call it like this:


(def text "42 0.5 1/2 foo-bar")

(read-tokens text)

While this is a great start, a real reader needs to convert those strings into the values they represent. In order words, "42" should be a number object, and "foo-bar" should be a symbol object. Right now, both are still just strings.

We can fix this, but we need to finally confront something we haven't talked about yet: namespaces. We haven't yet explicitly dealt with namespaces, but they are a very important part of Clojure.

All vars live in a namespace, whose name is a symbol with one or more periods inside. The primary namespace of Clojure, which is available by default, is clojure.core, which contains all the functions we have used so far.

To successfully convert a string to a number in Clojure, the easiest approach is to use a function called read-string. While there is one in clojure.core, it is not particularly safe to use because it can execute code. There is a more restricted version in the clojure.edn namespace that we want to use.

This illustrates an important point: Namespaces allow us to create functions with the same name without clashing. If we want to use functions from clojure.edn, we just need to use the require function:

(require 'clojure.edn)

Notice the quoted symbol, which we've seen before. Now we can call the read-string function from there like this:

(clojure.edn/read-string "42")

This can be annoying to type, so it's possible to assign a shorter name to the namespace, called an alias, like this:

(require '[clojure.edn :as edn])

(edn/read-string "42")

These particular examples can't run in a browser, because ClojureScript doesn't have that namespace. It does, however, have a similar one, called cljs.reader, that we can use:


(cljs.reader/read-string "42")

Now we want to run this function on everything that read-tokens returned. We can do that with map:


(map cljs.reader/read-string (read-tokens text))

Notice that it contains a bunch of nils, because read-string returns nil when you give it whitespace. We just want to remove those. One way is to use the remove function we saw previously:


(remove nil? (map cljs.reader/read-string (read-tokens text)))

However, there's an even easier way to do this using keep, which works the same as map but automatically removes nils:


(keep cljs.reader/read-string (read-tokens text))

Previous: Regular Expressions