It's time to bring back the char-type function we wrote:
(def chars (vec "42 0.5 1/2 foo-bar")) (def digits #{\1 \2 \3 \4 \5 \6 \7 \8 \9 \0}) (def char-names {\space :space \. :period \/ :forward-slash \" :double-quote}) (defn char-type [ch] (cond (contains? digits ch) :digit (contains? char-names ch) (get char-names ch) true :other))
One really neat thing we can do is use the partition-by function. Let's look at what it does:
(partition-by char-type chars)
If you look at the return value, you'll find a list of lists; partition-by has split up chars into smaller groups. You'll notice that any adjacent characters that have the same type are grouped together. For example, the first list is (\4 \2) because they are both digits.
It might be easier to understand with another example. Here, we partition a vector of numbers according to whether or not they are even:
(partition-by even? [1 1 3 2 4 5 7 3])
So when we do (partition-by char-type chars) it looks at each item in chars and runs the provided function, char-type, on it. If it returns the same thing that the last item did, they are grouped together; otherwise, it starts a new group.
Let's see if we can replicate what partition-by is doing ourselves. Since we generally prefer vectors over lists, let's create a vector of vectors like this: [[\4 \2] [\space] [\0] ...]. To do that, we need to make a function that takes two arguments: the result vector that we are producing, and the individual character that we're currently looking at.
We can do this by creating a loop. It needs to store two things: the overall result vector that we want to create, and the index of the character that we're currently looking at. We'll put ,,, in the body of the loop as a placeholder for now:
(loop [result [] index 0] ,,,)
It looks quite a bit like a let, except it can run multiple times, updating the result and index each time. For example, if we just want to fill result with each character and then return it, we could write this:
(loop [result [] index 0] (let [ch (get chars index) new-index (+ index 1)] (if ch (let [new-result (conj result ch)] (recur new-result new-index)) result)))
First, we get the current char. If we are past the end of chars, it will return nil. If it isn't nil, we create a let where we add it to the result vector using conj, and increment the index by one. Finally, we call recur, which will cause the loop to run again with the new values. When index is finally at the end, we just return result.
So how do we mimic the behavior of partition-by? What we want to do is store a few more things in the bindings of the loop. First, let's store the type of the last character we saw. We do this by initializing it as nil (meaning no value), and then we pass the current char's type into recur.
(loop [result [] index 0 last-type nil] (let [ch (get chars index) new-type (char-type ch) new-index (+ index 1)] (if ch (let [new-result (conj result ch)] (recur new-result new-index new-type)) result)))
Notice that nothing changed in the output; all we did is make the loop store the last character's type while it ran. Now let's make it only conj a character if its type is different than the last type:
(loop [result [] index 0 last-type nil] (let [ch (get chars index) new-type (char-type ch) new-index (+ index 1)] (if ch (if (= last-type new-type) (recur result new-index last-type) (let [new-result (conj result ch)] (recur new-result new-index new-type))) result)))
To put them into smaller groups like partition-by, we create a group vector that is initially nil. If the type changed, we make a new group with [ch]; otherwise, we add it with (conj group ch). Then, we add group to result instead of ch, so the result should now be a vector of vectors:
(loop [result [] index 0 last-type nil group nil] (let [ch (get chars index) new-type (char-type ch) new-index (+ index 1)] (if ch (if (= last-type new-type) (recur result new-index last-type (conj group ch)) (let [new-result (conj result group)] (recur new-result new-index new-type [ch]))) result)))
There are a few bugs. We aren't adding the last group, so the final \" character isn't there. We can do that by adding it to the result using conj:
(loop [result [] index 0 last-type nil group nil] (let [ch (get chars index) new-type (char-type ch) new-index (+ index 1)] (if ch (if (= last-type new-type) (recur result new-index last-type (conj group ch)) (let [new-result (conj result group)] (recur new-result new-index new-type [ch]))) (conj result group))))
Lastly, can you guess why that little nil is there in the beginning? Think about what happens when it first runs. last-type is nil, and the first character's type is :digit, so it thinks the group has changed and dutifully adds it to result. We can fix that by making sure we don't add nil groups to the result:
Previous: Local BindingsNext: Loops Continued(loop [result [] index 0 last-type nil group nil] (let [ch (get chars index) new-type (char-type ch) new-index (+ index 1)] (if ch (if (= last-type new-type) (recur result new-index last-type (conj group ch)) (let [new-result (if group (conj result group) result)] (recur new-result new-index new-type [ch]))) (conj result group))))