Creating a parser combinator library to parse JSON
| Prev: Better error reporting | Contents | Next: JSON objects |

How to match that?
After the first value, any comma has to be followed by another value. I don’t care about the return value of matching the comma, so I’ll use seq2 to get just the return value of the JSON value:
(seq2 (skipwhite:match-is #\,)
json-value)
There can be many (zero or more) of those “comma followed by a value” pairs:
(many (seq2 (skipwhite:match-is #\,)
json-value))
There does have to be a value before the first comma:
(seq json-value
(many (seq2 (skipwhite:match-is #\,)
json-value)))
And the whole thing is optional, because the JSON array might be empty:
(optional (seq json-value
(many (seq2 (skipwhite:match-is #\,)
json-value))))
So this is pretty close for matching a JSON array:
(= json-array
(seq2 (match-is #\[)
(optional (seq json-value
(many (seq2 (skipwhite:match-is #\,)
json-value))))
(skipwhite:match-is #\])))
Just a couple of problems. JSON arrays contain JSON values, which recursively can be or contain JSON arrays...
(= json-value
(skipwhite:alt json-true
json-false
json-null
json-number
json-string
json-array))
But when I’m defining json-array, I haven’t defined json-value yet...
arc> (= json-array
(seq2 (match-is #\[)
(optional (seq json-value
(many (seq2 (skipwhite:match-is #\,)
json-value))))
(skipwhite:match-is #\])))
reference to undefined identifier: _json-valuePutting json-value first doesn’t help of course, since then it will be json-array that isn’t defined yet. So, I’ll need to wrap the reference to json-value in a function:
(= json-array
(seq2 (match-is #\[)
(optional (seq (fn (p) (json-value p))
(many (seq2 (skipwhite:match-is #\,)
(fn (p) (json-value p))))))
(skipwhite:match-is #\])))
Which I can make shorter with a macro:
(mac forward (parser)
(w/uniq p
`(fn (,p) (,parser ,p))))
Now I can easily have forward references:
(= json-array
(seq2 (match-is #\[)
(optional (seq forward.json-value
(many (seq2 (skipwhite:match-is #\,)
forward.json-value))))
(skipwhite:match-is #\])))
Next I need to fix the return value:
arc> (show-parse json-value "[1,2,3]") returning: ((1 (2 3))) remaining: nil
Back when I wrote optional, if the parser matched, I put its return value in a list. Now that I’m actually using optional for the first time, it turns out I don’t want that, I want just the value. An easy fix:
(def optional (parser)
(fn (p)
(iflet (p2 r) (parser p)
(return p2 r)
(return p nil))))
But now optional is just returning what the parser returns, so I could write it as:
(def optional (parser)
(alt parser
(fn (p)
(return p nil))))
Now I get:
arc> (show-parse json-value "[1,2,3]") returning: (1 (2 3)) remaining: nil
This is the same pattern I had before with many1: a sequence of A followed by B, and I want to cons the single item returned by A together with the list of items returned by B. I can extract a cons-seq function for that:
(def cons-seq (a b)
(with-seq (r a
rs b)
(cons r rs)))
Now many1 is:
(def many1 (parser)
(cons-seq parser
(many parser)))
And I get the right return value from a JSON array:
(= json-array
(seq2 (match-is #\[)
(optional (cons-seq forward.json-value
(many (seq2 (skipwhite:match-is #\,)
forward.json-value))))
(skipwhite:match-is #\])))
arc> (show-parse json-value "[1,2,3]") returning: (1 2 3) remaining: nil
arc> (fromjson «[1, ["apple", true], 3.14159]»)
(1 ("apple" t) 3.14159)Finally, the error messages can be improved.
arc> (fromjson «[») not a JSON value: [arc> (fromjson «[1,]») not a JSON value: [1,]
Once we see the opening bracket, we know there has to be a closing bracket, and when we see a comma, we know it has to be followed by a value:
(= json-array
(seq2 (match-is #\[)
(optional (cons-seq forward.json-value
(many (seq2 (skipwhite:match-is #\,)
(must "a comma must be followed by a value"
forward.json-value)))))
(must "a JSON array must be terminated with a closing ]"
(skipwhite:match-is #\]))))
arc> (fromjson «[») a JSON array must be terminated with a closing ]arc> (fromjson «[1,]») a comma must be followed by a value
| Prev: Better error reporting | Contents | Next: JSON objects |
Questions? Comments? Email me andrew.wilcox [at] gmail.com