awwx.ws

urlencode1

A version that encodes Unicode

http://awwx.ws/urlencode1.arc:

(def bytehex (i)
  (if (< i 16) (writec #\0))
  (pr (upcase:coerce i 'string 16)))

(def urlsafe (c)
  (or alphadig.c (in c #\- #\_ #\. #\~)))

(def charutf8 (c)
  (scheme (ac-niltree (bytes->list (string->bytes/utf-8 (string c))))))

(def urlencode (s)
 (tostring
   (each c s
     (if (is c #\space)
          (pr #\+)
         urlsafe.c
          (pr c)
          (each i charutf8.c
            (writec #\%)
            (bytehex i))))))

This version of urlencode works with Unicode characters. Consider for example this sample code which displays an input form, and when the form is submitted, redirects the browser to Google with the input as the search term:

(defop example req
  (arform (fn (req)
            (string "http://www.google.com/search?q="
                    (urlencode (arg req "q"))))
    (input "q")
    (submit)))

With this version of urlencode, if you type some non-ASCII Unicode characters into the input, your search term will be passed through the Google URL unaltered.

Acknowledgments

I originally saw a form of the safe URL character check in Anarki.

Prerequisites

This hack depends on arc3.1 and scheme0.

License

Same as Arc.

Contact me

Twitter: awwx
Email: andrew.wilcox [at] gmail.com