awwx.ws

parseurl0

Parse a URL into its component parts

http://awwx.ws/parseurl0.arc:

; regex from http://search.cpan.org/~gaas/URI-1.51/URI.pm#PARSING_URIs_WITH_REGEXP

(def parse-url (url)
  (withs ((scheme hostport path query frag)
          (re-match
           "(?:([^:/?#]+):)?(?://([^/?#]*))?([^?#]*)(?:\\?([^#]*))?(?:#(.*))?"
           url)
          (host port) (tokens hostport #\:))
    (obj scheme (sym scheme)
         hostport hostport
         host host
         port (errsafe:int port)
         path path
         query query
         frag frag)))

Examples

arc> (parse-url "http://example.com:8080/")
#hash((path . "/") (port . 8080) (host . "example.com") (hostport . "example.com:8080") (scheme . http))

Note that a default port is not provided.

arc> (parse-url "http://example.com/foo/bar?a=1&b=2#here")
#hash((frag . "here") (query . "a=1&b=2") (path . "/foo/bar") (host . "example.com") (hostport . "example.com") (scheme . http))

Prerequisites

This hack depends on arc3.1 and re3.

License

Same as Arc.

Contact me

Twitter: awwx
Email: andrew.wilcox [at] gmail.com