Just An Application

June 24, 2013

Programming With Rust — Part Fifteen: Representing An HTTP Request

Now we can read a raw HTTP response we need to turn it into something useful.

1.0 HTTP Request: The RFC 2616 Definition

1.1 Request

An HTTP Request is defined like this

    Request    = Request-Line              ; Section 5.1
                 *(( general-header        ; Section 4.5
                  | request-header         ; Section 5.3
                  | entity-header ) CRLF)  ; Section 7.1
                 CRLF
                 [ message-body ]          ; Section 4.3

1.2 Request-Line

Request-Line is defined like this

    Request-Line    = Method SP Request-URI SP HTTP-Version CRLF

1.2.1 Method

Method is defined like this

    Method    = "OPTIONS"                ; Section 9.2
              | "GET"                    ; Section 9.3
              | "HEAD"                   ; Section 9.4
              | "POST"                   ; Section 9.5
              | "PUT"                    ; Section 9.6
              | "DELETE"                 ; Section 9.7
              | "TRACE"                  ; Section 9.8
              | "CONNECT"                ; Section 9.9
              | extension-method

1.2.2 Request-URI

Request-URI is defined like this

    Request-URI    = "*" | absoluteURI | abs_path | authority

where absoluteURI, abs_path and authority are as defined in RFC 2396.

1.3 Headers

general-header, request-header and entity-header
are all forms of message-header which is defined like this

    message-header = field-name ":" [ field-value ]

1.4 The Message Body

message-body is defined like this.

    message-body = entity-body
                   | <entity-body encoded as per Transfer-Encoding>
    entity-body    = *OCTET

2.0 Representing A Request

2.1 The Components

2.1.1 The Request Method

We can represent the Method using an enum like so

    enum Method
    {
        CONNECT,
        DELETE,
        GET,
        HEAD,
        OPTIONS,
        POST,
        PUT,
        TRACE
    }

For the moment we are not going to support extension methods

2.1.2 The Request URI

We can represent the Request URI in terms of its four possible variants by using an enum like so

    enum RequestURI
    {
        AbsolutePath(~str),
        AbsoluteURI(url::Url),
        Authority(~str),
        Wildcard
    }

The AbsoluteURI variant uses the type Url from the std library.

2.1.3 Headers

We are going to represent the set of headers using a new type Headers which is defined like this

    pub struct Headers
    {
        priv headers: ~[Header]
    }

and Header is defined like this

    struct Header
    {
        priv key:   ~str,
        priv value: ~str
    }

2.2 The Request

A Request simply holds the method, the URI and the headers.

    pub struct Request
    {
        priv method:  Method,
        priv uri:     RequestURI,
        priv headers: Headers
    }

For the moment we are going to ignore the message-body.

3.0 The Request Methods

3.1 Reading A Request

We define a static method read on the Request type which takes a RequestBuffer and returns a Request

    pub fn read(buffer: &mut RequestBuffer) -> Request
    {
        let (method, uri) = readRequestLine(buffer);
        let headers       = Headers::read(buffer);
        
        Request { method: method, uri: uri, headers: headers}
    }

The function readRequestLine reads the request line and splits it into its constituent parts and returns them as a tuple.

    fn readRequestLine(buffer: &mut RequestBuffer) -> (Method, RequestURI)
    {
        let     line           = buffer.readLine();
        let mut parts: ~[&str] = ~[];
        
        str::each_word(line, |part| { parts.push(part); true });
    
        if (vec::len(parts) != 3)
        {
            fail!(fmt!("Invalid status line: %s", line));
        }
    
        let method = Method::fromString(parts[0]);
        let uri    = RequestURI::fromString(method, parts[1]);
    
        (method, uri)
    }

3.2 Accessors

3.2.1 getMethod

This method is very simple. It just returns the Method held by the Request.

    pub fn getMethod(&self) -> Method
    {
         self.method
    }

3.2.2 getURI

This method is a little less straightforward than getMethod.

3.2.1.1 Take One

We cannot simply return a shallow copy of the RequestURI like this

    pub fn getURI(&self) -> RequestURI
    {
        self.uri
    }

because to quote the compiler

    request.rs:41:8: 41:16 error: moving out of immutable field
    request.rs:41         self.uri
                          ^~~~~~~~
    error: aborting due to previous error

As we know in this case the RequestURI value is returned via a shallow copy.

A RequestURI value may contain an owned pointer so this would be moved to the copy being returned which would make the original in the Request invalid.

3.2.1.2 Take Two

We could return a deep copy with a pointer to a copy of the original string in the owned heap as well, but this seems a tad extravagant.

An alternative would be to return a borrowed pointer like so

    pub fn getURI(&self) -> &RequestURI
    {
        &self.uri
    }

but the compiler is now very unhappy indeed

    request.rs:41:8: 41:17 error: cannot infer an appropriate lifetime due to conflicting requirements
    request.rs:41         &self.uri
                      ^~~~~~~~~
    request.rs:40:4: 42:5 note: first, the lifetime cannot outlive the lifetime &'self  as defined on the block at 40:4...
    request.rs:40     {
    request.rs:41         &self.uri
    request.rs:42     }
    request.rs:41:8: 41:17 note: ...due to the following expression
    request.rs:41         &self.uri
                      ^~~~~~~~~
    request.rs:40:4: 42:5 note: but, the lifetime must be valid for the anonymous lifetime #1 defined on the block at 40:4...
    request.rs:40     {
    request.rs:41         &self.uri
    request.rs:42     }
    request.rs:41:8: 41:17 note: ...due to the following expression
    request.rs:41         &self.uri
                      ^~~~~~~~~
    error: aborting due to previous error

but it does have a point.

We have not so much attempted to borrow a pointer, as tried to walk off with it without any indication of when we might be planning to give it back, if ever.

3.2.1.3 Take Three

To placate the compiler and assure it of our good intentions we need to explicitly specify the intended lifetime of the borrowed pointer like this

    pub fn getURI(&self) -> &'self RequestURI
    {
        &self.uri
    }

The named lifetime

    'self

specifies that the lifetime of the borrowed pointer is the same as the thing from which it was borrowed which in this case is the Request.

This makes sense because as long as the Request exists the RequestURI exists, so as long as the Request exists the borrowed pointer to the RequestURI is valid.

3.2.3 getHeader

This method delegates to the Headers get method and returns its result.

    pub fn getHeader(&self, key: &str) -> Option<&'self str>
    {
        self.headers.get(key)
    }

4.0 The Headers Methods

4.1 Reading The Headers

The static method read reads the successive header lines until an empty line is read.

Each header lined is convered to a Header value.

The method returns a Headers value constructed using the collected Header values.

    pub fn read(buffer: &mut RequestBuffer) -> Headers
    {
        let mut headers: ~[Header] = ~[];
        
        loop
        {
            let line = buffer.readLine();
        
            if (line.len() == 0)
            {
                break;
            }
            headers.push(toHeader(line));
        }
        Headers { headers: headers }
    }

The function toHeader splits the header line into a key and a value and returns the result as a Header value.

    fn toHeader(line: &str) -> Header
    {
        let mut parts: ~[&str] = ~[];
    
        str::each_splitn_char(line, ':', 1, |s| { parts.push(s); true });
        if (vec::len(parts) != 2)
        {
            fail!(fmt!("Bad header: %s", line));
        }
    
        Header { key: parts[0].to_str(), value: parts[1].trim().to_str() }
    }

4.2 Getting A Header Value

The get method returns an Option as it is possible that the specified header was not present in the original HTTP Request.

If the header is present a borrowed pointer to the value is returned via the Option Ok variant.

The lifetime of the borrowed pointer is explicitly specified to be self. (See the Request getURI method.)

    pub fn get(&self, key: &str) -> Option<&'self str>
    {
        let lcKey = str::to_lower(key);
        
        match self.headers.position(|h| h.key.to_lower() == lcKey)
        {
            Some(index) => 
            {
                let value: &str = self.headers[index].value;
                
                Some(value)
            },
            
            None        => None
        }
    }

5.0 Method and RequestURI Methods

5.1 Method.fromString

The Method fromString static method is very simple. If the argument matches one of the defined methods return the corresponding enum variant, otherwise fail.

    impl Method
    {
        pub fn fromString(s: &str) -> Method
        {
            match s
            {
                "CONNECT" => CONNECT,
                "DELETE"  => DELETE,
                "GET"     => GET,
                "HEAD"    => HEAD,
                "OPTIONS" => OPTIONS,
                "POST"    => POST,
                "PUT"     => PUT,
                "TRACE"   => TRACE,
       
                _ => 
                {   
                    fail!(fmt!("Unrecognized method %s", s));
                }
            }
        }
    }

5.2 RequestURI.fromString

The RequestURI fromString static method converts the string specifying the URI contained in the HTTP Request into one of the variants of the enum type RequestURI.

It takes the Method of the Request as an argument so that it can identify the CONNECT case where the specified Request-URI is actually is an authority which may otherwise be indistinguishable from an absoluteURI in certain circumstances.

    impl RequestURI
    {    
        pub fn fromString(method: Method, s: &str) -> RequestURI
        {
            match method
            {
                CONNECT =>
                {
                    Authority(s.to_str())
                },
            
                _ =>
                {
                    let length = s.len();
        
                    if (length == 1)
                    {
                        match s.char_at(0)
                        {
                            '*' =>
                            {
                                Wildcard
                            },
                
                            '/' =>
                            {
                                AbsolutePath(~"/")
                            },
                
                            _ =>
                            {
                                fail!(fmt!("Invalid URI: %s", s));
                            }
                        }
                    }
                    else
                    {
                        match s.char_at(0)
                        {
                            '/' =>
                            {
                                AbsolutePath(s.to_str())
                            }
                        
                            _ =>
                            {                            
                                match url::from_str(s)
                                {
                                    Ok(url)  => AbsoluteURI(url),
                                
                                    Err(err) => fail!(err)
                                }
                            }
                        }
                    }
                }
            }
        }
    }

6.0 Source Files

6.1 headers.rs

// headers.rs

// part of httpd v0.5

use buffer::RequestBuffer;

pub struct Headers
{
    priv headers: ~[Header]
}

struct Header
{
    priv key:   ~str,
    priv value: ~str
}

impl Headers
{
    pub fn read(buffer: &mut RequestBuffer) -> Headers
    {
        let mut headers: ~[Header] = ~[];
        
        loop
        {
            let line = buffer.readLine();
        
            if (line.len() == 0)
            {
                break;
            }
            headers.push(toHeader(line));
        }
        Headers { headers: headers }
    }
    
    pub fn get(&self, key: &str) -> Option<&'self str>
    {
        let lcKey = str::to_lower(key);
        
        match self.headers.position(|h| h.key.to_lower() == lcKey)
        {
            Some(index) => 
            {
                let value: &str = self.headers[index].value;
                
                Some(value)
            },
            
            None        => None
        }
    }
}


fn toHeader(line: &str) -> Header
{
    let mut parts: ~[&str] = ~[];
    
    str::each_splitn_char(line, ':', 1, |s| { parts.push(s); true });
    if (vec::len(parts) != 2)
    {
        fail!(fmt!("Bad header: %s", line));
    }
    
    Header { key: parts[0].to_str(), value: parts[1].trim().to_str() }
}

6.2 request.rs

// request.rs

// part of httpd v0.5

use HTTP::Method;
use HTTP::RequestURI;

use buffer::RequestBuffer;

use headers::Headers;

//

pub struct Request
{
     priv method:  Method,
     priv uri:     RequestURI,
     priv headers: Headers
}

impl Request
{
    pub fn read(buffer: &mut RequestBuffer) -> Request
    {
        let (method, uri) = readRequestLine(buffer);
        let headers       = Headers::read(buffer);
        
        Request { method: method, uri: uri, headers: headers}
    }
    
    //
    
    pub fn getMethod(&self) -> Method
    {
         self.method
    }
    
    pub fn getURI(&self) -> &'self RequestURI
    {
        &self.uri
    }
    
    pub fn getHeader(&self, key: &str) -> Option<&'self str>
    {
        self.headers.get(key)
    }
}

fn readRequestLine(buffer: &mut RequestBuffer) -> (Method, RequestURI)
{
    let     line           = buffer.readLine();
    let mut parts: ~[&str] = ~[];
        
    str::each_word(line, |part| { parts.push(part); true });
    
    if (vec::len(parts) != 3)
    {
        fail!(fmt!("Invalid status line: %s", line));
    }
    
    let method = Method::fromString(parts[0]);
    let uri    = RequestURI::fromString(method, parts[1]);
    
    (method, uri)
}


Copyright (c) 2013 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

June 20, 2013

Programming With Rust — Part Thirteen: Reading An HTTP Request

Now we have a connection we can read the incoming HTTP request.

1.0 An HTTP Request

An HTTP request arrives over a connection as

  • a request line

followed by

  • zero or more header lines

followed by

  • an empty line

followed by

  • a message body

which is optional.

A line is terminated by a carriage-return (CR == ASCII 13) immediately followed by a line-feed (LF == ASCII 10).

Ideally we would like to read the entire request from the connection “in one go” but that is not possible because there is no way of knowing how big a given HTTP request is before we read it.

To abstract out the unfortunately indeterminate nature of an incoming HTTP request, we will start by defining a RequestBuffer type which will deal with the vagaries of reading the necessary bytes from the connection and converting them into lines.

2.0 RequestBuffer: Take One

The original idea was that RequestBuffer would look something like this

    struct RequestBuffer
    {
        priv socketBuf:     TcpSocketBuf,
        priv bytes:         ~[u8],
        priv size:          uint,
        priv available:     uint,
        priv position:      uint,
        priv lastLineEnd:   uint
    }

and there would be a readLine method which would look something like this

    impl RequestBuffer
    {
        ...
        
        fn readLine(&mut self) -> ~str
        {
            let mut state = 0;
		
            loop
            {
                if (self.position == self.available)
                {
                    // read all the bytes currently available from the connection
            
                    ...
                }
            
                let b = self.bytes[self.position];
			
                self.position += 1;
            
                match state
                {
                    0 => 
                    {
                        if (b == 13) // CR
                        {
                            state = 1;
                        }
                    },
						
                    1 => 
                    {
                        if (b == 10) // LF
                        {
                            // make string representing line from buffered bytes
                    
                            let line = ...
                        
                            self.lastLineEnd = self.position;
                            return line;
                        }
                        else
                        {
                            state = 0;
                        }
                    },
						
                    _ => 
                    {
                        fail!(fmt!("state == %u !", state));
                    }
                }
            }
        }
        
        ...
        
    }

but at the moment there does not seem to be any way of simply reading all the bytes currently available on the connection in one go that actually works.

2.1 RequestBuffer: Take Two

This is a version that works but there really isn’t a whole lot of buffering going on.

2.1.1 RequestBuffer

    struct RequestBuffer
    {
        priv socketBuf: TcpSocketBuf,
        priv bytes:     ~[u8],
    }

2.1.2 The readLine Method

    ...

    static CR: u8 = 13;

    static LF: u8 = 10;

    ...

    impl RequestBuffer
    {
    
        ...
        
        fn readLine(&mut self) -> ~str
        {
            self.bytes.clear();
        
            let mut state = 0;
		
            loop
            {
                let i = self.socketBuf.read_byte();
			
                if (i < 0)
                {
                    fail!(~"EOF");
                }
            
                let b = i as u8;
            
                match state
                {
                    0 => 
                    {
                        if (b == CR)
                        {
                            state = 1;
                        }
                    },
						
                    1 => 
                    {
                        if (b == LF)
                        {
                            return str::from_bytes(vec::const_slice(self.bytes, 0, self.bytes.len() - 1));
                        }
                        else
                        {
                            state = 0;
                        }
                    },
						
                    _ => 
                    {
                        fail!(fmt!("state == %u !", state));
                    }
                }
                self.bytes.push(b);
            }
        }

        ...
        
    }

2.1.3 The new Method

The new method is a static method which can be used to create a RequestBuffer.

    fn new(socketBuf: TcpSocketBuf) -> RequestBuffer
    {
        RequestBuffer { socketBuf: socketBuf, bytes: ~[0u8, ..SIZE] }
    }

3.0 The handleConnection Function

If the accept function is successful we now call the handleConnection function which is defined like this

    fn handleConnection(socket: TcpSocket)
    {	
        let mut buffer      = RequestBuffer::new(socket_buf(socket));
        let     requestLine = buffer.readLine();
	
        io::println(requestLine);
        loop
        {
            let line = buffer.readLine();
		
            io::println(line);
            if (str::len(line) == 0)
            {
                break;
            }
        }
        io::stdout().flush();
        fail!(~"Now what ?");
    }

4.0 Running The Code

Running the code and pointing a web browser at 127.0.0.1:3534 produces this

 
    ./httpd
    on_establish_callback({x: {data: (0x1007094f0 as *())}})
    new_connection_callback(NewTcpConn((0x10200b000 as *())), {x: {data: (0x1007094f0 as *())}})
    accept succeeded
    GET / HTTP/1.1
    Host: 127.0.0.1:3534
    User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:21.0) Gecko/20100101 Firefox/21.0
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    Accept-Language: en-US,en;q=0.5
    Accept-Encoding: gzip, deflate
    DNT: 1
    Connection: keep-alive

    rust: task failed at 'Now what ?', httpd.rc:172
    rust: domain main @0x10201ee10 root task failed
    rust: task failed at 'killed', /Users/simon/Src/lang/rust-0.6/src/libcore/pipes.rs:314

5.0 The Source Code For httpd v0.3


// httpd.rc

// v0.3

extern mod std;

use core::comm::SharedChan;

use core::option::Option;


use core::task;

use std::net::ip;
use std::net::tcp;
use std::net::tcp::TcpErrData;
use std::net::tcp::TcpNewConnection;
use std::net::tcp::TcpSocket;
use std::net::tcp::TcpSocketBuf;

use std::net::tcp::socket_buf;

use std::sync::Mutex;

use std::uv_iotask;

// RequestBuffer

struct RequestBuffer
{
    priv socketBuf: TcpSocketBuf,
    priv bytes:     ~[u8],
}

//

static SIZE: uint = 4096;

// 

static CR: u8 = 13;

static LF: u8 = 10;

impl RequestBuffer
{    
    fn new(socketBuf: TcpSocketBuf) -> RequestBuffer
    {
        RequestBuffer { socketBuf: socketBuf, bytes: ~[0u8, ..SIZE] }
    }

    fn readLine(&mut self) -> ~str
    {
        self.bytes.clear();
        
        let mut state = 0;
		
        loop
        {
            let i = self.socketBuf.read_byte();
			
            if (i < 0)
            {
                fail!(~"EOF");
            }
            
            let b = i as u8;
            
            match state
            {
                0 => 
                {
                    if (b == CR)
                    {
                        state = 1;
                    }
                },
						
                1 => 
                {
                    if (b == LF)
                    {
                        return str::from_bytes(vec::const_slice(self.bytes, 0, self.bytes.len() - 1));
                    }
                    else
                    {
                        state = 0;
                    }
                },
						
                _ => 
                {
                    fail!(fmt!("state == %u !", state));
                }
            }
            self.bytes.push(b);
        }
    }
}


static BACKLOG: uint = 5;
static PORT:    uint = 3534;

static IPV4_LOOPBACK: &'static str = "127.0.0.1";


fn on_establish_callback(chan: SharedChan<Option<TcpErrData>>)
{
    io::println(fmt!("on_establish_callback(%?)", chan));
}

fn new_connection_callback(newConn :TcpNewConnection, chan: SharedChan<Option<TcpErrData>>)
{
    io::println(fmt!("new_connection_callback(%?, %?)", newConn, chan));
	
    let mx = Mutex();
	
    do mx.lock_cond
        |cv|
        {
            let mxc = ~mx.clone();

            do task::spawn 
            {
                match tcp::accept(newConn)
                {
                    Ok(socket) => 
                    {
                        io::println("accept succeeded");
                        do mxc.lock_cond
                            |cv|
                            {
                                cv.signal();
                            }
                        handleConnection(socket);
				                
                    },
                    Err(error) => 
                    {
                        io::println(fmt!("accept failed: %?", error));
                        do mxc.lock_cond
                            |cv|
                            {
                                cv.signal();
                            }
                    }
                }
            }
            cv.wait();
        }
}

fn handleConnection(socket: TcpSocket)
{	
    let mut buffer      = RequestBuffer::new(socket_buf(socket));
    let     requestLine = buffer.readLine();
	
    io::println(requestLine);
    loop
    {
        let line = buffer.readLine();
		
        io::println(line);
        if (str::len(line) == 0)
        {
            break;
        }
    }
    io::stdout().flush();
    fail!(~"Now what ?");
}

fn main()
{	
    tcp::listen(
        ip::v4::parse_addr(IPV4_LOOPBACK),
        PORT,
        BACKLOG,
        &uv_iotask::spawn_iotask(task::task()),
        on_establish_callback,
        new_connection_callback);
}


Copyright (c) 2013 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

Create a free website or blog at WordPress.com.