Just An Application

June 20, 2013

Programming With Rust — Part Thirteen: Reading An HTTP Request

Now we have a connection we can read the incoming HTTP request.

1.0 An HTTP Request

An HTTP request arrives over a connection as

  • a request line

followed by

  • zero or more header lines

followed by

  • an empty line

followed by

  • a message body

which is optional.

A line is terminated by a carriage-return (CR == ASCII 13) immediately followed by a line-feed (LF == ASCII 10).

Ideally we would like to read the entire request from the connection “in one go” but that is not possible because there is no way of knowing how big a given HTTP request is before we read it.

To abstract out the unfortunately indeterminate nature of an incoming HTTP request, we will start by defining a RequestBuffer type which will deal with the vagaries of reading the necessary bytes from the connection and converting them into lines.

2.0 RequestBuffer: Take One

The original idea was that RequestBuffer would look something like this

    struct RequestBuffer
    {
        priv socketBuf:     TcpSocketBuf,
        priv bytes:         ~[u8],
        priv size:          uint,
        priv available:     uint,
        priv position:      uint,
        priv lastLineEnd:   uint
    }

and there would be a readLine method which would look something like this

    impl RequestBuffer
    {
        ...
        
        fn readLine(&mut self) -> ~str
        {
            let mut state = 0;
		
            loop
            {
                if (self.position == self.available)
                {
                    // read all the bytes currently available from the connection
            
                    ...
                }
            
                let b = self.bytes[self.position];
			
                self.position += 1;
            
                match state
                {
                    0 => 
                    {
                        if (b == 13) // CR
                        {
                            state = 1;
                        }
                    },
						
                    1 => 
                    {
                        if (b == 10) // LF
                        {
                            // make string representing line from buffered bytes
                    
                            let line = ...
                        
                            self.lastLineEnd = self.position;
                            return line;
                        }
                        else
                        {
                            state = 0;
                        }
                    },
						
                    _ => 
                    {
                        fail!(fmt!("state == %u !", state));
                    }
                }
            }
        }
        
        ...
        
    }

but at the moment there does not seem to be any way of simply reading all the bytes currently available on the connection in one go that actually works.

2.1 RequestBuffer: Take Two

This is a version that works but there really isn’t a whole lot of buffering going on.

2.1.1 RequestBuffer

    struct RequestBuffer
    {
        priv socketBuf: TcpSocketBuf,
        priv bytes:     ~[u8],
    }

2.1.2 The readLine Method

    ...

    static CR: u8 = 13;

    static LF: u8 = 10;

    ...

    impl RequestBuffer
    {
    
        ...
        
        fn readLine(&mut self) -> ~str
        {
            self.bytes.clear();
        
            let mut state = 0;
		
            loop
            {
                let i = self.socketBuf.read_byte();
			
                if (i < 0)
                {
                    fail!(~"EOF");
                }
            
                let b = i as u8;
            
                match state
                {
                    0 => 
                    {
                        if (b == CR)
                        {
                            state = 1;
                        }
                    },
						
                    1 => 
                    {
                        if (b == LF)
                        {
                            return str::from_bytes(vec::const_slice(self.bytes, 0, self.bytes.len() - 1));
                        }
                        else
                        {
                            state = 0;
                        }
                    },
						
                    _ => 
                    {
                        fail!(fmt!("state == %u !", state));
                    }
                }
                self.bytes.push(b);
            }
        }

        ...
        
    }

2.1.3 The new Method

The new method is a static method which can be used to create a RequestBuffer.

    fn new(socketBuf: TcpSocketBuf) -> RequestBuffer
    {
        RequestBuffer { socketBuf: socketBuf, bytes: ~[0u8, ..SIZE] }
    }

3.0 The handleConnection Function

If the accept function is successful we now call the handleConnection function which is defined like this

    fn handleConnection(socket: TcpSocket)
    {	
        let mut buffer      = RequestBuffer::new(socket_buf(socket));
        let     requestLine = buffer.readLine();
	
        io::println(requestLine);
        loop
        {
            let line = buffer.readLine();
		
            io::println(line);
            if (str::len(line) == 0)
            {
                break;
            }
        }
        io::stdout().flush();
        fail!(~"Now what ?");
    }

4.0 Running The Code

Running the code and pointing a web browser at 127.0.0.1:3534 produces this

 
    ./httpd
    on_establish_callback({x: {data: (0x1007094f0 as *())}})
    new_connection_callback(NewTcpConn((0x10200b000 as *())), {x: {data: (0x1007094f0 as *())}})
    accept succeeded
    GET / HTTP/1.1
    Host: 127.0.0.1:3534
    User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:21.0) Gecko/20100101 Firefox/21.0
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    Accept-Language: en-US,en;q=0.5
    Accept-Encoding: gzip, deflate
    DNT: 1
    Connection: keep-alive

    rust: task failed at 'Now what ?', httpd.rc:172
    rust: domain main @0x10201ee10 root task failed
    rust: task failed at 'killed', /Users/simon/Src/lang/rust-0.6/src/libcore/pipes.rs:314

5.0 The Source Code For httpd v0.3


// httpd.rc

// v0.3

extern mod std;

use core::comm::SharedChan;

use core::option::Option;


use core::task;

use std::net::ip;
use std::net::tcp;
use std::net::tcp::TcpErrData;
use std::net::tcp::TcpNewConnection;
use std::net::tcp::TcpSocket;
use std::net::tcp::TcpSocketBuf;

use std::net::tcp::socket_buf;

use std::sync::Mutex;

use std::uv_iotask;

// RequestBuffer

struct RequestBuffer
{
    priv socketBuf: TcpSocketBuf,
    priv bytes:     ~[u8],
}

//

static SIZE: uint = 4096;

// 

static CR: u8 = 13;

static LF: u8 = 10;

impl RequestBuffer
{    
    fn new(socketBuf: TcpSocketBuf) -> RequestBuffer
    {
        RequestBuffer { socketBuf: socketBuf, bytes: ~[0u8, ..SIZE] }
    }

    fn readLine(&mut self) -> ~str
    {
        self.bytes.clear();
        
        let mut state = 0;
		
        loop
        {
            let i = self.socketBuf.read_byte();
			
            if (i < 0)
            {
                fail!(~"EOF");
            }
            
            let b = i as u8;
            
            match state
            {
                0 => 
                {
                    if (b == CR)
                    {
                        state = 1;
                    }
                },
						
                1 => 
                {
                    if (b == LF)
                    {
                        return str::from_bytes(vec::const_slice(self.bytes, 0, self.bytes.len() - 1));
                    }
                    else
                    {
                        state = 0;
                    }
                },
						
                _ => 
                {
                    fail!(fmt!("state == %u !", state));
                }
            }
            self.bytes.push(b);
        }
    }
}


static BACKLOG: uint = 5;
static PORT:    uint = 3534;

static IPV4_LOOPBACK: &'static str = "127.0.0.1";


fn on_establish_callback(chan: SharedChan<Option<TcpErrData>>)
{
    io::println(fmt!("on_establish_callback(%?)", chan));
}

fn new_connection_callback(newConn :TcpNewConnection, chan: SharedChan<Option<TcpErrData>>)
{
    io::println(fmt!("new_connection_callback(%?, %?)", newConn, chan));
	
    let mx = Mutex();
	
    do mx.lock_cond
        |cv|
        {
            let mxc = ~mx.clone();

            do task::spawn 
            {
                match tcp::accept(newConn)
                {
                    Ok(socket) => 
                    {
                        io::println("accept succeeded");
                        do mxc.lock_cond
                            |cv|
                            {
                                cv.signal();
                            }
                        handleConnection(socket);
				                
                    },
                    Err(error) => 
                    {
                        io::println(fmt!("accept failed: %?", error));
                        do mxc.lock_cond
                            |cv|
                            {
                                cv.signal();
                            }
                    }
                }
            }
            cv.wait();
        }
}

fn handleConnection(socket: TcpSocket)
{	
    let mut buffer      = RequestBuffer::new(socket_buf(socket));
    let     requestLine = buffer.readLine();
	
    io::println(requestLine);
    loop
    {
        let line = buffer.readLine();
		
        io::println(line);
        if (str::len(line) == 0)
        {
            break;
        }
    }
    io::stdout().flush();
    fail!(~"Now what ?");
}

fn main()
{	
    tcp::listen(
        ip::v4::parse_addr(IPV4_LOOPBACK),
        PORT,
        BACKLOG,
        &uv_iotask::spawn_iotask(task::task()),
        on_establish_callback,
        new_connection_callback);
}


Copyright (c) 2013 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

June 14, 2013

Programming With Rust — Part Nine: Let’s Build A Program

1.0 Defining A Main Function

We need somewhere to call the listen function from.

For now we will call it directly from the main function of our Rust program,

There are a couple of ways to define one of these.

One way is to define a function called main which takes no arguments and does not return a value, like so


    fn main()
    {	
        tcp::listen(
            ip::v4::parse_addr("127.0.0.1"),
            5,
            3534,
            &uv_iotask::spawn_iotask(task::task()),
            |chan|
            {
                io::println(fmt!("on_establish_callback(%?)", chan));
            },
            |newConn, chan|
            {
                io::println(fmt!("new_connection_callback(%?, %?)", newConn, chan));
                fail!(~"Now what ?");
            });
    }

2.0 Building A Rust Executable

In Rust terminology an executable is a crate.

The Rust compiler takes a single crate in source form and from it produces a single crate in binary form.

The source which defines a crate is contained in a single file, the crate file.

A crate in binary form can be either an executable or a library.

2.1 Linking Against Other Crates

A crate file specifies which other crates containing libraries, if any, the crate being defined should be linked against.

Every crate is automatically linked against the crate containing the core library.

In our case we also need to link against the crate containing the std library.

We specify this dependency like this

    extern mod std;

This is an example of an extern_mod_decl which is a kind of view_item.

2.2 Name Bindings

To make use of the definitions in the externally linked crates we set up some name bindings like so

    ...

    use core::task;

    use std::net::ip;
    use std::net::tcp;

    use std::uv_iotask;
    
    ...

2.3 Compiling The Program

To compile our crate file we simply do this

    rustc httpd.rc

3.0 Running The Program

The result of the compilation is an executable called httpd.

If we just run it we get this

    ./httpd
    on_establish_callback({x: {data: (0x1007098f0 as *())}})

and the program waits for a connection.

If we run it and then use a web browser to connect to 127.0.0.1:3534 we get this (output slightly re-formatted for clarity)

    ./httpd
    on_establish_callback({x: {data: (0x10070a470 as *())}})
    new_connection_callback(NewTcpConn((0x100832c00 as *())), {x: {data: (0x10070a470 as *())}})
    rust: task failed at 'Now what ?', httpd.rc:34
    Assertion failed: (false && "Rust task failed after reentering the Rust stack"), \
        function upcall_call_shim_on_rust_stack, \
        file /Users/simon/Src/lang/rust-0.6/src/rt/rust_upcall.cpp, line 92.
    Abort trap

and the program exits.

4.0 Defining Some Simple Constants

At the moment we have a couple of integer constants sitting in the middle of the code which as everyone knows is bad, bad, bad.

We can fix this by defining them as constants like so

    ...

    static BACKLOG: uint = 5;
    static PORT:    uint = 3534;
    
    ...

These are both examples of a static_item.

Like other items, static_items are processed at compile time and the values they define are stored in the program’s static memory.

We also have a string literal in the middle of the code.

We can define it as a constant like this

    static IPV4_LOOPBACK: &'static str = "127.0.0.1";

In fact the string literal is already a constant stored in static memory.

What we are defining here is a borrowed pointer.

The construct

   'static

is a named lifetime and it tells the compiler how long we want to borrow the pointer for.[1]

5.0 The Source Code For httpd v0.1

    // httpd.rc

    // v 0.1

    extern mod std;

    use core::task;

    use std::net::ip;
    use std::net::tcp;

    use std::uv_iotask;

    static BACKLOG: uint = 5;
    static PORT:    uint = 3534;
    
    static IPV4_LOOPBACK: &'static str = "127.0.0.1";

    fn main()
    {	
        tcp::listen(
            ip::v4::parse_addr(IPV4_LOOPBACK),
            PORT,
            BACKLOG,
            &uv_iotask::spawn_iotask(task::task()),
            |chan|
            {
                io::println(fmt!("on_establish_callback(%?)", chan));
            },
            |newConn, chan|
            {
                io::println(fmt!("new_connection_callback(%?, %?)", newConn, chan));
                fail!(~"Now what ?");
            });
    }

Notes

  1. Given the context I am not sure why the compiler cannot infer the required lifetime, but it doesn’t, so you have to tell it or it gets upset.


Copyright (c) 2013 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

Programming With Rust — Part Eight: Invoking The Listen Function

After all that we should now be in a position to write some code to invoke the listen function.

1.0 The Arguments

So let’s see. As arguments to the listen function we need

1.1 host_ip

We can get an IpAddress by parsing a string representing the IPv4 loopback address using the function std::net::ip::v4::parse_addr.

The parse_address function takes a borrowed pointer to a string, so for now we can hardwire the loopback address as a string literal.

1.2 port

The port number is easy. Just pick a number greater than 1024 and less than 65536 that nobody else is using.

1.3 backlog

The canonical value for the backlog argument to any listen function in any programming language is 5.[1]

1.4 iotask

IoTasks seem to be in fairly short supply.

The only way to obtain one seems to be to call the function std::uv_iotask::spawn_iotask.

Not sure what the implications of doing this are. The documentation for that particular function is less than forthcoming.

We will just have to do it and see what happens.

1.5 on_establish_cb And new_connect_cb

We can supply lambda expressions for the two callbacks. This will automatically result in the creation of the necessary owned closures.

2.0 The Return Type

The return type of the listen function is defined to be

    result::Result<(), TcpListenErrData>

The Result type is defined in the module result like this,

    pub enum Result<T, U> {
        /// Contains the successful result value
        Ok(T),
        /// Contains the error value
        Err(U)
    }

so it is a generic enum type.

The value returned from a call to the listen function is going to be either

  • an instance of the Ok variant with an associated value of the unit type, that is, nothing, or

  • an instance of the Err variant with an associated value of type TcpListenErrData

3.0 The Code

What all that looks like in practice is this

    ...

    tcp::listen(
        ip::v4::parse_addr("127.0.0.1"),
        3534,
        5,
        &uv_iotask::spawn_iotask(task::task()),
        |chan|
        {
            io::println(fmt!("on_establish_callback(%?)", chan));
        },
        |newConn, chan|
        {
            io::println(fmt!("new_connection_callback(%?, %?)", newConn, chan));
            fail!(~"Now what ?");
        });

    ...

The function io:println takes a borrowed pointer to a string and prints the string to stdout followed by a newline.

fmt! is like sprintf and fail! stops the Task.

For the moment we are going to ignore the return value.

Notes

  1. Its a tradition, or an old charter, or something.

Update: 27.07.2014

The socket functions as described above no longer exist in Rust.

See here for some discussion of the equivalent functionality in the current version of Rust


Copyright (c) 2013 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

June 11, 2013

Programming With Rust — Part Four: Things We Need To Know – Function Definitions Revisited: Items, Expressions And Statements

1.0 Items

Technically a function definition is a function-item (fn_item).

Similarly a module definition is a module item (mod_item).

A name binding is one kind of view item (view_item).

An enum type definition is an enum item (enum_item).

What all these things have in common, and what makes them items, is that it is possible to process them at compile time and if necessary store the results in a program’s read-only memory.

2.0 Function Bodies: Expressions And Statements

The body of a function is a block.

A block is a sequence of statements followed by an optional expression, delimited by braces (‘{‘ … ‘}’). [1]

An expression is something which when evaluated at run-time produces a value.

A statement is something which when evaluated at run-time does not produce a value.

Terminating an expression with a semi-colon causes the value of the expression to be ignored, that is, it becomes a statement, specifically an expression-statement. [2]

A block is itself an expression. If it ends with an expression then its value is the value of that expression. If it ends with a statement then its value is the singleton instance of the unit type, that is, nothing. [3]

2.0 Function Return Types

If a function does not return a value then the return type can be omitted from the definition.

    fn nonplussed(x: uint, y: uint)

Alternatively the return type can be explicitly declared as the unit type.

    fn nonplussed(x: uint, y: uint) -> ()

Or to put it another way in Rust a function always returns something even if that something is nothing.

3.0 Function Return Values

Because the body of a function is a block, and because a block is an expression, then, by default, the return value of a function is the result of
evaluating the block which comprises the body of the function.

For example,

    fn plus(x: uint, y: uint) -> uint
    {
        x + y
    }

Note the absence of a semi-colon (‘;’). If you inadvertently add one then the expression becomes an expression-statement and the function
will no longer compile as there is no value to return.

Note that the converse is also true. You cannot return a value from a function which is declared not to return a value

For example this will not compile

    fn nonplussed(x: uint, y: uint) 
    { 
        x + y
    }

whereas this will

    fn nonplussed(x: uint, y: uint)
    { 
        x + y;
    }

4.0 Explicitly Returning A Value From A Function

It is also possible to return a value explicitly using a return expression.

For example

    fn plus(x: uint, y: uint) -> uint
    {
        return x + y
    }

In this case the return is superfluous but there are situations in which it can be useful.

Notes

[1] This is my definition of a block as currently there does not appear to be a formal definition in the Rust documentation.

[2] Empirically this does not seem to be true for all possible expressions.

[3] This is my interpretation of the semantics of a block. The Rust documentation does specify that a block is an expression but does not explicitly specify its semantics.


Copyright (c) 2013 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

June 10, 2013

Programming With Rust — Part One: So Where Are The Sockets ?

To get some kind of a feel for a new programming language I like to try and use it to write some sort of ‘real-world’-ish program.

After toying with various ideas I decided to try and use Rust to write a very simple HTTP server

Note

All the code that follows was compiled and run using the Mac OS X version of Rust 0.6

How I built Rust 0.6 for Mac OS X is described here.

1.0 So Where Are the Sockets ?

The first thing you need to implement an HTTP server are some sockets, well at least one socket.

Rummaging around in the documentation we find the the following here (re-formatted for clarity)


    fn listen(
           host_ip:         ip::IpAddr, 
           port:            uint, 
           backlog:         uint, 
           iotask:          &IoTask,
           on_establish_cb: ~fn(SharedChan<Option<TcpErrData>>),
           new_connect_cb:  ~fn(TcpNewConnection, SharedChan<Option<TcpErrData>>)) 
		   
       ->  result::Result<(), TcpListenErrData>

So what does all that mean ?

To start with it helps to know how functions are defined in Rust.

2.0 Function Definitions

A function definition in Rust starts with the keyword

    fn

This is followed by the name of the function, then a list of parameters the function takes, then the return type, if any, then the body of the function

Slightly more formally

    functiondef : "fn" ident parameters returntype? body

    parameters  : '(' parameter [',' parameter]* ')' | '()'
    
    parameter   : ident ':' type
    
    returntype  : '->' type

Note

The definition of a parameter is a bit more complicated than that but it will do for now.

3.0 The listen Function Revisited

Given that we now know how functions are defined what we can deduce about the listen function without having
to resort to reading the documentation ?

Not a great deal.

It takes six arguments and returns a value. That is about it.

Everything else is pretty much pure conjecture.

The double colon in the type ip::IpAddr might have something to do with namespaces ?

uints might be unsigned integers ?

The ampersand in &IoTask might have something to do with references, but does Rust even have references ?

Unfortunately compilers are usually not big on conjectures, so there is nothing for it, we are going to have to find stuff out.

4.0 What Do We Need To Know About In Order To Use The listen Function ?

It turns out that to actually use the listen function it is first necessary to know something about all of the
following aspects of Rust

  • some primitive types

  • enum types

  • generic types

  • modules

  • items, expressions, and statements

  • tasks

  • the Rust memory model

  • lambda expressions and closures

For me this is the benefit of trying to do something ‘real-world’ ish when learning a new programming language. Usually you really do have to read the documentation and get to grips with what it says in order to get anything at all to happen !

Update: 27.07.2014

The socket functions as described above no longer exist in Rust.

See here for some discussion of the equivalent functionality in the current version of Rust


Copyright (c) 2013 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

Blog at WordPress.com.

%d bloggers like this: