Just An Application

June 17, 2013

Programming With Rust — Part Ten: More Fun With Sockets

Currently our HTTP server is a little lacking in functionality although it is reasonably secure.

The next thing to do is to accept the incoming connection.

1.0 The accept Function

Given that there is a listen function you would expect that there would also be an accept function and indeed there is,

It is documented here and it looks like this

    fn accept(new_conn: TcpNewConnection) -> result::Result<TcpSocket, TcpErrData>

It takes a TcpNewConnection as an argument, which is handy because we get passed one of those, and it returns a

    result::Result<TcpSocket, TcpErrData>

which as we have already seen is a parameterized enum type.

In this case the value returned from a call to the accept function is going to be either

  • an instance of the Ok variant with an associated value of the type TcpSocket, or

  • an instance of the Err variant with an associated value of type TcpErrData

2.0 Working With Enums

To determine the outcome of our call to accept we will need to use a match expression.

A Rust match expression is akin to a C++ or Java switch statement but considerably more flexible than either of them.

A match expression can be used to match a value of an enum type againt its possible variants.

In the case of the value of type result::Result<TcpSocket, TcpErrData>
returned by the accept function we can do this

    match tcp::accept(newConn)
    {
        Ok(socket) => 
        {
            // connection accepted 
            // ...
        },
			   
        Err(error) => 
        {
            // whoops ! somethings gone wrong
            // ...
        }
    }

If the return value matches the Ok variant pattern then the TcpSocket value it contains is bound to the local variable socket.

Alternatively if it matches the Err variant pattern then TcpErrData value is bound to the local variable error.

3.0 Accepting The Connection

3.1 Stage One

To make it easier to show what is going in the code the two closures passed to the listen function are replaced by functions.

In this case there was nothing for them to ‘close over’ anyway so it does not really make much difference.

The three functions look like this,

    ...

    fn on_establish_callback(chan: SharedChan<Option<TcpErrData>>)
    {
        io::println(fmt!("on_establish_callback(%?)", chan));
    }


    fn new_connection_callback(newConn :TcpNewConnection, chan: SharedChan<Option<TcpErrData>>)
    {
        io::println(fmt!("new_connection_callback(%?, %?)", newConn, chan)); 
        fail!(~"Now what ?");
    }

    fn main()
    {	
        tcp::listen(
            ip::v4::parse_addr(IPV4_LOOPBACK),
            PORT,
            BACKLOG,
            &uv_iotask::spawn_iotask(task::task()),
            on_establish_callback,
            new_connection_callback);
    }

3.2 Stage Two

At this point it all gets a bit complicated.

Whilest calling the accept function is straight forward enough as is getting at the result, the documentation states

It is safe to call net::tcp::accept only within the context of the new_connect_cb callback provided as the final argument to the net::tcp::listen function.

The new_conn opaque value is provided only as the first argument to the new_connect_cb provided as a part of net::tcp::listen. It can be safely sent to another task but it must be used (via net::tcp::accept) before the new_connect_cb call it was provided to returns.

and then for good measure in the Returns section it goes on to say

On success, this function will return a net::tcp::TcpSocket as the Ok variant of a Result.
The net::tcp::TcpSocket is anchored within the task that accept was called within for its lifetime. On failure, this function will return a net::tcp::TcpErrData record as the Err variant of a Result.

So

  1. The accept function must be called before the new_connect_cb callback invoked by the listen function returns

  2. The TcpNewConnection value passed to the the new_connect_cb callback can be passed between Tasks

  3. The TcpSocket value returned from a successful call to the accept function cannot be passed between Tasks

There are two solutions to this little exercise in constraint programming, either

  • accept the connection in the new_connect_cb callback, which ensures the accept function completes before the callback returns but means the connection has to be handled in the callback as well

  • spawn a new task and accept the connection there but ensure that the new_connect_cb callback waits until the accept function has returned which will require some form of synchronization.

Since blocking the callback while we handle the connection presumably prevents any further connections being established which will make for a very serial server we will go with the second option.

4.0 Spawning A Task

We can spawn a task using the task:spawn function which is documented here and defined like this

    fn spawn(f: ~fn())

It takes an owned closure which does not take any arguments.

So to accept a connection in a new task we would need to do something like this

    ...

    task::spawn(
        ||
        {
            match tcp::accept(newConn)
            {
                Ok(socket) => 
                {
                    // ...
                },
			   
                Err(error) => 
                {
                    // ...
                }
            }
        });

    ...

5.0 Things We Need To Know Part N: The do Expression

A function which takes a closure as its last argument can also be invoked using a do expression with the closure argument appearing outside the function call.

The example of spawning a task above becomes

    ...

    do task::spawn()
        ||
        {
            match tcp::accept(newConn)
            {
                Ok(socket) => 
                {
                    // ...
                },
			   
                Err(error) => 
                {
                    // ...
                }
            }
        );

    ...

If the argument list to the function is empty it can be omitted. This is also true for the closure.

Taking advantage of this the example then becomes

    ...

    do task::spawn
        {
            match tcp::accept(newConn)
            {
                Ok(socket) => 
                {
                    // ...
                },
			   
                Err(error) => 
                {
                    // ...
                }
            }
        );

    ...

This pattern is used quite heavily so it is important to be able to recognize it when you see it.

6.0 Things We Need to Know Part N + 1: Method Invocation Very Very Briefly

Methods can be defined on most Rust types.

Methods are invoked using the dot notation, that is

    value.method()

7.0 Task Synchronization

We need to be able to wait in the new_connection_callback function until the accept function has completed in the newly spawned task.

More rummaging around in the documentation turns up the std::sync module.

This module defines the useful looking Mutex struct.

The function Mutex returns a Mutex with an associated CondVar and CondVar supports signal and wait methods.

Mutex supports the lock_cond method which locks itself and then invokes a function with its associated CondVar as its argument.

To use a Mutex/CondVar for Task synchronization, in the function new_connection_callback we need to do something like this

    ...

    let mx = Mutex();
    
    do mx.lock_cond
        |cv|
        {
            // spawn task
            // ...
            cv.wait();
        }
        
    ...

and in the spawned task we need to do something like this

    ...

    match tcp::accept(newConn)
    {
        Ok(socket) => 
        {
            // call lock_cond and signal cv
                        
        },
			   
        Err(error) => 
        {
            // call lock_cond and signal cv
        }
    }

    ...

The question is how do we get hold of the Mutex in the spawned task so that we can lock it ?

The Mutex cannot be moved irrespective of where it is allocated because it is referenced by the call to lock_cond which is not going to return until the associated CondVar is signalled.

We need something which can be moved into the spawned task and enables us to act on the original Mutex/CondVar.

Some more rummaging in the documentation turns up the Mutex clone method which looks as though it is what we want.

If in new_connection_callback we create a clone of the original Mutex in the owned heap then the owned closure passed to task::spawn can take ownership of it.

So in new_connection_callback we need to do this.

    ...

    let mx = Mutex();
    
    do mx.lock_cond
        |cv|
        {
            let mxc = ~mx.clone();
            
            // spawn task
            // ...
            cv.wait();
        }
        
    ...

and in the spawned task we can do this

    ...

    match tcp::accept(newConn)
    {
        Ok(socket) => 
        {
            do mxc.lock_cond
                |cv|
                {
                    cv.signal();
                }
        },
			   
        Err(error) => 
        {
            do mxc.lock_cond
                |cv|
                {
                    cv.signal();
                }
        }
    }

    ...

8.0 Accepting The Connection Revisited

So combining all the above we end up with this

    ...
 
    fn new_connection_callback(newConn :TcpNewConnection, chan: SharedChan<Option>)
    {
        io::println(fmt!("new_connection_callback(%?, %?)", newConn, chan));
	
        let mx = Mutex();
	
        do mx.lock_cond
            |cv|
            {
                let mxc = ~mx.clone();

                do task::spawn 
                {
                    match tcp::accept(newConn)
                    {
                        Ok(socket) => 
                        {
                            io::println("accept succeeded");
                            do mxc.lock_cond
                                |cv|
                                {
                                    cv.signal();
                                }
                        },
                        Err(error) => 
                        {
                            io::println("accept failed");
                            do mxc.lock_cond
                                |cv|
                                {
                                    cv.signal();
                                }
                        }
                    }
                }
                cv.wait();
            }
        fail!(~"Now what ?");
    }

9.0 Running The Code

If we run the new version of httpd and then use a web browser to connect to 127.0.0.1:3534 we get this (output slightly re-formatted for clarity)

 
    ./httpd
    on_establish_callback({x: {data: (0x10070a570 as *())}})
    new_connection_callback(NewTcpConn((0x100832c00 as *())), {x: {data: (0x10070a570 as *())}})
    accept succeeded
    rust: task failed at 'Now what ?', httpd.rc:71
    Assertion failed: (false && "Rust task failed after reentering the Rust stack"), \
        function upcall_call_shim_on_rust_stack, \
        file /Users/simon/Src/lang/rust-0.6/src/rt/rust_upcall.cpp, line 92.
    Abort trap

10.0 The Source Code For httpd v0.2

    // httpd.rc

    // v0.2

    extern mod std;

    use core::comm::SharedChan;

    use core::option::Option;

    use core::task;

    use std::net::ip;
    use std::net::tcp;
    use std::net::tcp::TcpErrData;
    use std::net::tcp::TcpNewConnection;

    use std::sync::Mutex;

    use std::uv_iotask;

    static BACKLOG: uint = 5;
    static PORT:    uint = 3534;

    static IPV4_LOOPBACK: &'static str = "127.0.0.1";


    fn on_establish_callback(chan: SharedChan<Option>)
    {
        io::println(fmt!("on_establish_callback(%?)", chan));
    }

    fn new_connection_callback(newConn :TcpNewConnection, chan: SharedChan<Option>)
    {
        io::println(fmt!("new_connection_callback(%?, %?)", newConn, chan));
	
        let mx = Mutex();
	
        do mx.lock_cond
            |cv|
            {
                let mxc = ~mx.clone();

                do task::spawn 
                {
                    match tcp::accept(newConn)
                    {
                        Ok(socket) => 
                        {
                            io::println("accept succeeded");
                            do mxc.lock_cond
                                |cv|
                                {
                                    cv.signal();
                                }
                        },
                        Err(error) => 
                        {
                            io::println("accept failed");
                            do mxc.lock_cond
                                |cv|
                                {
                                    cv.signal();
                                }
                        }
                    }
                }
                cv.wait();
            }
        fail!(~"Now what ?");
    }

    fn main()
    {	
        tcp::listen(
            ip::v4::parse_addr(IPV4_LOOPBACK),
            PORT,
            BACKLOG,
            &mp;uv_iotask::spawn_iotask(task::task()),
            on_establish_callback,
            new_connection_callback);
    }

Update: 27.07.2014

The socket functions as described above no longer exist in Rust.

See here for some discussion of the equivalent functionality in the current version of Rust


Copyright (c) 2013 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

June 13, 2013

Programming With Rust — Part Six: Things We Need To Know – Lambda Expressions And Closures

1.0 Lambda Expressions

Rust supports lambda expressions.

Depending on your point of view this either makes Rust dangerously fashionable, after all both C++ and Java are in the process of acquiring them, or, if like me you learnt to program in Lisp a very long time ago, reassuringly old-fashioned. [1].

A lambda expression is an expression which defines an anonymous function.

The result of evaluating a lambda expression is a value which can be invoked as a function.

This value can be stored or passed as an argument to a function.

The general form of a lambda expression is a parameter list, delimited using the ‘|‘ character, followed by a return type, followed by an expression.

For example,

   | x: uint, y: uint | -> uint x + y

The return type can be omitted if it possible for the compiler to infer them. In this case it can, so we can do this.

   | x: uint, y: uint | x + y

The types of the parameters can also be omitted if it possible for the compiler to infer them.

2.0 Closures

The expression which comprises the body of the function defined by a lambda expression can potentially reference variables lexically visible at the point the lambda expression is defined.

Once the lambda expression has been evaluated the resulting function can be invoked at a point where its original environment is no longer directly accessible. This implies that the runtime representation cannot simply be the code of the function itself. It must also include some representation of those variables referred to by the function it represents.

This representation is called a closure, so called because it is the result of closing over the environment in which the lambda expression was evaluated.

As you might expect given the Rust runtime memory model there are three distinct types of closure,

  • stack

  • managed

  • owned

corresponding to the three types of read/write memory in which they can be allocated.

3.0 Closure Types

A closure type is declared by specifying where it is stored followed by the parameter list followed by the return type.

For example, this

    &fn(x: uint, y: uint) -> uint

declares a stack closure which takes two arguments of type uint and returns a value of type uint.

This,

    @fn(x: uint, y: uint) -> uint

declares a managed closure which takes two arguments of type uint and returns a value of type uint.

This,

    ~fn(x: uint, y: uint) -> uint

declares an owned closure which takes two arguments of type uint and returns a value of type uint.

It is the declared type of a closure which determines where the closure corresponding to the evaluation of a lambda expression is allocated.

This

    let plus: &fn(x: uint, y:uint) -> uint = | x: uint, y: uint | -> uint x + y ;

will result in the creation of a stack closure.

Note that the presence of the type declaration enables the compiler to infer the parameter and return types so it is possible to write the lambda expression above rather more gnomically like this

    let plus: &fn(x: uint, y:uint) -> uint = | x, y | x + y ;

which is nice.

This

    let plus: @fn(x: uint, y:uint) -> uint = | x, y | x + y ;

will result in the creation of a managed closure.

This

    let plus: ~fn(x: uint, y:uint) -> uint = | x, y | x + y ;

will result in the creation of an owned closure.

If the type of a closure is not explicitly declared then it defaults to being a stack closure.

4.0 Closure Lifetimes And Environment Capture

Each type of closure has a different lifetime determined by where it is allocated.

The lifetime of a closure type affects both what it can close over and how it does so.

The basic principle is that a closure cannot contain a reference to something that potentially has a shorter lifetime than the closure itself.

4.1 Stack Closure Environment Capture

4.1.1 Stack Closure Environment Capture In Theory

A stack closure is stored in a stack frame and hence it can only exist for the lifetime of the function in which it is created.

This would seem to imply that a stack closure resulting from the evaluation of a lambda expression should be able to access any local variable which has been defined before the lambda expression itself, because as long as the stack closure exists then the local variables exist so it is safe to access them.

If this is true then the implementation of a stack closure simply needs a list of the addresses of all the eligible local variables. They can then be accessed directly as required.

So is it true ?

Considering things on a case by case basis.

4.1.1.1 Local Values

For local variables holding values it is true. They are valid as long as the stack closure is valid so they can be accessed directly.

4.1.1.2 Managed Boxes

For local variables holding managed pointers it is also true.

A local variable holding a managed pointer is valid for as long as the stack closure is valid.

As long as the managed pointer held by the local variable is valid the referenced managed box is valid.

There is one caveat. If the local variable holding the managed pointer is itself mutable then the stack closure could end up accessing different managed boxes depending on when and/or how many times it is invoked, which may or may not be a problem.

4.1.1.3 Owned Boxes

It is not true for local variables holding owned pointers because there is a situation where a local variable holding an owned pointer can become invalid, specifically when the owned pointer it holds is assigned to another local variable or passed as an argument to a function.

For example, given this enum type

    enum Marlin
    {
        BlackMarlin,
        BlueMarlin,
        StripedMarlin,
        WhiteMarlin
    }

and this function definition

    fn catch(marlin: &Marlin)
    {
        ...
    }

then if you did this

    ...

    let marlin: ~Marlin = ~StripedMarlin;
    let angler: &fn()   = || { catch(marlin); };
    let caught: ~Marlin = marlin;
  
    angler();
    
    ...

then at the point the stack closure is invoked the local variable marlin is no longer valid. If we assume that the catch function accesses its argument in some way then it will be attempting to dereference whatever is now in the local variable marlin which is probably not a pointer to an owned box.

4.1.2 Stack Closure Environment Capture In Rust 0.6

So how does stack closure environment capture work in Rust 0.6 ?

It works exactly as I have described above, up to and including the owned box problem.

The owned box example above will compile and if the catch function does something with its argument like try to print it out the resulting program will crash with a segmentation violation.

This is a known problem.

See here for the bug I raised.

4.2 Managed Closure Environment Capture

A managed closure is stored in the Managed heap and hence it can potentially exist for as long as the Task in which it was created.

4.2.1 Local Values

A managed closure can reference an immutable value in a local variable because it can safely be copied.

A managed closure cannot reference a mutable value in a local variable even if it does not modify it.

4.2 Managed Boxes

A managed closure can reference both immutable and mutable managed boxes because it can create additional managed pointers which keep the referenced managed boxes from being garbage collected.

4.3 Owned Boxes

A managed closure can reference an immutable owned box but if it does so it necessarily takes ownership of it. In addition it cannot subsequently relinquish ownership. [2].

A managed closure cannot reference a mutable owned box.

4.3 Owned Closure Environment Capture

An owned closure is stored in the Owned heap and hence it can potentially exist for as long as the program in which it was created.

4.3.1 Local Values

An owned closure can reference an immutable value in a local variable because it can safely be copied.

An owned closure cannot reference a mutable value in a local variable even if it does not modify it.

4.3.2 Managed Boxes

An owned closure cannot reference immutable or mutable managed boxes both because it can potentially outlive them and because it can potentially move to another Task in which case the Managed heap of the Task in which it was created is no longer accessible.

4.3.3 Owned Boxes

As in the managed closure case, an owned closure can reference an immutable owned box but if it does so then it necessarily takes ownership of it. As in the managed closure case, it cannot subsequently relinquish ownership.

As in the managed closure case owned closure cannot reference a mutable owned box.

5.0 Closure Type Compatibility

Closures which differ only in their allocation type are not type compatible with one exception.

It is possible to use either a managed or owned closure in place of a stack closure if their parameters and return type are identical.

For example, if you define the function apply like this

    fn apply(f: &fn(x: uint, y: uint) -> uint, a: uint, b: uint) -> uint
    {
        f(a, b)
    }

you can do this

    fn main()
    {
        let s_plus: &fn(x: uint, y: uint) -> uint = |x, y| x + y ;
        let m_plus: @fn(x: uint, y: uint) -> uint = |x, y| x + y ;
        let o_plus: ~fn(x: uint, y: uint) -> uint = |x, y| x + y ;
    
        apply(s_plus, 1, 2);
        apply(m_plus, 3, 4);
        apply(o_plus, 5, 6);
    }

6.0 Closures And Function Compatibility

The name of a statically defined function can be used in place of any type of closure which has the same parameter and return types.

If the function plus is defined like this

    fn plus(x: uint, y: uint) -> uint
    {
        x + y
    }

then all of the following are valid.

    let s_plus: &fn(: uint, y: uint) -> uint = plus;

    let m_plus: @fn(: uint, y: uint) -> uint = plus;
    
    let o_plus: ~fn(: uint, y: uint) -> uint = plus;

Notes

  1. Rust, C++ and Java all use the term lambda expression to denote an anonymous function which may seem a little puzzling unless you know that in Lisp, which was the first language to support them, an anonymous function is identified using the symbol lambda

    Why lambda ? It is a reference to Alonzo Church’s system of formal computation, the lambda calculus, where the Greek character λ (small letter lambda) denotes an anonymous function.

  2. An exercise for the reader. Why Not ?


Copyright (c) 2013 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

Create a free website or blog at WordPress.com.