Just An Application

June 13, 2013

Programming With Rust — Part Six: Things We Need To Know – Lambda Expressions And Closures

1.0 Lambda Expressions

Rust supports lambda expressions.

Depending on your point of view this either makes Rust dangerously fashionable, after all both C++ and Java are in the process of acquiring them, or, if like me you learnt to program in Lisp a very long time ago, reassuringly old-fashioned. [1].

A lambda expression is an expression which defines an anonymous function.

The result of evaluating a lambda expression is a value which can be invoked as a function.

This value can be stored or passed as an argument to a function.

The general form of a lambda expression is a parameter list, delimited using the ‘|‘ character, followed by a return type, followed by an expression.

For example,

   | x: uint, y: uint | -> uint x + y

The return type can be omitted if it possible for the compiler to infer them. In this case it can, so we can do this.

   | x: uint, y: uint | x + y

The types of the parameters can also be omitted if it possible for the compiler to infer them.

2.0 Closures

The expression which comprises the body of the function defined by a lambda expression can potentially reference variables lexically visible at the point the lambda expression is defined.

Once the lambda expression has been evaluated the resulting function can be invoked at a point where its original environment is no longer directly accessible. This implies that the runtime representation cannot simply be the code of the function itself. It must also include some representation of those variables referred to by the function it represents.

This representation is called a closure, so called because it is the result of closing over the environment in which the lambda expression was evaluated.

As you might expect given the Rust runtime memory model there are three distinct types of closure,

  • stack

  • managed

  • owned

corresponding to the three types of read/write memory in which they can be allocated.

3.0 Closure Types

A closure type is declared by specifying where it is stored followed by the parameter list followed by the return type.

For example, this

    &fn(x: uint, y: uint) -> uint

declares a stack closure which takes two arguments of type uint and returns a value of type uint.

This,

    @fn(x: uint, y: uint) -> uint

declares a managed closure which takes two arguments of type uint and returns a value of type uint.

This,

    ~fn(x: uint, y: uint) -> uint

declares an owned closure which takes two arguments of type uint and returns a value of type uint.

It is the declared type of a closure which determines where the closure corresponding to the evaluation of a lambda expression is allocated.

This

    let plus: &fn(x: uint, y:uint) -> uint = | x: uint, y: uint | -> uint x + y ;

will result in the creation of a stack closure.

Note that the presence of the type declaration enables the compiler to infer the parameter and return types so it is possible to write the lambda expression above rather more gnomically like this

    let plus: &fn(x: uint, y:uint) -> uint = | x, y | x + y ;

which is nice.

This

    let plus: @fn(x: uint, y:uint) -> uint = | x, y | x + y ;

will result in the creation of a managed closure.

This

    let plus: ~fn(x: uint, y:uint) -> uint = | x, y | x + y ;

will result in the creation of an owned closure.

If the type of a closure is not explicitly declared then it defaults to being a stack closure.

4.0 Closure Lifetimes And Environment Capture

Each type of closure has a different lifetime determined by where it is allocated.

The lifetime of a closure type affects both what it can close over and how it does so.

The basic principle is that a closure cannot contain a reference to something that potentially has a shorter lifetime than the closure itself.

4.1 Stack Closure Environment Capture

4.1.1 Stack Closure Environment Capture In Theory

A stack closure is stored in a stack frame and hence it can only exist for the lifetime of the function in which it is created.

This would seem to imply that a stack closure resulting from the evaluation of a lambda expression should be able to access any local variable which has been defined before the lambda expression itself, because as long as the stack closure exists then the local variables exist so it is safe to access them.

If this is true then the implementation of a stack closure simply needs a list of the addresses of all the eligible local variables. They can then be accessed directly as required.

So is it true ?

Considering things on a case by case basis.

4.1.1.1 Local Values

For local variables holding values it is true. They are valid as long as the stack closure is valid so they can be accessed directly.

4.1.1.2 Managed Boxes

For local variables holding managed pointers it is also true.

A local variable holding a managed pointer is valid for as long as the stack closure is valid.

As long as the managed pointer held by the local variable is valid the referenced managed box is valid.

There is one caveat. If the local variable holding the managed pointer is itself mutable then the stack closure could end up accessing different managed boxes depending on when and/or how many times it is invoked, which may or may not be a problem.

4.1.1.3 Owned Boxes

It is not true for local variables holding owned pointers because there is a situation where a local variable holding an owned pointer can become invalid, specifically when the owned pointer it holds is assigned to another local variable or passed as an argument to a function.

For example, given this enum type

    enum Marlin
    {
        BlackMarlin,
        BlueMarlin,
        StripedMarlin,
        WhiteMarlin
    }

and this function definition

    fn catch(marlin: &Marlin)
    {
        ...
    }

then if you did this

    ...

    let marlin: ~Marlin = ~StripedMarlin;
    let angler: &fn()   = || { catch(marlin); };
    let caught: ~Marlin = marlin;
  
    angler();
    
    ...

then at the point the stack closure is invoked the local variable marlin is no longer valid. If we assume that the catch function accesses its argument in some way then it will be attempting to dereference whatever is now in the local variable marlin which is probably not a pointer to an owned box.

4.1.2 Stack Closure Environment Capture In Rust 0.6

So how does stack closure environment capture work in Rust 0.6 ?

It works exactly as I have described above, up to and including the owned box problem.

The owned box example above will compile and if the catch function does something with its argument like try to print it out the resulting program will crash with a segmentation violation.

This is a known problem.

See here for the bug I raised.

4.2 Managed Closure Environment Capture

A managed closure is stored in the Managed heap and hence it can potentially exist for as long as the Task in which it was created.

4.2.1 Local Values

A managed closure can reference an immutable value in a local variable because it can safely be copied.

A managed closure cannot reference a mutable value in a local variable even if it does not modify it.

4.2 Managed Boxes

A managed closure can reference both immutable and mutable managed boxes because it can create additional managed pointers which keep the referenced managed boxes from being garbage collected.

4.3 Owned Boxes

A managed closure can reference an immutable owned box but if it does so it necessarily takes ownership of it. In addition it cannot subsequently relinquish ownership. [2].

A managed closure cannot reference a mutable owned box.

4.3 Owned Closure Environment Capture

An owned closure is stored in the Owned heap and hence it can potentially exist for as long as the program in which it was created.

4.3.1 Local Values

An owned closure can reference an immutable value in a local variable because it can safely be copied.

An owned closure cannot reference a mutable value in a local variable even if it does not modify it.

4.3.2 Managed Boxes

An owned closure cannot reference immutable or mutable managed boxes both because it can potentially outlive them and because it can potentially move to another Task in which case the Managed heap of the Task in which it was created is no longer accessible.

4.3.3 Owned Boxes

As in the managed closure case, an owned closure can reference an immutable owned box but if it does so then it necessarily takes ownership of it. As in the managed closure case, it cannot subsequently relinquish ownership.

As in the managed closure case owned closure cannot reference a mutable owned box.

5.0 Closure Type Compatibility

Closures which differ only in their allocation type are not type compatible with one exception.

It is possible to use either a managed or owned closure in place of a stack closure if their parameters and return type are identical.

For example, if you define the function apply like this

    fn apply(f: &fn(x: uint, y: uint) -> uint, a: uint, b: uint) -> uint
    {
        f(a, b)
    }

you can do this

    fn main()
    {
        let s_plus: &fn(x: uint, y: uint) -> uint = |x, y| x + y ;
        let m_plus: @fn(x: uint, y: uint) -> uint = |x, y| x + y ;
        let o_plus: ~fn(x: uint, y: uint) -> uint = |x, y| x + y ;
    
        apply(s_plus, 1, 2);
        apply(m_plus, 3, 4);
        apply(o_plus, 5, 6);
    }

6.0 Closures And Function Compatibility

The name of a statically defined function can be used in place of any type of closure which has the same parameter and return types.

If the function plus is defined like this

    fn plus(x: uint, y: uint) -> uint
    {
        x + y
    }

then all of the following are valid.

    let s_plus: &fn(: uint, y: uint) -> uint = plus;

    let m_plus: @fn(: uint, y: uint) -> uint = plus;
    
    let o_plus: ~fn(: uint, y: uint) -> uint = plus;

Notes

  1. Rust, C++ and Java all use the term lambda expression to denote an anonymous function which may seem a little puzzling unless you know that in Lisp, which was the first language to support them, an anonymous function is identified using the symbol lambda

    Why lambda ? It is a reference to Alonzo Church’s system of formal computation, the lambda calculus, where the Greek character λ (small letter lambda) denotes an anonymous function.

  2. An exercise for the reader. Why Not ?


Copyright (c) 2013 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

Advertisements

2 Comments »

  1. […] To make it easier to show what is going in the code the two closures passed to the listen function are replaced by functions. […]

    Pingback by Programming With Rust — Part Ten: More Fun With Sockets | Just An Application — June 17, 2013 @ 7:16 am


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: