Just An Application

June 14, 2013

Programming With Rust — Part Seven: Things We Need To Know – Strings And String Literals

1.0 The String Type

A Rust string is an immutable sequence of UTF-8 encoded Unicode characters of type str.

2.0 String Literals

A Rust string literal is a sequence of Unicode characters delimited by double-quotes.

As in other programming languages, to include a double-quote (‘”‘) in a string literal you must escape it using a backslash (‘\’).

3.0 Memory Allocation

A Rust string can be allocated in the managed heap of a Task

    let cod: @str = @"cod";

or the owned heap

    let dab: ~str = ~"dab";

but not on the stack.

A string literal that is not explicitly allocated in a managed heap or in the owned heap is allocated in static memory.

You can obtain a borrowed pointer to a string literal allocated in static memory like this

    let eel: &str = "eel";

Copyright (c) 2013 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

Advertisements

June 13, 2013

Programming With Rust — Part Six: Things We Need To Know – Lambda Expressions And Closures

1.0 Lambda Expressions

Rust supports lambda expressions.

Depending on your point of view this either makes Rust dangerously fashionable, after all both C++ and Java are in the process of acquiring them, or, if like me you learnt to program in Lisp a very long time ago, reassuringly old-fashioned. [1].

A lambda expression is an expression which defines an anonymous function.

The result of evaluating a lambda expression is a value which can be invoked as a function.

This value can be stored or passed as an argument to a function.

The general form of a lambda expression is a parameter list, delimited using the ‘|‘ character, followed by a return type, followed by an expression.

For example,

   | x: uint, y: uint | -> uint x + y

The return type can be omitted if it possible for the compiler to infer them. In this case it can, so we can do this.

   | x: uint, y: uint | x + y

The types of the parameters can also be omitted if it possible for the compiler to infer them.

2.0 Closures

The expression which comprises the body of the function defined by a lambda expression can potentially reference variables lexically visible at the point the lambda expression is defined.

Once the lambda expression has been evaluated the resulting function can be invoked at a point where its original environment is no longer directly accessible. This implies that the runtime representation cannot simply be the code of the function itself. It must also include some representation of those variables referred to by the function it represents.

This representation is called a closure, so called because it is the result of closing over the environment in which the lambda expression was evaluated.

As you might expect given the Rust runtime memory model there are three distinct types of closure,

  • stack

  • managed

  • owned

corresponding to the three types of read/write memory in which they can be allocated.

3.0 Closure Types

A closure type is declared by specifying where it is stored followed by the parameter list followed by the return type.

For example, this

    &fn(x: uint, y: uint) -> uint

declares a stack closure which takes two arguments of type uint and returns a value of type uint.

This,

    @fn(x: uint, y: uint) -> uint

declares a managed closure which takes two arguments of type uint and returns a value of type uint.

This,

    ~fn(x: uint, y: uint) -> uint

declares an owned closure which takes two arguments of type uint and returns a value of type uint.

It is the declared type of a closure which determines where the closure corresponding to the evaluation of a lambda expression is allocated.

This

    let plus: &fn(x: uint, y:uint) -> uint = | x: uint, y: uint | -> uint x + y ;

will result in the creation of a stack closure.

Note that the presence of the type declaration enables the compiler to infer the parameter and return types so it is possible to write the lambda expression above rather more gnomically like this

    let plus: &fn(x: uint, y:uint) -> uint = | x, y | x + y ;

which is nice.

This

    let plus: @fn(x: uint, y:uint) -> uint = | x, y | x + y ;

will result in the creation of a managed closure.

This

    let plus: ~fn(x: uint, y:uint) -> uint = | x, y | x + y ;

will result in the creation of an owned closure.

If the type of a closure is not explicitly declared then it defaults to being a stack closure.

4.0 Closure Lifetimes And Environment Capture

Each type of closure has a different lifetime determined by where it is allocated.

The lifetime of a closure type affects both what it can close over and how it does so.

The basic principle is that a closure cannot contain a reference to something that potentially has a shorter lifetime than the closure itself.

4.1 Stack Closure Environment Capture

4.1.1 Stack Closure Environment Capture In Theory

A stack closure is stored in a stack frame and hence it can only exist for the lifetime of the function in which it is created.

This would seem to imply that a stack closure resulting from the evaluation of a lambda expression should be able to access any local variable which has been defined before the lambda expression itself, because as long as the stack closure exists then the local variables exist so it is safe to access them.

If this is true then the implementation of a stack closure simply needs a list of the addresses of all the eligible local variables. They can then be accessed directly as required.

So is it true ?

Considering things on a case by case basis.

4.1.1.1 Local Values

For local variables holding values it is true. They are valid as long as the stack closure is valid so they can be accessed directly.

4.1.1.2 Managed Boxes

For local variables holding managed pointers it is also true.

A local variable holding a managed pointer is valid for as long as the stack closure is valid.

As long as the managed pointer held by the local variable is valid the referenced managed box is valid.

There is one caveat. If the local variable holding the managed pointer is itself mutable then the stack closure could end up accessing different managed boxes depending on when and/or how many times it is invoked, which may or may not be a problem.

4.1.1.3 Owned Boxes

It is not true for local variables holding owned pointers because there is a situation where a local variable holding an owned pointer can become invalid, specifically when the owned pointer it holds is assigned to another local variable or passed as an argument to a function.

For example, given this enum type

    enum Marlin
    {
        BlackMarlin,
        BlueMarlin,
        StripedMarlin,
        WhiteMarlin
    }

and this function definition

    fn catch(marlin: &Marlin)
    {
        ...
    }

then if you did this

    ...

    let marlin: ~Marlin = ~StripedMarlin;
    let angler: &fn()   = || { catch(marlin); };
    let caught: ~Marlin = marlin;
  
    angler();
    
    ...

then at the point the stack closure is invoked the local variable marlin is no longer valid. If we assume that the catch function accesses its argument in some way then it will be attempting to dereference whatever is now in the local variable marlin which is probably not a pointer to an owned box.

4.1.2 Stack Closure Environment Capture In Rust 0.6

So how does stack closure environment capture work in Rust 0.6 ?

It works exactly as I have described above, up to and including the owned box problem.

The owned box example above will compile and if the catch function does something with its argument like try to print it out the resulting program will crash with a segmentation violation.

This is a known problem.

See here for the bug I raised.

4.2 Managed Closure Environment Capture

A managed closure is stored in the Managed heap and hence it can potentially exist for as long as the Task in which it was created.

4.2.1 Local Values

A managed closure can reference an immutable value in a local variable because it can safely be copied.

A managed closure cannot reference a mutable value in a local variable even if it does not modify it.

4.2 Managed Boxes

A managed closure can reference both immutable and mutable managed boxes because it can create additional managed pointers which keep the referenced managed boxes from being garbage collected.

4.3 Owned Boxes

A managed closure can reference an immutable owned box but if it does so it necessarily takes ownership of it. In addition it cannot subsequently relinquish ownership. [2].

A managed closure cannot reference a mutable owned box.

4.3 Owned Closure Environment Capture

An owned closure is stored in the Owned heap and hence it can potentially exist for as long as the program in which it was created.

4.3.1 Local Values

An owned closure can reference an immutable value in a local variable because it can safely be copied.

An owned closure cannot reference a mutable value in a local variable even if it does not modify it.

4.3.2 Managed Boxes

An owned closure cannot reference immutable or mutable managed boxes both because it can potentially outlive them and because it can potentially move to another Task in which case the Managed heap of the Task in which it was created is no longer accessible.

4.3.3 Owned Boxes

As in the managed closure case, an owned closure can reference an immutable owned box but if it does so then it necessarily takes ownership of it. As in the managed closure case, it cannot subsequently relinquish ownership.

As in the managed closure case owned closure cannot reference a mutable owned box.

5.0 Closure Type Compatibility

Closures which differ only in their allocation type are not type compatible with one exception.

It is possible to use either a managed or owned closure in place of a stack closure if their parameters and return type are identical.

For example, if you define the function apply like this

    fn apply(f: &fn(x: uint, y: uint) -> uint, a: uint, b: uint) -> uint
    {
        f(a, b)
    }

you can do this

    fn main()
    {
        let s_plus: &fn(x: uint, y: uint) -> uint = |x, y| x + y ;
        let m_plus: @fn(x: uint, y: uint) -> uint = |x, y| x + y ;
        let o_plus: ~fn(x: uint, y: uint) -> uint = |x, y| x + y ;
    
        apply(s_plus, 1, 2);
        apply(m_plus, 3, 4);
        apply(o_plus, 5, 6);
    }

6.0 Closures And Function Compatibility

The name of a statically defined function can be used in place of any type of closure which has the same parameter and return types.

If the function plus is defined like this

    fn plus(x: uint, y: uint) -> uint
    {
        x + y
    }

then all of the following are valid.

    let s_plus: &fn(: uint, y: uint) -> uint = plus;

    let m_plus: @fn(: uint, y: uint) -> uint = plus;
    
    let o_plus: ~fn(: uint, y: uint) -> uint = plus;

Notes

  1. Rust, C++ and Java all use the term lambda expression to denote an anonymous function which may seem a little puzzling unless you know that in Lisp, which was the first language to support them, an anonymous function is identified using the symbol lambda

    Why lambda ? It is a reference to Alonzo Church’s system of formal computation, the lambda calculus, where the Greek character λ (small letter lambda) denotes an anonymous function.

  2. An exercise for the reader. Why Not ?


Copyright (c) 2013 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

Programming With Rust — Part Five: Things We Need To Know – Tasks And Memory

1.0 Tasks

The unit of concurrency in Rust is the Task.

Code executing in different Tasks runs concurrently.

To this extent a Task is analagous to a Thread in a language such as Java but with one very important difference.

Rust Tasks are memory independent as well as execution independent.

Code executing in separate Tasks cannot interact via shared memory.

This turns out to have a number of interesting ramifications.

2.0 Memory Access

Rust code executing in a Task has read/write access to three distinct areas of memory,

  • the current stack frame

  • a Managed heap belonging to the current Task

  • a global heap called the Owned heap

as well as read-only access to an area of static memory which is global and immutable.

2.1 The Current Stack Frame: Local Variables

Local variables can be allocated in the current stack frame like this

    let foo: uint = 5;

This creates an immutable local variable foo with the value 5;

A mutable local variable can be allocated like this

    let mut bar: uint = 5;

Local variables can hold simple discrete values such as integers as well as more complicated aggregates of values such as enum variants

    let white: Colour = RGB(255, 255, 255);

In this respect Rust is like C++ where fixed-size arrays or instances of structs or classes as well as integers, floats, etc. can be stored directly in a stack frame, and unlike Java where arrays and object instances can only ever be stored in the heap.

2.2 The Heaps

A Note On Terminology

I am not at all convinced that I fully understand the terminology used to describe the Rust memory model when it comes to heaps, so what follows is my own attempt at a consistent and hopefully accurate terminology.

A box is a piece of memory which has been allocated in a heap.

A pointer is a reference to a box which can be used to access its contents.

More specifically

  • a managed box is a box in a Managed heap

  • a managed pointer is a pointer to a managed box

  • an owned box is a box in the Owned heap

  • an owned pointer is a pointer to an owned box

2.2.1 The Managed Heap

Each Task has its own Managed heap which can only be accessed by code executing in that Task.

Each managed box may be referenced by multiple pointers.

If at some point there are no longer any pointers to a managed box it becomes eligible to be garbage collected.

The garbage collection of a managed box will occur at some time between the point at which it becomes eligible to be garbage collected and the point at which the Task which owns the Managed heap ends.

To allocate a uint in the managed heap of the current Task you can do this

    let cod: @uint = @5;

In this example both the local variable baz and the managed box are immutable.

You can allocate a mutable managed box like this

    let dab: @mut uint = @mut 5;

The local variable cod is immutable but the managed box is mutable.

You can obviously allocate things other than unsigned integers in a Managed heap. For example,

    let eel: @Colour = @RGB(0, 0, 0);

2.2.2 The Owned Heap

There is a single Owned heap which can be accessed by code executing in any Task.

An owned box can only be referenced by a single pointer.

An owned box is garbage collected at the point, if any, that it is no longer referenced by a pointer.

To allocate a uint in the Owned heap you can do this

    let bream: ~uint = ~5;

In this example both the local variable bream and the owned box are immutable.

You can allocate a mutable owned box like this

    let mut chub: ~uint = ~5;

And an example of allocating something other than an unsigned integer in the Owned heap.

    let dace: ~Colour = ~RGB(0, 0, 0);

2.3 Static Memory

A program’s static memory holds the values of items processed at compile time.

2.4 Enforcing The Semantics Of Managed And Owned Boxes And Pointers

The semantics of managed and owned boxes and pointers are enforced at compilation time.

2.4.1 Type Safety

Managed pointers and owned pointers are distinct types and they are not interchangeable, i.e. their types include where they are pointing as well as what they are pointing at.

For example you cannot do this

    let rudd: @uint = ~5;

nor this

    let scad: ~uint = @5;

2.4.2 Assignment Of Owned Pointers

Because there can only ever be one owned pointer to an owned box, if you do this

    let hake = ~17;

and at some point you then do this

    let goby = hake;

then from that point on the local variable hake is no longer usable.

2.4.3 Passing Owned Pointers As Arguments To Functions

If you pass an owned pointer to a function then ownership is transferred to that function.

For example, given this enum type

    enum FreshwaterFish
    {
        Loach,
        Perch,
        Roach,
        Tench,
    }

and this function definition

    fn catch(fish: ~FreshwaterFish)
    {
        ...
    }

if you do this

    let perch : ~FreshwaterFish = ~Perch;
    
    catch(perch);

then after the call to catch the local variable perch is no longer usable.

2.4.4 Returning Owned Pointers From Functions

If you return an owned pointer from a function then ownership is transferred to the caller of the function.

For example, given this enum type again

    enum FreshwaterFish
    {
        Loach,
        Perch,
        Roach,
        Tench,
    }

and this function definition

    fn catch_and_return(fish: ~FreshwaterFish) -> ~FreshwaterFish
    {
        fish
    }

then the net effect of this

    let tench: ~FreshwaterFish = ~Tench;

    ...
    
    let fish:  ~FreshwaterFish = catch_and_return(tench);

is to transfer ownership of the owned box from the local variable tench to the local variable fish.

3.0 Borrowed Pointers

You can borrow a pointer to a piece of memory irrespective of its location so long as the lifetime of the memory pointed to is guaranteed to be longer than the lifetime of the borrowed pointer.

This constraint is enforced by the compiler. If it cannot prove that the constraint is true then the code will not compile.

A borrowed pointer is declared using an ampersand (&).

For example,

    fn set_background(colour: &Colour)
    {
        ...
    }

The function set_background takes a borrowed pointer to a value of type Colour.

Because you can obtain a borrowed pointer to a value irrespective of its location the set_background function can be passed a Colour value that is stored in the current stack frame, or in the Managed heap of the current Task, or in the Owned heap, like so

    ...

   let s_white:  Colour = RGB(255, 255, 255);
   let m_white: @Colour = @RGB(255, 255, 255);
   let o_white: ~Colour = ~RGB(255, 255, 255);
   
   set_background(&s_white);
   set_background(m_white);
   set_background(o_white);
   
   ...

Note that to obtain a pointer to a local variable you use the & operator. The compiler will automatically create a borrowed pointer for a managed or owned box.


Copyright (c) 2013 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

June 11, 2013

Programming With Rust — Part Four: Things We Need To Know – Function Definitions Revisited: Items, Expressions And Statements

1.0 Items

Technically a function definition is a function-item (fn_item).

Similarly a module definition is a module item (mod_item).

A name binding is one kind of view item (view_item).

An enum type definition is an enum item (enum_item).

What all these things have in common, and what makes them items, is that it is possible to process them at compile time and if necessary store the results in a program’s read-only memory.

2.0 Function Bodies: Expressions And Statements

The body of a function is a block.

A block is a sequence of statements followed by an optional expression, delimited by braces (‘{‘ … ‘}’). [1]

An expression is something which when evaluated at run-time produces a value.

A statement is something which when evaluated at run-time does not produce a value.

Terminating an expression with a semi-colon causes the value of the expression to be ignored, that is, it becomes a statement, specifically an expression-statement. [2]

A block is itself an expression. If it ends with an expression then its value is the value of that expression. If it ends with a statement then its value is the singleton instance of the unit type, that is, nothing. [3]

2.0 Function Return Types

If a function does not return a value then the return type can be omitted from the definition.

    fn nonplussed(x: uint, y: uint)

Alternatively the return type can be explicitly declared as the unit type.

    fn nonplussed(x: uint, y: uint) -> ()

Or to put it another way in Rust a function always returns something even if that something is nothing.

3.0 Function Return Values

Because the body of a function is a block, and because a block is an expression, then, by default, the return value of a function is the result of
evaluating the block which comprises the body of the function.

For example,

    fn plus(x: uint, y: uint) -> uint
    {
        x + y
    }

Note the absence of a semi-colon (‘;’). If you inadvertently add one then the expression becomes an expression-statement and the function
will no longer compile as there is no value to return.

Note that the converse is also true. You cannot return a value from a function which is declared not to return a value

For example this will not compile

    fn nonplussed(x: uint, y: uint) 
    { 
        x + y
    }

whereas this will

    fn nonplussed(x: uint, y: uint)
    { 
        x + y;
    }

4.0 Explicitly Returning A Value From A Function

It is also possible to return a value explicitly using a return expression.

For example

    fn plus(x: uint, y: uint) -> uint
    {
        return x + y
    }

In this case the return is superfluous but there are situations in which it can be useful.

Notes

[1] This is my definition of a block as currently there does not appear to be a formal definition in the Rust documentation.

[2] Empirically this does not seem to be true for all possible expressions.

[3] This is my interpretation of the semantics of a block. The Rust documentation does specify that a block is an expression but does not explicitly specify its semantics.


Copyright (c) 2013 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

Programming With Rust — Part Three: Things We Need To Know – Modules

1.0 Modules

A Rust module defines both a namespace and an access-control boundary.

In this example

    mod fish
    {
        enum FlatFish
        {
            Dab,
            Halibut,
            Flounder,
            Sole,
            Turbot
        }
    }

the type FlatFish could be referenced from within the fish module using the identifier FlatFish.

However it is not visible outside the module.

2.0 Access Control

To make it visible it is necessary to make it public it like this.

    mod fish
    {
        pub enum FlatFish
        {
            Dab,
            Halibut,
            Flounder,
            Sole,
            Turbot
        }
    }

The FlatFish type can now be referenced from other modules.

3.0 Paths

Now that the FlatFish type is public it can be referenced from outside the fish module using its path, that is, its fully qualified name, which is

    fish::FlatFish

4.0 Name Binding

Alternatively the name FlatFish can be bound locally in another module like this

    use fish::FlatFish;

Now the local name FlatFish can be used to reference the name FlatFish in the fish module.

5.0 Nested Modules

Modules can be nested.

For example

    mod underwater_creatures
    {
        mod fish
        {
            pub enum FlatFish
            {
                Dab,
                Halibut,
                Flounder,
                Sole,
                Turbot
            }
        }
    }

The path of the FlatFish type is, as you might expect

    underwater_creatures::fish::FlatFish

and to bind the name FlatFish locally you do this

    use underwater_creatures::fish::FlatFish;

6.0 Binding Module Names

You can also bind the name of a module locally, but visibility constraints apply to modules as well.

Given the example above you cannot do this

    use underwater_creatures::fish;

because the fish module is not public.

If the fish module is made public

    mod underwater_creatures
    {
        pub mod fish
        {
            pub enum FlatFish
            {
                Dab,
                Halibut,
                Flounder,
                Sole,
                Turbot
            }
        }
    }

then you can do this

    use underwater_creatures::fish;

and you can then refer to the FlatFish type using the path

    fish::FlatFish

This is useful if you want to use a prefix to qualify a name either to avoid name clashes and/or to identify where the name is from, but you do not want to use its path because of its length.

7.0 Aliasing

A name can be bound to an alias like this

    use ff = fish::FlatFish;

Now the local name ff can be used to reference the name FlatFish in the fish module.

8.0 Access Control And Name Bindings

By default name bindings are not visible outside the module in which they occur, but they can be made public in the same way
as enum types and modules.

The effect of doing this is that the local name is now visible outside the module and it in turn can now be referenced using its path and bound to a local name in another module and so on.

For example, if we do this

    mod underwater_creatures
    {
        pub use underwater_creatures::fish::FlatFish;
    
        mod fish
        {
            pub enum FlatFish
            {
                Dab,
                Flounder,
                Halibut,
                Sole,
                Turbot
            }
        }
    }

then you can use the path

   underwater_creatures::FlatFish

to reference the FlatFish type.

Alternatively you can bind the name locally like this

   use underwater_creatures::FlatFish

and the local name FlatFish will reference the FlatFish type.

Using aliasing in conjunction with a public name binding it is possible to re-name things in very confusing ways if you choose to do so.

For example,

    mod underwater_creatures
    {
        pub use FlatFish = underwater_creatures::fish::OtherFish;
    
        mod fish
        {
            pub enum FlatFish
            {
                Dab,
                Flounder,
                Halibut,
                Sole,
                Turbot
            }
        
            pub enum OtherFish
            {
                Perch,
                Roach,
                Tench
            }
        }
    }

Now the path

   underwater_creatures::FlatFish

references the OtherFish type in the fish module.


Copyright (c) 2013 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

June 10, 2013

Programming With Rust — Part Two: Things We Need To Know – Some Types

Filed under: Programming Languages, Rust, Rust 0.6, Rust On Mac OS X — Tags: , , , — Simon Lewis @ 9:15 pm

1.0 Some Primitive Types

1.1 The uint type

A value of type uint is an unsigned integer.

The size of a uint, that is the number of bits used to represent one, is machine dependent, and is specified to be

equal to the number of bits required to hold any memory address on the target machine.

No prizes for guessing what else these might be being used for in the implementation.

1.2 The unit Type

The unit type is a type with a single zero-size value.

Both the type and the value are specified like this

    ()

2.0 Enums

A Rust enum type defines a union type.

For example, in graphics programming there are any number of different representations of a colour.

You can use an enum type to define a Colour type in terms of a number of different representation types like this

    enum Colour
    {
        CYMK(uint, uint, uint, uint),
        HSB (uint, uint, uint),
        RGB (uint, uint, uint),
    }

Given this definition a value of type Colour can be a value of either

  • the CYMK type, or

  • the HSB type, or

  • the RGB type

A value of a variant type defined by an enum type can be constructed by using a function with the same name as the variant type.

For example,

    CYMK(0, 0, 0, 0)

or

    HSB(39, 0, 100)

or

    RGB(255, 255, 255)

These functions are defined automatically as part of the enum type definition.

When used in this way it is evident that a Rust enum type is not the same as a C++ or a Java enum type.

However when defining a Rust enum type it is possible to specify variant types which have no associated data, like this

    enum PrimaryColour
    {
        Red,
        Green,
        Blue
    }

This form of Rust enum type is equivalent to a C++ enum type.

3.0 Generic Types

Rust supports generic Types using the ‘<...>‘ syntax familiar from C++ and latterly Java.

For example, you could, if you really wanted to, define a generic version of the Colour enum type above like this

    enum Colour<T>
    {
        CYMK(T, T, T, T)
        HSB (T, T, T),
        RGB (T, T, T),
    }

The instantiated type

    Colour<uint>

would then be equivalent to the original non-generic definition of the Colour type.


Copyright (c) 2013 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

Programming With Rust — Part One: So Where Are The Sockets ?

To get some kind of a feel for a new programming language I like to try and use it to write some sort of ‘real-world’-ish program.

After toying with various ideas I decided to try and use Rust to write a very simple HTTP server

Note

All the code that follows was compiled and run using the Mac OS X version of Rust 0.6

How I built Rust 0.6 for Mac OS X is described here.

1.0 So Where Are the Sockets ?

The first thing you need to implement an HTTP server are some sockets, well at least one socket.

Rummaging around in the documentation we find the the following here (re-formatted for clarity)


    fn listen(
           host_ip:         ip::IpAddr, 
           port:            uint, 
           backlog:         uint, 
           iotask:          &IoTask,
           on_establish_cb: ~fn(SharedChan<Option<TcpErrData>>),
           new_connect_cb:  ~fn(TcpNewConnection, SharedChan<Option<TcpErrData>>)) 
		   
       ->  result::Result<(), TcpListenErrData>

So what does all that mean ?

To start with it helps to know how functions are defined in Rust.

2.0 Function Definitions

A function definition in Rust starts with the keyword

    fn

This is followed by the name of the function, then a list of parameters the function takes, then the return type, if any, then the body of the function

Slightly more formally

    functiondef : "fn" ident parameters returntype? body

    parameters  : '(' parameter [',' parameter]* ')' | '()'
    
    parameter   : ident ':' type
    
    returntype  : '->' type

Note

The definition of a parameter is a bit more complicated than that but it will do for now.

3.0 The listen Function Revisited

Given that we now know how functions are defined what we can deduce about the listen function without having
to resort to reading the documentation ?

Not a great deal.

It takes six arguments and returns a value. That is about it.

Everything else is pretty much pure conjecture.

The double colon in the type ip::IpAddr might have something to do with namespaces ?

uints might be unsigned integers ?

The ampersand in &IoTask might have something to do with references, but does Rust even have references ?

Unfortunately compilers are usually not big on conjectures, so there is nothing for it, we are going to have to find stuff out.

4.0 What Do We Need To Know About In Order To Use The listen Function ?

It turns out that to actually use the listen function it is first necessary to know something about all of the
following aspects of Rust

  • some primitive types

  • enum types

  • generic types

  • modules

  • items, expressions, and statements

  • tasks

  • the Rust memory model

  • lambda expressions and closures

For me this is the benefit of trying to do something ‘real-world’ ish when learning a new programming language. Usually you really do have to read the documentation and get to grips with what it says in order to get anything at all to happen !

Update: 27.07.2014

The socket functions as described above no longer exist in Rust.

See here for some discussion of the equivalent functionality in the current version of Rust


Copyright (c) 2013 By Simon Lewis. All Rights Reserved.

Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and owner Simon Lewis is strictly prohibited.

Excerpts and links may be used, provided that full and clear credit is given to Simon Lewis and justanapplication.wordpress.com with appropriate and specific direction to the original content.

« Newer Posts

Blog at WordPress.com.

%d bloggers like this: