Rust - A collection of intervals

2023-01-09

Previously published

This article was previously published on len-learns-rust.com. A full index of these articles can be found here.

Now that I have a simple interval I need a collection of them to represent the available ids that we can allocate. I’m going to create a new file for this new struct, intervals.rs and move the code from last time into interval.rs. I realise that I could leave it all in lib.rs but that doesn’t feel right to me and for now I’ll live with the extra complexity produced by the modules created by the different files that I’m using.

The collection of intervals will use a std::collections::BTreeSet<> to store the intervals and we will use std::format::Display on the intervals, and interval, to allow the tests to easily display the contents of the various objects under test. Being able to do dump the collection as a string, like this:

assert_eq!(intervals.dump(), "[1,3], [5,10]");

makes it so much easier to see what’s going on in a test.

The first cut of the collection code looks a bit like this:

use std::collections::BTreeSet;
use std::fmt;

use crate::interval::Interval;

pub struct Intervals {
    intervals: BTreeSet<Interval>,
}

impl Intervals {
    pub fn new() -> Self {
        Intervals {
            intervals: BTreeSet::new(),
        }
    }

    pub fn is_empty(&self) -> bool {
        self.intervals.is_empty()
    }

    pub fn insert_interval(&mut self, lower: u8, upper: u8) -> bool {
        let interval = Interval::new(lower, upper);

        self.insert(interval)
    }

    pub fn insert_value(&mut self, value: u8) -> bool {
        let interval = Interval::new_single_value_interval(value);

        self.insert(interval)
    }

    pub fn dump(&self) -> String {
        format!("{}", self)
    }

    fn insert(&mut self, interval: Interval) -> bool {
        self.intervals.insert(interval);

        true
    }
}

impl fmt::Display for Intervals {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        let mut first = true;

        for interval in self.intervals.iter() {
            if !first {
                write!(f, ", ")?;
            }

            write!(f, "{}", interval)?;

            if first {
                first = false;
            }
        }
        Ok(())
    }
}

and to implement fmt::Display for Intervals I also need to implement it for the Interval, which is easy:

impl fmt::Display for Interval {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        if self.lower == self.upper {
            write!(f, "[{}]", self.lower)
        } else {
            write!(f, "[{},{}]", self.lower, self.upper)
        }
    }
}

Compiler says "no!"

Compiling the code at this point gives some errors. The Rust compiler helpfully suggests some solutions:

error[E0277]: the trait bound `interval::Interval: Ord` is not satisfied
   --> src\intervals.rs:38:24
    |
38  |         self.intervals.insert(interval);
    |                        ^^^^^^ the trait `Ord` is not implemented for `interval::Interval`
    |
note: required by a bound in `BTreeSet::<T, A>::insert`
   --> rustlib/src/rust\library\alloc\src\collections\btree\set.rs:912:12
    |
912 |         T: Ord,
    |            ^^^ required by this bound in `BTreeSet::<T, A>::insert`
help: consider annotating `interval::Interval` with `#[derive(Ord)]`
   --> |src\interval.rs:2:1
    |
2   | #[derive(Ord)]
    |

Not surprisingly, to be able to store the Interval in the set we need some way of comparing one interval to another and working out which comes first. We can use the ‘default version’ of Ord¹ by doing as the compiler suggests:

#[derive(Ord)]
pub struct Interval {
    lower: u8,
    upper: u8,
}

and now the compiler points out that we also need other missing functionality. After we add Ord we need PartialOrd and then Eq and then PartialEq until finally we have this

#[derive(Ord, PartialOrd, Eq, PartialEq)]
pub struct Interval {
    lower: u8,
    upper: u8,
}

and the code compiles. Unfortunately, the default implementation of Ord doesn’t quite do what we want but we’ll need some unit tests to see what’s wrong. I wont show the whole failing test here, but the issue is that, given a collection that contains the following intervals [4,9], [4,10], [5,10], [5,12], [10,12], [12,20] when we add [9] we end up with [4,9], [4,10], [5,10], [5,12], [9], [10,12], [12,20] rather than [4,9], [9], [4,10], [5,10], [5,12], [10,12], [12,20]. Of course this is a contrived test example and is only possible because of the simple implementation that allows us to insert any set of intervals without merging contiguous intervals together. However, the underlying issue here is that the default implementation of Ord produces a lexicographic ordering based on the top-to-bottom declaration order of the struct’s members. This means that we’re based on lower and then upper. Switching the order of struct variables around doesn’t help at all here and so we actually have to write our own implementation of Ord to give us what we want.

impl Ord for Interval {
    fn cmp(&self, other: &Self) -> Ordering {
        let lower_is = self.lower.cmp(&other.lower);

        let upper_is = self.upper.cmp(&other.upper);

        if lower_is == Ordering::Less {
            return Ordering::Less;
        }

        if upper_is == Ordering::Greater {
            return Ordering::Greater;
        }

        if upper_is == Ordering::Less {
            return Ordering::Less;
        }

        if lower_is == Ordering::Greater {
            return Ordering::Greater;
        }

        Ordering::Equal
    }
}

and then only derive the rest of the requirements as they will use our version of Ord.

#[derive(PartialOrd, Eq, PartialEq)]
pub struct Interval {
    lower: u8,
    upper: u8,
}

This ordering works for us and the resulting sequences of intervals is much more sane once we begin to merge contiguous intervals².

One final thing. I’ve added a new binary crate with a usage example in it. This allows me to play with the public interface to the library crate, rather than accidentally using the private interface.

Join in

The code can be found here on GitHub each step on the journey will have one or more separate directories of code, so this article’s code is here this allows for easy comparison of changes at each stage.

Of course, there may be a better way; leave comments if you’d like to help me learn.

std::cmp::Ord ↩︎
I may revisit this later ↩︎