Agda
Safe HaskellNone
LanguageHaskell2010

Agda.Utils.VarSet

Description

Manage sets of natural numbers (de Bruijn indices).

This file contains an implementation of finite sets of integers that is optimized for storing sets of free variables. Notably, de Bruijn indices/levels are not uniformly distributed across the range of Int, and are always very small (EG: less than 64).

This makes IntSet somewhat ill-suited for the task. IntSet is designed to be able to handle values distributed across the entire range of Int, at the cost of pointer indirections and poor memory locality.

Instead, VarSet stores free variables as a bitvector. When the elements in the set are all smaller than 64, we can fit the entire set into a single unboxed Word#, which makes most of our set operations a single instruction.

When we exceed this bound, we fall back to an unboxed ByteArray#. Experimentally, this happens less than 0.0001% of the time.

Experimental Results

Agda builds instrumented with traceVarSetSize give the following statistics for VarSet sizes across the following libraries:

librarycommit hashvarsets allocatednon-word sized sets
1lab11f4672ca7dcfdccb1b4ce94ca4ca42c9b732e621181985510
agda-stdlib0e97e2ed1f999ea0d2cce2a3b2395d5a9a7bd36a389600492178
cubical1f2af52701945bef003a525f78fa41aeadb7c6ae827518050
Synopsis

VarSet type

Invariants

A variable set whose maximum entry is less than than 64 is always encoded via the VS# constructor. The final word of a VB# is always non-zero.

Performance

GHC will use pointer-tagging for datatypes with less than 4 constructors on 32 bit machines (8 on 64 bit), which means that VarSet will be laid out as a tagged pointer to a word-sized piece of data.

We also pay attention to the constructor ordering here. GHC will lay out the machine code for case statements in reverse order of the datatypes constructors. Both ARM and x86 statically predict that forward branches are not taken, so putting VB# before VS# means that we can eek out a few extra drops of performance.

Finally, many functions in this module follow the following pattern: @ union :: VarSet -> VarSet -> VarSet union (VS# w1) (VS# w2) = VS# (w1 or# w2) union vs1 vs2 = unionSlow vs1 vs2 {-# INLINE union #-}

unionSlow :: VarSet -> VarSet -> VarSet unionSlow (VS# w1) (VS# w2) = VS# (w1 or# w2) unionSlow (VS# w1) (VB# bs2) = VB# (bigNatOrWord# bs2 w1) unionSlow (VB# bs1) (VS# w2) = VB# (bigNatOrWord# bs1 w2) unionSlow (VB# bs1) (VB# bs2) = VB# (bigNatOr bs1 bs2) {-# NOINLINE unionSlow #-} @

This makes the 0.0001% of slow cases notably worse, but makes the happy path approximately 3x faster for things like inRange, as GHC is able to coalesce all of the happy paths.

data VarSet Source #

A set of de Bruijn indices/levels.

Constructors

VB# ByteArray#

A variable set whose maximum entry is greater than or equal to 64.

VS# Word#

A small variable set whose maximum entry is less than than 64.

Instances

Instances details
EmbPrj VarSet Source # 
Instance details

Defined in Agda.TypeChecking.Serialise.Instances.General

Null VarSet Source # 
Instance details

Defined in Agda.Utils.Null

Binary VarSet Source # 
Instance details

Defined in Agda.Utils.VarSet

Methods

put :: VarSet -> Put #

get :: Get VarSet #

putList :: [VarSet] -> Put #

NFData VarSet Source # 
Instance details

Defined in Agda.Utils.VarSet

Methods

rnf :: VarSet -> () #

Monoid VarSet Source #

mempty is empty.

Instance details

Defined in Agda.Utils.VarSet

Semigroup VarSet Source #

(<>) is union

Instance details

Defined in Agda.Utils.VarSet

Show VarSet Source # 
Instance details

Defined in Agda.Utils.VarSet

Eq VarSet Source # 
Instance details

Defined in Agda.Utils.VarSet

Methods

(==) :: VarSet -> VarSet -> Bool #

(/=) :: VarSet -> VarSet -> Bool #

Ord VarSet Source # 
Instance details

Defined in Agda.Utils.VarSet

Hashable VarSet Source # 
Instance details

Defined in Agda.Utils.VarSet

Methods

hashWithSalt :: Int -> VarSet -> Int #

hash :: VarSet -> Int #

IsVarSet () VarSet Source # 
Instance details

Defined in Agda.TypeChecking.Free

Methods

withVarOcc :: VarOcc' () -> VarSet -> VarSet Source #

Singleton Int VarSet Source # 
Instance details

Defined in Agda.Utils.Singleton

Methods

singleton :: Int -> VarSet Source #

Construction

empty :: VarSet Source #

The empty variable set.

singleton :: Int -> VarSet Source #

A variable set with only a single entry.

fromList :: [Int] -> VarSet Source #

Construct a set of variables from a list of indices.

fromDescList :: [Int] -> VarSet Source #

Construct a set of variables from a list of indices sorted in descending order. This can be more efficient than fromList, as we only need to perform a single to determine that elements of the list are smaller than 64.

The precondition that the variables are sorted in descending order is not checked by fromDescList.

full :: Int -> VarSet Source #

Construct a variable set that contains the indices [0..n-1].

range :: Int -> Int -> VarSet Source #

Construct a variable set that contains the indices [lo+1..hi-1].

Insertion

insert :: Int -> VarSet -> VarSet Source #

Insert an index into a variable set.

inserts :: [Int] -> VarSet -> VarSet Source #

Insert a list of variables into a VarSet.

insertsDesc :: [Int] -> VarSet -> VarSet Source #

Insert a list of variables sorted in descending order into a VarSet. This can be more efficient than inserts, as we only need to perform a single to determine that elements of the list are smaller than 64.

The precondition that the variables are sorted in descending order is not checked by insertsDesc.

Deletion

delete :: Int -> VarSet -> VarSet Source #

Delete an index from a variable set.

Queries

null :: VarSet -> Bool Source #

Is this the empty set?

member :: Int -> VarSet -> Bool Source #

Is a variable a member of a var set?

lookupMin :: VarSet -> Maybe Int Source #

Find the smallest index in the variable set.

lookupMax :: VarSet -> Maybe Int Source #

Find the largest index in the variable set.

size :: VarSet -> Int Source #

The number of entries in the variable set.

disjoint :: VarSet -> VarSet -> Bool Source #

Check if two variable sets are disjoint.

isSubsetOf :: VarSet -> VarSet -> Bool Source #

isSubsetOf vs1 vs2 determines if vs1 is a subset of vs2.

inRange :: Int -> Int -> VarSet -> Bool Source #

inRange lo hi vs determines if all the variables in vs are within the range [lo+1 .. hi-1].

Combining variable sets

union :: VarSet -> VarSet -> VarSet Source #

Union two variable sets.

unions :: Foldable f => f VarSet -> VarSet Source #

Union a collection of variable sets together.

intersection :: VarSet -> VarSet -> VarSet Source #

Take the intersection of two variable sets.

difference :: VarSet -> VarSet -> VarSet Source #

Set difference of two variable sets.

(\\) :: VarSet -> VarSet -> VarSet Source #

Infix operator for difference.

complement :: Int -> VarSet -> VarSet Source #

Take a relative complement of a variable set.

Filters

split :: Int -> VarSet -> (VarSet, VarSet) Source #

Split a variable set into elements that are strictly smaller than n and elements that are strictly larger.

Note that this is strict in both components of the pair. If you only need one half, use filterLT or filterGT.

filterLT :: Int -> VarSet -> VarSet Source #

Filter a variable set to elements that are less than to an index.

filterLE :: Int -> VarSet -> VarSet Source #

Filter a variable set to elements that are less than or equal to an index.

filterGT :: Int -> VarSet -> VarSet Source #

Filter a variable set to elements that are greater than to an index.

filterGE :: Int -> VarSet -> VarSet Source #

Filter a variable set to elements that are greater than or equal to an index.

Folds

Like most unordered containers, foldr and foldl are somewhat ambigious names for folds over a VarSet. For the sake of consistency, we adopt the same convention as IntSet, where VarSet.foldr f x = foldr f x . VarSet.toAscList VarSet.foldl f x = foldl f x . VarSet.toAscList

Unfortunately, this convention the opposite of how our sets are laid out in memory.

foldr :: (Int -> a -> a) -> a -> VarSet -> a Source #

Lazily fold over the elements of a variable set in ascending order.

foldl :: (a -> Int -> a) -> a -> VarSet -> a Source #

Lazily fold over the elements of a variable set in descending order.

This does not have the same deficiencies foldl, as we can start the fold at the back of the VarSet.

foldr' :: (Int -> a -> a) -> a -> VarSet -> a Source #

Strictly fold over the elements of a variable set in ascending order.

This does not have the same deficiencies foldr', as we can start the fold at the back of the VarSet.

foldl' :: (Int -> a -> a) -> a -> VarSet -> a Source #

Strictly fold over the elements of a variable set in descending order.

Views

minView :: VarSet -> Maybe (Int, VarSet) Source #

Retrieve the smallest index in the variable set along with the set sans that element, or Nothing if the set was empty.

maxView :: VarSet -> Maybe (Int, VarSet) Source #

Retrieve the smallest index in the variable set along with the set sans that element, or Nothing if the set was empty.

Contextual operations

strengthen :: Int -> VarSet -> VarSet Source #

Remove the first n entries from the variable set, and shift the indices of all other entries down by n.

weaken :: Int -> VarSet -> VarSet Source #

Shift all indices in the variable set up by n.

Conversions

toDescList :: VarSet -> [Int] Source #

Convert a variable set to a descending list of indices.

toAscList :: VarSet -> [Int] Source #

Convert a variable set to an ascending list of indices.