## » Secret Santas in Haskell III: Lather, Rinse, Repeat #

*Paul R. Brown*@ 2006-12-22

In Part I and Part II, I introduced the problem and observed some properties of the problem. This part exploits some of those observations and describes a straightforward but possibly (probably) inefficient solution.

### No Experimenting on People

Experimenting with `Participant`

s is awkward, and I introduced a `Marked`

class in Part I in the interest of working with easier-to-manipulate data:

class Marked a where marker :: a -> String

to that end, I'd like to use the classic combinatorial experimental subject of colored balls instead:

data Ball = Red | Blue | Green | Purple | Orange deriving (Show,Enum,Bounded,Eq) instance Marked Ball where marker b = [(head.show) b]

The following convenience functions provide either a randomly colored ball or a list of randomly colored balls:

ball :: Int -> Ball ball = toEnum random_ball :: IO Ball random_ball = do { r < newStdGen ; let (x,_) = randomR (fromEnum(minBound::Ball), fromEnum(maxBound::Ball)) r ; return (toEnum x) } random_balls :: Int -> IO [Ball] random_balls n = do { r < newStdGen ; let rs = randomRs (fromEnum(minBound::Ball), fromEnum(maxBound::Ball)) r ; return (map toEnum (take n rs)) }

For example, in `ghci`

:

Main> :load secretsantas.lhs [1 of 1] Compiling Main ( secretsantas.lhs, interpreted ) Ok, modules loaded: Main. *Main> x < random_ball Green *Main> y < random_balls 10 [Blue,Purple,Red,Purple,Blue,Purple,Green,Blue,Blue,Orange] *Main> concat (map marker y) "PPBORGOPOR"

The "`<-`

" notation at the prompt is for binding a symbol to the result of an action in the `IO`

monad. (This is explained in the `ghci`

documentation.)

### Idea of the Solution

Let's start with a raw list of marked items:

The first step is to gather together the consecutive entries with the same marker, like so:

Next, take one element from each segment and put the resulting list aside. Combine the leftovers together, eliminating any gaps and combining any adjacent groups with the same marker. Note that we're working on a circular list, so the leftmost and rightmost groups are adjacent. Continue until either all of the groups are gone or there is only a single group (all one marker) remaining. Here's a pictorial example:

In steps:

- Collect one element from each group. No leftover groups are empty, so continue.
- Collect one element from each group.
- One leftover group is empty, so eliminate it. No adjacent leftover groups are the same marker, so continue.
- Collect one element from each list.
- Two leftover groups are empty, and once they are removed, the adjacent groups have the same marker (orange), so combine them.
- Collect one element from each group.
- Two leftover groups (purple and rightmost light green) are empty, so eliminate them. No adjacent groups are the same marker, so continue.
- Collect one element from each group. No leftover groups are empty, so continue.
- Collect one element from each group. The one leftover group consists of elements all with the same marker.

With the collected groups assembled, the last group of leftovers either fits or does not, and in this case, it's works out:

If it hadn't worked out, then there would have been no solution.

### The Grouping Step in Haskell

Haskell has a well-chosen selection of list operations, so this is almost a one-liner:

segments :: Marked a => [a] -> [[a]] segments = groupBy (\ x y = (marker x == marker y))

A quick note on notation: the backslash notation (it's supposed to look like a λ) is Haskell's syntax for inline function definition, and the expression defines the segments function by currying the first argument of `groupBy`

with the function defined in-line. This isn't quite right, however, as the `marker`

of the first group and the `marker`

of the last group might be the same, so what we really want is a version that simulates grouping on a circular list:

align :: Marked a => [a] -> [a] align x = let q = marker(last x) in f(g (q,[],x)) where f (q,[],y) = y f (q,y,z) = z ++ (reverse y) g (q,y,[]) = (q,y,[]) g (q,y,z@(w:ws)) | marker w == q = g (q,w:y,ws) g (q,y,z) = (q,y,z) segments :: Marked a => [a] -> [[a]] segments = (groupBy (\x y -> (marker x) == (marker y))).align

This is both a little ugly and has the desired behavior:

*Main> y < random_balls 10 [Blue,Green,Orange,Purple,Green,Blue,Blue,Orange,Blue,Blue] *Main> segments y [[Green],[Orange],[Purple],[Green],[Blue,Blue],[Orange],[Blue,Blue,Blue]]

It also behaves correctly for the two edge cases:

*Main> let y = [Blue,Blue,Blue] *Main> align y [Blue,Blue,Blue] *Main> let y = [Purple] *Main> align y [Purple]

But what about the other edge case, `align []`

? The initial definition of `q`

looks like it should cause an exception, as `last []`

is undefined. (Actually, `[]`

isn't sufficiently specific to allow Haskell to evaluate it in `ghci`

; the right thing is to qualify it as an empty list of `Marked`

items, i.e., `[]::[Marked]`

.) We could (and probably should) specify the behavior of `align []`

, but laziness means that the definition of `q`

is never evaluated in this case:

align [] = f(g (q,[],[])) = f((q,[],[])) = []

### The Extraction and Simplification Step in Haskell

Back to the problem at hand, so long as the collected list has more than one entry, one element from each of the lists would meet the secret santa constraint because no two consecutive elements have the same `marker`

, by construction. Stripping an element from each segment gives us a valid list fragment and a new list of segments, some of which may be empty. Implementing the "remove empties and combine common" step in Haskell is equivalent to flattening the leftover lists and re-executing the `segment`

operation. Here's one function that both captures the first item in each group and prepares a new set of groups:

reap :: Marked a => [[a]] -> ([a],[[a]]) reap x = ((map head) x,(segments (concat ((map tail) x))))

And this works more or less as it should (with some line breaks added for legibility):

*Main> y < random_balls 25 [Green,Green,Green,Blue,Red,Purple,Purple,Purple,Orange,Purple, Orange,Purple,Blue,Green,Blue,Purple,Orange,Purple,Red,Blue, Blue,Green,Orange,Green,Blue] *Main> reap (segments y) ([Green,Blue,Red,Purple,Orange,Purple,Orange,Purple,Blue,Green, Blue,Purple,Orange,Purple,Red,Blue,Green,Orange,Green,Blue], [[Green,Green],[Purple,Purple],[Blue]])

We're almost ready for a recursive solution, but not quite. We need to be able to collect the good lists reaped at each pass, and this uses the observation from Part II about combining orbits together. Combination of orbits in Haskell:

compatible :: Marked a => [a] -> [a] -> Bool compatible x y = ((marker (head x)) /= (marker (last y))) && ((marker (head y)) /= (marker (last x))) combine :: Marked a => [a] -> [a] -> [a] combine [] y = y combine x [] = x combine x y | compatible x y = x ++ y combine x y = (reverse x) ++ y

And now a slightly fancier, recursive version of the `reap`

function:

reap :: Marked a => ([a],[[a]]) -> ([a],[[a]]) reap (w,[]) = (w,[]) reap (w,x) | length x == 1 = (w,x) reap (w,x) = reap (w `combine` ((map head) x), (segments (concat ((map tail) x))))

*Main> y < random_balls 25 [Green,Orange,Purple,Green,Green,Purple,Orange,Blue,Blue, Green,Red,Green,Blue,Green,Purple,Orange,Blue,Blue,Orange, Purple,Green,Purple,Green,Purple,Purple] *Main> reap ([],segments y) ([Green,Orange,Purple,Green,Purple,Orange,Blue,Green,Red, Green,Blue,Green,Purple,Orange,Blue,Orange,Purple,Green, Purple,Green,Purple], [[Green],[Blue,Blue],[Purple]]) *Main> reap it ([Green,Orange,Purple,Green,Purple,Orange,Blue,Green,Red, Green,Blue,Green,Purple,Orange,Blue,Orange,Purple,Green, Purple,Green,Purple,Green,Blue,Purple], [[Blue]])

### Intermingling the Leftover Items in Haskell

Assuming that `reap`

is run until either there are no leftovers or all of the leftovers are the same color, the next piece is a function that `Maybe`

intermingles the leftover group (one `Blue`

ball in the above example) into the list in the first coordinate. The `intermingle`

function below models the circular list as two halves, and moving one place in the circular list is modeled as moving the head of the third coordinate to the head of the second coordinate; the list of leftovers is passed in the first coordinate.

intermingle :: Marked a => ([a],[a],[a]) -> ([a],[a],[a]) intermingle ([],y@(y1:y1s),y2) = ([],reverse(y) ++ y2,[]) intermingle ([],[],y) = ([],y,[]) intermingle ((x:xs),y,[]) = ((x:xs),y,[]) intermingle (x:xs,[],y@(y1:ys)) | (marker x /= marker y1) && (marker x /= marker (last y)) = (xs,[],x:y) intermingle ((x:xs),y@(y1:y1s),(y2:y2s)) | (marker x /= marker y1) && (marker x /= marker y2) = intermingle (xs,y2:(x:y),y2s) intermingle (x,y,(y2:y2s)) = intermingle (x,y2:y,y2s)

With the example from above with the one leftover `Blue`

, `intermingle`

returns the one `Blue`

at the head of the list, as the third rule would define:

*Main> intermingle ((snd it)!!0,[],fst it) ([],[Blue,Green,Orange,Purple,Green,Purple,Orange,Blue,Green, Red,Green,Blue,Green,Purple,Orange,Blue,Orange,Purple,Green, Purple,Green,Purple,Green,Blue,Purple], [])

The `intermingle`

function also does the right thing when there is no solution:

*Main> intermingle ([Red,Red,Red],[],[Green,Green]) ([Red],[Green,Red,Green,Red],[])

### Putting it All Together

Now we can pull the pieces together into a function that solves the original problem:

result :: Marked a => ([a],[a],[a]) -> Maybe [a] result ([],y,[]) = Just y result ((x:xs),_,[]) = Nothing result (_,_,z) = error "Not expecting to see data in the third coordinate." santa_solve :: Marked a => [a] -> [a] santa_solve x = case outcome of { Nothing -> error "No solution." ; (Just w) -> w } where reaped = reap ([],segments x) leftovers = concat(snd reaped) partial_solution = fst reaped outcome = result (intermingle (leftovers, [], partial_solution))

Our <50 lines of Haskell work like a charm:

*Main> santa_solve [Red,Green,Blue,Red,Red] *** Exception: No solution. *Main> y < random_balls 300 **Main> concat ((map marker) (santa_solve y)) "OROGORPBPGRGRPBGRPGOBGORPOBGOBPGPORGBORGBOPRGBPRGBGOPOPGOGP RBRGRGBOGBPRPRPRPOGBPGPGPGORBOBGRBPBPRGBRPGBROBRBGPRPORGPRG PBGPGORGBPORBRPRPGPGORORGBPGPGOBRPGROPOROGOROBPGBGBGBOGOPOB ORBPOPORPRPRPOGOPBPOGROBOGBPOGOPGPBRPGPGBROBPBRGBOBPRBPGOGP RPOPROBPRORGORPGPBRGRPRBOROBOROPBRPRBRGORGBPGRGPOPOGBPGPOGP GPOGP"

Including the data structure and parsing code for `Participant`

s from Part I (but not including the code to generate random lists of `Ball`

s), a nicely-formatted Haskell solution weighs in at under 100 lines without sending email.

This implementation is more than fast enough for most practical purposes, but it's still *far* from optimal. On my development box, running a compiled `santa_solve`

on 10^{6} balls takes 1-2 seconds, and processing 10^{7} takes ~50 seconds. If I have time before I get distracted, I'll come back to a more efficient approach. (For what it's worth, one path to a more efficient solution would be to sort the initial list and do away with the successive flattening and grouping.)