Find a permutation given its Inversion Table

15/11/2009

Here is some code I whipped up in response to an interesting problem posted on reddit.com/r/coding/. It was a really interesting algorithm to work out:



import Data.List (sortBy)

-- takes an inversion list and converts it to the list of permuted
-- elements:
decode = dec [] . sortBy lowHi . zip [1..]  where
    dec as []         = as
    dec as ((a,_):is) = let is' = sortBy lowHi (decrementGT a is)
                         in dec (a:as) is'

-- decrement every inversion number by 1, for every tuple in which
-- the element is greater than `a`:
decrementGT a = map (\(a',i) -> (a', if a'>then i-1 else i) )

-- we sort by the inversion number, low to high. For elements with
-- the same inversion number, we say that the greater element tuple
-- should go before a lesser element:
lowHi (a,i) (a',i') 
    | i == i'   = if a>a' then LT else GT
    | otherwise = compare i i'

It’s not the prettiest code I’ve written, but it’s pretty straight-forward and concise. Thanks for looking!

No Comments

The move-to-front (MTF) Transform

10/11/2009

To follow up my last post about the Burrows-Wheeler Transform, I decided to implement another simple algorithm which is often used after the BWT to help consolidate localized redundancy in the data before entropy encoding.

The idea behind the Move-to-Front algorithm is that we start with some known alphabet (like the list of ASCII characters), and encode our data elements as the index into that alphabet list of each element. The trick is that after each character is encoded, the alphabet list is modified by moving the element whose index we just looked up to the front of the alphabet.

Here are my implementations of encode and decode:


import Data.List


mtf :: (Eq a, Bounded a, Enum a)=> [a] -> Maybe [Int]
mtf = sequence . snd . mapAccumL enc [ minBound.. ]
    where enc l x = (x:delete x l, elemIndex x l)


mtfD :: (Eq a, Bounded a, Enum a)=> [Int-> [a]
mtfD = snd . mapAccumL dec [ minBound.. ]
    where dec l i = let x = l !! i 
                     in (x:delete x l, x)

The mapAccumL function is perfect for passing the state of the dictionary list. Another point of interest to mention is the use of minBound to let the function be polymorphic over any type for which there is a defined order and lowest element.

For example, we can do this:

*Main> mtf “SIX.MIXED.PIXIES.SIFT.SIXTY.PIXIE.DUST.BOXES” >>= return . mtfDec :: Maybe String
Just “SIX.MIXED.PIXIES.SIFT.SIXTY.PIXIE.DUST.BOXES”

or we can decode to Word8 bytes, if we want to get back the ASCII character in that form, just by specifying a different return type:

*Main> :m + Data.Word
*Main Data.Word> mtf “SIX.MIXED.PIXIES.SIFT.SIXTY.PIXIE.DUST.BOXES” >>= return . mtfDec :: Maybe [Word8]
Just [83,73,88,46,77,73,88,69,68,46,80,73,88,73,69,83, 46,83,73,70,84,46,83,73,88,84,89,46,80,73,88,73,69,46, 68,85,83,84,46,66,79,88,69,83]

Let me know your thoughts!

2 Comments

Polishing a Functional Pearl: The Burrows-Wheeler Transform

7/11/2009

Here is a quick post to get me back into the swing of blogging:

I was looking through an old post on StackOverflow about clever functional code, and the best answer, given by “yairchu” was a nice version of the Burrows-Wheeler Transform, which is an algorithm for permuting a string such that it can be compressed more effectively by other algorithms. The code posted was (import Data.List assumed):


bwp :: (Ord a)=> [a] -> [a]
bwp xs = map snd $ sort $ zip (rots xs) (rrot xs)
rots xs = take (length xs) (iterate lrot xs)
lrot xs = tail xs ++ [head xs]
rrot xs = last xs : init xs

I saw I could improve/shorten this in a couple of obvious ways and came up with this:


bwp :: (Ord a) => [a] -> [a]
bwp = map snd . sort . rots 
rots xs = zip (tail $ iterate lrot xs) xs
lrot (x:xs) = xs ++ [x]

Still unsatisfied and even more obsessed I came up with this final, prettiest version, before forcing myself to give it up already:


bwp :: (Ord a)=> [a] -> [a]
bwp = map snd . sort . rots 
rots xs =  zip (lrot xs) xs
lrot = tail . tails . cycle

Unfortunately, this last version will croak if your string happens to look like “111111″ or “cAbcAb” because sort will keep trying to compare infinites lists.

Update: I did a short post on the Move To Front transform as a follow-up to this post.

1 Comment