Enumerating all n-integer Sets Which Sum to an Integer x

30/07/2009

I came up with what I think is an elegant use of monadic code, while working on a program to compute solutions to the Postage Stamp Problem. The problem asks:

Consider that an envelope is able to hold a limited number of postage stamps. Then, consider the set of the values of the stamps — positive integers only. Then, what is the smallest total postage which *cannot* be put onto the envelope?

The algorithm below takes a number of stamps that can fit and a total postage to apply and returns a list of all the sets of different stamp values that can be combined to form the target postage sum:


> sumLists :: Int -> Int -> [[Int]]
> sumLists = f 0 where
>     f _ 1 n = return [n]  
>     f i c n = do a  <- [i.. div n c] 
>                  as <- f (i+a) (c-1) (n-a)      
>                  return (a : as) 

Thus the potential stamp combinations for a letter which can hold three stamps and costing 5 cents to mail would be:

*Main> sumLists 3 5
[[0,0,5],[0,1,4],[0,2,3],[1,1,3],[1,2,2]]

As you can see, the algorithm also produces ordered lists naturally!

But perhaps all we care about are unique stamp values that. If we want to adjust our goal such that the algorithm returns a list of all sets of unique stamp values sufficient to create the target postage, we need only adjust the return line:


>                  return (a : dropWhile (==a) as) 

Similarly we could drop the zeros (which mean “no stamp” in our case) from the beginnings of each set if we desired. The above code have applications to many similar problems including Subset Sum, and Knapsack as well as other partitioning problems.

No Comments

Fun with Lazy Arrays: the LZ77 Algorithm

18/06/2009

This is my third post investigating compression techniques related to the DEFLATE algorithm: the first on run-length encoding, and the second on simple Huffman Coding. This post models the LZ77 algorithm, the second of the two compression strategies used by DEFLATE, and in the process explores some interesting properties of Haskell’s basic Arrays.

IMPLEMENTATION:


> module LZ77
>     where

we will use GHC's "basic non-strict arrays" for this
experiment:

> import Data.Array


and use Ints to store the length of the entire decoded
message (needed to create our array):

> type Length = Int
> type Offset = Int

in place of the length in the standard length-offset pair
I've decided to use an Int representing the index of the
last element in the span relative to the element at
offset. Thus the pair encoding a span of only one element,
two elements back would be (0,2), rather than the more
traditional (1,2).

This simply makes more sense to me, especially since I want
to be able to encoded reversed sequences, as you will see.

> type Index = Int

and a few simple dataypes for our uncompressed and
compressed data respectively:

> type Decoded a = Array Index a
>
> data Encoded a = 
>     Enc Length [ Either a (Index,Offset) ] 


The decompress function works by traversing the encoded message,
keeping track of our array index position (since offsets are
relative to the current position), and building an Array lazily
from a list which we generate, in part by referencing elements
from the partially generated array itself.

So when we see a Right value we look up in the Array the elements
referenced by the length-offset, concat-ing that list with the
result of processing the rest of the encoded message.

If we hit [] in 'dec' we call an error because the stored value
for the length of the uncompressed message in the Encoded type
was longer than what the 'decompress' function could produce.

It's diffficult to describe, but I hope the code is clear:

> decompress :: Encoded a -> Decoded a
> decompress (Enc el es) = decoded
>     where decoded = listArray (0,el - 1$ dec 0 es
>          
>           dec  _     [] = 
>               error "message is shorter than it should be" 
>
>           dec n (Left x : xs) = 
>               x : dec (n+1) xs
>          
>           dec n (Right (iRel,off) : xs) =
>               let i1 = n  - off
>                   i2 = (if iN > i1 then succ else pred) i1 
>                   iN = i1 + iRel
>                
>                in [ decoded!| i <- [i1, i2 .. iN] ] ++ 
>                   dec (n + 1 + abs iRel) xs


Some interesting things about the code above:

    1) we create an array from a list, which we build,
       in part, by looking up elements from the array
       we are in the process of building.

    2) we can compress a sequence of symbols which were
       seen previously but in reverse order, simply by
       storing a negative relative index in the
       (relative_index,offset) tuple. So, the string...
        
            "her racecar returns to race"
      
       might compress to:
            
            {her race(-5,2)turns to(4,19)}

       I'm not sure if this is useful in real compression,
       especially when it comes down to the binary encoding.

    3) even more interesting, we can use this same decoder
       function to decompress data that matches a sequence
       in parts of the array we haven't built yet! We simply
       use a negative offset in our tuple.

EXAMPLES AND CONCLUSION:


point (3) may or may not be something like the LZ78 algorithm,
which apparently works by encoding future data, but it is
defintely a cool thing to be able to do with arrays:

> coolArray = listArray (0,4
>        [0,  coolArray!4 - 3,  2,  coolArray!1 + 2,  4]


Here's an example with great compression where the relative
index exceeds the offset:

> exceedsOffset = elems $ decompress $
>       Enc 25 [Left 'B'Left 'l'Left 'a'Left 'h'Left ' ',
>               Left 'b'Right (17,5), Left '!']


...and a code example for (2) above:

> reverseReference = elems $ decompress $
>       Enc 27 [Left 'h',Left 'e',Left 'r',Left ' ',Left 'r',
>               Left 'a',Left 'c',Left 'e',Right(-5,2),Left 't',
>               Left 'u',Left 'r',Left 'n',Left 's',Left ' ',
>               Left 't',Left 'o',Right(4,19)]

...and finally an example combining (2) and (3):

> reverseLookAhead = elems $ decompress $ 
>       Enc 5 [Left 1Right (-1,-3), Left 3Left 4]


I was surprised to discover these properties in Haskell's lazy
Arrays. hope they came as a surprise to a few others.

2 Comments

Run-length Encoding

26/05/2009

Just a quick implementation of a RLE algorithm for lists in haskell. We compress a list by converting “runs” of consecutive elements into a tuple of the form: (run_length, element).


import Data.List (group)
import Control.Arrow


runLengthEnc :: Eq a => [a] -> [(Int,a)]
runLengthEnc = map (length &&& head. group

decode :: [(Int, a)] -> [a]
decode = concatMap (uncurry replicate)

If the &&& combinator looks foreign to you, check out David R. Maciver’s very enlightening blog about Arrow functions.

I’m always curious to see how naive-looking functions like the above compare in performance to a from-scratch implementation with explicit recursion, so I came up with the following:


runLengthEnc' :: Eq a => [a] -> [(Int,a)]
runLengthEnc' [] = []
runLengthEnc' (a:as) = run 1 a as
    where run n x [] = [(n,x)]
          run n x (x2:xss) | x == x2   = run (n+1) x xss
                           | otherwise = (n,x) : run 1 x2 xss

I tested both functions on a list of 100,000 random 1s and 0s and found the explicit version to be only marginally better performing, completing the list in 49 ticks & 130Mb, compared to 54 ticks & 139 Mb for the one-liner!

2 Comments