List Grouping module released

14/08/2009

EDIT: Don’t use this package, but use instead Data.List.Split by Brent Yorgey. I didn’t see that a package like his existed! This module will hopefully be removed from hackage if they can do that.

I just finished the initial release of a simple module called list-grouping that contains functions to partition a list into sub-lists in various ways, based on some predicate or integer offset. Functions like these are a little awkward to write and I was surprised when I didn’t see anything on hackage!

Check out the package description and install it with:

$ cabal install list-grouping

Here is an example from a previous post to build a binary tree from an in-order list, which uses the above library:


module Main
    where

import Data.List.Grouping

data Tree a = Node a (Tree a) (Tree a)
            | End
             deriving Show

fromSorted :: [a] -> Tree a
fromSorted = foldl mkTree End . splitWith [2^| n<-[0..]]
    where mkTree l (n:ns) = Node n l (fromSorted ns)

I’m sure the functions can be made more efficient, to take advantage of fusion or what-not, and I hope the library will eventually contain the most efficient implementations possible.

I also am looking for suggestions for other useful list grouping functions to include. Send your suggestions along! You can get the darcs source with:

$ darcs get http://coder.bsimmons.name/code/ListGrouping/

2 Comments

Run-length Encoding

26/05/2009

Just a quick implementation of a RLE algorithm for lists in haskell. We compress a list by converting “runs” of consecutive elements into a tuple of the form: (run_length, element).


import Data.List (group)
import Control.Arrow


runLengthEnc :: Eq a => [a] -> [(Int,a)]
runLengthEnc = map (length &&& head. group

decode :: [(Int, a)] -> [a]
decode = concatMap (uncurry replicate)

If the &&& combinator looks foreign to you, check out David R. Maciver’s very enlightening blog about Arrow functions.

I’m always curious to see how naive-looking functions like the above compare in performance to a from-scratch implementation with explicit recursion, so I came up with the following:


runLengthEnc' :: Eq a => [a] -> [(Int,a)]
runLengthEnc' [] = []
runLengthEnc' (a:as) = run 1 a as
    where run n x [] = [(n,x)]
          run n x (x2:xss) | x == x2   = run (n+1) x xss
                           | otherwise = (n,x) : run 1 x2 xss

I tested both functions on a list of 100,000 random 1s and 0s and found the explicit version to be only marginally better performing, completing the list in 49 ticks & 130Mb, compared to 54 ticks & 139 Mb for the one-liner!

2 Comments

Gnome Sort

11/05/2009

In my boredom, I implemented the toy sorting algorithm called Gnome Sort. It accomplishes backtracking by using a zipper-like system of two lists treated as stacks, and is about 100X slower than GHC’s merge sort on the test I used. It’s also not especially small or simple-looking. It’s basically terrible:


gnomeSort :: (Ord a)=> [a] -> [a]
gnomeSort = sort [] 
      where sort rs     []     = rs 
            sort []     (x:xs) = sort [x] xs
            sort (r:rs) (x:xs) 
                   | x > r     = sort rs (x:r:xs)          
                   | otherwise = sort (x:r:rs) xs

The above doesn’t keep track of where it left off before it began to backtrack, an obvious optimization which turned the algorithm into an insertion sort:


import Data.List (insert)

insertSort :: (Ord a)=> [a] -> [a]
insertSort = sort [] 
       where sort rs []     = rs 
             sort rs (x:xs) = sort (insert x rs) xs

about 60X slower than Data.List.sort
.

No Comments