Proposed modification to ‘array’ function in Data.Array

22/09/2009

In Data.Array the function array :: (IArray a e, Ix i) => (i, i) -> [(i, e)] -> a i e takes the array bounds and an association list, and it produces an Array. The function allows you to build an Array by providing the elements in whatever order you choose, rather than from the first index to the last.

The function listArray is similar except that it assumes that the list is already ordered by index so there is no need to provide tuples of (index, value). Becase listArray actually uses zip internally to produce the tupled list, so the programmer can pass a list to listArray which exceeds the declared bounds of the array and the function will quietly drop the extra list tail:

Prelude Data.Array> listArray (1,3) ['a','b','c',undefined]
array (1,3) [(1,'a'),(2,'b'),(3,'c')]

But when a list that exceeds the specified bound is passed to array, it dies a horrible death:

Prelude Data.Array> array (1,3) [(1,'a'),(2,'b'),(3,'c'),(4,'d')]
array *** Exception: Error in array index

I think the array function should be modified by adding a simple take to automatically drop any excess list elements:


array (l,u) ies
    = let n = safeRangeSize (l,u)
      in unsafeArray (l,u)
                     [(safeIndex (l,u) n i, e) | (i, e) <- take n ies]

This change would bring the function into better alignment with other prelude functions which expect infinite lists and drop tails as needed. It can be argued that the programmer might want to know that he is passing too long a list to his function, in which case I think the safeRangeSize function, etc. should be eliminated.

No Comments

List Grouping module released

14/08/2009

EDIT: Don’t use this package, but use instead Data.List.Split by Brent Yorgey. I didn’t see that a package like his existed! This module will hopefully be removed from hackage if they can do that.

I just finished the initial release of a simple module called list-grouping that contains functions to partition a list into sub-lists in various ways, based on some predicate or integer offset. Functions like these are a little awkward to write and I was surprised when I didn’t see anything on hackage!

Check out the package description and install it with:

$ cabal install list-grouping

Here is an example from a previous post to build a binary tree from an in-order list, which uses the above library:


module Main
    where

import Data.List.Grouping

data Tree a = Node a (Tree a) (Tree a)
            | End
             deriving Show

fromSorted :: [a] -> Tree a
fromSorted = foldl mkTree End . splitWith [2^| n<-[0..]]
    where mkTree l (n:ns) = Node n l (fromSorted ns)

I’m sure the functions can be made more efficient, to take advantage of fusion or what-not, and I hope the library will eventually contain the most efficient implementations possible.

I also am looking for suggestions for other useful list grouping functions to include. Send your suggestions along! You can get the darcs source with:

$ darcs get http://coder.bsimmons.name/code/ListGrouping/

2 Comments

directory-tree module released

9/05/2009

I’ve released my first package, up now on hackage (haddock docs should be generated soon). The module provides a simple data structure that mirrors a directory tree, and some useful functions for doing IO on directories of files. You can read more about it here.

It’s very likely there are some bugs, especially related to cross platform issues with file names and paths. The module is also fairly bare, so please send me any requests for functionality that I haven’t thought of, as well as any bugs you might find.

You can install it with:

$ cabal install directory-tree

And get the source with:

$ darcs get http://coder.bsimmons.name/code/DirectoryTree/

I hope this is useful to someone!

UPDATE: I’ve just released v0.9.0 of this package which has a number of sneaky changes and additions, most notably the inclusion of a readDirectoryWithL function that allows for lazy directory reading IO. This will let users work with a DirTree read from disk just as if it was a pure lazy data structure, while IO is done in the background as needed. No more exploding code! (hopefully)

7 Comments