As a part of my general programme of developing a distributed monadic MapReduce implementation in Haskell (as described here), I have been working slowly on the distributed infrastructure required to do this, using CloudHaskell. I have now succeeded in producing a very general proof-of-concept for distributed storage.
This introduces some interesting challenges, as we want the type of the data held in the service to change between rounds of processing, which may cause problems within Haskell’s strict type system. I have found a very simple way around this which is simultaneously a strong enabler for storage services backed by (say) a database engine.
So, I have written a description of how this was done (together with some interesting general observations about the properties of data stores) and made the code available via GIT:
- The paper Distributed Storage with CloudHaskell
- The GIT repository: git://github.com/Julianporter/Distributed-Haskell.git
The paper discusses the type-related problem with message passing between distributed Haskell components at some length, to make it clear why we have to pass messages consisting of concrete types.