Monday, March 30, 2009

Haskell Series Kickoff

I know many multilingual people but most only speak one language... programmers usually know multiple computer languages. It's good to know more than one programming language, but I recommend picking one from a paradigm you don't already know. There is a fair chance you have some familiarity with an imperative and/or object-oriented language such as: C, Java, C++, Pascal, C#, and many more. Until a couple years ago I was in the same boat, then I read the wonderful book The Structure and Interpretation of Computer Programs (aka the wizard book).

The wizard book uses a LISP-dialect called Scheme which is in the functional language paradigm. Another popular functional language is Haskell, this is my favorite functional language. I learned Haskell by reading A Gentle Introduction to Haskell and doing (most of) the Ninety-Nine Haskell Problems along the way. Adding these two resources with a decent amount of time proved to be a winning formula for learning Haskell.

I've always enjoyed programming with Haskell and using my new Haskell-inspired problem solving skills in other languages as well. I figured others might feel the same way so this post is the kickoff for a series on Haskell. I'll take you through the previously mentioned resources and add my own perspectives here and there.

I'll try to get a post up every week or so and I'd really appreciate your feedback so please leave comments. The first post will focus on the fundamentals of the Haskell type system.

Sunday, March 15, 2009

Using Template Specialization for Multi-format IO

I get the feeling that C++ templates have a reputation for being too difficult. That is how I felt about templates at first too. I think that everything seems difficult at first, that's because you don't understand it yet. You just need to dig in and eventually a lightbulb clicks on and it feels so natural all of the sudden because you understand it. A while I deiced to learn the STL and that is when I understood templates. Later on I even developed a pretty handy technique for doing mutli-format object serialization using templates.

The main idea is that you have a type for each format you want to support and then write a template specialization for the stream extraction (>>) and stream insertion (<<) operators for each type. Check it out:


#include <iostream>
#include <fstream>
#include <string>
using namespace std;

class oxmlstream : public ofstream {};
class ojsonstream : public ofstream {};

class contact
{
public:
contact(string name, string address)
{
_name = name;
_address = address;
}

string get_name() const {return _name;}
string get_address() const {return _address;}

private:
string _name;
string _address;
};

template<typename stream_type>
stream_type& operator<<(stream_type& stream, const contact& contact)
{
stream << contact.get_name() << ' ' << contact.get_address();
return stream;
}

template<> oxmlstream& operator<<(oxmlstream& xml, const contact& contact)
{
xml << "<contact>" << endl;
xml << " <name>" << contact.get_name() << "</name>" << endl;
xml << " <address>" << contact.get_address() << "</address>" << endl;
xml << "</contact>";

return xml;
}

template<> ojsonstream& operator<<(ojsonstream& json, const contact& contact)
{
json << "{" << endl;
json << " \"name\": \"" << contact.get_name() << "\"," << endl;
json << " \"address\": \"" << contact.get_address() << '\"' << endl;
json << "}";

return json;
}

int main(int argc, char* argv[])
{
contact dave("dave", "dave's address");
cout << dave << endl;

oxmlstream xout;
xout.open("dave.xml");
xout << dave << endl;
xout.close();

ojsonstream jout;
jout.open("dave.json");
jout << dave << endl;
jout.close();

return 0;
}


I skipped the corresponding stream extraction (input) overloads here but I think this should communicate the main idea. This keeps the input/output code outside of the class definition so you can focus on the information and behavior while working on the class. You can tackle the IO via the operator overloads after you've got the class working properly, and you are able to incrementally add support for the latest and greatest format (MGraph anyone).

Sunday, March 1, 2009

Function Curry

Curry is good spicy food and it is also a handy programming technique. You can curry a function to produce a new function that has one less parameter... that is, the arity is reduced by one. Consider the following binary function:

add x y = x + y

If you call this function with two arguments, it will evaluate to the sum of the numbers as expected. What do you think should happen when you call the function with just one argument like this:

add 1

Should this be an error or is there a meaningful way to evaluate this? It doesn't make any sense to try adding 1 to nothing so we can't do that. What if we consider the type of the result, instead of a number why not return a function? This new function should be based on the add function but somehow incorporate that argument 1 into it. Why not fix the first parameter to be 1 and the resulting function would just have a single argument... essentially we create an increment function:

increment = add 1
increment 3 -- evaluates to 4

This is how you curry a function... it's pretty straightforward with Haskell, but how could we do this in C#? Consider the following extension method:

static class FuncExtensions
{
public static Func<B, R> Curry<A, B, R>(this Func<A, B, R> f, A x)
{
return y => f(x, y);
}
}

You can apply this extension to a binary function and use resulting "curried" unary function. Now we can write the previous Haskell code in C#:

class Program
{
static void Main(string[] args)
{
Func<int, int, int> add = (x, y) => x + y;
Func<int, int> increment = add.Curry(1);

var z = increment(3); // evaluates to 4
}
}

Currying is a way to build new functions that are based on existing ones, this allows for better code reuse and supports a stratified design.