ThinkGeek - Cool Stuff for Geeks and Technophiles

I’ve known about the new dynamic keyword in C# 4 for about a year now but really haven’t thought much about it. It’s suppose to be syntactic sugar for dealing with things like COM interop and objects from other DLR languages. This can be done now by using the existing reflection framework/library, but it can be tedious dealing with method name strings and the chain of method calls needed to get to the method invocation. Variables declared as dynamic bypass static type checking.

I came to the realization today this new keyword might well trigger the downfall of western civilization. Not only can this keyword be used for local variables, but also method parameters and return values. If you have experience working in Corporate America, you know that the there are plenty of developers out there who will abuse this keyword to no end. Can you imagine a ginormous, multi-thousand source file code base littered with dynamic!!! Methods that return dynamic!!! Methods that take dynamic!!! Does this not frighten anybody? I’m so scared. Am I alone? Can we stop this?

Pipelining can be a useful operation when you need to break up code into several steps, perhaps for readability. Typically this is done to avoid a huge mess of nested functions: f(g(h(i(j(k(l(x)))))). Without pipelining you typically need to assign the various steps to local variables. You can get pipelining in C# by extending object with this extension method:

public static TResult Pipe<T, TResult>(this T obj, Func<T, TResult> f)
{
   return f(obj);
}

Example calculating standard deviation with and without pipelining:


List<double> values = new List<double>() { 1, 7, 8, 9, 10, 100, 1000, 1001, 100000 };

double average = values.Average();
double totalVariance = 0;
foreach (double value in values)
{
   totalVariance += Math.Pow(value - average, 2);
}

//OR you could do this:
//totalVariance = values.Aggregate(0.0, (variance, val) => variance + Math.Pow(val - average, 2));

double stdDeviation = Math.Sqrt(totalVariance / values.Count);

//Now with pipe
stdDeviation = values
   .Pipe(v => v.Average())
   .Pipe(avg => values.Aggregate(0.0, (variance, val) => variance + Math.Pow(val - avg, 2)))
   .Pipe(totVariance => Math.Sqrt(totalVariance / values.Count));

Even int gets Pipe():

(2)
   .Pipe(i => Math.Pow(i, 42))
   .Pipe(i42 => Math.Sin(i42));

The benefit, as far as I’m concerned, is avoiding uncessary mutable variables in the function scope (or at least from leaking out to where they don’t need to be).

You could imagine my excitement when I read the following on the Google Closure page:

“It also also checks syntax, variable references, and types, and warns about common JavaScript pitfalls. These checks and optimizations help you write apps that are less buggy and easier to maintain”

But then I tried the compiler and it thinks this code is dandy:

// ==ClosureCompiler==
// @compilation_level ADVANCED_OPTIMIZATIONS
// ==/ClosureCompiler==

function upper(s)
{
return s.toUpperCase();
}

function add(i)
{
return i + 1;
}

var i = add(new Array());

var s = upper(new Array());

alert(s);
alert(i);

Dynamic languages are evil. They cannot be tamed by mankind.

If you’ve used LINQ and lambdas, I’m sure you’ve come across the occasional function that requires an implementation of IEqualityComparer,or IComparer. You were hoping to write a little Lambda predicate but NOOOOOO, now you have to create a new type and implement an interface. Well no longer. I’m hoping these are the last two classes, in the history of mankind, that inherit from IComparer and IEqualityComparer:

public class Comparer<T> : IComparer<T>
{
   private Func<T, T, int> _compareFn;

   public Comparer(Func<T, T, int> fn)
   {
      _compareFn = fn;
   }

   public int Compare(T x, T y)
   {
      return _compareFn(x, y);
   }

}

public class EqualityComparer<T> : IEqualityComparer<T>
{
   private Func<T, T, bool> _equalsFn;
   private Func<T, int> _getHashCodefn;

   public EqualityComparer(Func<T, T, bool> equalsFn, Func<T, int> getHashCodefn)
   {
      _equalsFn = equalsFn;
      _getHashCodefn = getHashCodefn;
   }

   public bool Equals(T x, T y)
   {
      return _equalsFn(x, y);
   }

   public int GetHashCode(T obj)
   {
      return _getHashCodefn(obj);
   }
}

2 Examples:

List<int> l = new List<int> { 1, 2, 5, 7, 999, 234, 4 };
l.Sort(new Comparer<int>((x, y) => x < y ? -1 : x == y ? 0 : 1));

Dictionary<int, string> d = new Dictionary<int, string>() { { 1, "a" }, { 2, "a" }, { 3, "b" } };

var d2 = d.Distinct(new EqualityComparer<KeyValuePair<int, string>>(
   (kvp1, kvp2) => kvp1.Value == kvp2.Value, kvp => kvp.Value.GetHashCode()));

It’s been one year since we decided to switch to Git from Visual Source Safe. We’ve had up to 6 developers using Git with our .NET Visual Studio solution which consists of 31 project and 3000+ source files(C#, F#, ascx, etc…). Git is an amazing piece of software. If you are not using it, stop what you are doing, download it, and start learning it. I’ve used Perforce, MKS, Mercurial, and CVS/SVN, but it would be very sad for me to go back to using something other than Git.

Before I get more into Git, I want to get one thing straight: Source Safe is not source control software. At best, it’s a fat random byte generation layer on top of the file system or network stack. Calling it source control is the “biggest scam perpetrated on the American public since One Hour Martinizing”.

Git does not have the notion of a central repository built into it, it is distributed. You work directly on a repository. Usually this is your own local repository. You commit changes locally. You have all your history and branches locally. Branches can be shared across repositories by pushing and pulling. Git comes with its own network protocol for doing this, but there are other protocols that can be used like SSH. By starting with this distributed architecture, the very notion of having your own repository means you are working in your own seperate branch (a local branch can be tracked to a branch in a remote repository). The result: branching is EASY. It’s built in. It’s part of everything you do.

Just because it’s distrubited doesn’t mean you can’t work with a repository that is considered the “central” or master repo. This is what we do. We all cloned from the master repo. Our master branch is “tracked” to the master branch in the master repo. We work locally, commit, and push our changes to the central repo (you might have to pull first before doing a push if you’ve changed files that someone else has). We create other branches and push them to the master branch so that they are available to other developers when they do a pull. Merging is done on pulls automatically. Everything just works.

I’m not going to give you a course on Git here, there are other good sources for that, but I will give you some starter tips:

  • Get the msys Git for windows. Cygwin is evil.
  • Get git extensions. It adds git functionality to file explorer and Visual Studio.
  • Git, by default, considers all the bytes within all the files in the file system tree starting at the top level repository directory part of the repository. Any bytes in that file tree are fair game. This behavior can be changed. See below.
  • Don’t think of the repositories as files and directories. Think a big byte blog. Changes that happen are not happening in files, they are happening somewhere in this big ball of bytes.
  • Tags, branches, and commits are all the same thing: references to some point in the repository. Branches have a different work flow.
  • For visual studio projects, we use a .gitignore file with these contents:
    *.pdb
    *.dll
    *.exe
    *.suo
    *.scc
    *.vspscc
    *.user
    *.cache
    *.log
    *.rdl.data
    bin/
    obj/
  • For merging:
    1. Download p4merge.
    2. Create a batch file called p4diff.bat with the following contents and put it in the c:\windows dir:

    p4merge %2 %5

    3. Add the following to the end of your .gitconfig file under c:\Documents and Settings\Username
    [mergetool "p4merge"]
    cmd = p4merge $BASE $REMOTE $LOCAL
    [merge]
    tool = p4merge
    [diff]
    external = p4diff.bat

    Now, when you call git mergetool p4merge will be used.

Don’t let the name scare you. A monad is just something that represents one or more computations, a workflow. The need for monads came up in pure functional programming languages like Haskell where given a set of inputs, a function should return the same value every time. But what happens when you have a function that takes a filename and line number and returns a byte array of the data on that line in the file? Might it not return a different byte array between subsequent calls if the file changed? The solution: instead of executing the operations to read the file and return the data, the function creates a workflow(monad) that will do that. So what comes back from the function is a workflow, something that represents the operations to read from the file. In typical pragmatic form, Microsoft decided to use a better name for this in F#: Workflows. The stuff in the Workflow is called the compuation expression. These names make sense.

It turns out workflows are useful for other things like building Sequence Expressions and Async Workflows. The advantage is a syntactic sugar for creating what is usually a gobbledygook of disparate pieces of code to accomplish the same thing. You can create your own workflows by creating a type with a couple of important mothods: Bind, Delay, and Return. I really recommend watching Luca Bolognese’s presentation on F#. Its worth watching for the section where he discusses Async Workflows. Pretty amazing stuff.

If you come from the Lisp world, F# Sequence Expressions will seem familiar to you. In Clojure and Lisp they are called List Comprehensions but are essentially a specialized Monad/Workflow.

Are you tired of doing the if(dict.Contains(key))… pattern? Extend Dictionary:

public static R ValueOrSomethingElse<K, V, R>(this Dictionary<K, V> Col, K Key, Func<V, R> Transform, Func<R> SomethingElse)
{
   if (Col.ContainsKey(Key))
      return Transform(Col[Key]);
   else
      return SomethingElse();
}

Example usage:

Dictionary<string, DateTime> BDays = new Dictionary<string, DateTime>();
...
BDays.ValueOrSomethingElse("Mark", d => d.ToString("MM/dd/yy"), () => "Unknown BirthDate");

The second parameter is a lambda for transforming the value if the key is found. The third parameter is a lambda to execute and return if the key is not found. I have the third parameter(something else) as a lambda in case the “something else” is an expensive operation.

Over the years, I had heard much about Lisp. Articles I had read, or colleagues who knew it, would espouse the benefits of learning it or using it. Promised results ranged from making you a better imperative programmer to the highest intellectual enlightenment that could be achieved. So I dove into Lisp. Except it wasn’t diving. Well it was like diving if the body of water you are diving into is covered with 18 inches of ice, and you need to use a sledge hammer to make a hole to dive into. And then your mind and body react to diving into close to freezing water. If you come from a strictly imperative programming background, learning Lisp and/or Functional Programming can be a battle. I’m pretty sure it’s easier to learn Lisp and FP if you don’t know anything about programming.

So I started reading bits and pieces about Lisp, but without actually coding in it, I really didn’t get it. I needed to code. But what were the options? Islands. Islands with libraries that promised this and that but still little islands. Then I met Rich Hickey and his language Clojure. Clojure is a much improved dialect of Lisp that compiles down to JVM or CLR byte code. Continents. Continents with lots of libraries and documentation. I could now easily connect to SQL Server and do some real world stuff. Yes, Clojure is a dynamic language, and I’ve mentioned I prefer static languages, but Clojure is filled with awesome.

Although Clojure along with other Lisps are functional programming languages, Lisps have a very unique property that other common FP languages don’t have: Homoiconicity. Thats fancy talk for saying the code and data have the same shape. Data structures and code are represented the same way: s-expressions. This is what drives the powerful Lisp macro system. The macro system is what makes the language changeable. If there is a feature you want that’s not in the language, you can add it. This point was really driven home to me when I learned about the F# pipeline operator: |> . I thought, see this is a nice feature that Clojure doesn’t have. Until I realized that someone had built a pipeline macro for Clojure. This is from core.clj:

(defmacro ->
  "Threads the expr through the forms. Inserts x as the
  second item in the first form, making a list of it if it is not a
  list already. If there are more forms, inserts the first form as the
  second item in second form, etc."
  ([x form] (if (seq? form)
              `(~(first form) ~x ~@(next form))
              (list form x)))
  ([x form & more] `(-> (-> ~x ~form) ~@more)))

If F# didn’t have |> there would be nothing you could do [Update: That is not true. As Brian pointed out in the comments, F# allows you to declare infix operators.]. Don’t get me wrong, I’m a HUGE fan of F#, it just doesn’t have a macro system. But it is statically typed :)

You need a lazy sequence of something:

public static IEnumerable<T> Range<T>(T Start, T End, Func<T, T> inc) where T : IComparable<T>
{
   T result = Start;
   while (result.CompareTo(End) <= 0)
   {
      yield return result;
      result = inc(result) ;
   }
}

Need a sequence of integersĀ up to a million:

Funcs.Range(0, 1000000, (j => j + 1));

Need an infinite list of uppercase letters:

Funcs.Range('A', '[', (c => c >= 'Z' ? 'A' : (char)(c + 1)));