Nerdy tidbits from my life as a software engineer

Monday, July 27, 2009

The Importance Of Code Organization

If there is one thing I dislike about C#, it’s that it allows you to place definition statements where ever you want to.  While there is technically a similar freedom in C++, the nature of header files and visibility blocks encourages people to group, say, member variables and public methods together.  This encouragement is lost in C# because you have the freedom to scatter your member variables throughout your code files in any manner you want.

To me, this freedom promotes some bad habits which make it difficult to understand and navigate through a large program.  The reason is that if you scatter your member variables, properties, and methods around your source code in a random manner, it becomes difficult to figure out where they are.  The result is a large amount of wasted time as people hunt through your source code to find what they are looking for.

So I have one rule for anything I ever write: given the given fully qualified name of a class and it’s members, it should be immediately obvious to anybody where in your source code that member, property or method is defined.  Nobody should need to search for it.  Just from the namespace, they should be able to deduce – within a handful of clicks – where something is defined in your file structure.

For instance, say I have an int whose fully qualified namespace is:

int My.Program.DataTypes.SomeData.mID

If my solution contains two projects called My.Program and My.Program.DataTypes, I should expect to see the definition for mID in the My.Program.DataTypes project in a file called SomeData.cs.  Furthermore, I should expect to see the mID member variable defined in a particular section of SomeData.cs so that there is no need to hunt through the file to find it.

Any given class file can contain large numbers of member variables, properties, methods, event handlers, interface implementations, etc.  How would a random person know where in a given file a member variable is defined if you chose to scatter these definitions randomly throughout the source file?  In order to make it immediately obvious where to go to find a given member variable, I always group them together and – optionally – surround them with #region elements.  This way, if you open SomeData.cs and want to find the ID property, you can quickly browse to it by expanding the “Public Properties” region and scrolling down until you find it.  No search box necessary – it is obvious where the property is defined.

Why is this important?  Two reasons.  First, it makes your code more readable because things are laid out in an order that makes sense.  And second, because it saves a large amount of time and overhead.  The cost of searching for something in a large program is high enough that it should be avoided.  You should simply not have to search your code in order to discover where things are defined.

This is also my principle complaint about public inner classes.  The problem with inner classes is that it is not clear where they are just from looking at their full-qualified name.  For instance, I would expect:

var myObject = new My.Program.ObjectModel.SomeObject
to reside in a project called My.Program, within a folder called ObjectModel, in a file called SomeObject.cs.  But if SomeObject is an inner class of ObjectModel, it is not clear from looking at the name of the class where it’s code is defined because it’s not obvious whether ObjectModel is a class or a namespace.  When things get super inner-nested, this entanglement becomes even more confusing.
(Inner classes also prevent you from leveraging using statements to reduce long-winded class names, which in turn makes your code less readable because it becomes cluttered with lots of unnecessary scoping)
It is easy to write code in an organized manner.  Simply put new source files in locations that correspond with their namespace and define your members, properties and methods in the same sections of your code files.  That’s not too hard, is it?  It is much harder to take a program that was not written this way and separate things out into locations that are logical.  And it is even harder to understand a project where everything is defined in scattered, unorganized source files. 
So my advice is to do it right the first time.  Avoid future headaches from the start, make it easy for other engineers to collaborate with you, and reduce the overhead required to alter your work.  It’s a quick investment that continues paying dividends long after you’ve moved onto something else.