Struggles In Object Oriented Programming: January 2011

Friday, January 28, 2011

Writing Code That Writes Code

This blog references an executable and source code available for download. To download the referenced executable, Click Here

To download source, Click Here

The Pragmatic Programmer, which is a really nifty book that outlines principles for how a programmer should conduct himself (or herself), tells us to "write code that writes code."

There are several reasons one would want to have code that writes code:

1. We tend to write code in a consistent way.
2. If we don't write code in a consistent way, we should
3. Because of 1 & 2, reducing the amount of lines of code we have to physically write can increase our productivity, because it takes less physical work to create more functionality.

When I'm building a database application in C#, my business logic classes often look very similar. When considering how I would build an application that would write code for me, I started by looking at the consistencies in my code. I found the following:

1. I mostly write database applications, and business classes that represent (more or less) table definitions in my database. For example, for a CRM application that manages marketing campaigns, I might have a table called "Person" that represents people in my database; I may also have a table called "MarketingCampaign" that allows me to keep track of my marketing campaigns. I may have a link table called "MarketingRecipient" that links "Person" to "MarketingCampaign" (in other words, the MarketingRecipient table has a record for each person a marketing campaign targeted). To represent these business concepts in code, I would probably build *at least* 3 classes (probably more because I'm careful to avoid violating the Open-Closed principle). But these classes would probably be: Person, MarketingCampaign, and MarketingHistory (which is a collection of MarketingCampaign objects). I don't use ORM (object-relational mapping), because I don't like to surrender as much as it seems ORM asks us to surrender for the sake of convenience.

2. Most of the business layer classes I build have a load() method and a Save() method, and they often look similar; however, not similar enough to over-rely on inheritance.

3. I often have a need to represent business layer classes as a collection. These collections often look very similar - again, not similar enough to use inheritence (IMHO).

4. I make an effort to document (comment) every private member, public attribute, constructor, and method.

5. I often have a need to have a corresponding public attribute for every private member I have. Obviously this is not always the case.

Armed with these bits of information, I embarked on building a rinky-dink code generator that suits my needs.

So, this little application looks as follows:

As you can see, this app is quite simple, and the top input control is for "Class Name". This, as you might guess, asks you to enter a class name.

In the below example, I create a class name of "Person". Once I click the "Create" button, the panel below is activated.

At this point, I can start adding attributes to the class. I can manually type in the data type I want, or there is a set of primitive types in the dropdown.

Once I'm done adding attributes to the class, I click the "Generate Code" button in the lower right corner:

And voila - we have code. Granted, there's not a ton of functionality or code smarts this buys me, but it does do the following:

1. It does quite a bit of typing of members and attributes - this can be a big time saver when talking about 10-50 classes being created
2. It comments for me. Maybe not the best comments, but I like to comment all my members, attributes, and methods
3. It saves me from having to rewrite the same things over and over again, which helps me to abide by the "Don't repeat yourself" principle.

Granted, I often have to change things that are producted from this little app, but as mentioned above, this little app has turned out to be a big time saver. I'm sure there are better code generators out there, but for now, this works for me.

To download the referenced executable, Click Here

To download source, Click Here

Sunday, January 23, 2011

Another Layers of Programming Post

The first blog I wrote on the layers of programming, I feel like I really faltered on describing what "Business Logic" is.

Part of the reason for this is because the concrete definition of it wasn't as clear in my mind as it is now, and part of it was that I had more of an affinity towards the data access layer at the time, and I didn't find explaining business logic to be as exciting.

Times have changed, and I find business logic to be much more exciting than I did then. I also find this concept of "n-tier programming" to be one of the most challenging concepts to get right in object oriented programming.

So, I'll do a quick recap of the three main tiers in n-tier programming:

Data Access Layer: The code responsible for getting data from the database and delivering it to the business logic layer.

User Interface Layer: The code responsible for rendering output to the end-user. The user interface layer is mostly composed of user interface controls that render a user interface -- these controls can be the parent form that other controls are attached to, textboxes, comboboxes, radio buttons, and checkboxes for user input, buttons for user submission, list boxes and data grids for outputting collections of data...you get the idea.

Business Logic Layer: The code responsible for solving the business problems that software is being built to solve.

There are all kinds of implications for what the business logic layer is. Because object oriented programming helps us represent reality so well, I usually have my business logic layer define the concepts of the system I'm building.

For instance, suppose we're building a customer relationship management software (CRM).

Suppose this CRM is supposed to:
1. Keep track of our customers
2. Help us manage our inventory
3. Keep sales history for our customers

Obviously we could add more features to this CRM, but I think the above is adequate for describing what Business Logic does.

There's a lot of ways (methodologies) to build software - usually software is built after requirements have been written.

So, I'll preface these definitions with the fact that I'm building this business logic without having a requirements document.

But in this CRM, we need concepts defined, such as: what is a customer? what is a product? What is a sale? How do these concepts interact with one another?

The answers to the above questions really define what business logic is. For instance, one of the classes in my business logic layer may be Customer. The customer class would look like this:


public class Customer
{
   private string _firstName;

   private string _lastName;

   private string _emailAddress;

   private string _address;

   public string FirstName
   {
     get { return _firstName; }
     set { _lastName = value; }
   }

   public string LastName
   {
     get { return _lastName; }
     set { _lastName = value; }
   }

   public string EmailAddress
   {
     get { return _emailAddress; }
     set { _emailAddress = value; }
   }

   public string Address
   {
     get { return _address; }
     set { _address = value; }
   }
}

The above is a truncated version of what the Customer class would look like, but the above would also be the first lines of code that would come in the business logic layer, because all of the features of the CRM focuses around the concept of "Customer."

The next concept to define in our Business Logic is the concept of an Inventory Product. I would imagine that an inventory product would look a bit like the code below:


public class InventoryProduct
{
   private string _partNumber;

   private string _productName;

   private bool _isActive;

   public string PartNumber
   {
     get { return _partNumber; }
     set { _partNumber = value; }
   }

   public string ProductName
   {
     get { return _productName; }
     set { _productName = value; }
   }

   public bool IsActive
   {
     get { return _isActive; }
     set { _isActive = value; }
   }
}

So, the next question to answer is how does a Customer interact with an InventoryProduct? I would argue that a class called "ItemPurchase" would need to get created. An ItemPurchase would basically be a link between a Customer and an InventoryProduct, with a couple other attributes, such as a PurchaseDate and a Quantity, and maybe a couple other things, depending on what kind of information is getting collected about it.

Purchase would look as follows:


public class ItemPurchase
{
   private Customer _customer;
   private InventoryProduct _product;
   private int _quantity;
   private DateTime _date;

   public ItemPurchase(Customer customer, InventoryProduct product, int quantity)
   {
      _customer = customer;
      _product = product;
      _quantity = quantity;
      _date = DateTime.Now;
   }

   ///Other methods to save the purchase
}

This link between Customer and InventoryProduct is great, but we've got another problem: the problem is that the ItemPurchase object only covers one item. Obviously, a customer can purchase multiple things at one time, so we need another collection class. Maybe it could be called "ShoppingCart." I'll exclude the code for this because this is getting to be kind of a long blog entry, but I think you get the idea.

The shopping cart would serve to link an entire purchase to a customer...which leads us to another collection - PurchaseHistory. A PurchaseHistory object would be a list of ShoppingCart objects, and this PurchaseHistory object would get linked to a Customer object, and the Customer class would HAVE-A PurchaseHistory object in it. This relationship in the business logic also lines up very well with how the database would be designed (not covering database design in this post).

So, since I'm creating an PurchaseHistory object, which is essentially a List of ShopingCart objects, AND since I don't want to violate Open-Closed principle, I would start by creating a InventoryProductCollection object, and have PurchaseHistory inherit from that:


public class ShoppingCartCollection : List <ShoppingCart>
{

}

public class PurchaseHistory : ShoppingCartCollection 
{
   Customer _customer;

   public PurchaseHistory(Customer customer)
   {
     _customer = customer;
     this.populate();
   }

   private void populate()
   {
     ///Go to the data access layer to grab information about this customer's purchase history
   }
}

Notice in PurchaseHistory, the default constructor has a parameter of Customer. This makes sense to me because a PurchaseHistory, as we've defined it so far, does not seem to exist outside the context of a Customer. There's a lot of conversation that can be had around this, and how a PurchaseHistory will exist, but for this example, we'll just assume this will always be the case.

Because a customer will HAVE-A PurchaseHistory, I would modify the Customer class as follows:


public class Customer
{
     ///All of the code above

   private PurchaseHistory _purchases;

   public PurchaseHistory Purchases
   {
     get
     {
          if(_purchases == null)
          {
             _purchases = new PurchaseHistory(this);
          }
          return _purchases;
     }
   }
}

Notice in the getter in Customer, I check to see if the _purchases member is null - this is a trick called "Lazy initialization," which helps me avoid instantiating a PurchaseHistory object until I need it. This may be the right thing to do or the wrong thing to do, depending on all kinds of other factors (size and optimization of database indexes, how often a Customer's PurchaseHistory will be accessed, how often a Customer's PurchaseHistory will be changed, etc, etc, etc).

So, the business logic we've defined will:

1. Get its data from the database
2. Deliver it to the user interface
3. The data will (presumably) be changed in the user interface
4. The request to save to the database will come from the user interface
5. Prompting our business logic to tell the data access layer to save the changed data to the database.

That's business logic as I see it, and this is where a programmer generally earns his (or her) wage. The ability to see the business concepts as relationships, and then represent them in code is a real skill that takes years of tweaking to get right. There's all kinds of pitfalls and obstacles that come with almost any strategy one uses when building this stuff, ranging from scalability issues, extensibility issues, code management issues because of poor design or overly complicated design, performance problems because of use (or lack thereof) of lazy initialization, and a host of other things.

But, this is the world as I see it, and I hope it helps.

Monday, January 3, 2011

The Liskov Substitution Principle

Where the Open-Closed princple advocates inheritance to build good code, the Liskov Substitution principle says "Not so fast!" Inheritance is a great way to create hierarchies in programming, but with great power comes great responsibility. Officially, the Liskov Substitution Principle states: Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it. There are 2 main risks that one assumes by using inheritance:

1. Inappropriate behavior when casting occurs
2. Overly broad generalizations about child class behavior

Again, Martin relies on shapes to demonstrate how inheritance can create problems, this time using a square and a rectangle as the demonstration points.

In his example, he describes a Rectangle as a parent class, and a Square as a child class of Rectangle. Below is C# code that emulates behavior Martin describes:


    public class Rectangle
    {
        protected double _width;
        protected double _height;

        /// 
        /// Just to keep track of the object
        /// 

        protected string _objectName;

        public Rectangle()
        {
            _objectName = "Rectangle";
        }

        public virtual void SetWidth(double w)
        {
            _width = w;
        }

        public virtual void SetHeight(double h)
        {
            _height = h;
        }

        public double GetWidth()
        {
            return _width;
        }

        public double GetHeight()
        {
            return _height;
        }

        public void Output()
        {
            Console.WriteLine(_objectName + " attributes: Height=" + _height + " | Width=" + _width);
        }
    }

    public class Square : Rectangle
    {
        public Square()
        {
            _objectName = "Square";
        }

        public override void SetWidth(double w)
        {
            _width = w;
            _height = w;
        }

        public override void SetHeight(double h)
        {
            _height = h;
            _width = h;
        }
    }

The above code looks clean, elegant, etc, but there's a problem. And this problem comes when, later on in the development process, someone builds code that takes the parent class as a parameter, expecting


class SomeImplementation
    {
        public void DoStuff()
        {

            Square square = new Square();
            manipulateShape(square);

        }

        private void manipulateShape(Rectangle rectangle)
        {
            rectangle.SetHeight(2);
            rectangle.SetWidth(4);

            if (rectangle.GetWidth() * rectangle.GetHeight() == 8)
                return;

            throw new Exception("Error with rectangle dimensions");
        }

    }

In the above example, if a Rectangle object were passed to manipulateShape(), everything would be fine; however, in this case, when a Square object (which also IS-A rectangle) was passed to manipulateShape(), it throws an exception, because the Width * Height would equal 16, and not 8. The real problem here is that we over-rely on the parent's behavior, and as the body of code grows, managability is a major concern, because other users of code ought to rely on expected behavior, without having to always factor the interaction that occurs in the inheritance chain.

Ultimately, the developer who wrote the manipulateShape() method is expecting a Rectangle, but not one of its children (which is a reasonable thing to do); however, the real issue in this design is the overreliance on virtual/overridden methods: a square is a rectangle based on its public interface, but not on its behavior.

Consider a modification to the example, where a Rectangle is generated from some outside method:


Rectangle someUnknownShape = getShape();
manipulateShape(someUnknownShape);

If the getShape() method returns anything other than a top-level Rectangle, the exception gets thrown.

Because the behavior of Square is not consistent with Rectangle, the ultimate solution is to not have Square inherit from Rectangle. The safer alternative, in C#, as I see it, is to create an interface that both Square and Rectangle implement:


public interface IBoxShape
{
 void SetWidth(double width);
 void SetHeight(double height);
 void GetWidth();
 void GetHeight();
}

If both Square and Rectangle implemented the IBoxShape interface, then the above manipulateShape() method could keep its parameter as Rectangle, and the behavior of the application would perform as expected, and the developer(s) would not be tempted to send inappropriate objects to manipulateShape().

Going back to the Liskov Substitution Principle: Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it.

The manipulateShape() method in its original state ought to be able to expect that the area of the rectangle, after manipulating it, will be 8 - that behavior ought to be transparent, and changing behavior in child classes by use of virtual and overridden methods should be avoided.

The Open-Closed Principle

The Open-Closed Principle states, quite concisely, that classes should be "Open for extension, but closed for modification." In other words, the attributes and behaviors of a class (ie the building blocks of a class) should be thoughtfully placed where they belong in the inheritance chain. I don't think I could really overstate how many times I've violated this principle. In fact, in looking back at code I've written in the past, I see all kinds of examples where I violate it. But this is definitely one of the principles that has strongly impacted how I write code.

The example Uncle Bob used to describe this principle was a Shape example. The implementation of the Open-Closed is via inheritance, and eventually, polymorphism.

Martin starts with an antipattern to describe Open-Closed. Below is an example in C# that is similar to Martin's:

We start with some base citizens in our code base:

 
public class Shape
{
 double Area;
 ///...Other shape code
}

public class Square : Shape
{
 ///...Square code
}

public class Circle : Shape
{
 ///....Circle code
}

public class SomeImplementation
{
 public void DrawCircle()
 {
  ///...Draw a circle
 }

 public void DrawSquare()
 {
  ///...Draw a square
 }
 ///...Some code in Some Implementation

 public void DrawAllShapes(List shapeList)
 {
  for(int i = 0; i < shapeList.Count; i++)
  {
   Shape currentShape = shapeList[i];
   if(currentShape is Square)
    DrawSquare();
   else if(currentShape is Circle)
    DrawCircle();
  }
 }
}

The above code violates the Open-Closed Principle because it could not accomodate new shapes. So, adding new shapes would entail making a change to DrawAllShapes() every time a new shape got added to the program's vernacular. In the original article, Shape, Square, and Circle were Structs, not Classes, which made the antipattern much more ugly-looking. I didn't use a struct in my example, because overusing Structs was never a real problem in my learning how to program.

The better solution, which does not violate the Open-Closed Principle is shown below:

 
public abstract class Shape
{
 double Area;

  public abstract void Draw();

 ///...Other shape code
}

public class Square : Shape
{
 public override void Draw()
 {
  ///...Do Square drawing
 }
}

public class Circle : Shape
{
 public override void Draw()
 {
  ///...Do Circle drawing
 }

}

public class SomeImplementation
{
 ///...Some code in Some Implementation

 public void DrawAllShapes(List shapeList)
 {
  foreach(Shape currentShape in shapeList)
   currentShape.Draw();
 }
}

In the above code, the SomeImplementation class uses polymorphism to draw any shape in the passed shape list. So, as more shapes are added to the application, the DrawAllShapes() method does not need to change (the Closed portion of the principle), and the new shape can simply be a new class, inherited from Shape, or one of its children (the Open portion of the principle).

As a personal aside, I used to violate this principle all the time, because I would create "Collection classes" that would inherit from a List object, and every time a need for a new type of the same collection occurred, I would add a new constructor to represent that type of collection; the better solution would have been to inherit from the base collection. So, the below code is what I used to do:

 
public class SomeClass
{
}

public class SomeClassCollection : List
{
 Category _category;
 Person _person;

 public class SomeClassCollection()
 {
  _category = null;
  _person = null;
  populate();
 }

 public class SomeClassCollection(Category category)
 {
  _category = category;
  _person = null;
  populate();
 }

 public class SomeClassCollection(Person person)
 {
  _person = person;
  _category = null;
  populate();
 }

 private void populate()
 {
  if(_category == null && _person == null)
  {
   ///Do type 1 collection population
  }
  else if(_category == null && _person != null)
  {
   ///Do type 2 collection population
  }
  else if(_category != null && _person == null)
  {
   ///Do type 3 collection population
  }
 }
}

The better way to build the above code to avoid violating the Open-Closed Principle would have been to extend SomeClassCollection:

 
public class SomeClassCollection : List
{

 protected virtual void populate()
 {

 }
}

public CategoryCollection : SomeClassCollection
{
 public CategoryCollection(Category category)
 {

 }
 override void populate()
 {
  ///Populate based on category
 }
}

public CategoryCollection : SomeClassCollection
{
 public CategoryCollection(Person person)
 {

 }
 override void populate()
 {
  ///Populate based on person
 }
}

SOLID Principles

I recently stumbled onto a collection of articles written by an author named Robert Martin way back in 1996 for a now defunct magazine called "The C++ Report" (The C++ Report was evidently followed by the "Journal of Object-Oriented Programming", which doesn't seem to exist anymore either). These articles address core principles of object oriented design, and seem as true today as they were all those years ago. Of course, a few of the references Martin used in his articles may be a bit outdated, but the spirit and intent behind the articles is clear, and unlike a lot of articles I read that were written much more recently than Martin's articles, these articles are unmuddied by the complexities and interactions between all of the various technologies, subtechnologies, wrappers, APIs, etc that exist these days.

Martin (affectionately known as "Uncle Bob") wasn't really inventing any new concepts with his articles; rather, he was simply synthesizing information created by others into a paradigm that was appropriate for the time and available technology. But Robert Martin's articles were so successful and widely adopted, that they are still at the core of most object oriented design that is in use today, including later concepts, such as design patterns. Evidently, Martin is still at it, owning a company called "Object Mentor" which does company/enterprise-level coaching of these, and other object oriented design principles. He's also written a number of books on the topic, as well. Today, Martin is considered a legend in the programming world. He even has a blog, which can be found at http://blog.objectmentor.com/articles/category/uncle-bobs-blatherings

I'm about half way through reading this series of articles Martin wrote, and the information I've gleened from them have drastically (or at least, significantly) impacted how I approach design and refactoring (in fact, some of the principles demonstrated in these articles have sparked several refactoring sessions). Martin's SOLID principles are made up of: Single Responsibility Principle, The Open-Closed Principle, The Liskov Substitution Principle, the Interface Segregation Principle, and the Dependency Inversion Principle.

If nothing else my next few blogs will be a way for me to take the information I gathered from the core OOP principles Uncle Bob put forward (hereafter referred to as SOLID principles). And perhaps, someone out there will read them, and be compelled to apply them in their own programming.

Struggles In Object Oriented Programming

Friday, January 28, 2011

Writing Code That Writes Code

Sunday, January 23, 2011

Another Layers of Programming Post

Monday, January 3, 2011

The Liskov Substitution Principle

The Open-Closed Principle

SOLID Principles

Blog Archive

Favorite Links

Followers

About Me

Search This Blog