Object Oriented Programming in a Nutshell
Next year I’ll likely be teaching Essential Test-Driven Development to a team that includes about 50% COBOL programmers. I told the client I’d look for a good object-oriented (OO) primer for those developers to read in advance. As you can imagine, it’s tough to “unit-test” software that doesn’t have clear “unit” boundaries. COBOL relies on a lot of global state, and this appears to be true for this client.
So I thought about my very first industrial OO training, in 1998, which was horrible. It was from the school of OO that promotes “subclass for specialization” and “design objects to model reality so they can be reused across many applications and you know that yours will be the best ever though thousands have tried and failed…” and “multiple inheritance is awesome!” [Hmmm. Funny, that last one… Ruby supports multiple inheritance, but it’s not nearly as offensive as in C++. There’s a whole ‘nother post in that, though. Not today…]
Thankfully, I had all that shaken from my brain upon my first encounter with a Smalltalk programmer mere months later.
I tried to find a popular OO primer on Amazon. It’s so hard to tell without reading them, though, whether it’s from the Smalltalkian school of OO or from the confused masses of pseudo-OO who still think there’s some magical mystery to software development.
Failing that, I decided to record a few things I used to point out when I was teaching OO classes:
OOP in a Small Nutshell, in Three Parts
First, I’ll summarize. But I must warn you that there’s a lot of subtlety hiding behind this summary, so don’t wander off. I’m summarizing because my summary is likely quite different from that of other OOP-in-a-Nutshell bloggers.
- Objects are custom types.
- Objects are loci of maintenance activities.
- Objects replace procedural compile-time structures with state that changes at run-time.
Notice that I didn’t mention polymorphism, abstraction, or any other $5 words. They’re all in there…somewhere. It’s not that any of those are bad, but that they’ve always seemed like observational descriptors, from outside actual code-slinging, looking in at results.
I’m a pragmatist. I think most programmers are pragmatists (or at least they become pragmatists after their delusions of magical wizardry are crushed by a good Automata Theory class, or an ex-Smalltalker coach).
So my list is designed to appeal to the pragmatist in you. Though, they do need some explaining (especially that third one).
Objects are custom types.
“As custom types, objects allow us to talk about our system, and change our system, using a language that we create together.”
Objects combine data with the behaviors that are most applicable to that data. And vice versa. Objects provide that behavior to other parts of the software (other objects, or an API, or some other form of UI).
I think a good counter-example is always helpful: In the early days of Java, you could not add two Integers together with the + operator. I’m not even sure Integer had an addTo(anotherInteger) method! You had to extract the int primitive from the object, add them together, then (if you wanted) create a new Integer object out of your int sum. That made Integer a poor excuse for an object. (I think it was fixed only after Microsoft implemented boxing and unboxing in .Net, which made everything an object in .Net)
As custom types, objects allow us to talk about our system, and change our system, using a language that we create together. Our object names and method names (behaviors) match the “ubiquitous language” we use when we talk to customers. In fact, conversations with customers often help us name things.
We’re not trying to model reality here. We’re actually modeling domain activities.
My very first OO product was a “Storage Area Network Policy Management Application.” Before our OO coach arrived to kick us in our complacency, we had diagrammed DiskDrive and TapeDrive and TapeRobot and a host of other nouns. Working with the coach, we discovered that we needed to be able to model a user’s need to store data on encrypted volumes, perhaps duplicated a certain number of times, and/or accessible within a range of times. For example, if I said I would need to access my gigabytes of data (in 1998, that was huge) within an hour and I wanted it to be encrypted, the system would give me a list of tape drives (capacity) that supported hardware encryption.
We threw out all our old stuff and wrote Filter (“gotta have it so don’t show me anything that doesn’t”) and Sorter (“put the best at the top and I’ll choose”) classes.
Which the coach then said were terrible names. He made us step back and listen to ourselves talk about the users, and how they would be using the software. Filters became Needs, Sorters became Wants. Yep, our system had Java class names of Need and Want. (They could be composed with the And and Or subclasses, becoming a very nice Composite Pattern! But I digress into supreme nerd-dom…)
Though I haven’t mentioned any of these, can you see how this explains the need for: abstraction and polymorphism, cohesion, intentional (vs. accidental) coupling, encapsulation?
Objects are loci of maintenance activity.
“Good objects contain just enough of the system so we can look into two or four of them and not feel overwhelmed.”
This paragraph may crush a few delusions. (I sure hope so.) Software design is fundamentally about two things: (1) Making sure the system does what we need it to do, and (2) making sure we can change (extend or fix) the system when necessary.
Wrapped up in #2 is easy readability, and for all developers. If only one developer loves to use lambdas (closures, anonymous functions, function pointers…), they need to explain them sufficiently to the rest of the team so the whole team can determine when lambdas are appropriate, and when a whole system of unnamed functions flying around like fireflies will detract from general comprehensibility and maintainability.
I’d be remiss if I didn’t mention that a suite of fast, automated, comprehensive tests (like what you’d get from BDD or TDD) supports both #1 and #2. On an “Agile” team, you need to be able to refactor diligently, and this test-suite (a.k.a. executable specification) allows you to do so with confidence.
Good objects contain just enough of the system so we can look into two or four of them and not feel overwhelmed. I’ve heard complaints about OO systems that we have to jump from class to class to figure out what’s going on. And this is partly true. Procedural code, which describes a behavioral recipe step-by-step, is very easy to understand. I would argue that OO is easier, once you get used to it, and once you trust some of the objects in your system to behave as expected. If I’m trying to understand class Foo, and class Foo delegates to class Bar, my need to go and learn class Bar depends upon how well-named Bar’s methods are, and whether I trust Bar. Again, if we’ve tested the snot outta Bar, why would I need to examine Bar? I need only read Foo’s code (better yet, Foo’s tests!) to figure out what Foo is doing.
When it comes time to add some feature to the system, we benefit greatly if changes can be understood and implemented using the classes and abstractions already available to us. If not, then that’s a great opportunity to refactor to the point where the changes are isolated to a small set of new and existing classes. In 2001 Jim Shore and I enhanced the input and output of a whole software suite to support a broader set of international characters, by changing two methods in one class.
As your system grows, objects need to become more concise and specific, not less. This does trend towards smaller objects that cooperate with each other in the aggregate, in order to get important stuff done. If one object does one thing over there, it can do the same thing over here. There, now I’ve covered “reuse.”
I’ve also set myself up nicely for the third point.
Objects replace procedural compile-time structures with state that changes at run-time.
“Often good object-oriented designs are the result of past decisions to make the addition of a new run-time variation more easily completed, tested, and understood in the code.”
It’s easy to follow procedural code. “Here an if statement, there a for loop… Ah, I see what it’s doing.” Objects, especially poorly written objects, can seem incomprehensible in comparison. So why bother?
Let’s look at a brief toy example. Here’s the procedural code:
And here’s the same code using delegation:
whatToDoNext() is a “factory method” which just means it returns a new object. And, likely, it contains an if statement. Not much advantage there.
So where is the advantage of OO over procedural? Well, one thing I notice here is that a decision is being made based on a flag, and then we’re deciding to do one of two different things.
I’d bet, early on in the development of this, all we had was the body of doSomethingQuick(). Now there’s a choice, and we need to choose between two things. What if two becomes three? Or 18? Ah, yes! We can change the if statement to a switch, and that would be fine. What if we need two switchstatements, or four? And then we need to add the 19th game strategy?
Here’s what you should watch out for: If you need to add functionality in more than one place, your code is telling you that it’s not designed optimally for the new addition. Particularly if you have more than one switch statement that switches on the same data value.
Folks have been telling you about the “Rule of Three” for decades, right? “If there are three of something, we need an abstraction.” That’s fine, but I’ve found that if there are two of something, there will be three. Also, there are advantages to making it clear that there are two, and that they are separate and slightly different. Instead of stating a “Rule of Two” which would be another external observation, I’ll tell you the rule I really follow: I try to be sensitive to the “code smells” (Fowler/Beck, Refactoring, 1999) that lead me to more readable and understandable designs.
So all those if-else or switch statements become simple delegations to an object that was selected as the current state of the delegating object. You can’t get rid of all the switch/if-else statements (unless you use a Dependency Injection Framework, but let’s not fool ourselves…that’s still an if statement [and a new!], just a really slow and awkward one). There’s one in the factory method.
Okay, note that I didn’t say “Factory Method” but “factory method.” I’m not referring to the pattern. Essentially, I’m referring to a method that makes something and returns it, rather than making it and then using it within the same set of curly-braces (that path leads to madness…well, unintentional coupling and difficulty in testing, anyway).
Often good object-oriented designs are the result of past decisions to make the addition of a new run-time variation more easily completed, tested, and understood in the code.
When we replace lots of compile-time structure with run-time state, we make the code simpler to read. There may seem to be a bit of trade-off because we have to read more of the system in order to understand what might happen in certain circumstances. And there’s another reason why I can’t ever, ever separate object-oriented software development from the numerous advantages of having a suite of well-written, readable, fast, and comprehensive behavioral tests[/specs].
You may note there’s a lot of overlap in my descriptions of OO in three nutshells: Custom types, maintainable and comprehensible, replacing lots of static, compile-time structure with instances of those types.
With tests. Did I mention you want to have tests?