SOLID Principles – Revisited

I recently rewrote an article on APIE, an article I originally wrote over 2 years ago. Now I am going to rewrite another article I wrote 2 years ago on the SOLID principles.

My views on the SOLID principles, have stayed largely the same, however my knowledge of these principles has matured. This hopefully means, I’ll be able to explain each principle in greater detail and be a lot more concise on each subject. I also like to think my writing style has matured and so I’m able to express myself better with less waffling, which is always a plus.

A brief introduction

The SOLID principles are 5 basic OOP principles that together aid developers to build systems, which are easy to maintain and extend over a large period of time. In my recent rewrite on APIE, I talked a lot about abstraction, as I believe it’s the most important of the 4 APIE concepts. That leads me onto, well what are the SOLID principles? The SOLID principles stand for Single responsibility, Open-closed, Liskov substitution, Interface segregation and Dependency inversion.

Last time I wrote about the SOLID principles, I wrote how the open-closed principle and interface segregation principle were less important than the others. This time I’m not saying anything of the sort. Each principle is equally as important, however I will say the open-closed principle is very easy to violate. If you could write applications without violating this principle, you would potentially have an extremely robust piece of software. I’ll cover why in the open-closed section.

As I step through each principle, you’ll begin to see how intertwined each of them are. When you start violating one of them, you’re almost certain violating two or three of them. In my opinion the most important thing to understand about the SOLID principles, is that everything revolves around programming to an interface. It’s an age-old rule within OOP, but it’s just as important now as it has ever been. Be warned, it may sound like I’m repeating myself at times and that’s because I will be. The SOLID principles are tangled together like a bowl of good chow mein, so unfortunately there will be some cross referencing, however I’ll do my best to limit this.

Single Responsibility

“A class should have only one reason to change”

This principle is by far the most commonly known principle of the five. Even if you’ve never heard of it before, you’ll probably be complying to single responsibility at least a little. The principle states that when building your classes, “make sure that each class does one thing and one thing only”. This can also be described as having only one reason to change. If you’re class requires updating, to support any more than one piece of functionality, it would be breaking single responsibility. You can think of it a little like the Unix philosophy, just with classes instead of programs.

The all consuming point that I want to talk about here, is the concept of knowers and doers in your application. Every class you create should either be a knower or a doer but never both, doing so would violate single responsibility. Lets discuss what being a knower entails.

Knower Classes

A knower is simply a class which stores data, but does not have methods to operate on its data, it simply stores data for other objects to consume and use. These objects are commonly referred to as Data Transfer Objects (DTO). The most common example of a DTO would be your models, but they could also be value-objects. Models are generally populated from your database data, using some form of Data Access Object (DAO). This could be done using an Object Relational Mapper (ORM), repository pattern or something else, whatever tickles your fancy.

Models that have no doer methods, i.e. they contain no behaviours, are called anaemic models and they’re extremely common in OOP. Despite their popularity, there are another group of people who will tell you anaemic models are an anti-pattern, the spawn of the devil even. – Martin Fowler wrote a must read article on just this. The general reason why anaemic models are disapproved of, is because they remove the benefits that come with bundling data and behaviours – encapsulation. It’s a fair point, after all it’s the very core benefit of OOP over procedural code. So the argument is, should we allow some logic in our models, whether that be validation rules, business rules or something else? I don’t believe there is a right or wrong answer to this question. The advantage of keeping your models anaemic, is that you don’t run the risk of always thinking, just one more method and ending up with a monolith, 6 months down the line. However if you and your team are confident in keeping your models under control, whilst adding some logic to them, I think that’s the perfect medium. There’s nothing wrong with abstracting out methods later down the line and it’s nothing to be ashamed of either. Once you feel the logic is getting to complicated for your models, it’s better to abstract some of that out, rather than letting the code rot even more. As I mentioned above, I highly recommend Martin Fowlers article on this. I don’t want to go into massive detail, as it’s not essential to single responsibility.

A brief word on frameworks. Frameworks are a great, they help us build applications quick. But they all have one thing in common. They all go completely apeshit when it comes to helper methods. As I mentioned above, anaemic models are fine imo, models with some simple behaviours are also fine. Abstract models that are 5,000 lines long, follow active record and have endless methods working on the global scope, are not fine. Yet every framework I’ve ever worked with including Symfony (PHP), Laravel (PHP), Yii (PHP, Phalcon (PHP), Django (Python) and Rails (Ruby), all do it. The important thing to take away is, just because they all do it, doesn’t make it right. They all have hundreds if not thousands of contributors to their frameworks and are somehow able to maintain their code. You and your tiny team, won’t be able to and you wouldn’t want to either.

Doer Classes

Single responsibility really lives within the realm of the doers. Knower’s know stuff but that’s it, they’re pretty simple. Doers should do only one thing. Although they could do two, three, five, ten, one hundred or any other amount of things. So let’s discuss why they should only do one thing.

Lets go back to the framework abstract model example. You might think that a single class that can perform every possible action on a user is following single responsibility, but you would be wrong. Abstract models in frameworks usually perform many, many tasks e.g., Pulling data from the database, storing data, saving records to the database, saving relationships, validating forms, performing diffs on objects, hydration, outputting result, caching, exporting and so much more. Each of these responsibilities should be performed by a completely separate set of classes to comply with single responsibility.

Great, so splitting up all these actions into separate classes, will help us comply with single responsibility, but what makes that any better? Great question Sherlock. The key thing to remember when writing software, is that you’re not just writing code for machines, more importantly you’re writing code for humans. One of my favourite quotes goes like this, “Always code as if the guy who ends up maintaining your code, will be a violent psychopath who knows where you live.“. We’ve all opened a class to find it has over 5,000 lines of code, hell I’ve stumbled across PHP classes with over 30,000 lines of code. It’s overwhelming, impossible to find what you want, messy, unmaintainable and damn right scary. Any developer who is worth their salt and roped into working in an environment like that, will soon be freshening up the old curriculum vitae and find a role worthy of them. The point here is, by writing classes that have a single responsibility, you’re also writing code which can more easily be maintained, not just for a few weeks, but months and years.

On top of having human friendly code, which is obviously a huge win. When complying to single responsibility, you’ll find it way easier to substitute responsibilities between different abstractions. For example, going back to the framework abstract model example, your application might be storing all its data in MySQL, but you want to start storing certain models in MongoDB. To do so you’re going to have to throw a bunch of horrible conditions into your models like this:

if useMysql
    // Save to Mysql
elsif useMongo
    // Save to MongoDB
else
    // Throw exception

This obviously isn’t maintainable or easily extensible. In fact as soon as you start putting if statements inside your classes to control logic paths, it’s a pretty safe bet that you’re violating the single responsibility principle. Wouldn’t it be much cleaner, if instead you had a data mapper class for MySQL and another for MongoDB, which saves models to the respective database. You could then use the strategy pattern to inject the data mapper that you want to use in the client class. This allows you to separate your saving logic for each database system, keeping your classes laser focused and easy to maintain. If you wanted to start saving records to Neo4J as well, you could just add a new data mapper class and you’re done.

On top of all of those benefits, you’ll also find testing your classes way easier. The more you can substitute dependencies out into abstractions the more you can mock the dependencies during testing, keeping your tests laser focused too.

Open-Closed

“Software entities … should be open for extension, but closed for modification”

The general idea behind the open-closed principle, is that once a entity (class or method) has been written, it should be closed for modification i.e., it should never be edited again. The benefit to doing this is, if you never have to update your existing classes, then you never have to worry about breaking existing functionality when creating new functionality. So how do we achieve this, well programming to an interface is a great first step. By programming to an interface, we are able to build new abstractions by extending existing interfaces. What this means is, we can change the behaviour of a class without modifying the source code of our existing classes.

A tell-tale sign that you may be violating the open-closed principle is if inside of your methods you have to check the instance of an argument. Then based on the type of argument that is passed in different behaviours should take place. The best way to make sure you never get into this scenario, is to always create interfaces for your abstractions and always type hint your method arguments where possible to comply to an interface. If you do this throughout your application, you’ll never need to modify your existing entities to create new functionality. There is a catch though, what if we need to add a new method to our interfaces?

Technically this also violates the open-closed principle, however like I said, sometimes you just have to violate this principle, there really is no way around it. The less you modify your existing entities the better, but sometimes it absolutely necessary. I don’t think anyone would agree that over-engineering is a good thing, so we’re hardly going to create a class that has every method we’ll ever need, straight off the bat. The upshot being, we all agree that it’s basically impossible to comply to the open-closed principle, so why not just make refactoring part of the principle. Maybe one day this will be the case, but for now comply to open-closed as best as you can, always type hint using your interfaces, but don’t be afraid to refactor and add new methods where you see appropriate.

Liskov Substitution

“Objects in a program should be replaceable with instances of their subtypes without altering the correctness of that program”

Liskov substitution and open closed almost go hand in hand with one another. The main thing we took away from the open-closed principle, was that by programming to an interface, we can stop ourselves from modifying existing source files. Instead we can change the behaviour of a class by simply substituting in a new implementation.

Liskov substitution is very similar, but where open-closed focuses on your method arguments, liskov substitution focuses on the return types of your methods. What the Liskov substitution principle defines, is a set of rules that each implementation of an interface or child class should adhere to, in order to be truly substitutable. This can quite simply be described as being polymorphic.

A tell-tale sign that you’re probably violating the liskov principle is that you’re type checking a response inside a client class. In some languages you’re able to define the return type of your methods and these can be defined in your interfaces or parent classes, as a contract for your concrete classes to follow, but not all languages support this functionality. So for example you may have two classes which return user records, one set comes from Redis as a standard array or list. Another set may come from MySQL as a collection object. This means that inside your client code you need to type check the response before your can operate on the data. This may seem like not too big a deal with only two types of response, but when you add another source, say Neo4J, with another return type, you’ll start to see the messiness creep in.

When your child classes return different types of response, they don’t adhere to the contract set by their parent. This is what causes us to have to type check the responses, if all of our concrete classes had adhered to a contract and all returned the same object type, the response could be handled in exactly the same manner, regardless of the implementation. Not only that, this also would also violate the open-closed principle, because we would be forced to update the client class to support the new return type, which means we would be breaking the closed for modification rule.

As I mentioned above, unfortunately not all languages support return type declarations, in these cases we have to use docblocks to define return types. Obviously this doesn’t have the same effect as return type declaration, but some languages have static code analysis tools that will read your docblocks and try to find violations for you.

On top of return type declarations, liskov substitution defines some additional rules for staying polymorphic.

Method Signatures

I don’t know about all OO languages, but in the ones that I’m familiar with, it’s possible to override method declarations, with completely different arguments. All methods in a child class should match the signature of the same method in their parent class or become more loose. Becoming more loose is always a confusing concept, it always seems more natural for a child class to become tighter. However making the contract for a child class tighter than the parent, would mean certain valid arguments that could be passed to the parent, would not be valid for the child, meaning the child does not adhere to the parents contract.

Lets explain that with a simple example. You have 6 classes named Person, Woman, Man, Car, Ferrari, Mini. Man and Woman are both child classes of Person, Ferrari and Mini are both child classes of Car and Car is a dependency of Person. Lets put that into code with some PHP.

class Car {}

class Ferrari extends Car {}

class Mini extends Car {}

class Person {
    public function __construct(Car $car) {
        $this->car = $car;
    }
}

Class Man extends Person {
    public function __construct(Ferrari $car) {
        $this->car = $car;
    }
}

class Woman extends Person {}

Despite the code above actually lacking any real use, you can see that the Man class has broken the contract defined by its parent class – Person. The Person class states in its contract that it accepts any instance of Car, that could be the actual Car class or Ferrari or Mini. The Woman class doesn’t override the contract in any way, so it’s fine, however the Man class states only Ferrari is a valid dependency. It has updated the contract, making it tighter and no longer comply to its parents contract. Liskov substitution has been violated.

Exceptions

When throwing exceptions inside a child class, the exception thrown should always be more specific than any exception thrown inside the parent class. That’s the other way round to method signatures, but it makes sense here. Lets create 2 new exception classes, CarException which extends Exception and FerrariException which extends Car Exception.

class CarException extends Exception {}

class FerrariException extends CarException {}

Lets assume that inside the Car class we created a drive() method that threw a CarException when the gearstick is in neutral. Inside the Ferrari class we threw a FerrariException if the gearstick is in neutral. This would be perfectly valid and would not violate the liskov substitution principle. When Calling the the Car classes drive() method, the client knows it might have to handle a CarException, which will automatically catch all child exceptions like FerrariException. However if the Ferrari classes drive() method was to throw a basic Exception, this would violate the liskov substitution principle, because the Car classes contract towards exceptions has changed.

P.s. No sexist notions towards men drive Ferraris and woman drive minis was intended.

Interface Segregation

“Many client-specific interfaces are better than one general-purpose interface”

Interface segregation is an extremely simple principle and compliments the single responsibility principle perfectly. Where single responsibility focuses on building small but highly cohesive software components, interface segregation focuses on building interfaces which are small, focused and highly cohesive. It simply states that no clients should ever be forced to implement an interface that it does not need. More specifically this means that no class should ever be required to implement a method it doesn’t need, just because of an interface the class implements. The solution is pretty simple, just make your interfaces smaller and have your class implement as many interface as it needs. There is no harm in having a class which implements multiple interfaces, interface which contain a singe method are absolutely essential in many situations.

As an example imagine you had 2 classes Eagle and Ostrich, they both implement an interface called bird which enforces the definition of certain methods like eat, sleep and fly. That all sounds great, until you realise that an Ostrich may be a bird, but it can’t fly. So what do you put in the Ostrich classes fly method? You’re forced to return a null or falsey value, as it’s an invalid method call. This obviously isn’t great and violates the interface segregation principle.

The correct solution would be to define two interfaces, Bird which contains the eat and sleep methods and FlyableInterface which contains the fly method. This way you can have both the Eagle and Ostrich classes implement the Bird interface, but also have the Eagle class implement the FlyableInterface. Doing this will ensure that your Ostrich class is not being forced to implement an interface it does not need and now complies to the interface segregation principle.

Here’s an example of this in PHP

interface Bird {
    public function eat();
    public function sleep();
}
interface FlyableInterface {
    public function fly();
}
class Eagle implements Bird, FlyableInterface {
    public function eat() { return 'Eagle Eating'; }
    public function sleep()  { return 'Eagle Sleeping'; }
    public function fly()  { return 'Eagle Flying'; }
}
class Ostrich implements Bird {
    public function eat() { return 'Ostrich Eating'; }
    public function sleep()  { return 'Ostrich Sleeping'; }
}

Interfaces which contain methods which not every implementation will need, are called fat interfaces and these not only violate interface segregation, they also violate single responsibility. By defining tiny reusable interfaces, we are able to type cast arguments to a very specific set of behaviours.

Lets say for example we have a Mailer class which sends emails, but requires an object which contains a method called getEmail, this could be a User class for example. We could type hint the Mailer class to only accept instances of the User class, which would ensure a getEmail method will always exist, great. But what if we want to start sending emails to email addresses stored in other classes, for example a NightlyMonitor class. We may create a class called NighlyMonitor, which also contains a getEmail method. This new NightlyMonitor class fits the requirements to be used in the Mailer class, apart from it’s not a User class, so how do we allow the Mailer class to accept instances of both User and NighlyMonitor? The answer is with a tiny interface, lets call it MailableInterface which defines the requirement of a single method – getEmail. After that, make sure the User and NightlyMonitor class implement that interface. Now in the Mailer class, we can make sure an instance of MailableInterface is provided instead of an instance of User. Now any class you create that has an email address you want to send emails to, you can have the class implement MailableInterface and it’s ready for action.

Hopefully that isn’t too fuzzy, here’s that exact example in PHP.

interface MailableInterface {
    public function getEmail();
}
interface MailerInterface {
    public function send();
}
class User implements MailableInterface {
    public function getEmail() { return 'top@secret.com'; }
}
class NighlyMonitor implements MailableInterface {
    public function getEmail() { return 'eventopper@secret.com'; }
}
class Mailer implements MailerInterface {
    public function send(MailableInterface $mailable) {
         $email = $mailable->getEmail();
    }
}

It’s a pretty basic example, but as you can see, the Mailer class now accepts both instances of the User and NightlyMonitor classes. Your Mailer class is now a lot more reusable and no longer depends upon a concrete User. All it cares about is that a getEmail method exists, so it has somewhere to send its mail too.

Dependency Inversion

“Depend upon Abstractions. Do not depend upon concretions. Dependency injection is one method of following this principle.”

Last but certainly not least, dependency inversion. A common belief is that dependency inversion and dependency injection mean the same thing, but that’s not true. A more accurate description would be that dependency inversion can be achieved using dependency injection. That’s not so say that dependency inversion cannot be achieved using other approaches though.

The golden rule to achieving dependency inversion goes a little like this.

High level modules should never depend upon low level modules, instead they should depend upon abstractions. Low level modules should also depend upon these same abstractions.

So what are high and low level abstractions? You can think of high level abstractions as your classes which say I want to perform task xyz, but have no idea what’s involved in achieving that, therefore they delegate the responsibility to a low level abstraction. Low level abstraction are the clever clogs classes of your application, they’re the classes that know exactly how to do what the high level classes ask of them.

You can think of a high level classes like your car, it can perform behaviours like drive. Then you can think of your low level classes like your cars engine, this can perform tasks like, air and petrol intake, spark plug ignition and fume releasing. Your car has no idea how to perform these tasks, so it simply delegates the responsibility to the engine aka the clever clogs, which knows exactly what to do.

So how does the high level class (Car) consume the low level class (Engine), the answer is dependency injection. When instantiating the Car class the Engine class can simply be provided as a dependency. That’s pretty simple and was explained in the open-closed principle. But what is meant by high and low level modules should depend on the same abstraction.

Just like all of the SOLID principles, what this is referring to, is programming to an interface. By making sure that the Engine class implements an EngineInterface and that the Car class can be passed any instance of EngineInterface not just Engine, our high and low level modules now depend upon the same abstraction.

That really is the crux of dependency inversion, although it has separate goals to the open-closed principle, they’re both achieved using exactly the same approach. Just like with the open-closed principle you reap all the same rewards, like decoupling your modules, making changes easier and less spaghetti like, also easily being able to test your code.

Conclusion

That’s one of my most meaty blog posts to date, sorry about that. You probably got bored of me repeating myself so often, but it’s not without good reason. All of the SOLID principles are very tightly coupled and they really do all boil down to one thing – program to an interface.

I’d go as far as saying that’s the most important aspect of OOP, when trying to keep a healthy, maintainable, easy to extend and human friendly application.

Thanks for reading.

9 Love This

Simon Jakowicz

Just another blogger