Notes about Chapter 02 of Web Scalability For Startup Engineers
Based on my understanding of this chapter, the author is expressing the idea that sometimes the need to scale forces us to break good design principles. However, it’s important to first understand these principles.
Some Good Software Design Principles:
This is a list of the good design priciples that a software engineer should understand:
Simplicity:
Keep it simple, but not too simple. How simple should we make things? We need to consider for whom we are designing and the delivery deadline.
Simplicity is not how fast or quick can we implement a solution , but how easy for another software engineer to use our solution, and being able to understand the system as it goes larger and more complex.
To implement this a software engineer need experience with different tools and languages, he also need to revisit his old solutions review them and try to fix them, finding a mentor or working with people who value this principles will make us progress faster.
Below is the steps to make our solutions more simple:
- Hide complexity and build abstraction: we should be able to achieve the local simplicity, so if when take it is easy to understand how it works , zoom int out to modules and then to an entire app must be the same case. Complexity is about how many dependencies a single component like (class, module, …) have on the other components. We need to worry about how our components interact instead of how they fulfil their duties once we start seeing a bigger picture, in larger systems we can add services where each one is responsible for a specific functionalities and showing a higher level of abstraction.
- Avoid Over Engineering: Do not spend time building over complicated and imaginary designs that no one uses, we should care about simplicity and most common scenarios,we should think about tradeoffs and if we will really need this.
- Try TDD: allow us to reduce the amount of useless functionality, and act as a doc for our code by showing the expected results and behavior and the code function. Tdd allows us to do mental shift where we start thinking about how this component is going to be used by other components first rather than implementing the internal logic
- Learn from models of simplicity: check other good software and learn from there design.
Loose Coupling:
We should keep coupling as low as possible between components, coupling is how much two components depend on each other , the less coupled the components the less they know about and depend on each other, a no coupling mean that two components does not know about each other.
Keep the low coupling is important for out ability to scale.
- The higher the coupling the higher the number of changes/bugs that we might add to other components that depend on the internal implementation of specific component.
- We will be able to hire more engineers as they don’t have to know the full details to work on specific parts of the system.
Promoting Loose Coupling: we should carefully manage dependencies between modules classes and applications.
Applications are the highest level functions, you might use an application for accounting, asset management or file storage.
Each application will contain a set of modules (for example: pdf rendering, credit card processing, portal document) that another team member can work on independently, if we can not do this the application might have some tight coupling problems .
Each application will contain a set of modules (for example: pdf rendering, credit card processing, portal document) that another team member can work on independently, if we can not do this the application might have some tight coupling problems .
Each module consist of classes , a class is the smallest unit of abstraction, we should keep our functions private or protected as much as possible, the less knowledge other classes know about our class the less they are aware how the class does the job, private functions can be refactored easily as they are called only in the class , in case of other protected and public functions we should search the code before refactoring them.
We should share only the minimum information and functionality that satisfies, sharing too much in early increase coupling.
How to avoid unnecessary coupling:
Hide as much as we can and avoid using getters and setter without needing them, as these getters and setter was introduced to provide a good ide support.
When client need to call methods of class in a certain order for the work to be done.
Do not allow circular dependencies between layers, classes and modules, in diagrams the relations between our components should look like a directed graph more.
Models of lose coupling:
A good example of loose coupling is the unix program where the commands like grep, sort, awk can be combined to perform a more complex tasks.
Simple logging Facade for java (SLIF4J), it act as layer to hide the complexity of the logging from the users
Read books that discuss this subject.
DRY:
Things that we should avoid to ensure that we are applying this principle
- Following inefficient process: we should always try to get feedbacks, apply continuous improvements, incremental change, and repeat, we should not have the mentality of “we always did it this way” or “ this is how we do it”
- Lack of automation: waste time deploying manually, configuring servers, writing documentation , and testing, this tasks can be simple at the start but it is going to be hard and time consuming when software get more complex.
- Not invented here, reinventing the wheel: building things that already exists which waste our time.
- Copy past programming: having a code that does the same things in a part of the system so we just copy that code and use it , we will face problems like bugs occurring in multiple parts , we can add a rule that we never copy past code.
- I wont need it again so let’s just hack it quickly: we might work on a code or project quickly thinking that we will never need to go back to it again , but a problem occur and we have to go back to work on it , we will find a messy and inefficient , not tested , and unmaintainable code waiting for us, we should practice refactoring , inheritance, composition, and design patterns
A white paper by Nasa show that 10 to 25 percent large systems code is a result of copy past programming, In a higher level of abstraction we can create a common service that can be used by other services across the system.
If the usage of a library or a component is easy everyone will use it , if not they will not and we might end up with a duplication or hacks.
ِConding To Contract
Coding to contract or to interface You discuss things that are allowed to the client to see and expose only what the client need .
Contract: is a set of rules that the provider of functionality agrees to fullfile , and the client will depend on , but without knowing how this functionality is implemented, as long as we keep it intact, client and providers can be modified independently.
When designing the code we should create explicit contract , we should depend on the contract whenever is possible instead of implementation details.
We should think of the contract as a legal documentt, in legal document we should be more details oriented , because if our contract does not cover all what we want (in case of software if our contract expose too much details) we need to renegotiate every change with our client.
When we start building systems we should first define what features our client need and then expose the minimum details to achieve what he want.
Http is a good example of coding to contract because it gives the possibility for different application to communicate based on a specific interface(decoupling), things like web browsers, cache sever(Varnish), web server (nginx, apache) can communicate between each other and depend on the same contract( an example of this can be found in the figure 2.5).
Draw Diagrams
Diagrams are worth a thousand words, even when we don’t have so much time we should take time to design the architecture.
If it is difficult for us to draw diagrams we can follow this approach:
- Draw diagrams of what we have already built , once we get comfortable with diagrams
- And then we start drawing diagrams while coding and working on certain features
- And then we start trying to do an up front design (design first and then code last)
We might want to design a circuit breaker component , is a design pattern that will prevent our system from falling by first checking if certain system is available before doing an action
So we can do the following to design it:
1- Create a draft of the interface (Listing 2.1)
2- draft the client code, can be a unit test or just some client code that does not have to compile.(2.2)
3- a draft of sequence diagram
4- a draft of class diagram
With this approach we can see the design from different angles , and avoid doing an unrealistic design.
We have three very important diagrams: use case diagram, class diagram, and module diagrams
Use case diagrams:
They show the users of the system and the operations they perform, they show the business requirement. Can also show interaction with other systems like apis or task scheduler.
We should keep it simple to we can maintain readability and maintainability.
Class diagram:
Are the best to show coupling between classes by simply watching how many dependencies a node include, and show the module structures and the relations between its classes , interfaces.
Interfaces should always depend on interfaces, never on concrete classes.
Classes on the other hand should depend on interfaces as much as possible.
Module diagram:
A module diagram is a zoom out of class diagram, show the interaction between modules. Can be a package or any logical part responsible for a certain functionality.
Module diagram focus on certain functionality that are relevant to the functionality that we want to document; when a system grow larger it is better to create a few separate diagrams to keep simple; easy to remember, and easy to recreate.
Single Responsibility
Reduce complexity, make it simple…
Some guidelines to promote the single responsibility:
Keep class below two to four screens of code
Ensure that our class does not depend on more than 5 classes or interfaces.
Ensure that class has specific goal and purpose.
Summarize the responsibility of class and put it on the top of class , if you find it hard to summarize this mean that we are breaking the rule.
An example of this is if we are adding an email verification feature to our software, we could add the verification to the code that create the user but this will make the code more complex and we will not be able to re use it in another code again, separating the logic of validation in a separate class will solve this.
A good way to learn more about this is to explore more about design patterns( strategy, iterator, proxy, and adapter), and learn more about Domain driven design.
Open Closed Principle:
Everytime we write a code with the intent to extend it and not modify it later we are using this principle, classes should be open for extension and closed for modification.
With a main reason of increasing flexibility and make future changes cheaper.
An example of this is when we have to implement a sorting algorithm , with a feature to sort employees, and we implement the solution inside a class called SortingEmployees with a sort method, this will cause problems if we want to do the same thing for Cities, we will be left with two dirty solutions , we either extend the SortingEmployees , but sorting Cities does not have to know about SortingEmployees , or to copy past code from sorting employees class and past it in SortingCities.
A solution here is to break the problem into smaller ones , by creating a sorter and Comparator interfaces , and then the new class SortEmployee will implement the Comparator interface and Comparator will have an instance of Sorter inside of it.
MVC framework is a good example of this and specially spring framework , if framework is well designed you don’t have to update the framework code to implement features but you just extend a component and create another one based on it . in spring most of classes does not even have to now about the existing of spring mvc framework.
Dependency injection
Reduce coupling and promote the open closed principle.
Reference the objects that the class depends on, it does not allow the class to know about the referenced object implementation details, or how they are assembled.
Dependency injection switch from the principle of letting class inherit other classes ( a pull approach) , to adding the objects directly into the class (a push approach) to decouple the class from dependencies and make it easier to test.
To understand more an example of a reader and cd can be used, fig 2.14 and 2.13 show examples of this
Dependency injection make the class responsibility less and make it dumber, make it simpler.
Without the need to know the contract of the injected object the class can focus in it’s own responsibility
Inversion of control
DI is included here , this is a larger principle that can be applied every where in all the levels of abstractions.
IOC is removing some responsibilities from the class to make it simple , and less coupled to other parts of the system
You don’t have to know who will use or create your objects, how or when.
It is used in a lot of frameworks , IOC look at requests and figure out which classes should be initiated and which services and components they depend on. (requests contain data like the url, headers and cookies that the ioc framework will use)
Can also be called as the “we call you, you don’t call use principle” the classes does not have who is using them, when their instances are created, or how their dependencies are put together, the classes become like a plugin.
Using frameworks will reduce the local complexity of our app.
The factors of an IOC framework ( FIG 2.16)
- You can create plugins for your framework
- Each plugin is independent and can be removed or added at anytime
- Framework can auto detect these plugins , or there is a way to configure which plugins should be used
- Your framework define the interfaces for each plugin and should not be coupled to a plugins themself.
IOC framework , is like a having fishes in tank , you decide how many fish we want there, and we decide when to feed them , so the fish is the plugin and you are the ioc framework.
Design for scale
A difficult thing to master, we should be careful to make a balance with designing to scale and overengineering.
Most of the startup fails and never need to scale (90%) , the other 9% will not need to really go for a horizontal scalability, only the 1%
Similar to coupling and complexity principles discussed above , scalability problems can categorized into:
- Adding more clones: adding indistinguishable components
- Functional partitioning: Dividing the system into smaller subsystems based on functionality
- Data partitioning: keeping a subset of the data in each machine
Adding more clones While building a system the easiest way to scale it is to design it to be able to add more clone; a clone is the exact same of a current server or component where if we send a random request to any clone server we should get the same result. Figure 2.17 and 2.18.
We need to pay attention to where we keep state and sync it between the servers.
Scaling with clones work best for stateless servers or services (services that does not have any local state)
The problem with this scaling technique is in syncing data between stateful services.
Functional partitioning: The main idea is to look at the parts that work on the same functionality and create a separate subsystem of them.
In term of infrastructure it is the separation of our data center into multiple server types for example: Message queue server, cache server, Webserver, load balancer…
It is dividing the system into independent services, it help us allow the coding to contract principle , it is often used in web services layer , and one of the basics of service oriented architecture , this strategy also allow us to analyze and get the needs of each services independently and scale them separately.
It is also common to break the app into a database service layer and web service layer.
It is common in large companies to separate the app into a smaller independent services , where each team can work on a service separately analyze it and try to scale it.
Drawbacks of this approach is that it require more management and effort to start with, We can not keep rewriting our system and divide it endlessly , it might not solve our scalability problem there might be some other problems like architecture or optimization issues.
Data Partitioning Using a manifestation of share-nothing principle, where each service has it’s own subset of data, he have a complete control of his internal state without the need to sync state with others, no need for locking( locking is when we have multiple server trying to access the same resources and modify the data we need to handle concurrency to avoid data corruption).
Design for self healing
And because our systems might fail at any moment we should design them with high availability and self healing in mind.
We want to make our system always available for our users , even when experiencing partial failure or during maintenance (a system is considered available as long as it perform it’s functions as expected from client prospective).
There is no specific measurement of availability, but it can measured as the numbers of nines , if a system is available 99% of time it means which means that he is going to be out for 365 days * 0.01 = 3.65, and if we do 99.999% which make it available only 5 min a year.
The larger the system get the higher the chance of failure because we will be communicating with more services/data stores and components when the system get bigger, failure need to be considered as a norm during design not as special condition.
NEtflix use a system called chaos monkey , the system cause the failure of some component , so the team can test the availability of there Software.
Crash only: means that whenever a system failed and then start operating again he must be able to detect failure and fix the broken data
To ensure high availability of a system we should remove the single points of failure, and to ensure graceful failure, which means that our system should switch to a backup/clone component without impacting the user or cause loss/corruption of data.
We can draw a diagram of all our system components and ask ourself if we shutdown one service what could happen, we can then discuss the possibility of adding redundancy and see if it is going to be cheap or not, we also prepare a disaster recovery plan.
If we achieved the a high level of availability and handling graceful failure , we can start thinking about designing a self healing system, who can fix his issues without the need of human interactions, this is hard and expensive to build.
An example of a self healing system is Cassandra when a node failed the server stop requests that are coming to this node (this is the only stage where users might experience some failure or downtime), once the node is detected as failed the client continue reading data from other nodes in the cluster that provide redundancy of the failed node, when the failed node is back to work the system automatically provide it with the missing data
Mean time to recovery, measure how fast we can detect, repair and recover from a failure, the higher the availability of our system and it can be measured with this equation mean time to failure / (mean time to failure + mean time to recovery).
Summary:
The cleanest solution is not always the best solution for the business if it costs more time , money , and need more management, we need to think and make the best possible solutions for the business, we need to make tradeoffs in term of scalability, flexibility, high availability, costs, and time to market.
Don’t hesitate to challenge the rules, but it’s essential to understand the tools, basics, and principles of our craft first. This way, we can make informed decisions and balance tradeoffs effectively