Notes About Web Scalability For Startup Engineers: Components of the scalable frontend
The components of the frontend are responsible for rendering the ui and handling the connections started by the users.
DNS
First component the client talk to when he want visit a website; dns used to map domain name to an ip address .
For most of the cases using a third party dns provider is the best solution; but for example if we are working on web hosting company hosting our own dns server is the best solution to gain flexibility and save money.
Incase of using amazon we can use ROUtE 53 which integrate well with other amazon services; the amazon dns can be configure to use latency based routing to redirect the user to the closest data center , compared to geodns it use latency measurement like network congestion, outages, network patterns
Network congestion: network overwhelmed by too much data traffic can cause delay
Network patterns: the path that a packet take to get to its destination
Load balancers
When the client get mapped to the right server by the dns, a usage of load balancer play a good role to keep our service always available if we want to change our hardware or if we want to scale and add more servers.
Old days dns was used to distribute traffic between servers because load balancers was not common and too expensive, a round robin dns can be used where the client ask the dns for one of the ip addresses the client get a direct connection to one of the servers, if we want to add or remove a server we need to change the dns; This approach has alot of downside including removing a server is might be hard because our clients can have the ip address of the server cached and can connect directly to it, same thing for adding new server where clients will keep connecting to old servers as long as the time to live policy allow, so using this strategy in prod is not preferred.
TTL: in dns is how much time an information is considered valid before it can be refreshed or discarded, so other servers will consider an info valid and request the dns to refresh this data , TTL help reduce the number of request a dns receive
DNS resolver: get a dns queries from clients (example: browsers), if he does not have the ip address cached he recursively query other authoritative dns to find the ip address, it cache the ip addresses based on TTL value
DNS authoritative: store dns record (A, CNAME, AA, MX ….) for domain they are authoritative for, manage mapping for other ip addresses and other dns records
We can put a load balancer between the client and our servers to hide data center structure and responsibilities , it also has the following benefits
- Hide server maintenance: we can first stop load balancer from redirecting request to a server (remove it from load balancer pool) , wait for the active connections to the server to stop (drain conn) and then we can deploy a new software without affecting our clients, doing this to each server alone and then getting that server back to the load balancer pool is called rolling updates
- Seamless increase capacity: adding new server without client noticing it or experience any connection delay or interruption which is better than dns round robing.
- Efficient failure management: if a server fail , we can remove it from the load balancer pool to stop redirecting connections to it.
- Automated auto scaling: similar to cloud platforms where removing and adding servers happen always to adapt to the current usage and traffic without affecting the users
- Effective resource management: for example we can use SSL offloading layer to perform encryption and decryption in the load balancer, and use unencrypted connections internally
There is 3 popular options or types of load balancer to choose from based on the complexity of the system:
Load balancer as hosted service:
if we are hosting in cloud provider like EC2 or azure it is recommended that we use there load balancer, ELB by amazon a load balancer as a service, all what we need to do is to point it to bunch of EC2 instances and he will do the rest here is some benefits of ELB:
- ELB is simple and also cheap , we will have one less component to manage and scale
- ELB scale transparently, which means that he can scale up and down without any manual intervention
- ELB is highly available, if we setup our own load balancer we need to have a hot standby load balancer as backup if the main load balancer fail
- ELB we pay as we go, no charge for setting up an instance
- ELB allow auto scaling by automatically replace EC2 instances if a one fail
- ELB can perform ssl termination or offloading, by handling SSL enc/dec and request from ELB to EC2 instances are http, so we will not need to run ssl webserver at all
- ELB support graceful backend-server termination, where we can remove a server from LB pool wait for all client connections to close and then shut it down
- ELB suupot amazon sdk so LB config can be automated
ELB might not be suitable for our app if our traffic spike and these will require doubling capacity in a matter of second or min than ELB is not the right as he will need some min or sec to scale up
Some cloud providers allow us to use LB internally for example we can host our frontend and backend server in the same provider and add LB between , and we can get all the benefits we talked about above.
Self managed software based load balancer:
in case we are hosting in cloud provider that his load balancer does not fit our needs we can go for a software based load balancer like nginx or HAProxy or other alternatives
- knowing that nginx is also reverse proxies so we can benefit from its caching for http request and more.
- HAProxy is just a load balancer and can be used as a layer 4 and layer 7 load balancer , so he can play this role for not only http/https protocols but for others too, when we use as layer 4 LB he does not inspect higher level protocols he only depend on tcp ip headers to distribute the traffic, so we can use it as an LB for databases, cache servers , queues and more, if used a layer 7 LB it can support sticky sessions and ssl termination, still need more resources to be able to track http specific information, perform better when it comes to performance and it also designed for high availability to make it more resilient and simplify failure handling and recovery.
- Using a one of the lb mentioned above we will have to scale our LB , we will reach a limit where we are going to have too many concurrent connections or many connections sent to the lb, this should be enough for most applications and we can run a single LB with a hot standby.
- After this we can use a round robin dns to distribute traffic between multiple load balancers, it is simple as long as our lbs is interchangeable and server are stateless.
- Its ok to use round robin dns to distribute traffic to load balancers than servers because LB can not get updated frequently like web servers, lb may not fail because of bugs.
Reverse Proxy: sit in front of a web server intercepts requests from clients; but a forward proxy sit in front of a client and ensure that no origin server communicate with the client directly, we should configure multiple reverse proxy so we are not going to have a single point of failure, and also to apply the principle of “failover”.
Hardware load balancer
like Big-IP from F5 or Netscaler from Citrix , more features and much higher capacity than software based lb, with higher throughput extremely low latencies and consistent performance, they can handle 100 thousands to 1 million of clients.
This type of lb are very expensive they can go from few thousands to over 100 thousands, they also require experienced people with specific training to manage them
So if we are hosting our services in our own hardware, a hardware load balancer is the way to go.
Web Servers
Frontend layer should not have much business logic and should be a presentation and data fetching layer from the web services , it is better to use dynamic languages like php, js, python , ruby because of there simple syntax , less code , which allow for faster prototyping and more productivity, and flexibility, better ajax handling because of there async nature (js), and better SEO handling, internationalization compared to strongly typed languages.
It is good to use the same language and framework for our application , but it is also common to see different languages in different layers as they might face different problems which benefit from different architectural decisions.
Node js: a runtime environment with a set of tools to help js run on the backend, perform better than other tech when used with high throughput apps and when we want to handle thousands to 100 of thousands of open connections , or when small packet of data is being exchanged between client and server.
For a single machine Nodejs as a web server compared to other technologies like apache is better as he can scale with 100 of thousands of concurrent connections while apache will crush with few thousands, when it comes to horizontal scaling the advantage of node js disappear in favorite of the horizontal scalability of the entire cluster rather than a single machine.
So for real time notifications and chat apps and multiplayer games nodejs is cool , but for other cases it is better and cheaper to develop in python , php, ruby as they are more stable in term of environment and tools built around them.
At the end the most important factor when it comes to scaling our frontend is t keep our machines stateless
Important Note: when doing research about specific task and tech, look closely and think carefully when reading benchmarks as there results depend on who prepared them , also look carefully to numbers and charts.
Caching
Is critical and one of the most important things when it comes to scaling our frontend, so instead of making our server faster or scaling horizontally we try to not serve these requests and catch them.
A cdn can be used to proxy all requests coming to the web server or it can only handle static files, if we serve all of the traffic using a cdn we can cache entire pages with ajax responses, this approach does not work well with most apps because there content is personalized and dynamic, , a usage of our own proxy server to control what content to cache and for how much time is the best solution for more complex apps, some examples are nginx and varnish.
Another way is to store our the data in the web storage ( local storage of browser as an examle) which going to load to smoother experience and reduce number of requests sent to the server this is useful in SPA and when developing for mobile clients.
If our cdn , proxies , browser cache was not enough , we might be cache some fragments of our responses using a shared object cache like redis or memcached.
Auto scaling
Is the ability to automate the infrastructure so that new virtual servers can be added or removed automatically depending on the volume of the traffic and the server load , because scalability is not just about scaling up but it is also about scaling down and the ability to remove servers from the cluster and save the cost, around 25 to 50 of hosting cost can be saved. And the best way to do that is to use the hosting provider auto scaling tools instead of building our own tools (amazon, azure, rackspace).
An example of configuring auto scaling is the usage of amazon EC2 , where servers is going to be removed and added automatically so we should not save any data on the servers, shutting down servers should not cause a bad user experience, before this we have to create a web server image AMI and configure it to be able to bootstrap its self and join a server , so anything that is need for new EC2 instance to become functionally is in the AMI file , passed by AMI lunch parameters or fetched from a remote data store, EC2 use an AMI image and to tell the EC2 instance which server he belong to and what is his role, then we can create auto scaling group to define the scaling rules like “if cpu usage is higher than 90% add a server” or “everyday at 9pm add a server” and also set different threshold for different servers collected by cloud watch, if we decide to use ELB new EC2 can be added and removed to or from the ELB pool automatically.