Distributed Systems Explained In 10 Minutes

Why put all your eggs in one basket?  That’s the philosophy behind a computing concept known as distributed systems.

But it’s definitely not that simple.

Getting a solid grasp of distributed systems architecture can take a while, fortunately we’re here to help you learn the basics in just 10 minutes.

What Is A Non-Distributed System?

Think old school, dinosaur era monolithic architecture. A single, unified system that is extremely basic in its operation. Slow and steady wins the race, right?

All parts of the system are in the same location – which is the main criteria for being deemed non-distributed. Monolithic systems tick this box, as the database, and server-side application that processes requests, executes domain-specific logic, and retrieves and updates data are in the same place. Every component that sends and receives data to the user is in the same container, case, or rack, and exists within the singular process that is the mainframe.

Even experts start off with a monolithic first approach before moving to microservices – which are much more complex. Interestingly, there are such thing as non-distributed microservices – which is basically where you program your system as a set of microservices that run concurrently within the same machine.

This is done mostly as an interim learning process where the programmer can get used to automatizing testing, monitoring services execution, and managing a perfectly working network from within the basic structure of the same machine. Fake microservices, if you will, to learn programming language ahead of the real deal of distributing amongst multiple machines. Instead of being launched into the pool headfirst in a “sink or swim” situation, it’s more like learning to crawl before you walk.

But when you hear the term non-distributed system, it’s best to imagine something like  a WordPress application running off a single machine. You have Apache serving the PHP back end code, and a MySQL database used for storage and retrieval.

To learn more about microservices or serverless architecture check out our blog articles, “Serverless Architecture Explained in 10 minutes” or “Microservices Explained in 10 Minutes”. We keep it simple and get straight to the details so you can stay on top of the trends and continue to grow your career.

So, What’s A Distributed System Then:

In short, a distributed system consists of multiple components located across different machines that may or may not be part of a network. These machines converse with one another to synchronise and order any actions requested by the user. These actions are carried out in such a way as to appear to the user like they have been performed by a single, coherent system.

Microservices can also be an example of a distributed system. They run multiple copies of each component on separate machines – adding redundancies to the various layers so the failure of a single machine doesn’t end up with a loss of data or function.

Each component communicates with each other over a linked network, and not within a single mainframe process. They key takeaway here is that distributed systems are a collection of autonomous computing elements that come together to act as a whole, coherent system. Bear in mind that the machines that contain each component do not have to be in the same physical location, either.

Using the same WordPress application, it becomes a distributed system once you have multiple Apache instances running off of different machines that all combine in some form to service the back end PHP code. Likewise, there would need to be some form of load balancer whose sole function is to distribute the workload amongst these various Apache instances. Similarly, the MySQL database is now a cluster with master/slave replication, and is memory cached.

Distributed Systems Pros:

In the event of a machine breakdown, the legion of duplicate services can still continue to serve user requests thanks to that built in redundancy. It’s this sharing of the workload that gives the appearance that distributed systems have near-infinite resources. In reality, it’s a case of “many hands make light work”.

Distributed systems can be built on top of other distributed systems and use the existing architecture. An existing lower-level system will likely provide a sufficient base in order to build an entirely new set of processes that either do the same job but better, or have an entirely different function.

The other big plus is scalability, or the ability for a system to be modified to fit a problem area. This could be something like accommodating increased storage or using an increased number of datasets, all the while maintaining service with the user.

Unlike a non-distributed system that needs to be shut down in order to be updated, distributed systems, by their very nature, are duplicated across multiple machines. If one falls off, the rest can take up the slack and the only hint of disruption is perhaps a slightly longer time to fulfill requests for the user – like a longer load time.

Non-distributed systems require more power to be added to the same machine, generally in the form of a faster CPU, more RAM, more drive storage, etc. As with everything, the mainboard capacity has a limit and after a while, vertical scaling like this requires a full system hardware upgrade which gets pretty pricey.

However, distributed systems tackle the problem of handling too many requests and extended load times by just adding more machines to the system. You scale horizontally, as it’s way cheaper to just add in more processors than to buy a new processor that has a massive number of cores. Multiple cheaper machines can be thousands of times cheaper than scaling up a single server to handle the same increased load.

Distributed System Cons:

There aren’t many cons associated with a distributed system really. The big one is that due to the size of the system, some messages can get lost within the network.

Similarly, security exploits become a risk due to sharing between multiple machines. Vulnerabilities arise, and cyber-attacks and breaches can come by way of network, or local access install of backdoors.

The individual machines may also become overloaded, if the shared workload is too high for the combined might of the conglomerate of duplicated services. The whole only works as good as the transmission of communication across the network.

The Current Trend:

Large scale projects are moving towards distributed service systems spread out across multiple powerful machines that can handle an extremely high volumes of requests and processing of data. It’s a similar concept to a pool of crypto miners all contributing to decoding the blockchain of a single bitcoin, as the power of a single system to do so is almost incomprehensible.

The horizontal scalability and lower set up price is the main draw for heading down the distributed system route. The ability to simply add in more machines offers unmatched performance at a much lower price point, and easily outweighs the security disadvantages associated with a system spread out across a network.

Downtime is what kills a business, especially those who deal with big chunks of data and a high workload. A distributed system eliminates the need for downtime because let’s face it, time is money – and money makes the world go round.

DISTRIBUTED SYSTEMS ENGINEER SALARIES

Engineers and architects that focus on large, complex distributed systems generally have a very deep and expert grasp on multiple programming languages, networking and hardware integration.  For this reason, SME’s of this caliber are in high demand.

Back-end engineers can expect between $150,000-$200,000 per year as a base salary, not including any stock, equity or bonus inclusions.  A principal or architect typically will earn north of $250,000, regardless of location.

To benefit from a confidential conversation about your career and learn of some of the best opportunities for Distributed Systems Engineers, contact Kofi Group today.