If you are using traditional IT systems and infrastructures, you would no doubt be familiar with designing for fault tolerance in your applications. However, with a move to the cloud, you might want to revisit fault tolerance in your applications. This is because cloud systems and architectures might have a different way when it comes to fault tolerance and high availability.
Simply put, cloud systems are built differently from traditional infrastructures, and applications tend to go out of commission when the failures do occur.
The Difference Between Machine Failures and Cloud Architecture Failures
What is the difference, then?
In traditional machine-based architectures, the failures are related to the operating system or hardware. For this reason, fault tolerance and high availability efforts are geared towards having redundant operating system and hardware. For instance, you have RAID disk arrays, redundant network cards, database replication between master and slave, and cluster application servers to make your system is ready for high availability.
To simplify things, you will have two or more nodes powering the same application. If one node fails for any reason the other nodes would be able to step up and execute that application.
But this is not the case when it comes to the enterprise cloud. Because cloud systems are designed for multi-tenancy and scalability, the system is much more complicated.
Instead of different machines or nodes, you have different services that are taken together in order to give you a running application. These services are interconnected, in such a way that one node could be anywhere in the world and composed of different application layers. For example, a cloud server in Asia might have a UI layer, NoSQL data layer, business logic layer, API layer and other different types of layer. In short, using only that server in Asia, you can already use the application. But then, all of these layers are also replicated in servers in other parts of the world, so that machines in Europe, for instance, would have the same layers.
Sounds complicated? Well, think of a table.
In machine architecture, it is as simple as having columns. Each column will represent different and redundant machines.
With cloud architecture, you have those columns and you need to add rows. The rows will represent the different layers.
The rows of services mean that the cloud service is infinitely scalable and that it is cost-effective, highly performing and able to provide services on demand.
If you think of it that way, you would easily see that redundancy is no longer enough. You need to have multiple zones, or groups of computing resources that are physically distant from each other. This will ensure redundant resources for your application.
Oracle Solutions: High Cloud availability
Get Oracle Coherence, a distributed data management platform that allows you to have a low latency, scalable and replicated data system. This way, your data is replicated automatically.
You should also need to allow for graceful degradation, or enable other components to be accessed even if some other functions are failing. One example is if you sell DVDs online and, for some reason, the server fails to get DVD information, such as reviews and blurbs. This should not affect your shopping cart, so even if the buyer cannot read the reviews, he or she can still buy the DVD.
You can also use Oracle WebLogic 12c to allow for asynchronous messaging so that your messages will be stored and will still be delivered in case a failure happens.
Four Cornerstone can help you ensure high availability even for cloud systems using a variety of Oracle software and solutions.
Photo courtesy of Oracle.