Microservices – Fault Tolerance and Resiliency

How fault tolerance works?
Discovery client keeps sending the heart beats to discovery server by saying that I am still alive.

Making Microservices Resilient and Fault Tolerant

What is Fault Tolerance?
Given an application, if there is a fault what is the impact of that fault. How tolerant your system is.
Resiliency: How many faults a system can tolerate

Accessing values in Java code from property files.

@Value("$(api.key)")
private String apiKey;

Issues with Microservices
1. An microservice instance can go down
Solution: Run multiple instances
2. Microservice instance is slow

Insight
Let’s say you have created 3 microservices.
M1
M2
M3

M1 is calling M2 and M3 to get data, consolidate the data and sends the response back to client.
M2 is also calling an external service to get some data which it depends upon.
Scenario 1: If external service goes slow, then M2 will be slow hence M1 will be slow.
Scenario 2: Even if M3 has nothing to do with M2 but since M1 is dependent on M2 and M3, so M3 will be slow until we get the result from M2.
Scenario 3: Let’s say we have another microservice called M4, which is just calling M3 and nothing to with M2, it still might be slow because the way threads work in web server.

Solution for slow microservices:
Timeout

Setting timeouts on Spring RestTemplate
Method 1:

@Bean
@LoadBalanced
public RestTemplate getRestTemplate() {
	HttpComponentsClientHttpRequestFactory clientHttpRequestFactory = new HttpComponentsClientHttpRequestFactory();
	clientHttpRequestFactory.setConnectTimeout(3000);
	return new RestTemplate(clientHttpRequestFactory);
}

Method 2: (Preferred way)
Circuit Breaker Pattern
1. Detect something is wrong.
2. Take temporary steps to avoid the situation getting worse
3. Deactivate the problem component so it doesn’t affect downstream components

Circuit Breaker Parameters
When does the circuit trip?
1. Last n requests to consider for the decision.
2. How many of those should fail?
3. Timeout duration

When does a circuit get back to normal?
1. How long a circuit trip to try again?

Example:
Last n requests to consider for the decision: 5
How many of those should fail? 3
Timeout duration: 2s
How long to wait (Sleep window): 10s

OK!

So what if requests keep coming for the service which experiences the circuit-break?
We need fallback
1. Throw an error
2. Return a fallback default response
3. Save previous responses (cache) and use when possible

Why Circuit Breakers?
1. Failing fast
2. Fallback functionality
3. Automatic recovery

Circuit Breaker Pattern
1. When to break circuit
2. What to do when circuit breaks
3. When to resume requests

Hystrix
1. Open source library originally created by Netflix
2. Implements Circuit breaker pattern so you don’t have to
3. Give this configuration params and it does the work
4. Works well with Spring Boot

Steps to Add Hystrix to Spring Boot Microservices

Step 1: Add the below starter dependency in pom.xml

<dependency>
	<groupId>org.springframework.cloud</groupId>
	<artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
</dependency>

Step 2: Go to main class of your microservice and add @EnableCircuitBreaker annotation

package org.techlearnings;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.cloud.client.circuitbreaker.EnableCircuitBreaker;
import org.springframework.cloud.client.loadbalancer.LoadBalanced;
import org.springframework.cloud.netflix.eureka.EnableEurekaClient;
import org.springframework.context.annotation.Bean;
import org.springframework.web.client.RestTemplate;

@SpringBootApplication
@EnableEurekaClient
@EnableCircuitBreaker
public class UserDetailsServiceApplication {

	public static void main(String[] args) {
		SpringApplication.run(MovieCatalogServiceApplication.class, args);
	}
	
	@Bean
	@LoadBalanced
	public RestTemplate getRestTemplate() {
		return new RestTemplate();
	}

}

Step 3: Add below annotation to the methods that need circuit breaker.
@HystrixCommand
The way @HystrixCommand works is, it creates the proxy and breaks the circuit by not calling the method available in actual class based on the configuration (circuit breaker parameters) you provided.

Step 4: Configure Hystrix behavior
Example

@HystrixCommand(fallbackMethod = "getUserDetailsFallback")
public UserDetails getUserDetails() {
	return restTemplate.getForObject("http://user-details-service/userdetails/" + 1, UserDetails.class);
}

public UserDetails getUserDetailsFallback() {
	return new UserDetails();;
}

How Hystrix works?
Let’s say you have 3 microservices named app1, app2 and app3
One of the resources(r1) in app1 has a method named m1, which is consuming resources available in app2 and app3. If you annotate m1 with @HystrixCommand then Hystrix creates the proxy of r1. So if you consume 2 or more microservices from the same method then you cannot handle the fallback. So better, refactor it by creating the method(annotated with @HystrixCommand) in separate bean.

Hystrix Properties

@HystrixCommand(fallbackMethod = "getUserRatingsFallback",
		commandProperties = {
				@HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "2000"),
				@HystrixProperty(name = "circuitBreaker.requestVolumeThreshold", value = "5"),
				@HystrixProperty(name = "circuitBreaker.errorThresholdPercentage", value = "50"),
				@HystrixProperty(name = "circuitBreaker.sleepWindowInMilliseconds", value = "5000")
		})
public UserDetails getUserDetails() {
	return restTemplate.getForObject("http://user-details-service/userdetails/" + 1, UserDetails.class);
}

Hystrix Dashboard

Steps to Configure Hystrix Dashboard
Step 1: Add below starter dependencies in your microservice

<dependency>
	<groupId>org.springframework.cloud</groupId>
	<artifactId>spring-cloud-starter-netflix-hystrix-dashboard</artifactId>
</dependency>

<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-cloud-starter-actuator</artifactId>
</dependency>

Step 2: Add @EnableHystrixDashboard annotation in your main class

@package org.techlearnings;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.cloud.client.circuitbreaker.EnableCircuitBreaker;
import org.springframework.cloud.client.loadbalancer.LoadBalanced;
import org.springframework.cloud.netflix.eureka.EnableEurekaClient;
import org.springframework.cloud.netflix.hystrix.dashboard.EnableHystrixDashboard;
import org.springframework.context.annotation.Bean;
import org.springframework.web.client.RestTemplate;

@SpringBootApplication
@EnableEurekaClient
@EnableCircuitBreaker
@EnableHystrixDashboard
public class UserDetailsServiceApplication {

	public static void main(String[] args) {
		SpringApplication.run(MovieCatalogServiceApplication.class, args);
	}
	
	@Bean
	@LoadBalanced
	public RestTemplate getRestTemplate() {
		return new RestTemplate();
	}

}

Step 3: Add below property in your application.properties file

management.endpoints.web.base-path=/
management.endpoints.web.exposure.include=hystrix.stream

Step 4: Accessing Hystrix Dashboard
To access Hystrix dashboard access hit below URL

http(s)://<hystrix-app>:<port>/hystrix.stream

That’s it!!!

Author: Mahesh

Technical Lead with 10 plus years of experience in developing web applications using Java/J2EE and web technologies. Strong in design and integration problem solving skills. Ability to learn, unlearn and relearn with strong written and verbal communications.