Demystified Service Mesh Capabilities for Developers

Service Meshes have been gaining a lot of popularity lately, more so amongst Spring and Java developers who wish to address cross-cutting concerns. But, are you wondering what exactly are Service Meshes? What are some of the popular types out there? And most importantly, what kind of problems do they actually solve? Well, look no further! This blog is here to provide you with the answers you seek.

What is a Service Mesh?

A service mesh is a dedicated infrastructure layer that helps manage communication between the various microservices within a distributed application. It acts as a transparent and decentralized network of proxies that are deployed alongside the application services. These proxies, often referred to as sidecars, handle service-to-service communication, providing essential features such as service discovery, load balancing, traffic routing, authentication, and observability.

By abstracting away the complexity of network communication, a service mesh enables developers to focus on application logic rather than dealing with the intricacies of networking code. It provides a consistent and flexible way to handle cross-service communication and allows for the implementation of advanced traffic management strategies, security policies, and observability mechanisms.

They provide a standardized approach to managing microservices communication, making it easier to monitor, secure, and control traffic within complex distributed systems.

Components of a Service Mesh

Service mesh architecture typically involves the following components and their interactions:

Data Plane: The data plane refers to a network of sidecar proxies deployed along with each service instance, so that it can communicate with the other services in the system. It acts as an intermediary between the service and the rest of the network. Sidecar proxies handle inbound and outbound traffic, intercepting communication and providing additional features.

  1. Sidecar: It’s based on Envoy proxy. It’s another container which runs in the same Kubernetes POD and takes care of all cross-cutting concerns. It’s based on the sidecar container design pattern.
  2. Application Traffic: Microservices connect through other microservices using sidecar containers. Application traffic is basically communication between Envoy sidecar proxy containers.
  3. Namespace: It’s an isolated space on a Kubernetes POD where the both containers (sidecar and microservices app) run in parallel.

Control Plane: The control plane is the centralized management and configuration layer of the service mesh. It is responsible for controlling and coordinating the behavior of the sidecar proxies. It provides a control plane API that allows administrators to configure policies, rules, and settings for traffic management, security, and observability.

  1. API Endpoints: API endpoints are the entry points through which services within the mesh can communicate with each other
  2. Controllers: A controller is a component responsible for managing and controlling the behavior of the mesh. It is typically a software component that monitors the state and health of services, configures traffic routing and load balancing rules, enforces security policies, and handles other aspects of service-to-service communication within the mesh.
  3. Service Discovery: Service discovery is an essential component in service mesh architecture. It enables services to dynamically locate and connect with each other without hard-coded addresses.
  4. Certificate Authority: It provides and manages root and intermediate certificates and performs certificate signing operations. 

Application Microservices: These are the individual services or microservices that make up the application. They are responsible for handling specific functions or tasks.

Use Case: E-commerce Application

Consider an e-commerce application use case, a service mesh would help manage the complex network of microservices responsible for different functions, such as inventory management, order processing, payment processing, and shipping. 

  • The sidecar proxies would handle load balancing, ensuring that traffic is distributed efficiently across multiple instances of each service.
  • Additionally, the service mesh would provide secure communication between services by enforcing encryption and authentication using TLS. This would help protect sensitive customer information during transmission and prevent unauthorized access to critical services.
  • Traffic management features would allow operators to control and monitor the flow of requests, enabling them to perform tasks like routing certain requests to a newer version of a service for testing purposes or limiting the rate of incoming requests to prevent overloading.
  • The observability and monitoring capabilities of the service mesh would provide operators with real-time insights into the application’s performance, enabling them to identify and resolve issues promptly.
  • They could analyze metrics, logs, and traces to optimize the application’s performance, troubleshoot problems, and ensure a smooth customer experience.

Overall, a service mesh simplifies the management and enhances the resilience, security, and observability of a distributed application, making it an essential component in modern microservices architectures.

What problems do Service Meshes solve?

Service mesh solves several problems in the context of modern application architectures. Here are some of the key problems that service mesh addresses:

  1. Service-to-service communication: In a microservices architecture, applications are composed of multiple independent services that need to communicate with each other. Service mesh provides a dedicated infrastructure layer to handle service-to-service communication, making it easier to manage and secure these interactions.
  2. Service discovery and load balancing: As the number of services increases, it becomes challenging to keep track of their locations and distribute traffic efficiently. Service mesh offers service discovery and load balancing capabilities, allowing services to discover and connect to each other dynamically while automatically distributing the traffic load across multiple instances.
  3. Traffic management and routing: Service mesh enables sophisticated traffic management and routing features, such as request routing based on service version, path, headers, or other attributes. It allows for traffic shifting, canary deployments, and A/B testing, empowering teams to implement complex deployment strategies with ease.
  4. Resilience and fault tolerance: Service mesh provides mechanisms for implementing resilience and fault tolerance patterns, such as retries, timeouts, circuit breaking, and load shedding. These features help services handle failures gracefully, isolate issues, and prevent cascading failures across the system.
  5. Observability and Debugging: Service mesh provides developers with powerful observability features such as distributed tracing, metrics collection, and logging. These capabilities help developers gain insights into the behavior and performance of their services, allowing them to debug issues, trace requests across service boundaries, and optimize the performance of their applications.
  6. Security and authentication: Service mesh strengthens the security of microservices architectures by providing features like transport-level encryption (TLS), mutual authentication, and authorization policies. It allows for fine-grained access control and identity management, enhancing the overall security posture of the system.
  7. Tight coupling of source code: Cloud configuration always comes with tight coupling with business logic source code, which makes it code-heavy to manage and debug for any code issues. This can make the process of adding new business features, inserting additional code, and resolving issues a cumbersome task. However, adopting a service mesh architecture allows for the segregation of cross-cutting concerns from the business logic source code. By employing this approach, the service mesh effectively handles all application configurations independently through the collaboration of DevOps platform/infrastructure teams.
  8. Testing overhead of cross-cutting configuration concerns: Testing new features, during integration, regression, and load testing for feature releases, necessitates additional testing effort. It is crucial to test the entire codebase, including the cross-cutting configuration code, even for minor changes in the business logic. By adopting a service mesh approach, the business logic code becomes more concise and streamlined, resulting in easier and faster testing. Furthermore, developers find it simpler to write fewer JUnit and integration test cases.
  9. Application performance issue: When business logic and cross-cutting configuration are combined, they need extra time to load, deploy, and run on app containers. It consumes extra CPU and RAM for even business-specific API calls, which can cause performance issues. In contrast, a service mesh utilizes a separate side-car container dedicated to running the cross-cutting concerns configuration code. This alleviates the load on the main application container, resulting in improved app performance. By running only the streamlined application business logic, the performance is enhanced.

What key features should you look for when selecting a Service Mesh?

  • Connect Kubernetes clusters: It provides connectivity between two or more Kubernetes clusters if it’s used with hybrid cloud technologies like Google Anthos, Azure Arc, AWS Outpost, VMware Tanzu Mission Control (TMC), etc. It could spread across on-premises, private, and public cloud providers.
  • Service discovery with the Ingress Controller and Ingress resources: It provides dynamic service discovery and routing to distributed microservice REST APIs across K8s clusters on multiple clouds with different dynamic IP addresses. It exposes the service by its service name through the Ingress Controller and Ingress resources, which can be used by any client or consumer. The ingress resource provides routing details to various services, and the ingress controller routes incoming requests to the API using the ingress resource.
  • Circuit breaker resiliency: A circuit breaker provides a retry mechanism if dependent services are not responding to the first attempt. A service mesh provides a powerful feature of the circuit breaker when a dependent service does not respond within a given ETA. Because of this, microservices are more resilient to downtime since a service mesh can reroute requests away from failed services using this mechanism.
  • API Tracing between microservices: It provides the API Tracing (API to API interactions) feature of microservices, which traces request and response interaction logs. This tracing helps improve the performance of API and SLA. It helps developers debug and diagnose bugs.
  • Observability: It provides a powerful mechanism to check application health and infra resources like CPU and memory usage. Also, it collects application performance matrices and visualizes them on the web dashboard. Performance metrics can suggest ways to optimize communication in the runtime environment. Also, monitor infrastructure and application monitoring.
  • Data Payload Security: It provides data encryption in transit between microservice API communications by applying two-way strong mTLS security encryption technology.
  • API Rate Limiting: It provides a mechanism to restrict the number of backend API calls and prevent distributed denial-of-service (DOS/DDOS) attackers where thousands or even millions of requests hit backend APIs randomly and crash the entire backend software system and infrastructure.
  • Load balancing: It provides load balancing by using its in-built ingress controller mechanism to expose microservices on Kubernetes clusters as external services exposed through the ingress controller load balancer. Ingress control can map and route client requests to distributed microservices based on ingress resources.

Popular Service Meshes

Istio (OSS)

Istio is an open-source service mesh platform that provides a set of tools and capabilities for managing and securing microservices-based applications. It aims to address common challenges associated with service-to-service communication, observability, security, and traffic management in complex distributed systems. At its core, Istio deploys a sidecar proxy, called Envoy, alongside each microservice in the application. This sidecar proxy intercepts and manages all inbound and outbound traffic for the service, allowing Istio to control and monitor the communication between services.

Advantages:

  • Istio boasts one of the largest communities for online service mesh and is highly acclaimed and discussed on the internet. Its GitHub contributors far outnumber those of Linkerd by a significant margin. 
  • Furthermore, it offers support for both Kubernetes and VM modes.

Drawbacks:

  • Istio comes with a cost as it is not available for free. It demands a considerable time investment in terms of reading the documentation, setting it up, ensuring proper functionality, and ongoing maintenance. 
  • The implementation and integration of Istio into production can range from several weeks to several months, depending on the complexity of the infrastructure.
  • Using Istio requires a significant amount of resource overhead. 
  • Unlike Linkerd, it lacks a built-in administrative dashboard. 
  • Additionally, Istio mandates the use of its own ingress gateway. 
  • The Istio control plane is exclusively supported within Kubernetes containers, meaning there is no VM mode available for the Istio data plane.

Linkerd

Linkerd is an open-source service mesh platform designed to provide observability, reliability, and security to microservices architectures. It is developed by the Cloud Native Computing Foundation (CNCF) and focuses on simplicity, performance, and ease of use.

Advantages

  • Linkerd leverages the expertise of its creators, who are former Twitter engineers with experience in developing the internal tool, Finagle. They gained valuable insights from working on Linkerd v1, which contributes to the refinement of the service mesh. 
  • Being one of the pioneering service meshes, Linkerd enjoys an active and vibrant community, boasting more than 5,000 users on Slack, along with an engaged mailing list and Discord server. 
  • The availability of comprehensive documentation and tutorials further enhances its appeal.
  • Linkerd has reached a level of maturity with the release of version 2.9, which is evident from its adoption by prominent corporations such as Nordstrom, eBay, Strava, Expedia, and Subspace. 
  • Additionally, Linkerd offers paid enterprise-grade support through Buoyant, ensuring professional assistance is readily available.

Drawbacks

  • Using Linkerd service meshes to their full potential requires a significant learning curve. It is important to note that Linkerd is exclusively supported within Kubernetes containers and does not offer a VM-based or “universal” mode. 
  • Unlike Envoy, the Linkerd sidecar proxy differs, providing Buoyant the flexibility to optimize it according to their requirements. However, this customization comes at the expense of losing the inherent extensibility offered by Envoy. 
  • Consequently, Linkerd lacks support for essential features such as circuit breaking, delay injection, and rate limiting. Additionally, there is no straightforward API exposed for easy control of the Linkerd control plane, although a gRPC API binding can be found.

In case you wish to read more about the above service meshes comparison and what more they have to offer, you can read all about it here.

That’s not it, there many many options in the market for you to choose from like:

Conclusion

Service mesh technology is a boon for developers. It increases developer productivity by delegating cross-cutting concerns from application source code to in-house DevSecOps. Service Mesh provides a ton of more features to solve developer challenges and increase developer productivity. It’s now a de facto standard for managing cross-cutting configuration code for cloud-native microservice apps on Kubernetes.

Published by

Rajiv Srivastava

Principal Architect with Wells Fargo

Leave a comment