May 2019
Data Plane
Ingress Gateway
Egress Gateway
No control plane? Not a service mesh.
Egress Gateway
Control Plane
Data Plane
Ingress Gateway
Control Plane
Data Plane
You need a management plane.
Ingress Gateway
Egress Gateway
Management
Plane
Pilot
Citadel
Mixer
Control Plane
Data Plane
istio-system namespace
policy check
Foo Pod
Proxy Sidecar
Service Foo
tls certs
discovery & config
Foo Container
Bar Pod
Proxy Sidecar
Service Bar
Bar Container
Out-of-band telemetry propagation
telemetry
reports
Control flow during request processing
application traffic
Application traffic
application namespace
telemetry reports
Galley
Ingress Gateway
Egress Gateway
Control Plane
Data Plane
linkerd-system namespace
Foo Pod
Proxy Sidecar
Service Foo
Foo Container
Bar Pod
Proxy Sidecar
Service Bar
Bar Container
Out-of-band telemetry propagation
telemetry
scarping
Control flow during request processing
application traffic
Application traffic
application namespace
telemetry scraping
destination
Prometheus
Grafana
tap
web
CLI
proxy-api
public-api
proxy-injector
Client
Edge Cache
Istio Gateway
(envoy)
Cache Generator
Collection of VMs running APIs
service mesh
Istio VirtualService
Istio VirtualService
Istio ServiceEntry
Situation:
Benefits:
Out-of-band telemetry propagation
Control flow during request processing
Application traffic
Service A
Service A
Service A
linkerd
Node (server)
Service A
Service A
Service B
linkerd
Node (server)
Service A
Service A
Service C
linkerd
Node (server)
Advantages:
Less (memory) overhead.
Simpler distribution of configuration information.
primarily physical or virtual server based; good for large monolithic applications.
Disadvantages:
Coarse support for encryption of service-to-service communication, instead host-to-host encryption and authentication policies.
Blast radius of a proxy failure includes all applications on the node, which is essentially equivalent to losing the node itself.
Not a transparent entity, services must be aware of its existence.
at
Advantages:
Good starting point for building a brand-new microservices architecture or for migrating from a monolith.
Disadvantages:
When the number of services increase, it becomes difficult to manage.
Mixer
Control Plane
Data Plane
istio-system namespace
Foo Pod
Proxy sidecar
Service Foo
Foo Container
Out-of-band telemetry propagation
Control flow during request processing
application traffic
application traffic
application namespace
telemetry reports
an attribute processing engine
AppOpticsâ„¢
types: logs, metrics, access control, quota
Papertrailâ„¢
Prometheusâ„¢
Stackdriverâ„¢
Open Policy Agent
Grafanaâ„¢
Fluentd
Statsd
®
Pilot
Citadel
Mixer
istio-system namespace
Galley
Control Plane
a multi-service mesh management plane
https://layer5.io/meshery
Service Mesh Interface (SMI)
a multi-service mesh management plane
Service Mesh Interface (SMI)
@lcalcote
layer5.io/books
layer5.io/landscape
Playground
WHICH SERVICE MESH SHOULD I USE AND HOW DO I GET STARTED?
Learn about the functionality of different service meshes and visually manipulate mesh configuration.
Performance Benchmark
WHAT OVERHEAD DOES BEING ON THE SERVICE MESH INCUR?
Benchmark the performance of your application across different service meshes and compare their overhead.
layer5.io/meshery
@lcalcote
Istio
Linkerd
Octarine
NSM
App Mesh
@lcalcote
results coming forthcoming at KubeCon EU...
Consul
Up next...
(service meshes contributing adapters)
Kubernetes
(no mesh)
Demo
@lcalcote
layer5.io/meshery
Cores | Threads | Istio (2) | Linkerd |
---|---|---|---|
8 | 8 | 1 | 1 |
8 | 16 | 1.7 | 1.8 |
8 | 32 | 3.2 | 3.4 |
8 | 100 | 9.3 | 9.6 |
(2) mTLS on, tracing off
@lcalcote
layer5.io/meshery
Cores | Threads | Istio (1) | Istio (2) | Linkerd |
---|---|---|---|---|
8 | 8 | 1 | 1 | 1 |
8 | 16 | 1.4 | 1.7 | 1.8 |
8 | 32 | 18.4 | 3.2 | 3.4 |
8 | 100 | 52.2 | 9.3 | 9.6 |
(1) mTLS on, tracing on
(2) mTLS on, tracing off
Mixer
Control Plane
Data Plane
istio-system namespace
Foo Pod
Proxy sidecar
Service Foo
Foo Container
Out-of-band telemetry propagation
Control flow during request processing
application traffic
application traffic
application namespace
telemetry reports
an attribute processing engine
@lcalcote
layer5.io/meshery
@lcalcote
layer5.io/meshery
@lcalcote
layer5.io/meshery
A project and vendor-neutral specification for capturing details of:
Environment / Infrastructure
Number and size of nodes, orchestrator
Service mesh and its configuration
Service / application details
Bundled with test results.
github.com/layer5io/service-mesh-benchmark-spec
@lcalcote
layer5.io/meshery
@lcalcote
layer5.io/meshery
Service Mesh Community
Layer5.io
a dedicated layer for managing service-to-service communication
So, a microservices platform?
obviously.
Orchestrators don't bring all that you need
and neither do service meshes,
but they do get you closer.
Missing: application lifecycle management, but not by much
partially.
Missing: distributed debugging; provide nascent visibility (topology)
First few services are relatively easy
Democratization of language and technology choice
Faster delivery, service teams running independently, rolling updates
Next 10 or so may introduce pain
Language and framework-specific libraries
Distributed environments, ephemeral infrastructure, out-moded tooling
to avoid...
Bloated service code
Duplicating work to make services production-ready
Load balancing, auto scaling, rate limiting, traffic routing...
Inconsistency across services
Retry, tls, failover, deadlines, cancellation, etc., for each language, framework
Siloed implementations lead to fragmented, non-uniform policy application and difficult debugging
Diffusing responsibility of service management
• Observability
• Logging
• Metrics
• Tracing
• Traffic Control
• Resiliency
• Efficiency
• Security
• Policy
what gets people hooked on service metrics
Metrics without instrumenting apps
Consistent metrics across fleet
Trace flow of requests across services
Portable across metric back-end providers
You get a metric! You get a metric! Everyone gets a metric!
© 2018 SolarWinds Worldwide, LLC. All rights reserved.
control over chaos
Timeouts and Retries with timeout budget
Control connection pool size and request load
Circuit breakers and Health checks
content-based traffic steering
Web
Service Foo
Timeout = 600ms
Retries = 3
Timeout = 300ms
Retries = 3
Timeout = 900ms
Retries = 3
Service Bar
Database
Timeout = 500ms
Retries = 3
Timeout = 300ms
Retries = 3
Timeout = 900ms
Retries = 3
Web
Service Foo
Deadline = 600ms
Deadline = 496ms
Service Bar
Database
Deadline = 428ms
Deadline=180ms
Elapsed=104ms
Elapsed=68ms
Elapsed=248ms
where Dev and Ops meet
Problem: too much infrastructure code in services