Characterizing and Contrasting Kuhn-tey-ner

Awr-kuh-streyt-ors

 

Lee Calcote

LinuxCon+ContainerCon, August 2016

Lee Calcote

clouds, containers, infrastructure,

applications  and their management

Available at ContainerCon

Preorder Available

[k uh   n- tey -ner] 

[ awr -k uh -streyt-or] 

Definition:

Fleet
Nomad
Swarm
Kubernetes
Mesos+Marathon

(Stay tuned for updates to presentation)

One size does not fit all.

A strict apples-to-apples comparison is inappropriate and not the objective, hence  characterizing  and  contrasting.

Let's not go here today.

Container orchestrators may be in te rm ix ed.

Categorically Speaking

  • Genesis & Purpose

  • Support & Momentum

  • Host & Service Discovery

  • Scheduling

  • Modularity & Extensibility

  • Updates & Maintenance

  • Health Monitoring

  • Networking & Load-Balancing

  • High Availability & Scale

Hypervisor
Manager
Elements

  • Compute

  • Network

  • Storage

Container
Orchestrator
Elements

  • Host (Node)
  • Container
  • Service
  • Volume
  • Applications

Core

Capabilities

  • Cluster Management

    • Host Discovery

    • Host Health Monitoring

  • Scheduling

  • Orchestrator Updates and Host Maintenance

  • Service Discovery

  • Networking and Load-Balancing

Additional

Key Capabilities

  • Application Health Monitoring

  • Application Deployments

  • Application Performance Monitoring

Docker Swarm

Genesis & Purpose

  • Swarm is simple and easy to setup.
     

  • Swarm is responsible for the clustering and scheduling aspects of orchestration.  
     

  • Originally an imperative system, now declarative
     

  • Swarm’s architecture is not complex as those of Kubernetes and Mesos
     

  • Written in Golang, Swarm is lightweight, modular and extensible

Docker Swarm 1.12

aka

Swarmkit or Swarm mode

Docker Swarm 1.11 (Standalone)

Docker Swarm Mode 1.12

Support & Momentum

  • Contributions:

    • Standalone: ~3,000 commits, 12 core maintainers (140 contributors)

    • Swarmkit: ~2,000 commits, 12 core maintainers (40 contributors)
       

  • ~250 Docker meetups worldwide
     

  • Production-ready:

    • Standalone announced 8 months ago (Nov 2015)

    • Swarmkit announced < 1 month ago (July 2016)

Host & Service Discovery

Host Discovery

  • used in the formation of clusters by the Manager to discover for Nodes (hosts).

 

Service Discovery

  • Embedded DNS and round robin load-balancing

  • Services are a new concept 

 

image: iStock

Scheduling

  • Swarm’s scheduler is pluggable

  • Swarm scheduling is a combination of strategies and filters/constraint: 

    • Strategies

      • Random 

      • Binpack

      • Spread*

      • Plugin?

    • Filters

      • container constraints (affinity, dependency, port) are defined as environment variables in the specification file

      • node constraints (health, constraint) must be specified when starting the docker daemon and define which nodes a container may be scheduled on.

Swarm Mode only supports Spread

Modularity & Extensibility

Ability to remove batteries is a strength for Swarm:

  • Pluggable scheduler

  • Pluggable network driver

  • Pluggable distributed K/V store

  • Docker container engine runtime-only
     

  • Pluggable authorization (in docker engine)*

image:  Alan Chia  

Updates & Maintenance

Nodes

  • Nodes may be Active, Drained and Paused

  • Manual swarm manager and worker updates
     

Applications

  • Rolling updates now supported

    • --update-delay

    • --update-parallelism

    • --update-failure-action

image: 123RF

Health Monitoring

Nodes

  • Swarm monitors the availability and resource usage of nodes within the cluster

 

Applications

  • One health check per container may be run
    • check container health by running a command inside the container
      • --interval=DURATION (default: 30s)
      • --timeout=DURATION (default: 30s)
      • --retries=N (default: 3)

Networking & Load-Balancing

  • Swarm and Docker’s multi-host networking are simpatico

    • provides for user-defined overlay networks that are micro-segmentable

    • uses a gossip protocol for quick convergence of neighbor table

    • facilitates container name resolution via embedded DNS server (previously via etc/hosts)

  • You may bring your own network driver

  • Load-balancing based on IPVS

    • ​expose Service's port externally

    • L4 load-balancer; cluster-wide port publishing

  • ​Mesh routing

    • ​send a request to any one of the nodes and it will be routed automatically

    • send a request to any one of the nodes and it will be internally load balanced

High Availability & Scale

  • Managers may be deployed in a highly-available configuration

    • Active/Standby - ​only one active Leader at-a-time

    • Maintain odd number of managers
       

  • Rescheduling upon node failure

    • ​No r ebalancing upon node addition to the cluster
       

  • Does not support multiple failure isolation regions or federation

Scaling swarm to 1,000 AWS  nodes and 50,000 containers

  • Suitable for orchestrating a combination of infrastructure containers

    • Has only recently added capabilities falling into the application bucket

  • Swarm is a young project

    • advanced features forthcoming

    • natural expectation of caveats in functionality

  • No rebalancing, autoscaling or monitoring, yet

  • Only schedules Docker containers, not containers using other specifications.

    • Does not schedule VMs or non-containerized processes

  • Need separate load-balancer for overlapping ingress ports

  • While dependency and affinity filters are available, Swarm does not provide the ability to enforce scheduling of two containers onto the same host or not at all.

    • Filters  facilitate sidecar pattern. No “pod” concept.

  • Swarm works. Swarm is simple and easy to deploy.

    • ​1.12 eliminated the need for much third-party software

    • Facilitates earlier stages of adoption by organizations viewing containers as faster VMs

    • now with built-in functionality for applications

  • Swarm is easy to extend, if can already know Docker APIs, you can customize Swarm

  • Highly modular:

    • Pluggable scheduler

    • Pluggable K/V store for both node and multi-host networking

Kubernetes

Genesis & Purpose

  • an opinionated framework for building distributed systems

    • or as its tagline states "an open source system for automating deployment, scaling, and operations of applications."

  • Written in Golang, Kubernetes is lightweight, modular and extensible

  • considered a third generation container orchestrator led by Google, Red Hat and others.

    • bakes in load-balancing, scale, volumes, deployments, secret management and cross-cluster federated services among other features.

  • Declaratively, opinionated with many key features included

 

Kubernetes Architecture

Support & Momentum

  • Kubernetes is young (about two years old)

    • Announced as production-ready 13 months ago (July 2015)
       

  • Project currently has over 1,000 commits per month (~34,000 total)

    • made by about 100 (862 total)  Kubernauts (Kubernetes enthusiasts)

    • ~5,000 commits made in the latest release - 1.3.
       

  • Under the governance of the Cloud Native Computing Foundation
     

  • Robust set of documentation and ~90 meetups 

Host & Service Discovery

Host Discovery

  • by default, the node agent (kubelet) is configured to register itself with the master (API server)

    • automating the joining of new hosts to the cluster

Service Discovery

Two primary modes of finding a Service

  • DNS

    • SkyDNS is deployed as a cluster add-on

  • environment variables​

    • environment variables are used as a simple way of providing compatibility with Docker links-style networking

image: iStock

Scheduling

  • By default, scheduling is handled by kube-scheduler.

  • Pluggable

  • Selection criteria used by kube-scheduler to identify the best-fit node is defined by policy:

    • Predicates (node resources and characteristics):

      • ​PodFitPorts , PodFitsResources, NoDiskConflict , MatchNodeSelector, HostName , ServiceAffinit, LabelsPresence

    • Priorities (weighted strategies used to identify “best fit” node):

      • ​LeastRequestedPriority, BalancedResourceAllocation, ServiceSpreadingPriority, ​EqualPriority

Modularity &

         Extensibility

  • One of Kubernetes strengths its pluggable architecture

  • Choice of:

    • database for service discovery or network driver

    • container runtime

      • users may choose to run Docker with Rocket containers

  • ​Cluster add-ons

    • optional system components that implement a cluster feature (e.g. DNS, logging, etc.)

    • shipped with the Kubernetes binaries and are considered an inherent part of the Kubernetes clusters 

       

Updates & Maintenance

Applications

  • Deployment objects automate deploying and rolling updating applications.​

  • Support for rolling back deployments

Kubernetes Components

  • Upgrading the Kubernetes components and hosts is done via shell script 

  • Host maintenance - mark the node as unschedulable.

    • existing pods are not vacated from the node

    • prevents new pods from being scheduled on the node

image: 123RF

Health Monitoring

Nodes

  • Failures - actively monitors the health of nodes within the cluster

    • via Node Controller

  • Resources - usage monitoring leverages a combination of open source components:

    • cAdvisor, Heapster, InfluxDB, Grafana

Applications 

  • three types of user-defined application health-checks and uses the Kubelet agent as the the health check monitor

    • ​HTTP Health Checks, Container Exec, TCP Socket

​Cluster-level Logging

  • collect logs which persist beyond the lifetime of the pod’s container images or the lifetime of the pod or even cluster

    • ​standard output and standard error output of each container can be ingested using a Fluentd agent running on each node

Networking & Load-Balancing

…enter the Pod

  • atomic unit of scheduling

  • flat networking with each pod receiving an IP address

  • no NAT required, port conflicts localized

  • intra-pod communication via localhost​

Load-Balancing

  • Services provide inherent load-balancing via kube-proxy:

    • runs on each node of a Kubernetes cluster

    • reflects services as defined in the Kubernetes API

    • supports simple TCP/UDP forwarding and round-robin and Docker-links-based service IP:PORT mapping. 

High Availability & Scale

  • Each master component may be deployed in a highly-available configuration.

    • ​Active/Standby configuration
       

  • ​In terms of scale, v1.2 brings support for 1,000 node clusters and a step toward fully-federated clusters (Ubernetes)
     

  • Application-level auto-scaling is supported within Kubernetes via Replication Controllers

  • Only runs containerized applications

  • ​ For those familiar with Docker-only, Kubernetes requires understanding of new concepts

    • Powerful frameworks with more moving pieces beget complicated cluster deployment and management.

  • Lightweight graphical user interface

  • Does not provide as sophisticated techniques for resource utilization as Mesos

 

 

  • Kubernetes can schedule docker or rkt containers

  • Inherently opinionated with functionality built-in.

    • little to no third-party software needed

    • builds in many application-level concepts and services (secrets, petsets, jobsets, daemonsets, rolling updates, etc.)

    • advanced storage/volume management

  • ​Kubernetes arguably moving the quickest

  • Relatively thorough project documentation

  • Multi-master, cross-cluster federation, robust logging & metrics aggregation

 

Mesos

+

Marathon

Genesis & Purpose

  • Mesos is a distributed systems kernel

    • stitches together many different machines into a logical computer

  • Mesos has been around the longest (launched in 2009)

    • and is arguably the most stable, with highest (proven) scale currently

  • Mesos is written in C++

    • with Java, Python and C++ APIs

  • Marathon as a Framework

    • Marathon is one of a number of frameworks (Chronos and Aurora other examples) that may be run on top of Mesos

    • Frameworks have a scheduler and executor. Schedulers get resource offers. Executors run tasks.

    • Marathon is written in Scala

Mesos Architecture

Support & Momentum

  • MesosCon 2015 in Seattle had 700 attendees

    • up from 262 attendees in 2014
       

  • 78 contributors in the last year

 

  • Under the governance of Apache Foundation
     

  • Used by Twitter, AirBnb, eBay, Apple, Cisco, Yodle

Host &
      Service Discovery

  • Mesos-DNS generates an SRV record for each Mesos task

    • including Marathon application instances

  • Marathon will ensure that all dynamically assigned service ports are unique

  • Mesos-DNS is particularly useful when:

    • apps are launched through multiple frameworks (not just Marathon)

    • you are using an IP-per-container solution like Project Calico

    • you use random host port assignments in Marathon

image: iStock

Scheduling

  • Two level scheduler

    • First level scheduling happens at mesos master based on allocation policy , which decides which framework get resources

    • Second level scheduling happens at Framework scheduler , which decides what tasks to execute.

  • Provide reservations, over-subscriptions and preemption

Modularity & Extensibility

Frameworks

  • multiple available

  • may run multiple frameworks

Modules

  • extend inner workings of Mesos by creating and using shared libraries that are loaded on demand

  • many types of Modules

    • Replacement, Isolator, Allocator, Authentication, Hook, Anonymous

Updates &

Maintenance

Nodes

- Mesos has maintenance mode

 

Applications

  • Marathon can be instructed to deploy containers based on that component using a blue/green strategy

    • where old and new versions co-exist for a time.

image: 123RF

Health Monitoring

Nodes

  • Master tracks a set of statistics and metrics to monitor resource usage

    • Counters and Gauges

Applications

  • support for health checks (HTTP and TCP)

  • an event stream that can be integrated with load-balancers or for analyzing metrics

Networking & Load-Balancing

Networking

  • An IP per Container

    • No longer share the node's IP

    • ​Helps remove port conflicts

    • Enables 3rd party network drivers

  • Container Network Interface (CNI)   isolator with MesosContainerize

Load-Balancing

  • Marathon offers two TCP/HTTP proxies

    • A simple shell script and a more complex one called marathon-lb that has more features.

    • Pluggable (e.g. Traefic for load-balancing)

High Availability & Scale

  • A strength of Mesos’s architecture

    • requires masters to form a quorum using ZooKeeper (point of failure)

    • only one Active (Leader) master at-a-time in  Mesos and Marathon

 

  • Scale is a strong suit for Mesos. Used at Twitter, AirBnB... TBD for Marathon

 

  • Great at asynchronous jobs. High availability built-in.

    • Referred to as the “golden standard” by Solomon Hykes, Docker CTO.

  • Universal Containerizer

    • abstract away from docker, rkt, kurma?, runc, appc

  • Can run multiple frameworks, including Kubernetes and Swarm.

  • Only of the container orchestrators that supports multi-tenancy

  • Good for Big Data house and job-oriented or task-oriented workloads.​

    • Good for mixed workloads and with data-locality policies

  • Powerful and scalable, Battle-tested

    • ​Good for multiple large things you need to do 10,000+ node cluster system

  • ​Marathon UI is young, but promising

  • Still needs 3rd party tools

  • Marathon interface could be more Docker friendly (hard to get at volumes and registry)

  • May need a dedicated infrastructure IT team

    • ​an overly complex  solution  for small deployments

Summary

A high-level perspective of the container orchestrator spectrum .

Lee Calcote

Thank you. Questions?

clouds, containers, infrastructure,

applications  and their management

Available at ContainerCon 2016

Preorder Available