Big Data keeps evolving. Stone Age Hadoop was alot of Java bolierplate for defining HDFS access, Mapper & Reducer. This was superceded by Bronze Age Spark, which provided a succint Scala unification of:ML pipelinesin-memory structured DataSets over RDDs via a SparkSession SQL APIDistributed Streams(Note: You can run such jobs easily in a dynamically scalable manner on Google Dataproc) Technology keeps evolving - the Big-Iron Age has arrived in the form of Google Cloud Platform's SPARK KILLER ......
The Docker CLI commands actually encapsulate Rest calls to the host Docker Daemonhttps://docs.docker.c... daemon listens on unix:///var/run/docker.sock - which you can curl into: curl --unix-socket /var/run/docker.sock http:/containers/jsonIn theory, a Container app could be Cluster-enabled via startup calls to the host Docker Remote REST API !POST /swarm/join POST /services/create Here's a summary of the exposed API surface:ContainersGET /containers/json ......
HashiCorp Consul https://www.consul.io/ provides an easy to use, multi-region Service Discovery / Health-Check + distributed config keyval store. If you have ever hacked away at coding ZooKeeper to support distributed systems, Consul's Agent based architecture requires an order of magnitude less effort to roll out.Here's my Consul QuickRef:CLI options:agent - Runs a Consul agent-dev-server-config-dir... -atlas=, atlas-token=xxx-ui ......