Etcd – the key-value store for Kubernetes

Etcd in Kubernetes In Kubernetes architecture, etcd is the data store. It stores the desired state of Kubernetes object. API server is the only client that connects to etcd (via gRPC protocol). Cluster builder specifies the endpoint of etcd as a parameter to the kube-api-server process. Other Kubernetes components, whether in the control plane or … Read moreEtcd – the key-value store for Kubernetes

Intro to Big Data Projects

Modern applications produce super large datasets beyond what traditional data-processing application can handle. Big data is a discipline that specialize in processing such data. For example, analysis, information extraction etc. The scale of large dataset grows well beyond the capacity of a single computer, which calls for computing power delivered by multi-node clustered systems. Intensive … Read moreIntro to Big Data Projects

Zookeeper Summary

Distributed systems Distributed system involves independent computing entities linked together by network. The components communicate and coordinate with each other to achieve a common goal. In early days, designers and developers often had made some assumptions (aka. fallacies) of distributed computing: These fallacies make coordinating distributed computing entities a huge challenge and Zookeeper is introduced … Read moreZookeeper Summary

Kafka overview

Zookeeper General definition of distributed system: a software system that is composed of independent computing entities linked together by a computer network whose components communicate and coordinate with each other to achieve a common computational goal. Implementing coordination among components of a distributed system is hard. For example, designated master node becomes single point of … Read moreKafka overview