Autoscaling on Kubernetes Platform

Introduction The concept of autoscaling on Kubernetes platform dates from the era where virtualization first became widespread and the overhead of provisioning a new server became lightweight through the use of cloud-init. With public cloud, customers operate on usage-based billing. Autoscaling allows workload to scale down during idle times to reduce cost, and scale up … Read moreAutoscaling on Kubernetes Platform

Intro to Big Data Projects

Modern applications produce super large datasets beyond what traditional data-processing application can handle. Big data is a discipline that specialize in processing such data. For example, analysis, information extraction etc. The scale of large dataset grows well beyond the capacity of a single computer, which calls for computing power delivered by multi-node clustered systems. Intensive … Read moreIntro to Big Data Projects

Zookeeper Summary

Distributed systems Distributed system involves independent computing entities linked together by network. The components communicate and coordinate with each other to achieve a common goal. In early days, designers and developers often had made some assumptions (aka. fallacies) of distributed computing: These fallacies make coordinating distributed computing entities a huge challenge and Zookeeper is introduced … Read moreZookeeper Summary

Kafka overview

Zookeeper General definition of distributed system: a software system that is composed of independent computing entities linked together by a computer network whose components communicate and coordinate with each other to achieve a common computational goal. Implementing coordination among components of a distributed system is hard. For example, designated master node becomes single point of … Read moreKafka overview