Intro to Data Analytics Platform on Azure

Having been in transactional data world for almost the entire career, recently I have to pick up quite a few things to catch up on the analytical workload. The main purpose of data analytics project is to build analysis services models and manage deployed databases. Later in this post I’ll discuss some useful Azure resources for … Read moreIntro to Data Analytics Platform on Azure

High Performance Computing

Overview High Performance Computing (HPC) has recently been commoditized with the advent of commodity server hardware (x86 server), virtualization technology and cloud delivery model. It is common in specialized industries where intensive computing tasks are required, for example: HCL (healthcare and life science): drug discovery, computer aided diagnosis (CAD), genome engineering; CAD, CAE, CAM (computer … Read moreHigh Performance Computing

Spark, Cassandra and Python

In this post we touch briefly on Apache Spark as a cluster computing framework that supports a number of drivers to pipe data in, and that its stunning performance thanks much to resilient distributed dataset (RDD) as its architectural foundation. In this hands-on guide, we expand on how to configure Spark, and use Python to … Read moreSpark, Cassandra and Python

Intro to Big Data Projects

Modern applications produce super large datasets beyond what traditional data-processing application can handle. Big data is a discipline that specialize in processing such data. For example, analysis, information extraction etc. The scale of large dataset grows well beyond the capacity of a single computer, which calls for computing power delivered by multi-node clustered systems. Intensive … Read moreIntro to Big Data Projects