#### Page 2 of 2

You may have heard of the famous book The Signal and the Noise by Nate Silver. In predictive modeling, you can think of the signal as the true…

This tutorial will show how to install and configure version 5.7.0 of Cloudera Distribution Hadoop (CDH 5) on Ubuntu 16.04 host using Docker. What’s CDH? CDH (Cloudera’s Distribution…

In pattern recognition information retrieval and binary classification, precision (also called positive predictive value) is the fraction of relevant instances among the retrieved instances, while recall (also known…

The Bias-Variance dilemma is relevant for supervised machine learning. It’s a way to diagnose an algorithm performance by breaking down its prediction error. There are three types of…

Standardization Standardization (or Z-score normalization) is the process of rescaling the features so that they’ll have the properties of a Gaussian distribution with μ=0 and σ=1 where μ…

Since the yield keyword is only used with generators, it makes sense to recall the concept of generators first and before generators come iterables. Iterables Everything one can…

In machine learning dimensionality simply refers to the number of features(I.e. input variables in the datasets). when the number of features is very large relative to the number…

There are two types of models, parametric and non-parametric, let’s start with parametric models. Parametric model A learning model that summarizes data with a set of parameters of…