Growing and visualizing a decision tree using R

In this post, we shall be exploring decision trees in R.

Decision trees, as you might be aware of, is one of the most popular and intuitive machine learning models, that is used for both classification and regression purposes. The theory behind decision trees is simple. Say you need to classify a set of observations, depending upon certain factors. The outcome is the response variable, which is typically a class label. So for example, we can consider two broad classes as "Good" or "Bad" and then classify students as either depending on certain factors - grades, discipline and so on. The "factors" are known as the predictor variables, while "good" or "bad" are two outcomes of the response variable.

We shall be using a data set that comes built into R, in the datasets library - viz. Kyphosis.

This data set contains four columns - Kyphosis, Age, Number and Start, and it explores the presence (or absence) of the kyphosis deformation in ch…

Installing RMySQL in Windows to connect R with MySQL

If you're using the wonderful statistical computation tool R, and need to fetch data from or write data to MySQL databases (or run any MySQL query for that matter from the comfort of an R environment) you'll need to download some third party packages to do the same.

One of the most popular of those packages, is - no points for guessing - RMySQL- but installing it in a Windows environment isn't a very easy task. This is because unlike other R packages that can be downloaded and installed using the install.packages("packagename") from the R console, RMySQL isn't available as a precompiled .zip archive. It needs a certain Windows dynamically linked library (.dll) in order to work, and therefore, while it works out of the box in *nix OSes, it requires compiling on Windows.

Here's how you get around to making it work. This has been tested on a 64 bit Windows 8.1 computer.

First up, you need to have R installed on your system. I'm not sure about the exact v…