Benchmarking rxNeuralNet for OCR

By March 19, 2017ai, bigdata, machinelearning

The MicrosoftML package introduced with Microsoft R Server 9.0 added several new functions for high-performance machine learning, including rxNeuralNet. Tomaz Kastrun recently applied rxNeuralNet to the MNIST database of handwritten digits to compare its performance with two other machine learning packages, h2o and xgboost. The results are summarized in the chart below:

RxNeuralNet

In addition to having the best performance (for both the CPU-enabled and GPU-enabled modes), rxNeuralNetwork did not have to sacrifice accuracy. In fact, rxNeuralNetwork had the best accuracy of the three algorithms: 97.8%, compared to 95.3% for h2o and 94.9% for xgBoost. The same training and validation set were used for each case, and the R code is available here. (If you're looking for other uses of MicrosoftML, this script also applies algorithms like rxFastForest and rxFastLinear to various other datasets.)

The MicrosoftML package can be used to classify other kinds of images, too. This post from the Microsoft R Server Tiger Team demostrates using the rxNeuralNet function to classify images from the UCI Image Segmentation Data Set. But for more on the OCR application, follow the link below.

TomaztSQL: rxNeuralNet vs. xgBoost vs. H2O


Source link