~ Project ~

Feed


1. Handwritten Digits Classification with Kernel-SVM

Dataset is available on THE MNIST DATABASE. In this project, you need to do the following:

  • SVM method: Use kernel method to train the SVM model on MapReduce and classify the digits.

2. Handwritten Digits Classification with CNN

With the same dataset above, you need to do the following:

  • In the first step, apply the Convolution Neural Network method to perform the training on one single CPU and testing.

  • In the second step, try the distributed training on at least two CPU/GPUs and evaluate the training time.

3. Comparison between Ceph and HDFS

In this project, you need to install Hadoop with Ceph where Ceph is another popular distributed file system. Run Terasort Benchmark with input data at least 1TB to compare the performance across Ceph and HDFS. The comparison should include:

  • The overall running time of Terasort under different file systems.

  • The actual I/O throughput.