~ Home ~
Description
This course aims to provide students an understanding in the operating principles and hands-on experience with mainstream Big Data Computing systems. Open-source platforms for Big Data processing and analytics would be discussed. Data mining algorithms and machine learning applications are another major stream of this course. In addition, widely-adopted optimization methods and models for big data analytics will also be investigated. Topics to be covered include:
Programming models and design patterns for mainstream Big Data computational frameworks ;
System Architecture and Resource Management for Data-center-scale Computing ;
Algorithm Design for Big Data Analytics, e.g., SVM Model, K-means Clustering, Deep Neural Network ;
Optimization Methods, e.g., convex optimization, gradient descent, online optimization ;
Course Pre-requisite:
This course contains substantial hands-on components which require solid background in programming and hands-on operating systems experience. If you have never used a command-line interface to install/configure/manage an operating system, e.g. a linux-based one, you will need to pick-up the skills yourself and IT CAN BE VERY TIME-CONSUMING for you to complete the homeworks. (Students without the aforementioned required background may take several 10's of hours to finish EACH homework assignment).
Course Information
Lecture time and venue:
- 6F503 (2:30pm - 5:00pm, Tuesday);
Instructor:
- Dr. Huanle Xu.
xhlcuhk [at] gmail [dot] com
- Office hours: Fri 4:30-5:15pm or by Appointment (9A 304)
Teaching Assistant:
- Zizhao Mo
yc17461@connect.um.edu.mo
Recommended Programming References
[DataAlgorithms] Data Algorithms: Recipes for Scaling Up with Hadoop and Spark, by Mahmoud Parsian, Publisher: O'Reilly Media, Aug 2015
[Pig] Programming Pig, by Alan Gates, published by O’Reilly Media
[Hive] Programming Hive, by Edward Capriolo, Dean Wampler, Jason Rutherglen, published by O’Reilly Media,
[OpenStackOp] OpenStack Operations Guide, published by O’Reilly Media, (current-version available online at: http://docs.openstack.org/openstack-ops/content )
Course Assessment
Your grade will be based on the following components:
- Homework & Programming assignments (about 3-4 sets in total): 40%
- Project: 60%