Applied Machine Learning II

Track: Enterprise Data Scientist (EDS)
Machine Learning II introduces other slighty more complex algorithms: ensemble methods, Kmeans, PCA and Neural Networks. Participants will learn how to use the algorithms with Python on real datasets.

Make Data-Based Predictions with Applied Machine Learning

Share on facebook
Share on google
Share on twitter
Share on linkedin

In Machine Learning or statistical learning, it is possible to train a computer to perform a specific task. This is extremely powerful specifically when a lot of data related to this task is available. Machine Learning is a must-have for a data scientist to go beyond data analytics.

Learning outcome:

Upon completion, participants should be able to demonstrate each of the following;
  1. Ability to reduce the number of features when it is too large
  2. Knowing how to cluster similar records in an unlabelled dataset
  3. Ability to get better performance results using multiple models instead of one

Who should attend:

Professionals that work with data

5 days of in depth learning

Face to face with experienced Data Scientist.

Course Methodology

This course will utilize a combination of Presentations and Workshops.

CADS Certification​

Earn certification upon completion.

Applied Machine Learning I
Minimum Qualification:
Undergraduate Degree

Training Track

Enterprise Data Scientist (EDS)

Applied Machine Learning II is one of the modules under our Enterprise Data Scientist (EDS) programme. EDS is a 42- day training program that provides participants with the tools to be key leaders and contributors of a data science team and be able to analyze data to drive informed business decisions.

Details of Subject

  1. Supervised Machine Learning: Classification –  Classification is the sub-field of Machine Learning that consists in building a model from training data (data with correct class) in order to predict the class for other data. In classification, there is only a finite number of classes
    • Naïve Bayes. Algorithms of this class are generative: they assume that the data was generated by some (naïve) distribution. Which can be binomial, Gaussian, multinomial… The training part consists in finding the parameters of the distribution. The predicting part consists in finding the most likely class according to the different distributions
    • Ensemble methods: Random forests. Decision trees (covered ML1) have the drawback of overfitting. To overcome this problem, in Random Forest many very simple trees are built.
  2. Clustering –  Clustering consists in grouping elements without any prior information. Kmean is a clustering algorithm based on the distance between the points
    • Kmeans
  3. Dimensionality reduction –  Dimensionality reduction consists in reducing the number of dimensions! In PCA, the goal is to find new dimensions that best explain the data
    • PCA (Principal Component Analysis)
  4. Introduction to deep learning: neural networks –  With Deep Learning, researchers manage to get very good results on some machine learning tasks. One of them is image classification.Deep learning is based on neural networks. In this class we will teach what is a neuron in a neural network and how neural networks manage to handle non-linear problem

Lead Instructor

Jan Sauer
A Data Scientist at The Center of Applied Data Science (CADS), Jan Sauer was a biostatistician in the field of deep learning and image/pattern recognition. He has a master’s degree in physics and have extensive experience as a software developer. Throughout his career, he has been involved in different areas of data science, ranging from automated data collection and data analysis, data pipeline and database design, and advanced machine learning where he uses Tensorflow extensively in image processing.

CADS Certification

EDS CADS Certified Enterprise Data Scientist

Certification information for this module & track will be made available soon.

Hear from Our Alumni

Register Interest

Applied Machine Learning II