Future Technology: Machine Learning
In this course you will learn how to analyse, solve and implement Machine Learning problems in Python. Although the course covers the functions, libraries, etc. used, it does not cover the details of the language. For this reason, you should have a basic knowledge of Python.
Day 1: Installation and Introduction
Introducing ML (Machine Learning):
- What is Machine Learning?
- Applications (Financial Forecasting, Profiling, Text Mining, Image Recognition, etc.)
- Supervised Learning vs. Unsupervised Learning
- Other applications: Information Retrieval, Optimizations, Graph Analysis
- Exploratory Analysis. What is it and why is it important?
Installation (Windows, Linux)
Installation of PyCharm
An introduction to Python for ML (Machine Learning):
- Introduction to Python
- Introduction to essential libraries for ML:
Overview of PyCharm:s
- Basic functions
- Useful Shortcuts
- Debug Mode
- Data View
Tag 2: Explorative Analysis
Guidelines how to proceed with an explorative analysis:
- What problem do I have? Is there a target variable?
- What do my data look like? Which variables are correlated? Are all relationships important? Which criterion is used to determine whether the relationship found (correlation) is important or not? Are the criteria only statistical?
- What statistical methods are available to perform an explorative analysis? Correlation tests, correlation: categorical-categorical, categorical-continuous, continuous-continuous, correlation matrices, PCA, Correspondence Analysis, etc.
- Visualization as an important component – Graphs as part of explorative analysis: How can insights be drawn from graphs? Analysis of simple (ordinary) and complex graphs.
The group will work out the answers to these questions together. At the same time they will learn how to implement the aforementioned points in Python. Regard to the technical aspects, the course will cover the following topics:
- Loading data (CSV, SQL, JSON, etc.)
- Data processing (subsets, aggregations, etc.)
- Running statistical methods
Tag 3: Supervised Learning
- Main concepts in Supervised Learning:
- Training, Testing and Validation
- Overfitting and Underfitting
- Generation of additional variables
- Supervised learning in Python:
- Naive Bayes
- Classification Trees & Random Forests
- Neural Networks
- Performance measurements:
- Mean Square/Absolute Error
- Confusion Matrix
- ROC and AUC
- Technical aspects in Python:
- Train/test splitting
- Create additional variables
- Training models
- Performance measurements
The third day will start with an insight into the main concepts and techniques of Supervised Learning. Due to the strong practical orientation of this course, the participants will afterwards be divided into small groups in which they will practice different Supervised Learning models. Each team receives the same initial data and decides individually which additional variables should be created. The experiences and thinking processes will be shared with the others. The performance of the trained models will be tested with new, unknown data and subsequently analysed and discussed together.
The consultant is of course available to answer any questions during the entire process. Through the training, the participants not only deal with a “real” machine learning problem, in addition they learn how to solve it in Python.
Tag 4: Unsupervised Learning + Text Mining
- Unsupervised Learning:
- Hierarchical Clustering & Heatmap
- Principal Component Analysis/ Correspondence Analysis
- Introducing Text Mining and Information Retrieval
- Introducing Text Mining and Information Retrieval
- Applications (Sentiment Analysis, IR, etc.)
- From words to numbers:
- Pre-processing (punctuation, lowercase, etc.)
- Porting – Stemming
- Numeric matrices from text
On day 4, two topics will be dealt with separately (morning & afternoon). Each module is organised as follows:
- theoretical introduction
- Examples in Python
- Analysis of the results
- Independent development of similar tasks by the participants
- Joint analysis and interpretation of the results
Used Python libraries are explained, for example:
- Scikit clustering
- Nltk (Natural Language Toolkit)
Tag 5: From laboratory to production
In the morning the implementation of Machine Learning in a productive environment is worked on.
- Online vs. Offline Predictions
- data collection strategies – Response Times
- APIs in Machine Learning
- Batch predictions
For each topic the corresponding Python tools are explained:
In the afternoon, the course concludes with a summary of the topics covered:
- Overview of topics
- Discussion: How do the topics relate to each other?
- Final Q&A
Other courses that might interest you
We offer these services
High availability with Patroni
Patroni – Protects your database Patroni is an Open Source Cluster-Technology, which takes care of automatic failover and high availability of your PostgreSQL database. Furthermore, Patroni is simple – the Patroni-Clusters are very easy to set up and to use. Ensure high availability of your database with Patroni! Cybertec’s Patroni-Extension “vipmanager” ensures an automatic IP […]
PostgreSQL Product- and 3rd Level Support
Are you using PostgreSQL as database backend for your product? Then we have the ideal solution to save you and your customers from database problems. PostgreSQL is increasingly being used as database for software and hardware solutions. Once they are sold to the customer, you are often left with the question how to offer a […]
PostgreSQL performance tuning
Is your PostgreSQL database slow? Do you see high latency, many slow query entries and are customers already complaining? If you are looking for a performance tuning because you simply want a fast, reliable database, you are on the right website. PostgreSQL Storage Tuning One part of performance tuning is storage tuning. A good storage […]