International School of Chemometrics  -  ISC 2021

INFORMATION ABOUT THE SEMINARS

NOTE: All seminars include all the material that the student might need: 

- Slides of the course (pdf).

- Exercises

- Datasets

- Toolboxes

- Refreshment during the lessons (coffee, tea, candies, cookies, and other amenities). This, of course, if we are allowed due to the Covid-19 pandemic situation

- Unlimited Off-line access to the videos of the lessons until the end of 2021.

- We do NOT provide: Matlab and lunch

MATLAB - Introduction to Matlab for Multivariate Data Analysis

Matlab is one of the main software packages that will be used in the ISC-2020. Therefore, attending some suggestions from the students of previous editions, we have decided to merge our PhD course “Introduction to Matlab for Multivariate Data Analysis” with ISC-2020. This course will show the basis of the software Matlab in order to be able to deal with data analysis.

Dates and timetable: From the 31st of May until the 4th of June, 2021. From 9 am until 4 pm (CET) with 1-hour lunch break (30 hours)

Previous knowledge needed: None

Software needed: Matlab.

Teacher: José Manuel Amigo

ECTS: 2.5

Price student: 1500 DKK (Approx. 200 euro / 245 USD)

Price industry: 3750 DKK (Approx. 505 euro / 612 USD)

 

 

BASIC - Basic introduction to Chemometrics, data types, data pre-processing, PCA, Multivariate Linear Regression and Multivariate Linear Classification

 

This seminar contains 2 topics:

 

- EXPLORE - Data exploration and regression: Principal Component Analysis has become the most powerful and versatile tool for exploring data tables in Analytical Sciences. Here we present a course to show the main benefits and drawbacks of PCA when it is used for different kind of analytical data: Spectroscopy, environmental assessment, sensory, experiments performance, chromatography, etc. Moreover, preprocessing of different type of data will be also addressed in the seminar as a prerequisite for having the optimal possibility for exploring the data. If PCA is the keystone of pattern recognition methods, PLS is the keystone of multivariate calibration methods. This seminar will give a general overview of different multivariate calibration strategies and will focus in Partial Least Squares regression.

- CLASS - Multivariate Classification: The course will deal with main classification linear methods. We will initially introduce what is classification and a bit of terminology, such as the difference between class modelling and discriminant methods, classification measures, validation approaches. Then, we will move to main classification methods, such as Discriminant Analysis, Partial Least Squares Discriminant Analysis (PLSDA) and SIMCA. We will see both theoretical aspects and practical applications. There will be in fact practical sessions where we will apply classification methods to real data with ad-hoc toolboxes in MATLAB.to learn how to handle these tasks.

 

Dates and timetable: From the 7th until the 11th of June, 2021. From 9 am until 4 pm (CET) with 1-hour lunch break (30 hours)

 

Previous knowledge needed: None

Software needed: Matlab, PLS_Toolbox and Classification toolbox working under Matlab. IMPORTANT: For the PLS_Toolbox a fully functional demo will be available for the School. The Classification toolbox can be freely downloaded from here: 

https://michem.unimib.it/download/matlab-toolboxes/classification-toolbox-for-matlab/

Teachers: José Manuel Amigo (EXPLORE) and Davide Ballabio (CLASS).

ECTS: 2.5

Price student: 1500 DKK (Approx. 200 euro / 245 USD)

Price industry: 3750 DKK (Approx. 505 euro / 612 USD)

 

INTERMEDIATE - Intermediate topics on Chemometrics. Variable selection methods, Linear Algebra and Non-linear modelling

 

This seminar contains three topics:

 

- VARSEL - Variable selection methods: This seminar aims at revisiting the most important variable selection methods for regression and classification purposes with the aim of improving the performance of the models. The emphasis will be on practical applications, and what methods could be applied to which problem. There will also be hints as to what methods are good, and which ones to stay away from.

- LINAL - Linear Algebra: The Foundation for chemometric modelling is Linear Algebra. Why do the algorithms work? Why are the models meaningful? A math derived answer to these questions can be found using linear algebra. The seminar will focus on hands-on experience with some fundamental linear algebra concepts including rank, determinant, inverse, pseudo inverse, eigenvalues, singular value decomposition, orthogonality and basis sets. We will analyze a few real-life datasets, but the purpose of the seminar is to be a proficient mechanic unravelling the black-box of algorithms and models while other courses will learn you how to drive the car. 

- NONLIN - Nonlinear modelling: This module aims at providing a basic introduction to the techniques which may be used in all those situations when a linear relation is not enough to provide accurate results (e.g. due to the presence of multiple sources of variability). In this respect, the most important aspects of data modelling will be considered (exploratory analysis, classification and calibration). Topics such as kernel and dissimilarity-based approached (including support vector machines), local modelling (kNN and locally weighted regression/classification) and artificial neural networks will be covered. 

Dates and timetable: From the 14th until the 18th of June, 2021. From 9 am until 4 pm (CET) with 1-hour lunch break (30 hours)

 

Previous knowledge needed: Basic multivariate data analysis and Matlab

Software needed: Matlab and PLS_Toolbox. IMPORTANT: For the PLS_Toolbox a fully functional demo will be available for the School.

Teachers: José Manuel Amigo and Davide Ballabio (VARSEL), Morten A. Rasmussen (LINAL) and Rasmus Bro (NONLIN).

ECTS: 2.5

Price student: 1500 DKK (Approx. 200 euro / 245 USD)

Price industry: 3750 DKK (Approx. 505 euro / 612 USD)

CHALLENGES - Topics for a further understanding of advanced modelling methods

This seminar contains three topics:

- MULTIWAY - Multiway data analysis: Multi-way data is gaining popularity due to the capability of scientific devices to generate data with, at least, 3 dimensions (elution time – mz channel – samples, excitation-emission – sample, etc). Therefore, learning the basics of multi-way analysis will help to extract the most of that complex data structure. In this sense, methods like parallel factor analysis (PARAFAC) and PARAFAC2 will be studied and applied to different examples.

- FUSION - Multivariate data fusion approaches: The seminar will deal with the chemometric approaches for integrating (“fusing”) data from different sources. First of all the various configurations which may occur when dealing with multiple data matrices will be presented and discussed and a hierarchy/systematization of the possible data fusion approaches will be introduced. Then the main multi-block strategies for data exploration and predictive modelling will be discussed and compared and further classification of models depending on whether the globally common, locally common and distinct information is considered or not will be introduced. The theoretical and algorithmic description of the methods will be accompanied by worked examples of real data sets.

- ERROR - Error propagation in multivariate models: The seminar will deal with “Measurement Error (ME)”. First, we will introduce measurement errors, show the different types of noise and learn how to propagate them to the final result in the univariate case. Then, we will move to multivariate data and will show how to characterize and simulate multivariate ME, with special emphasis in error covariance matrices and how these can provide a clue about the structure of the errors of our data. Finally, we will apply the error propagation theory to the multivariate scenario and learn how to calculate the uncertainty of prediction (PLS) and classification.

Dates and timetable: From the 21st until the 25th of June, 2021. From 9 am until 4 pm (CET) with 1-hour lunch break (30 hours)

 

Previous knowledge needed: Basic multivariate data analysis and Matlab

Software needed: Matlab and PLS_Toolbox. IMPORTANT: For the PLS_Toolbox a fully functional demo will be available for the School.

Teachers: Rasmus Bro (MULTIWAY), Federico Marini (FUSION) and Ricard Boqué (ERROR).

ECTS: 2.5

Price student: 1500 DKK (Approx. 200 euro / 245 USD)

Price industry: 3750 DKK (Approx. 505 euro / 612 USD)

MCR - Multivariate Curve Resolution

 

The module will address the theoretical description and hands-on application of Multivariate Curve Resolution (MCR). MCR is a multivariate resolution (unmixing) method that can provide the description of a multicomponent data set through a bilinear model of chemically meaningful profiles, e.g., when analysing an HPLC-DAD data set, MCR would provide the real elution profiles and the related UV spectra for each compound in the sample. It has application in many diverse fields, such as process analysis, chromatographic data, hyperspectral images or environmental data, essentially in any context where a mixture analysis problem can be encountered. MCR can be applied to a single data matrix or to multiset structures formed by blocks of different information (data fusion). The module focuses mainly on the algorithm MCR-ALS (Multivariate Curve Resolution-Alternating Least Squares) and hands-on work will be done using a dedicated free GUI interface adapted to MATLAB environment. Applications will cover many of the areas mentioned above.

Dates and timetable: From the 28th until the 29th of June, 2021. From 9 am until 4 pm (CET) with 1-hour lunch break (12 hours)

 

Previous knowledge needed: Basic knowledge of PCA and multivariate regression methods.

Software: Matlab and the MCR-ALS toolbox. MCR-ALS toolbox can be freely downloaded here:

https://mcrals.wordpress.com/download/mcr-als-2-0-toolbox/

Teacher: Anna de Juan

ECTS: 1

Price student: 600 DKK (Approx. 81 euro / 98 USD)

Price industry: 1500 DKK (Approx. 200 euro / 245 USD)

QSAR - Quantitative Structure-Activity Relationship

The seminar will introduce the main concepts of quantitative structure-activity relationship (QSAR) modelling, i.e., how we can leverage chemometrics to predict the Physico-chemical and biological properties of molecules. In the seminar, we will explore some of the core concepts of QSAR modelling, such as molecular representation and description, and applicability domain. These concepts will be then applied to a real case study, i.e., to develop a QSAR model to predict biologically relevant properties of molecules.

Dates and timetable: From the 28th until the 29th of June, 2021. From 9 am until 4 pm (CET) with 1-hour lunch break (12 hours)

 

Previous knowledge needed: Basic knowledge of PCA and multivariate classification methods.

Software: Matlab and any spreadsheet software in general.

Teacher: Francesca Grisoni

ECTS: 1

Price student: 600 DKK (Approx. 81 euro / 98 USD)

Price industry: 1500 DKK (Approx. 200 euro / 245 USD)

MSPC - Multivariate Statistic Process Control

The seminar will deal with latent variables based on Multivariate Statistical Process Monitoring. First, we will present the context of application and objectives, focus on the differences among end-quality monitoring and process monitoring, nature of process variables w.r.t. “chemical” sensors for intermediate quality monitoring. We will recall the concept of univariate control chart introduce measurement errors, then, we will move to multivariate control charts. Give the basic of main approaches used for the analysis of continuous and batch process data and the main issues encountered.

Dates and timetable: From the 30th of June until the 1st of July, 2021. From 9 am until 4 pm (CET) with 1-hour lunch break (12 hours)

 

Previous knowledge needed: Basic knowledge multivariate exploration and regression methods

Software: Matlab

Teacher: Marina Cocchi

ECTS: 1

Price student: 600 DKK (Approx. 81 euro / 98 USD)

Price industry: 1500 DKK (Approx. 200 euro / 245 USD)

METABO - Metabolomics Data Analysis

 

The seminar will start with introducing various pre-processing techniques suitable for various –omics data. The problems of noise, baseline removal, alignment and normalization will be discussed using several examples. The seminar will follow on presenting several machine learning ensemble techniques, such as unsupervised random forest, random forest, gradient boosting trees and the principle of non-linear bi-plots. The theoretical background of these techniques will be provided, accompanied by several real-life data examples. The seminar will summarise by introducing the theory and application principle of ensemble stacking. 

Dates and timetable: From the 30th of June until the 1st of July, 2021. From 9 am until 4 pm (CET) with 1-hour lunch break (12 hours)

 

Previous knowledge needed: Basic knowledge of PCA and multivariate regression and classification methods

Software: Matlab and git

Teacher: Agnieszka Smolinska

ECTS: 1

Price student: 600 DKK (Approx. 81 euro / 98 USD)

Price industry: 1500 DKK (Approx. 200 euro / 245 USD)

GLUE - How not to make Chemometrics

In this half-day seminar, we will take a very close look at all the most common mistakes that even experienced people will do when doing multivariate analysis. We will cover exploration, calibration, interpretation, visualization and many other subjects. And always with a focus on what is the most common problem as well as a sounder alternative.

 

Dates and timetable: 2nd of July, 2021. From 9 am until 12 pm (CET) (3 hours)

Previous knowledge needed: Chemometrics

Software: None

Teacher: José Manuel Amigo and Rasmus Bro

ECTS: 0

Price student and industry: 0 DKK (Approx. 0 euro / 0 USD)

 

 

visit our youtube channel

(c) HYPER-Tools v.3.0. is free of use as well as the rest of the material that can be downloaded from this website

For further information, contact José Manuel Amigo (info@hypertools.org).