Data Science Course
incl. Generative AI
Our Data Science Training (IABAC™ accredited) prepares you to become an expert in data science. The program is comprehensive: we begin with foundational data science skills and progress to creating machine learning models and exploring advancements in generative AI. By the end of this training, you’ll be equipped to work independently as a data scientist within your organization.
The data science training offers a mix of theory and practice. On one hand, we delve deeply into the field of data science, and on the other, you’ll immediately get hands-on experience writing scripts in Python. This approach allows you to expand your theoretical knowledge while simultaneously developing new skills that can be directly applied in practice.
Throughout this data science training, we use Python. Prior programming knowledge is not required. Upon completion, you’ll receive an official certificate from this IABAC™ accredited program.
Reviews from participants & clients
Learning Objectives Data Science Course
Complete introduction in data science
In the first two days, we familiarize you with the possibilities of Python and data science. We explain all the fundamental skills for data scientists through assignments in our custom online environment. For example, you’ll learn how to automate analyses and create advanced visualizations. This provides you with a solid foundation to start working independently in practice.
In-depth knowledge of machine learning
In the final two days, we delve deeper into machine learning and generative AI. You’ll learn about the different models, when to apply each model, and how to evaluate them. The sessions alternate between theory and practical assignments. After completing the training, you’ll have a comprehensive understanding of the possibilities of data science and be capable of writing algorithms for your organization.
Is this Data Science training for me?
This training is for you if:
- You want to gain more insights from the data available within your work environment.
- You have a profession that involves many data sources and analyses, and you want to broaden your knowledge. For example, in finance, logistics, marketing, ICT, or HR.
Required Knowledge
No specific prior knowledge is required, but a basic understanding of mathematics and experience with tools like Excel files, text files (csv), or databases is helpful. This is a training at a challenging level, and we recommend a minimum of a bachelor's degree (HBO) level.
Content of the Data Science Training
The content of this training is accredited by the International Association of Business Analytics Certification (IABAC™).
Click on the days below to view the program in more detail.
Day 1 Data Science basics
Introduction to Data Science
On the first day, you will be introduced to the field of data science. How did it originate, and what skills does a data scientist need? We’ll explain why Python is the programming language central to every data scientist’s work. You’ll also learn about the most commonly used data science packages. During this first morning, you will write and execute your own Python script in our custom online environment.
Throughout the first day, you will acquire several essential foundational skills required for any data scientist. These skills will also be used throughout the rest of the training.
Foundational Skill #1: Variables & Data Types
When working with data, it’s important to know how to store data in variables. For example, you can store numbers, sequences, or text. In this data science training, you’ll learn which data types are relevant for data scientists and experience how different data types can enable various possibilities.
Foundational Skill #2: Lists
Data science often involves working with massive datasets. A collection of data points can be stored in a 'list.' We’ll practice using lists and discuss their practical applications in everyday scenarios.
Foundational Skill #3: Dictionaries
A dictionary in Python functions like a literal dictionary. Using 'keys,' you can retrieve specific information. For instance, you could create a dictionary linking city names to their populations. As a data scientist, you will use dictionaries daily. We’ll explore their applications and practice this foundational skill.
Foundational Skill #4: Logic, Methods, and Functions for Data Science
Every data analysis applies logic to the data. Python offers a variety of tools to achieve this. We’ll discuss when to apply logic and practice it to gain proficiency. You’ll also learn how to simplify repetitive tasks in analyses using functions. You will apply functions, methods, and logic to extract your first complex insights from data.
With these foundational skills, we’ll establish a solid base on the first day, which we will continue to build upon in the following days.
Day 2 Common data science tools
NumPy: Working Efficiently with Large Datasets
If you’ve ever worked with a large dataset in Excel, you may have experienced its limitations. With Python, you can work with much larger datasets in a more efficient way. NumPy is a Python package that allows you to perform operations on large datasets.
Moreover, the NumPy package is widely used by other data science packages, making it an essential part of any data science training. We’ll discuss how to use NumPy and practice with a real-world case study.
Pandas: The Most Important Data Science Tool
Pandas is by far the most commonly used tool by data scientists. Essentially, you can do everything in Pandas that you can do in Excel—and much more. For instance, you can import data from various sources, such as Excel files, CSVs, APIs, or databases.
Once imported, you can explore and manipulate the data in numerous ways. You can combine, transform, group, or filter datasets. The possibilities are nearly endless. We will dedicate ample time to hands-on exercises to ensure you gain a strong command of Pandas.
By this point in the training, you’ll already be able to extract insights from data that would have been impossible to achieve with Excel.
Matplotlib: Creating Stunning Visualizations
A data scientist’s role isn’t just about gathering insights; communicating these insights effectively is equally important. Since not everyone is comfortable with raw data, data scientists often present insights through graphs and visualizations.
With the Python package Matplotlib, you’ll learn to create hundreds of different types of visualizations. The image below provides a random glimpse into the variety of possibilities.
In this data science training, you’ll learn to apply Matplotlib in a practical case study, where you create complex, customized visualizations of a dataset.
Final Assignment
During the final assignment, you will apply everything you’ve learned in the first two days to solve a single problem. You’ll connect various data sources, apply logic, write functions, and work with the NumPy, Pandas, and Matplotlib packages. You’ll extract valuable insights from the data and visualize these insights in a way that makes them clear and understandable to others.
Day 3 machine learning basics
Introduction to Machine Learning
Machine learning is more popular than ever. We explain how the field originated and how it has evolved over time. We provide an insightful overview of the wide range of possibilities, including discussions on what supervised learning and unsupervised learning entail. Additionally, we explore when to apply different algorithms.
You’ll learn the steps a data scientist follows when building a machine learning model. This step-by-step process will be applied during the practical cases on days 3 and 4 of the data science training. By following this method repeatedly, it will become your standard workflow.
Classification: The Theory
A classification algorithm predicts whether an observation belongs to a specific category or group. For example, emails are assessed by a classification model that sorts them into "spam" or "not spam." We dive into the underlying statistics so you can understand how classification algorithms work. We also present various practical examples and applications. Additionally, you’ll learn which Python packages to use for these types of problems.
Classification: A Practical Case
During the classification practical case, you will build your first machine learning model from scratch. Using the skills acquired on days 1 and 2 of the data science training, you will import and explore a dataset. You’ll create graphs to uncover relationships in the data that are relevant to your machine learning model.
We address how to handle incomplete (missing values) or messy datasets. You will write your own machine learning model, train it with a portion of your data, and then validate its performance by making predictions on new data.
Regression: The Theory
A regression algorithm allows you to predict numerical values. For example, predicting life expectancy based on someone’s lifestyle. We explore the statistics and mathematics behind regression analysis so you understand what happens when applying machine learning models with these algorithms. We discuss various real-world examples and explain the options available in Python for implementing regression.
Regression: A Practical Case
As with classification, we also work on an engaging practical case for regression, where you build your own model from the ground up. This begins with importing a raw dataset into Python.
Through exploratory data analysis, you’ll investigate which factors might have significant predictive value in your regression model. Additionally, you’ll learn how to use non-numerical data in regression models. We practice with multiple regression algorithms, such as linear regression and gradient boosting, and discuss when to choose each approach.
Day 4 unsupervised learning & generative AI
Unsupervised Learning and Clustering
In this section, you’ll learn what unsupervised learning is and how it differs from supervised learning. We focus specifically on clustering, a widely used technique in unsupervised learning to identify groups within data without predefined labels. You’ll explore how clustering algorithms like k-means clustering work.
Generative AI (ChatGPT / DALL·E)
In this section, you’ll discover what generative AI is and how it differs from other AI/ML techniques. We provide insight into the underlying principles and the workings of this technology. Additionally, we cover key generative AI models like GPT and DALL·E, showcasing their impressive capabilities in generating text and images. You’ll also learn how to "prompt" AI for optimal results. Finally, we demonstrate practical applications, enabling you to use Python code to implement generative AI for text-to-text and text-to-image generation.
Evaluating and Improving a Model
There are various ways to measure a model’s performance. In this data science training, you’ll learn about these methods and when to use them. For instance, we’ll cover ROC curves, AUC, and confusion matrices. Using this knowledge, you’ll revisit your earlier models and explore the effects of improvements through feature engineering and hyperparameter tuning.
Deploying a Model to Production
A model truly adds value when it is consistently applied to new data. For example, an automatic prediction can be generated whenever new data is added to a database. Using a practical example, you’ll learn how to achieve this. New data can lead to new situations, causing the model’s performance to degrade over time. You’ll learn how to account for this and how to prevent it.
Final Assignment
In a challenging final assignment, you will apply your new skills to an intriguing real-world dataset. You’ll combine data from multiple sources, enhance it with feature engineering, and develop a model of your choice. You’ll iteratively improve the model until it meets the required expectations.
Location, dates, and times
This Data Science Training is offered exclusively as an in-company program for groups. The location, date, and times are flexible. Contact us for a customized quote tailored to your specific needs.
You will receive a custom proposal
• Dates are flexible
• Location is flexible