Last Updated: March 7, 2026
STM32 FreeRTOS CMSIS-RTOS Tutorial – Step-by-Step Guide Using STM32CubeIDE
In many STM32 projects, we start with a simple while(1) superloop. It works fine for small applications. But as soon as we add multiple tasks like sensor reading, UART communication, and display updates, things get messy.
The common issues include:
- Delays started conflicting
- UART missed data
- The system became hard to manage
That is where RTOS helps us. In this tutorial, we will learn how to:
- Configure FreeRTOS using CMSIS-RTOS V2
- Create a task in STM32
- Understand how scheduler works
- Blink an LED using
osDelay() - Understand why RTOS is better than superloop
We will use STM32CubeMX and STM32CubeIDE to create a build the project. I am going to use the STM32 Nucleo-L496ZG-P for this RTOS series.

Why We Need FreeRTOS in STM32 Projects
When we start learning STM32, most of us write code inside a simple while(1) loop. This is called the superloop approach. It works well in the beginning. I also used it for many small projects.
But once the project grows, problems start appearing. Let’s understand why.
Problems with Superloop and HAL_Delay()
In a typical STM32 project, we often write something like this:
while(1) // Single infinite loop
{
sensor_task();
HAL_Delay(200); // BLOCKS EVERYTHING
display_task();
HAL_Delay(500); // BLOCKS EVERYTHING
UART_task();
HAL_Delay(1000); // BLOCKS EVERYTHING
}At first glance, this looks fine. But there is a big problem here.
HAL_Delay() blocks the entire CPU. When we call:
HAL_Delay(500);the CPU just waits there for 500 milliseconds. It does nothing else. During this time:
- UART cannot process incoming data properly
- Sensors may miss important readings
- Other time-critical operations get delayed
The CPU is busy waiting. It is not multitasking. In small projects, this seems fine. But in real embedded systems, this becomes a serious limitation.
As the number of tasks increases, maintaining proper timing becomes very difficult. We start adding counters, flags, and time checks using HAL_GetTick(). The loop becomes messy very quickly.
This is exactly the point where we should consider using RTOS in STM32 projects.
Why State Machines Become Complex
Now suppose we decide not to use HAL_Delay(). Instead, we try to make everything non-blocking using state machines.
We may write something like:
switch(state)
{
case READ_SENSOR:
if (HAL_GetTick() - prevTime >= 200)
{
Read_Sensor();
prevTime = HAL_GetTick();
state = UPDATE_DISPLAY;
}
break;
case UPDATE_DISPLAY:
// similar logic
break;
}This works. But the code becomes harder to read. When we add:
- Multiple sensors
- UART communication
- Error handling
- Communication timeouts
The state machine grows very large. The code loses clarity. Maintenance becomes difficult. One small change can break the entire timing logic.
So What Is the Solution?
We need a better way to:
- Run multiple tasks independently
- Avoid blocking the CPU
- Keep timing accurate
- Maintain clean and readable code
That solution is FreeRTOS. Instead of one big loop, we create separate tasks. Each task handles one job. The scheduler decides which task runs and when.
If one task waits, others continue running. This is why modern embedded systems prefer RTOS-based design over the traditional superloop approach.
What is FreeRTOS and How It Works
Before we start writing code, we should clearly understand what FreeRTOS actually is. Many beginners think it is just a library. But it is much more than that.
FreeRTOS changes the way our STM32 application runs internally. Instead of one infinite loop controlling everything, the control moves to a scheduler.
What is a Real-Time Operating System (RTOS)?
A Real-Time Operating System (RTOS) is a lightweight operating system designed for embedded systems.
The keyword here is real-time. Real-time does not mean fast. It means predictable timing.
In embedded systems, we often need things to happen at precise intervals:
- Read a sensor every 10 ms
- Transmit data every 100 ms
- Respond to an interrupt immediately
An RTOS ensures that high-priority tasks run exactly when they should.
In a normal superloop Everything runs sequentially, Delays block the CPU and the timing becomes unpredictable.
Whereas in FreeRTOS:
- We create multiple tasks
- The scheduler decides which task runs
- Tasks can wait without blocking others
So instead of this:
while(1)
{
Task1();
Task2();
Task3();
}We create independent tasks, and the RTOS runs them efficiently in the background.
What is a Task in FreeRTOS?
A task is like a small independent program inside your main program. Each task has:
- Its own stack
- Its own local variables
- Its own execution point
- Its own priority
When the scheduler switches from one task to another, it saves everything about the current task and restores the next one.
It feels like they are running at the same time. In reality, the CPU is switching between them very fast.
A typical task looks like this:
void StartTask(void *argument)
{
for(;;)
{
// Task code
}
}We intentionally write an infinite loop inside a task. If a task returns, FreeRTOS deletes it automatically. So every task must run forever unless we explicitly delete it. This structure keeps the system stable.
Task States in RTOS (Running, Ready, Blocked, Terminated)
Every task in FreeRTOS can be in one of four main states.
- Running:
This task currently owns the CPU. Only one task can be in the Running state at a time (in single-core STM32).
- Ready:
The task is ready to run but waiting for the scheduler to give it CPU time. If two tasks are ready, the one with higher priority runs first.
- Blocked:
The task is waiting for something:
- A delay (
osDelay) - A semaphore
- A queue
- An event
Blocked tasks do not use CPU time. This is the biggest advantage over HAL_Delay(). When we use osDelay(1000); Only that task is paused. Other tasks continue running.
- Terminated:
The task has been deleted. It no longer exists in the system. In most applications, we rarely terminate task, rather we keep them running forever.
What is a Task Control Block (TCB)?
Internally, FreeRTOS needs to store information about each task. This information is stored in something called a Task Control Block (TCB).
It stores:
- Stack pointer
- Task priority
- Current state (Running, Ready, etc.)
- Timing information
- Task name
Whenever a context switch happens, the scheduler:
- Saves the current task’s CPU registers into its stack
- Updates its TCB
- Loads the next task’s stack pointer
- Restores its registers
All this happens in microseconds. We do not see it. But this is what makes multitasking possible.
What is CMSIS-RTOS and Why We Use It
Now that we understand what FreeRTOS is and how tasks work, the next question is: If we are already using FreeRTOS, why do we need CMSIS-RTOS?
FreeRTOS vs RTX5 vs ThreadX
In the embedded world, FreeRTOS is not the only RTOS available. There are several RTOS kernels:
- FreeRTOS
- Keil RTX5
- Azure RTOS ThreadX
- Zephyr and others
Each RTOS has its own API. For example:
In FreeRTOS, we create a task using: xTaskCreate()
In ThreadX, we use tx_thread_create() for the same
In Zephyr, k_thread_create() is used to create a task.
Now imagine this situation:
- Today, your project uses FreeRTOS
- Tomorrow, your company decides to move to ThreadX
- Or your client requires RTX5
If your entire application code uses native FreeRTOS APIs, you must rewrite everything. This is where CMSIS-RTOS V2 becomes very useful.
CMSIS-RTOS as an Abstraction Layer
CMSIS stands for Common Microcontroller Software Interface Standard. CMSIS-RTOS is a standardized API defined by ARM.
- It does not replace FreeRTOS.
- It does not schedule tasks.
- It is not a kernel.
It is just an abstraction layer. Think of it like this:
Your code calls CMSIS-RTOS functions like:
osThreadNew();
osDelay();
osSemaphoreAcquire();These functions internally call the native FreeRTOS or any other kernel’s functions. The CMSIS wrapper translates them to RTOS calls behind the scenes. This makes your application independent of the underlying RTOS.
How CMSIS-RTOS Maps to FreeRTOS APIs
Let’s look at some direct mappings to understand this clearly.
| CMSIS-RTOS V2 | Native FreeRTOS API |
|---|---|
osThreadNew() | xTaskCreate() |
osDelay() | vTaskDelay() |
osKernelStart() | vTaskStartScheduler() |
osSemaphoreNew() | xSemaphoreCreateBinary() / others |
So when we write:
osDelay(1000);Internally, FreeRTOS executes:
vTaskDelay(1000);This separation gives us portability. If tomorrow we switch from FreeRTOS to RTX5:
- We still call
osThreadNew() - We still call
osDelay() - We still call
osKernelStart()
Only the underlying implementation changes. Our application code remains the same. That is why in this tutorial series, we will use CMSIS-RTOS V2 instead of native FreeRTOS APIs.
Creating STM32 FreeRTOS Project in STM32CubeMX
In this section, we will create a FreeRTOS project using CMSIS-RTOS V2 in STM32CubeMX. We will configure the clock, enable FreeRTOS, and set up a simple LED task.
I will use the STM32 Nucleo-L496ZG-P board for this tutorial.
Clock Configuration
We will start with the clock configuration. This Nucleo board does not use an external crystal for the main system clock. So we will use the internal oscillator.
The image below shows the clock configuration for this project.
We will use the internal oscillator (HSI) and PLL to run the system at maximum 80MHz clock.
Enabling CMSIS-RTOS V2 in Middleware
Now we will enable the CMSIS-RTOS. The configuration can be found in the Middleware section as shown in the image below.
Always choose CMSIS-RTOS V2 for new projects. It is cleaner and more modern.
Let’s understand the basic parameters.
Tick Rate
The default tick rate is 1000 Hz.
This means the sysTick interrupt runs 1000 times per second which provides 1 ms resolution.
So when we use osDelay(1) It delays approximately 1 millisecond.
Since the maximum tick rate is 1000 Hz, we cannot generate delays smaller than 1 millisecond in FreeRTOS.
Minimal Stack Size
The default value is 128 words.
Note that this is in words, not bytes. Since STM32 is a 32-bit MCU:
1 word = 4 bytes
128 words = 512 bytes
Each task has its own stack. If the stack is too small, the system may crash due to stack overflow. For now, 128 words is enough for a simple LED task.
Total Heap Size
The default heap size is around 3000 bytes.
This heap is used for:
- Tasks
- Queues
- Semaphores
- Timers
- Other RTOS objects
When we create a new task, memory is taken from this heap. For this simple example, 3000 bytes is more than enough. Later, when we add more RTOS features, we may need to increase it.
Configuring Default Task
Now go to Tasks and Queues. You will see one task already created: DefaultTask.
The parameters of this defaultTask are as follows:
- Task Name: DefaultTask
(Only for identification) - Priority: Normal
We will keep it as Normal for now. - Stack Size: 128 words
Minimum allowed value. - Entry Function:
StartDefaultTask
This is where we will write our code later. - Argument: NULL
We are not passing any parameters. - Memory Allocation: Dynamic
The stack and Task Control Block will be allocated from the heap at runtime.
Changing HAL Time Base from SysTick to TIM6
This step is very important in STM32 FreeRTOS projects. By default, STM32 uses SysTick as the HAL time base. But FreeRTOS also uses SysTick for its scheduler.
If both HAL and FreeRTOS share SysTick, it may create timing conflicts.
So we should dedicate:
- SysTick → FreeRTOS Scheduler
- TIM6 (or TIM7) → HAL Time Base
- Go to System Core → SYS
- Change Timebase Source from SysTick to TIM6 (or TIM7)
TIM6 and TIM7 are basic timers. They do not support PWM or advanced features. That is why they are ideal for this purpose.
Enabling Newlib Reentrant Support
Go to Middleware → FreeRTOS → Advanced Settings and Enable Use newlib reentrant.
In an RTOS environment, multiple tasks may call functions like printf(); sprintf(); malloc(); etc. These standard C library functions are not thread-safe by default.
If two tasks call printf() at the same time, the output may get corrupted. Enabling newlib reentrant makes these functions safe in multitasking systems. It uses slightly more memory, but it prevents difficult debugging issues later.
Configuring GPIO for LED (PB7)
Now we will configure the onboard LED. According to the board schematic, the blue LED is connected to PB7. Therefore, we will configure the pin PB7 in the output mode.
We do not need to change any other GPIO settings for this example. Later inside the task, we will toggle this pin using HAL_GPIO_TogglePin().
Understanding the Generated FreeRTOS Code
After generating the project from STM32CubeMX and opening it in STM32CubeIDE, you will see that most of the FreeRTOS-related code is already written for you.
Let’s understand the most important functions that control the RTOS behavior.
osKernelInitialize()
This function initializes the FreeRTOS kernel. It prepares all internal RTOS data structures such as:
- Task control blocks
- Ready lists
- Delayed lists
- Memory management structures
It simply prepares the RTOS environment so that tasks can be created safely. You will usually see it inside main() before any task creation.
osThreadNew()
This function is used to create a new task (thread). CubeMX automatically generates the required parameters like DefaultTask, Entry function (for example StartDefaultTask), Stack size and Priority.
When osThreadNew() is called:
- Memory is allocated from the FreeRTOS heap
- A Task Control Block (TCB) is created
- The task is added to the ready list
You can create multiple tasks by calling osThreadNew() multiple times.
osKernelStart()
This is the most important function. When osKernelStart() is called:
- The FreeRTOS scheduler starts
- SysTick begins generating RTOS ticks
- Context switching becomes active
- The highest priority ready task starts executing
From this moment the system is fully under RTOS control. This function never returns under normal conditions. Therefor, any code written after it in main() will not execute.
If you need continuous execution, you need to put your logic inside a task (Not inside main() after osKernelStart()).
Writing Our First FreeRTOS Task – LED Blink Example
Now that the FreeRTOS project is generated, let’s write our first task. We will blink the onboard LED connected to PB7 on the STM32 Nucleo-L496ZG-P.
Writing the Task Code
We have created only one task, DefaultTask, and the entry function for this task is shown below.
void StartDefaultTask(void *argument);Basically, we need to write our code inside this function.
void StartDefaultTask(void *argument)
{
for(;;)
{
HAL_GPIO_TogglePin(GPIOB, GPIO_PIN_7);
osDelay(500);
}
}Below is the explanation of the function:
- The LED toggles every 500 ms
- The task sleeps for 500 ms
- FreeRTOS can run other tasks during that delay
This creates a clean 1-second blink cycle.
Why We Use osDelay Instead of HAL_Delay
In bare-metal projects, we usually write HAL_Delay(1000), but in a FreeRTOS project, we should use osDelay(1000).
This is because HAL_Delay() blocks the CPU. It uses the HAL time base and performs a busy wait. While waiting, no other task can run efficiently.
On the other hand:
osDelay()is RTOS-aware.- It puts the current task into the Blocked state.
- The scheduler switches to another ready task.
- CPU time is used efficiently.
Full Combined Code Example
Below is the simplified structure of main() and the task:
int main(void)
{
HAL_Init();
SystemClock_Config();
MX_GPIO_Init();
MX_FREERTOS_Init();
osKernelInitialize();
osThreadNew(StartDefaultTask, NULL, &defaultTask_attributes);
osKernelStart();
while (1)
{
}
}
void StartDefaultTask(void *argument)
{
for(;;)
{
HAL_GPIO_TogglePin(GPIOB, GPIO_PIN_7);
osDelay(500);
}
}Important reminder:
- Code after
osKernelStart()will not run. - All application logic must be inside tasks.
Output of the project
After flashing the project to the board, the LED should blink once every second. The GIF below demonstrates the expected output.
You can see above:
- The LED connected to PB7 will blink every 500 ms.
- The blinking will continue indefinitely.
- The scheduler is now running in the background.
Video Tutorial
STM32 FreeRTOS LED Blink Video Tutorial
This STM32 FreeRTOS tutorial shows the complete project setup and execution in action. I walk you through the CubeMX configuration, moving the HAL time base to TIM6, enabling newlib reentrant support, understanding osKernelInitialize(), osThreadNew(), osKernelStart(), and finally writing the LED blink task step by step. Watch the video and follow the written guide together to clearly understand how a basic FreeRTOS project is structured and how the scheduler runs your first task.
Watch the FreeRTOS TutorialConclusion
In this tutorial, we built a complete FreeRTOS-based project using STM32CubeMX and STM32CubeIDE on the STM32 Nucleo-L496ZG-P. We started by configuring the clock, enabling CMSIS-RTOS V2, adjusting stack and heap sizes, and making two critical configuration changes: moving the HAL time base from SysTick to TIM6 and enabling newlib reentrant support. Then we understood the generated FreeRTOS structure, including osKernelInitialize(), osThreadNew(), and osKernelStart(), and clarified why application code must always run inside tasks.
Finally, we implemented our first FreeRTOS task to blink an LED using osDelay() instead of HAL_Delay(), compiled the project, and flashed it to the board. This foundation is extremely useful because it teaches the correct way to structure RTOS-based embedded applications. With this understanding, you can now confidently create multiple tasks, use priorities, and build scalable real-time systems instead of relying on blocking bare-metal code.
Browse More STM32 Tutorials
LVGL on STM32 || PART 6
W25Q Flash Series Part 7 – QUADSPI Write, Read, Memory Mapped mode
STM32 Timers (Part 9): One Pulse Mode (OPM) – Generate Precise Triggered Pulses with Delay and Width Control
How to Interface GC9A01 Round Display with STM32 Using SPI + LVGL Integration
How to Setup UART using Registers in STM32
How to use Ethernet with Riverdi STM32 Displays
STM32 USB HOST HID
STM32 RTOS Project Download
Info
You can help with the development by DONATING Below.
To download the project, click the DOWNLOAD button.
STM32 FreeRTOS FAQs
Yes. You can use TIM7 or any other basic timer that is not required elsewhere in your project. Just make sure it is not being used for PWM, input capture, or other application features.
Your project may still compile and run, but functions like printf(), malloc(), or sprintf() can behave unpredictably when called from multiple tasks. This can lead to corrupted output or hard-to-trace bugs.
Common signs include random crashes, HardFaults, or unexpected resets. You can enable stack overflow checking in FreeRTOS settings to detect this during development.
Yes. Tasks can be created even after osKernelStart() using osThreadNew(). This is useful for event-driven designs where tasks are created only when needed.
Because osDelay() depends on the RTOS tick frequency. If you reduce the tick rate, the delay resolution changes accordingly, which directly affects timing accuracy.











