Home ▸ STM32 Tutorials ▸ STM32 HAL ▸ STM32 CMSIS-RTOS (Part 1): FreeRTOS Setup, Tasks & Scheduler Explained

Published: March 3, 2026
Last Updated: March 7, 2026

STM32 FreeRTOS CMSIS-RTOS Tutorial – Step-by-Step Guide Using STM32CubeIDE

Q: What happens if I forget to enable newlib reentrant?

Your project may still compile and run, but functions like printf() , malloc() , or sprintf() can behave unpredictably when called from multiple tasks. This can lead to corrupted output or hard-to-trace bugs.

Q: Can I create tasks dynamically after the scheduler starts?

Yes. Tasks can be created even after osKernelStart() using osThreadNew() . This is useful for event-driven designs where tasks are created only when needed.

Q: Why does my LED blink speed change when I modify the tick rate?

Because osDelay() depends on the RTOS tick frequency. If you reduce the tick rate, the delay resolution changes accordingly, which directly affects timing accuracy.

In many STM32 projects, we start with a simple while(1) superloop. It works fine for small applications. But as soon as we add multiple tasks like sensor reading, UART communication, and display updates, things get messy.

The common issues include:

Delays started conflicting
UART missed data
The system became hard to manage

That is where RTOS helps us. In this tutorial, we will learn how to:

Configure FreeRTOS using CMSIS-RTOS V2
Create a task in STM32
Understand how scheduler works
Blink an LED using osDelay()
Understand why RTOS is better than superloop

We will use STM32CubeMX and STM32CubeIDE to create a build the project. I am going to use the STM32 Nucleo-L496ZG-P for this RTOS series.

Why We Need FreeRTOS in STM32 Projects

When we start learning STM32, most of us write code inside a simple while(1) loop. This is called the superloop approach. It works well in the beginning. I also used it for many small projects.

But once the project grows, problems start appearing. Let’s understand why.

Problems with Superloop and HAL_Delay()

In a typical STM32 project, we often write something like this:

while(1) // Single infinite loop
{  
  sensor_task();
  HAL_Delay(200);  // BLOCKS EVERYTHING
  display_task();
  HAL_Delay(500);  // BLOCKS EVERYTHING
  UART_task();
  HAL_Delay(1000); // BLOCKS EVERYTHING
}

while(1) // Single infinite loop
{  
  sensor_task();
  HAL_Delay(200);  // BLOCKS EVERYTHING
  display_task();
  HAL_Delay(500);  // BLOCKS EVERYTHING
  UART_task();
  HAL_Delay(1000); // BLOCKS EVERYTHING
}

At first glance, this looks fine. But there is a big problem here.

HAL_Delay() blocks the entire CPU. When we call:

HAL_Delay(500);

HAL_Delay(500);

the CPU just waits there for 500 milliseconds. It does nothing else. During this time:

UART cannot process incoming data properly
Sensors may miss important readings
Other time-critical operations get delayed

The CPU is busy waiting. It is not multitasking. In small projects, this seems fine. But in real embedded systems, this becomes a serious limitation.

As the number of tasks increases, maintaining proper timing becomes very difficult. We start adding counters, flags, and time checks using HAL_GetTick(). The loop becomes messy very quickly.

This is exactly the point where we should consider using RTOS in STM32 projects.

Why State Machines Become Complex

Now suppose we decide not to use HAL_Delay(). Instead, we try to make everything non-blocking using state machines.

We may write something like:

switch(state)
{
  case READ_SENSOR:
    if (HAL_GetTick() - prevTime >= 200)
    {
      Read_Sensor();
      prevTime = HAL_GetTick();
      state = UPDATE_DISPLAY;
    }
    break;

  case UPDATE_DISPLAY:
    // similar logic
    break;
}

switch(state)
{
  case READ_SENSOR:
    if (HAL_GetTick() - prevTime >= 200)
    {
      Read_Sensor();
      prevTime = HAL_GetTick();
      state = UPDATE_DISPLAY;
    }
    break;

  case UPDATE_DISPLAY:
    // similar logic
    break;
}

This works. But the code becomes harder to read. When we add:

Multiple sensors
UART communication
Error handling
Communication timeouts

The state machine grows very large. The code loses clarity. Maintenance becomes difficult. One small change can break the entire timing logic.

So What Is the Solution?

We need a better way to:

Run multiple tasks independently
Avoid blocking the CPU
Keep timing accurate
Maintain clean and readable code

That solution is FreeRTOS. Instead of one big loop, we create separate tasks. Each task handles one job. The scheduler decides which task runs and when.

If one task waits, others continue running. This is why modern embedded systems prefer RTOS-based design over the traditional superloop approach.

What is FreeRTOS and How It Works

Before we start writing code, we should clearly understand what FreeRTOS actually is. Many beginners think it is just a library. But it is much more than that.

FreeRTOS changes the way our STM32 application runs internally. Instead of one infinite loop controlling everything, the control moves to a scheduler.

What is a Real-Time Operating System (RTOS)?

A Real-Time Operating System (RTOS) is a lightweight operating system designed for embedded systems.

The keyword here is real-time. Real-time does not mean fast. It means predictable timing.

In embedded systems, we often need things to happen at precise intervals:

Read a sensor every 10 ms
Transmit data every 100 ms
Respond to an interrupt immediately

An RTOS ensures that high-priority tasks run exactly when they should.

In a normal superloop Everything runs sequentially, Delays block the CPU and the timing becomes unpredictable.

Whereas in FreeRTOS:

We create multiple tasks
The scheduler decides which task runs
Tasks can wait without blocking others

So instead of this:

while(1)
{
  Task1();
  Task2();
  Task3();
}

while(1)
{
  Task1();
  Task2();
  Task3();
}

We create independent tasks, and the RTOS runs them efficiently in the background.

What is a Task in FreeRTOS?

A task is like a small independent program inside your main program. Each task has:

Its own stack
Its own local variables
Its own execution point
Its own priority

When the scheduler switches from one task to another, it saves everything about the current task and restores the next one.

It feels like they are running at the same time. In reality, the CPU is switching between them very fast.

A typical task looks like this:

void StartTask(void *argument)
{
  for(;;)
  {
    // Task code
  }
}

void StartTask(void *argument)
{
  for(;;)
  {
    // Task code
  }
}

We intentionally write an infinite loop inside a task. If a task returns, FreeRTOS deletes it automatically. So every task must run forever unless we explicitly delete it. This structure keeps the system stable.

Task States in RTOS (Running, Ready, Blocked, Terminated)

Every task in FreeRTOS can be in one of four main states.

Running:

This task currently owns the CPU. Only one task can be in the Running state at a time (in single-core STM32).

Ready:

The task is ready to run but waiting for the scheduler to give it CPU time. If two tasks are ready, the one with higher priority runs first.

Blocked:

The task is waiting for something:

A delay (osDelay)
A semaphore
A queue
An event

Blocked tasks do not use CPU time. This is the biggest advantage over HAL_Delay(). When we use osDelay(1000); Only that task is paused. Other tasks continue running.

Terminated:

The task has been deleted. It no longer exists in the system. In most applications, we rarely terminate task, rather we keep them running forever.

What is a Task Control Block (TCB)?

Internally, FreeRTOS needs to store information about each task. This information is stored in something called a Task Control Block (TCB).

It stores:

Stack pointer
Task priority
Current state (Running, Ready, etc.)
Timing information
Task name

Whenever a context switch happens, the scheduler:

Saves the current task’s CPU registers into its stack
Updates its TCB
Loads the next task’s stack pointer
Restores its registers

All this happens in microseconds. We do not see it. But this is what makes multitasking possible.

What is CMSIS-RTOS and Why We Use It

Now that we understand what FreeRTOS is and how tasks work, the next question is: If we are already using FreeRTOS, why do we need CMSIS-RTOS?

FreeRTOS vs RTX5 vs ThreadX

In the embedded world, FreeRTOS is not the only RTOS available. There are several RTOS kernels:

FreeRTOS
Keil RTX5
Azure RTOS ThreadX
Zephyr and others

Each RTOS has its own API. For example:

In FreeRTOS, we create a task using: xTaskCreate()
In ThreadX, we use tx_thread_create() for the same
In Zephyr, k_thread_create() is used to create a task.

Now imagine this situation:

Today, your project uses FreeRTOS
Tomorrow, your company decides to move to ThreadX
Or your client requires RTX5

If your entire application code uses native FreeRTOS APIs, you must rewrite everything. This is where CMSIS-RTOS V2 becomes very useful.

CMSIS-RTOS as an Abstraction Layer

CMSIS stands for Common Microcontroller Software Interface Standard. CMSIS-RTOS is a standardized API defined by ARM.

It does not replace FreeRTOS.
It does not schedule tasks.
It is not a kernel.

It is just an abstraction layer. Think of it like this:

Layered architecture diagram showing how osThreadNew() in Application Code translates to xTaskCreate() in CMSIS-RTOS API, then runs through the FreeRTOS Kernel scheduler, and finally executes on STM32 Cortex-M hardware. Each colorful block contains faded example code.

Your code calls CMSIS-RTOS functions like:

osThreadNew();
osDelay();
osSemaphoreAcquire();

osThreadNew();
osDelay();
osSemaphoreAcquire();

These functions internally call the native FreeRTOS or any other kernel’s functions. The CMSIS wrapper translates them to RTOS calls behind the scenes. This makes your application independent of the underlying RTOS.

How CMSIS-RTOS Maps to FreeRTOS APIs

Let’s look at some direct mappings to understand this clearly.

CMSIS-RTOS V2	Native FreeRTOS API
`osThreadNew()`	`xTaskCreate()`
`osDelay()`	`vTaskDelay()`
`osKernelStart()`	`vTaskStartScheduler()`
`osSemaphoreNew()`	`xSemaphoreCreateBinary()` / others

So when we write:

osDelay(1000);

osDelay(1000);

Internally, FreeRTOS executes:

vTaskDelay(1000);

vTaskDelay(1000);

This separation gives us portability. If tomorrow we switch from FreeRTOS to RTX5:

We still call osThreadNew()
We still call osDelay()
We still call osKernelStart()

Only the underlying implementation changes. Our application code remains the same. That is why in this tutorial series, we will use CMSIS-RTOS V2 instead of native FreeRTOS APIs.

Creating STM32 FreeRTOS Project in STM32CubeMX

In this section, we will create a FreeRTOS project using CMSIS-RTOS V2 in STM32CubeMX. We will configure the clock, enable FreeRTOS, and set up a simple LED task.

I will use the STM32 Nucleo-L496ZG-P board for this tutorial.

Clock Configuration

We will start with the clock configuration. This Nucleo board does not use an external crystal for the main system clock. So we will use the internal oscillator.
The image below shows the clock configuration for this project.

We will use the internal oscillator (HSI) and PLL to run the system at maximum 80MHz clock.

Enabling CMSIS-RTOS V2 in Middleware

Now we will enable the CMSIS-RTOS. The configuration can be found in the Middleware section as shown in the image below.

Image shows how to enable FreeRTOS CMSIS_V2 in the middleware section of the STM32 MCU using CubeMX.

Always choose CMSIS-RTOS V2 for new projects. It is cleaner and more modern.

Let’s understand the basic parameters.

Tick Rate

The default tick rate is 1000 Hz.

This means the sysTick interrupt runs 1000 times per second which provides 1 ms resolution.

So when we use osDelay(1) It delays approximately 1 millisecond.

Since the maximum tick rate is 1000 Hz, we cannot generate delays smaller than 1 millisecond in FreeRTOS.

Minimal Stack Size

The default value is 128 words.

Note that this is in words, not bytes. Since STM32 is a 32-bit MCU:

1 word = 4 bytes
128 words = 512 bytes

Each task has its own stack. If the stack is too small, the system may crash due to stack overflow. For now, 128 words is enough for a simple LED task.

Total Heap Size

The default heap size is around 3000 bytes.

Image shows the total heap assigned to the STM32 FreeRTOS is 3000 bytes.

This heap is used for:

Tasks
Queues
Semaphores
Timers
Other RTOS objects

When we create a new task, memory is taken from this heap. For this simple example, 3000 bytes is more than enough. Later, when we add more RTOS features, we may need to increase it.

Configuring Default Task

Now go to Tasks and Queues. You will see one task already created: DefaultTask.

The defaultTask is pre configured in cubeMX.

The parameters of this defaultTask are as follows:

Task Name: DefaultTask
(Only for identification)
Priority: Normal
We will keep it as Normal for now.
Stack Size: 128 words
Minimum allowed value.
Entry Function: StartDefaultTask
This is where we will write our code later.
Argument: NULL
We are not passing any parameters.
Memory Allocation: Dynamic
The stack and Task Control Block will be allocated from the heap at runtime.

Changing HAL Time Base from SysTick to TIM6

This step is very important in STM32 FreeRTOS projects. By default, STM32 uses SysTick as the HAL time base. But FreeRTOS also uses SysTick for its scheduler.
If both HAL and FreeRTOS share SysTick, it may create timing conflicts.

So we should dedicate:

SysTick → FreeRTOS Scheduler
TIM6 (or TIM7) → HAL Time Base

Image shows how to change the timebase when using FreeRTOS in STM32.

Go to System Core → SYS
Change Timebase Source from SysTick to TIM6 (or TIM7)

TIM6 and TIM7 are basic timers. They do not support PWM or advanced features. That is why they are ideal for this purpose.

Enabling Newlib Reentrant Support

Go to Middleware → FreeRTOS → Advanced Settings and Enable Use newlib reentrant.

Image shows how to enable NewLib Reentrant for FreeRTOS in STM32.

In an RTOS environment, multiple tasks may call functions like printf(); sprintf(); malloc(); etc. These standard C library functions are not thread-safe by default.

If two tasks call printf() at the same time, the output may get corrupted. Enabling newlib reentrant makes these functions safe in multitasking systems. It uses slightly more memory, but it prevents difficult debugging issues later.

Configuring GPIO for LED (PB7)

Now we will configure the onboard LED. According to the board schematic, the blue LED is connected to PB7. Therefore, we will configure the pin PB7 in the output mode.

Image shows the GPIO configuration in STM32 to set the pin PB7 in the output mode. This pin will be used to control the LED via the FreeRTOS task.

We do not need to change any other GPIO settings for this example. Later inside the task, we will toggle this pin using HAL_GPIO_TogglePin().

Understanding the Generated FreeRTOS Code

After generating the project from STM32CubeMX and opening it in STM32CubeIDE, you will see that most of the FreeRTOS-related code is already written for you.

Let’s understand the most important functions that control the RTOS behavior.

osKernelInitialize()

This function initializes the FreeRTOS kernel. It prepares all internal RTOS data structures such as:

Task control blocks
Ready lists
Delayed lists
Memory management structures

It simply prepares the RTOS environment so that tasks can be created safely. You will usually see it inside main() before any task creation.

osThreadNew()

This function is used to create a new task (thread). CubeMX automatically generates the required parameters like DefaultTask, Entry function (for example StartDefaultTask), Stack size and Priority.

When osThreadNew() is called:

Memory is allocated from the FreeRTOS heap
A Task Control Block (TCB) is created
The task is added to the ready list

Note: The task function does not start running immediately.It will only run after the scheduler is started.

You can create multiple tasks by calling osThreadNew() multiple times.

osKernelStart()

This is the most important function. When osKernelStart() is called:

The FreeRTOS scheduler starts
SysTick begins generating RTOS ticks
Context switching becomes active
The highest priority ready task starts executing

From this moment the system is fully under RTOS control. This function never returns under normal conditions. Therefor, any code written after it in main() will not execute.

If you need continuous execution, you need to put your logic inside a task (Not inside main() after osKernelStart()).

Writing Our First FreeRTOS Task – LED Blink Example

Now that the FreeRTOS project is generated, let’s write our first task. We will blink the onboard LED connected to PB7 on the STM32 Nucleo-L496ZG-P.

Writing the Task Code

We have created only one task, DefaultTask, and the entry function for this task is shown below.

void StartDefaultTask(void *argument);

void StartDefaultTask(void *argument);

Basically, we need to write our code inside this function.

void StartDefaultTask(void *argument)
{
  for(;;)
  {
    HAL_GPIO_TogglePin(GPIOB, GPIO_PIN_7);
    osDelay(500);
  }
}

void StartDefaultTask(void *argument)
{
  for(;;)
  {
    HAL_GPIO_TogglePin(GPIOB, GPIO_PIN_7);
    osDelay(500);
  }
}

Below is the explanation of the function:

The LED toggles every 500 ms
The task sleeps for 500 ms
FreeRTOS can run other tasks during that delay

This creates a clean 1-second blink cycle.

Why We Use osDelay Instead of HAL_Delay

In bare-metal projects, we usually write HAL_Delay(1000), but in a FreeRTOS project, we should use osDelay(1000).

This is because HAL_Delay() blocks the CPU. It uses the HAL time base and performs a busy wait. While waiting, no other task can run efficiently.

On the other hand:

osDelay() is RTOS-aware.
It puts the current task into the Blocked state.
The scheduler switches to another ready task.
CPU time is used efficiently.

Image shows the difference between HAL_Delay() and osDelay() when using FreeRTOS in STM32.

Full Combined Code Example

Below is the simplified structure of main() and the task:

int main(void)
{
  HAL_Init();
  SystemClock_Config();
  MX_GPIO_Init();
  MX_FREERTOS_Init();

  osKernelInitialize();

  osThreadNew(StartDefaultTask, NULL, &defaultTask_attributes);

  osKernelStart();

  while (1)
  {
  }
}


void StartDefaultTask(void *argument)
{
  for(;;)
  {
    HAL_GPIO_TogglePin(GPIOB, GPIO_PIN_7);
    osDelay(500);
  }
}

int main(void)
{
  HAL_Init();
  SystemClock_Config();
  MX_GPIO_Init();
  MX_FREERTOS_Init();

  osKernelInitialize();

  osThreadNew(StartDefaultTask, NULL, &defaultTask_attributes);

  osKernelStart();

  while (1)
  {
  }
}


void StartDefaultTask(void *argument)
{
  for(;;)
  {
    HAL_GPIO_TogglePin(GPIOB, GPIO_PIN_7);
    osDelay(500);
  }
}

Important reminder:

Code after osKernelStart() will not run.
All application logic must be inside tasks.

Output of the project

After flashing the project to the board, the LED should blink once every second. The GIF below demonstrates the expected output.

Gif shows the LED on STM32 is blinking using the FreeRTOS task.

You can see above:

The LED connected to PB7 will blink every 500 ms.
The blinking will continue indefinitely.
The scheduler is now running in the background.

Video Tutorial

STM32 FreeRTOS LED Blink Video Tutorial

This STM32 FreeRTOS tutorial shows the complete project setup and execution in action. I walk you through the CubeMX configuration, moving the HAL time base to TIM6, enabling newlib reentrant support, understanding osKernelInitialize(), osThreadNew(), osKernelStart(), and finally writing the LED blink task step by step. Watch the video and follow the written guide together to clearly understand how a basic FreeRTOS project is structured and how the scheduler runs your first task.

Watch the FreeRTOS Tutorial

Conclusion

In this tutorial, we built a complete FreeRTOS-based project using STM32CubeMX and STM32CubeIDE on the STM32 Nucleo-L496ZG-P. We started by configuring the clock, enabling CMSIS-RTOS V2, adjusting stack and heap sizes, and making two critical configuration changes: moving the HAL time base from SysTick to TIM6 and enabling newlib reentrant support. Then we understood the generated FreeRTOS structure, including osKernelInitialize(), osThreadNew(), and osKernelStart(), and clarified why application code must always run inside tasks.

Finally, we implemented our first FreeRTOS task to blink an LED using osDelay() instead of HAL_Delay(), compiled the project, and flashed it to the board. This foundation is extremely useful because it teaches the correct way to structure RTOS-based embedded applications. With this understanding, you can now confidently create multiple tasks, use priorities, and build scalable real-time systems instead of relying on blocking bare-metal code.

Browse More STM32 Tutorials

STM32 IoT Tutorial (Part 5): MQTT Subscribe with ESP8266, DMA & FreeRTOS

LVGL on STM32 || PART 6

W25Q Flash Series Part 7 – QUADSPI Write, Read, Memory Mapped mode

STM32 Timers (Part 9): One Pulse Mode (OPM) – Generate Precise Triggered Pulses with Delay and Width Control

How to Interface GC9A01 Round Display with STM32 Using SPI + LVGL Integration

How to Setup UART using Registers in STM32

How to use Ethernet with Riverdi STM32 Displays

STM32 USB HOST HID

1 2 3 4 5 … 22 23 24 25 Next »

STM32 RTOS Project Download

DOWNLOAD SECTION

Info

You can help with the development by DONATING Below.
To download the project, click the DOWNLOAD button.

Support My Work

View on GitHub

STM32 FreeRTOS FAQs

Can I use a different timer instead of TIM6 for the HAL time base?

Yes. You can use TIM7 or any other basic timer that is not required elsewhere in your project. Just make sure it is not being used for PWM, input capture, or other application features.

What happens if I forget to enable newlib reentrant?

Your project may still compile and run, but functions like printf(), malloc(), or sprintf() can behave unpredictably when called from multiple tasks. This can lead to corrupted output or hard-to-trace bugs.

How do I know if my task stack size is too small?

Common signs include random crashes, HardFaults, or unexpected resets. You can enable stack overflow checking in FreeRTOS settings to detect this during development.

Can I create tasks dynamically after the scheduler starts?

Yes. Tasks can be created even after osKernelStart() using osThreadNew(). This is useful for event-driven designs where tasks are created only when needed.

Why does my LED blink speed change when I modify the tick rate?

Because osDelay() depends on the RTOS tick frequency. If you reduce the tick rate, the delay resolution changes accordingly, which directly affects timing accuracy.

STM32 FreeRTOS CMSIS-RTOS Tutorial – Step-by-Step Guide Using STM32CubeIDE

Why We Need FreeRTOS in STM32 Projects

Problems with Superloop and HAL_Delay()

Why State Machines Become Complex

So What Is the Solution?

What is FreeRTOS and How It Works

What is a Real-Time Operating System (RTOS)?

What is a Task in FreeRTOS?

Task States in RTOS (Running, Ready, Blocked, Terminated)

What is a Task Control Block (TCB)?

What is CMSIS-RTOS and Why We Use It

FreeRTOS vs RTX5 vs ThreadX

CMSIS-RTOS as an Abstraction Layer

How CMSIS-RTOS Maps to FreeRTOS APIs

Creating STM32 FreeRTOS Project in STM32CubeMX

Clock Configuration

Enabling CMSIS-RTOS V2 in Middleware

Configuring Default Task

Changing HAL Time Base from SysTick to TIM6

Enabling Newlib Reentrant Support

Configuring GPIO for LED (PB7)

Understanding the Generated FreeRTOS Code

osKernelInitialize()

osThreadNew()

osKernelStart()

Writing Our First FreeRTOS Task – LED Blink Example

Writing the Task Code

Why We Use osDelay Instead of HAL_Delay

Full Combined Code Example

Output of the project

Video Tutorial

STM32 FreeRTOS LED Blink Video Tutorial

Conclusion

Browse More STM32 Tutorials

STM32 RTOS Project Download

Info

STM32 FreeRTOS FAQs

Search This Site

Subscribe

FOLLOW US