Linux Tuning For SIP Routers – Part 1 (Interrupts and IRQ Tuning)

.

Introduction

Caring about the performance of any SIP router (real-time application) involves caring about the performance of the operating system and the routing application which runs on the operating system. The hardware performance is also important. I will mention some tips on how you can tune Linux to serve real time application like SIP routing. This article will be in many parts. In this part i will talk about interrupts and IRQ tuning for real time processing.

.

Interrupts and IRQ Tuning

Usually the network devices has 1 to 5 interrupts lines to communicate with the CPU. In multiple CPU system, round robin scheduling algorithm (Fair interrupts distribution) is used to choose the CPU core that will handle the interrupt. The file “/proc/interrupts” records the number of interrupts per CPU core per IO device. Execute this “cat /proc/interrupts” to get the list of interrupts in your system. To get the line associated with Ethernet driver, execute “grep p4p1 /proc/interrupts” where p4p1 is the name of my Ethernet driver. The output will look like this:   

30:      36650      16711      62175      18490   PCI-MSI-edge      p4p1

Where: 30 is IRQ number. 36650 of that interrupt handled by CPU-Core-0. 16711 of that interrupt handled by CPU-Core-1, 62175 of that interrupt handled by CPU-Core-2, 18490 of that interrupt handled by CPU-Core-3. PCI-MSI-edge is the interrupt type. p4p1 is the driver which receives the interrupts (This could be a comma-delimited list of drivers).

It is recommended that all interrupts generated by a specific device to be handled by the same CPU cores. IRQ fair distribution between all CPU cores is not recommended because when the interrupt goes to another fresh CPU core, the new CPU core will load the interrupt handler function from the main memory to the cache (time overhead).

IRQs have a property called interrupt affinity or smp_affinity. This property defines the CPU cores (CPU Set) that will handle the interrupts of a specific device. This property can be used to improve the performance of the SIP router (as any real-time application) by assigning same CPU set to the process/thread and to the interrupt. This minimizes the delays by allowing cache sharing between the process/thread and the interrupt.

Affinity Value for a Specefic IRQ

The affinity value (CPU cores) for a specific IRQ is stored in the file “/proc/irq/IRQ_NUMBER/smp_affinity”. The value in this file is a hexadecimal bit-mask (The lowest order bit corresponding to the first logical CPU). For example to set the affinity value to the IRQ-30 (the example above), do this as root:

  • Display the current affinity value: “cat /proc/irq/30/smp_affinity”.
  • Change the affinity of the IRQ-30 to the first 2 cores (0011). The corresponding hex value is 3. We write this number 3 in the file “/proc/irq/30/smp_affinity” : “echo 3 > /proc/irq/30/smp_affinity”

Setting the smp_affinity can be done at the hardware level (no intervention from the kernel) on systems which support interrupt steering.

“irqbalance” Daemon Configuration

Most of the current distributions comes with “irqbalance” daemon which distributes the interrupts fairly between CPU cores. So disable the irqbalance for those CPU cores that you want to bind them with specific IRQs. This can be done by editing the file “/etc/sysconfi/irqbalance” and changing the parameter IRQBALANCE_BANNED_CPUS which is 64 bit mask which allows you to indicate which CPUs should be skipped when balancing IRQs.

Affinity Value for a Specific Process/Thread

You can use the command “taskset” to set/get the affinity value of a running process or to run a new command with a given affinity. This will be done as following for non-running process (command): “taskset 3 /…./my_process“. This will bind my_process to the CPU-Core-0 and CPU-Core-1. Scheduling will happen between these two cores.

To bind running process to specific cores, do this : “taskset -p 3 PID” where 3 is the CPU-Set hex mask and PID is the process ID.

To check the CPU cores, execute “cat /proc/PID/status“.

I will apply this to OpenSIPS (SIP router):

OpenSIPS 1.X is multi-processes SIP router so when you execute “ps aux |grep opensips“, you will get many lines. You can change the affinity value to all or some of these processes as following:

  • # taskset -p 3 19904 where 19904 is one of these processes
  • Check the status of the process 19904: # cat /proc/19904/status

Cpus_allowed:   3

OpenSIPS 2.X is multi-threads SIP router (under development).

The affinity can also set by the process in its code. For example:

      #include <sched.h>

cpu_set_t mask;

CPU_ZERO(&mask);

unsigned int len = sizeof(mask);

CPU_SET(cpu_nr, &mask);

sched_setaffinity(0, len, &mask);

 Description

  • cpu_set_t is a data structure (set of bits) which represents the set of CPU cores.
  • CPU_ZERO(…) clears the set (no CPUs).
  • CPU_SET(…) adds CPU core number cpu_nr to the CPU set.
  • sched_setaffinity(…) Set the affinity mask. First parameter is PID (here it is zero so it means the calling thread/process).

Note: To scale the performance more, choose the best CPU cores for interrupts affinity and process affinity.


Last Word

If there is a lot of reading/writing from/to the hard disk (e.g. Stateful SIP Router),  Attaching the same CPU set to the processes/threads and the interrupts will be useful.

  • Get the driver name of your hard disk:
    • Execute “udevadm info -a -n /dev/sda” and parse the output. Search for something like DRIVERS==”Something”.  I have got this DRIVERS==”ahci”.
  • Execute “# grep ahci /proc/interrupts” and get the IRQ number.
  • Set the affinity value for the processes/threads and to the interrupts as discussed above.

Next

The Next Part will be about tuning the access to the disk to optimize the performance of the SIP routers.


 

Leave a comment