Friday, March 23, 2007

Disappointment with kernel

Came across a livejournal entry comparing between FreeBSD 7 and Linux 2.6.18 kernel on Fedora Core 6. Surprisingly Linux performed poorly when it came to scalability issue with a MySQL server.

Here is my interpretation and inferences( can be wrong also :) ) -

-On a 'n' node system running a SMP kernel, the SQL server handles the load excellently till following condition is met -

num_of_sql_query_threads <= no_of_nodes_in_the_SMP_system

This means till threads(assuming each query launches a thread) are less in number as compared to the num of CPUs(cores if you consider Dual Core processors these days) the threading model performs great.
Every thread gets migrated to each of the ores and thus gives optimum CPU usage. The moment number of threads start increasing beyond the num of cores in the SMP system, it starts causing problems.
Possible reasons :-
* The userland is unable to keep up with the CPU execution speed and thus is not able to feed the right number of threads to the kernel. (Does it means user space threading model needs a look?)
* The kernel is not able to keep the CPUs busy when number of kernel threads increase beyond a limit. Well does this means kernel threading model is a little overlooked?

I think reason 1 is indeed true because replacing glibc's malloc with google's malloc improved performance considerably. But things did not improve that much. Something really bad is happening inside the kernel.

Linux uses a one to one threading model. This means for every thread requesting a priviliged operation, a corresponding kernel thread is spawned by the kernel on its behalf.This means some global/local spinlocks in SMP kernels is hindering effective migration of the kernel threads among runqueues(or waitqueues). Whatever this issue looks plainly separate from the scheduler.No blaming scheduling algo here:). Locking and synchronization primitives are indeed the culprits.

Important point to note here is in Kernel threading model in Linux is relatively new. It was included in the kernel with the 2.6.x kernel series only.

So there is a lot of scope for improvements in the kernel on threading front.
Any takers? :)

On last thoughts, why is malloc performing poorly? No idea.May be Ulrich Drepper is the best person to ask... so don't ask me. :)

Random thoughts : doesn't futexes have some added complexity? Somebody can explain me about futexes i hope ...

Later.

No comments: