Translate In French / Traduire en FranÃ§ais
Time is for investigations, here is another one. We know for a few days now that the Pentium Extreme Edition based on the Smithfield dual-core will be announced soon, and this processor will use Hyper-Threading technology. We all remember when the HT was released, Windows 2000 was poorly supporting this technology, that resulted in a perfromance decrease for some applications. This was due to the OS scheduler (the component in charge to dispatch the thread and processes on all the the CPUs). For this reason, we think it would be interesting to check the behaviour of Windows XP on a dual-core with HT.
The introduction of dual core CPUs with HT consists in a known configuration for servers, but a new one for desktop systems : two physical CPUs and four logical units. In order to simulate this configuration, we used a dual Xeon HT configuration. Results of HT efficiency are surprising, and deserve to be analysed before the release of the dual core CPUs. For the following tests we used two Xeon 3.0GHz (800MHz bus) and 1GB DDR400 on an Asus NCCH-DL, based on the E7210 chipset (i875-ES), and of course Windows XP SP2.
In order to show Windows XP scheduler behaviour on such a configuration, we used video encoding programs, that generally tend to make a good use of HT technology. Here are the results of an encoding made with Windows Media Encoder 9 :
As we can notice, with one CPU, the activation of HT results in a 20% increase in performance. Using two physical CPUs (HT disabled) even allows a 65% increase ! Nethertheless, activating HT on both CPUs results in a 3.5% decrease of the encoding mark. In order to confirm this result, we also made a test with the DivX 5.2.1 codec, and VirtualDub version 1.6.3. Results are the same :
In conclusion, the activation of HT results in a performance increase if we only use one CPU, but results in a performance decrease as soon as HT is used on a dual CPU configuration. Why ? A system with two physical CPUs and four logical units (these four units appear as CPUs in the taks manager) look like this :
The first thread is taken in charge by CPU#1. But the second thread MUST be taken in charge by one of the two units of the second physical CPU, and not by the second unit of the first CPU, otherwise only one of the two physical CPUs is used. Consequently, the 2nd thread must not be taken in charge by CPU#2, as it would be on a classic dual CPU system. Instead, both physical CPUs must be used first, before one of the logical unit is used.
We used the multi-thread application made by Franck, that computes a simple whestone, and that allows to see the maximum theorical performance of a SMP system. We used the following configuration, and here are the results :
If the two threads are dispatched on CPU#1 and CPU#2, only one physical CPU is used, and the results are the ones we would obtain on a single CPU with HT. This is eactly what Windows XP scheduler does, instead of using first the two physical units. By chance, if the application can set the affinity of its own threads by itself, this solves the problem. This is the case of TMPGEnc version 1.254, that handles the threads dispatching instead of leaving Windows scheduler do the work. The same encoding test with this program gives the following results :
With HT enabled, we get the same performance increase for the two CPUs.
To sum up, at the moment of the release of dual core CPUs with HT, the behaviour of muti-threaded applications will have to be carefully studied, regarding the problems of Windows XP scheduler to manage four logical CPUs with efficiency. If Microsoft does not update Windows XP scheduler in order to fix this (that is very unlikely, remember that Windows 2000 was never fixed to correctly handle HT), applications developers will (one more time) have to take that in charge in their application.