Page 1 of 1

Underutilization of CPU resources

Posted: Sat Jul 20, 2013 6:15 am
by jmosk
I am running a CPU intensive application using version 13.2 and it appears that the application is running slower than it should. There is no disk I/O involved. There are no other applications running. I am running on an HP h8-1090t which is an Intel Core i7 CPU X990 at 3.47GHz. It has 12Gb of RAM and 6 cores. Only 4.4Gb of RAM is being used. My APL application is only utilizing 8% of the CPU power. 90% - 92% of the CPU time is going to the Window 7 (64 bit) System Idle Process. This just does not seem right. I was expecting the app to utilize most of the CPU power. Using the task monitor, at most I see one core utilizing perhaps 40% of its CPU usage for that core.

Can anyone give me an idea of why I am seeing such a low overall CPU utilization of a CPU bound app?

CPU Intensive App Running.jpg


Process Utilization.jpg

Re: Underutilization of CPU resources

Posted: Sat Jul 20, 2013 7:42 am
by PhilGray
Are there any Calls to Windows .. e.g. GUI or other Windows Calls ?
I presume that the RAM modules are suitable (e.g. fast enough) for the CPU clock.
Remember that the CPU runs on the Clock (GHz) .. and cores do not run "parallel" , but sequentially.
i.e. the "power" is in the Clock speed, the "Multi Cores" provide Task-swapping with more efficiency.

Re: Underutilization of CPU resources

Posted: Sun Jul 21, 2013 12:40 am
by jmosk
PhilGray wrote:Are there any Calls to Windows .. e.g. GUI or other Windows Calls ?
I presume that the RAM modules are suitable (e.g. fast enough) for the CPU clock.
Remember that the CPU runs on the Clock (GHz) .. and cores do not run "parallel" , but sequentially.
i.e. the "power" is in the Clock speed, the "Multi Cores" provide Task-swapping with more efficiency.


No - this is a totally compute bound function doing calculations and array manipulation. There are no external calls. It is only executing various function calls written by me. I saw a prior post on this board were someone showed a 40% CPU utilization using the same processor (although they said they have 4 cores on their i7, but an i7 has 6 cores). So I don't know why they were showing 40% when I am stuck at 8%. The RAM is definitely fast enough for the CPU as I have C programs of a similar nature that use 95% of the CPU. I tired running 4 instances of the APL interpreter all running the same application and the task manager shows each one using 8% of the CPU. So there is plenty of available CPU power to run faster since 4 instances now used a total of 32% of the CPU.

I can't explain why I am being limited to such a low utilization on a relatively fast CPU/RAM combination.

Any ideas of what can be looked at to find out what is keeping the process from running faster? There is no memory swapping going on to close things down as there is plenty of unused RAM so nothing is being cached out to disk.

Re: Underutilization of CPU resources

Posted: Sun Jul 21, 2013 6:53 am
by PhilGray
You might finds some useful tools here , developed by Mark Russinovich and Bryce Cogswell :

http://technet.microsoft.com/en-us/sysi ... s/bb545027

Their "Process Explorer" is far superior to the standard Windows version.
http://technet.microsoft.com/en-us/sysi ... 96653.aspx

"Process Monitor" may also be useful.
http://technet.microsoft.com/en-us/sysi ... 96645.aspx

Re: Underutilization of CPU resources

Posted: Sun Jul 21, 2013 11:05 am
by JoHo
It would be interesting, to create some dumb test functions/threads to see, if it would be possilbe to generate 100% CPU load on this multi core machine using v13.2 .

Re: Underutilization of CPU resources

Posted: Mon Jul 22, 2013 8:00 am
by MikeHughes
My reply seems to have disappeared.

The shorter version is the Dyalog interpreter is single threaded - Dyalog threads are effectively time slicing within the single core.

You will only be able to use one of your cores unless you use a feature like Peach. There are exceptions Dyalog are experimenting with some primitives and external dls can be system threaded but generally it is all single core.

Re: Underutilization of CPU resources

Posted: Mon Jul 22, 2013 9:10 am
by MikeBa
I raised the same issue on Mar 15. See Morten's reply, it might help.

Re: Underutilization of CPU resources

Posted: Mon Jul 22, 2013 2:00 pm
by AndyS|Dyalog
To add to Mike Hughes' comment ..

The following scalar dyadic functions (+ - × * ÷ > ≥ = ≠ ≤ < ⍟ | ! ○ ∨ ∧) execute in parallel in separate system threads each running on a separate cpu or core, when the argument size exceeds a configurable limit, the parallel execution threshold. They only execute in parallel for floating point (not however for + - ×), DECF and (where appropriate) complex numbers. In all other cases the administrative overhead of parallelising these primitive functions has been found to outweigh the benefits that we can accrue.

Two I-Beams can be used to configure the point at which the above primitives are performed in parallel system threads, and the number of threads used: 1111⌶ specifies how many system threads to use (the default is the number of CPU/cores in the machine), and 1112⌶ defines the threshold at which parallel execution takes place.

The manauls are currently (as of 2013-07-22) rather thin on this information; I've raised an issue to flesh the information out !

This feature was introduced in 12.1.

MikeBa was I believe referring to http://www.dyalog.com/forum/viewtopic.php?f=30&t=532&p=2046.