linux - [Solved-3 Solutions] Why does multiprocessing use only a single core after I import numpy ? - ubuntu - red hat - debian - linux server - linux pc



Linux - Problem :

from joblib import Parallel,delayed
import numpy as np

def testfunc(data):
    # some very boneheaded CPU work
    for nn in xrange(1000):
        for ii in data[0,:]:
            for jj in data[1,:]:
                ii*jj

def run(niter=10):
    data = (np.random.randn(2,100) for ii in xrange(niter))
    pool = Parallel(n_jobs=-1,verbose=1,pre_dispatch='all')
    results = pool(delayed(testfunc)(dd) for dd in data)

if __name__ == '__main__':
    run()
click below button to copy the code. By - Linux tutorial - team

Here's you can see in htop while this script is running:

Learn Linux - Linux tutorial - Script running - Linux examples - Linux programs

Linux - Solution 1:

  • It turns out that certain Python modules (numpy, scipy, tables, pandas, skimage) mess with core affinity on import.
  • This problem seems to be specifically caused by them linking against multithreaded OpenBLAS libraries.
  • To reset the task affinity using the following code:
os.system("taskset -p 0xff %d" % os.getpid())
click below button to copy the code. By - Linux tutorial - team

After the module imports, runs on all cores:

puthon-multiprocessing-number-of-cores

Learn Linux - Linux tutorial - multiprocessing number of cores - Linux examples - Linux programs

This doesn't seem to have any negative effect on numpy's performance, although this is probably machine- and task-specific .

There are also two ways to disable the CPU affinity-resetting behaviour of OpenBLAS itself. At run-time you can use the environment variable OPENBLAS_MAIN_FREE (or GOTOBLAS_MAIN_FREE), for example

OPENBLAS_MAIN_FREE=1 python myscript.py
click below button to copy the code. By - Linux tutorial - team

Alternatively, if you're compiling OpenBLAS from source you can permanently disable it at build-time by editing the Makefile.rule to contain the line

NO_AFFINITY=1
click below button to copy the code. By - Linux tutorial - team

Linux - Solution 2:

You can try this:

>>> import os
>>> os.sched_getaffinity(0)
{0, 1, 2, 3}
>>> os.sched_setaffinity(0, {1, 3})
>>> os.sched_getaffinity(0)
{1, 3}
>>> x = {i for i in range(10)}
>>> x
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
>>> os.sched_setaffinity(0, x)
>>> os.sched_getaffinity(0)
{0, 1, 2, 3}
click below button to copy the code. By - Linux tutorial - team

Linux - Solution 3:

This appears to be a common problem with Python on Ubuntu, and is not specific to joblib:

  • Both multiprocessing.map and joblib use only 1 cpu after upgrade from Ubuntu 10.10 to 12.04
  • Python multiprocessing utilizes only one core
  • multiprocessing.Pool processes locked to a single core

Related Searches to - linux - linux tutorial - Why does multiprocessing use only a single core after I import numpy ?