2021年4月3日星期六

Python multiprocessing with shared RawArray

I want to have each process access a different row of a numpy array in parallel using a shared array to speed things up. However, when I run the following code, the first process to reach func throws "NameError: global name 'var' is not defined", as if var is no longer in scope. Why is this happening?

import numpy as np  import multiprocessing as mp  import time    num_procs = 16  num_points = 2500000    def init_worker(X):      global var      var = X    def func(proc):      X_np = np.frombuffer(var).reshape((num_procs, num_points))      for y in range(num_points):          z = X_np[proc][y]    if __name__ == '__main__':      data = np.random.randn(num_procs, num_points)      X = mp.RawArray('d', num_procs*num_points)      X_np = np.frombuffer(X).reshape((num_procs, num_points))      np.copyto(X_np, data)      print('Starting parallel...')      start = time.time()      pool = mp.Pool(processes=4, initializer=init_worker, initargs=(X,))      for proc in range(num_procs):          pool.apply_async(func(proc))      pool.close()      pool.join()      end = time.time()      print("Finished in", end-start)  
https://stackoverflow.com/questions/66937630/python-multiprocessing-with-shared-rawarray April 04, 2021 at 10:53AM

没有评论:

发表评论