Parallel Computing using IPython (1)

We cannot understand the easiest programming skills unless we are able to manipulate them. Parallel computing first seemed to be hard, but once we could write even several lines of codes, it became so easy.

def printtime(t):
    import time
    time.sleep(1)
    return (time.time(), t)

from ipyparallel import Client
c = Client()
print c.ids

lview = c.load_balanced_view()

res = lview.map_async(printtime, xrange(10))

import time

time.sleep(2)

if res.ready():
    print res.get()

# When using async

I started 15 engines, more than the times that the function would be called.

ipcluster start -n 15

So we could speculate there will no delay for calling the functions. The result of above code was:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
[(1442806077.728, 0), (1442806077.744, 1), (1442806077.744, 2), (1442806077.775, 3), (1442806077.775, 4), (1442806077.775, 5), (1442806077.775, 6), (1442806077.791, 7), (1442806077.806, 8), (1442806077.806, 9)]
[Finished in 5.0s]

We see nearly no delay here. Let’s see the serial computing example:

import time

def printtime(t):
	time.sleep(1)
	return (time.time(), t)

ti = []

for i in xrange(10):
	ti.append(printtime(i))
print ti

We could expect there should be latency.

[(1442807505.553, 0), (1442807506.553, 1), (1442807507.569, 2), (1442807508.569, 3), (1442807509.569, 4), (1442807510.584, 5), (1442807511.584, 6), (1442807512.585, 7), (1442807513.591, 8), (1442807514.594, 9)]
[Finished in 10.4s]

Each time, the function was called one second later due to the time.sleep(1).

Several import points:

  • Module importing was included in defining function, because each engine started was a “blank” engine before the codes came in. These engines were different from the “local” one.
  • If we use map_async, the process will not wait until the full execution is finished. In this case, the program will go on and we can let the program do other things. Later, we can use ready() and get() functions to see if the computing is finished and get the results.
  • If we use map_sync, the vice versa is true. This case is useful if we need the results as prerequisite for further analysis. Otherwise, use map_async.
  • We can wrap our programs in functions and call.

One Reply to “Parallel Computing using IPython (1)”

Comments are closed.