Lab 11

Multiprocessing
Due by 6pm on Tuesday, May 5

The purpose of this lab is to:

Getting Started

Create a folder called lab11 inside your cs150 folder.

Hello, World

Once more, we will write the classic Hello, World program, but this time we will say hello from multiple processes. In this program, you should ask the user for a name and a number n of processes to create. You should then create n processes, each of which will execute a function that prints "Hello, <name>, from Process <pid>". (A pid is a process id number.) Save this program in a file called hello.py

Here are brief reminders of how to do some of these steps:

Creating a new process

First, you will need to import the multiprocessing module, like so:

      from multiprocessing import *
You will need to define a function that does whatever you would like your process to do. For you, this will look something like:
       def HelloWorld(name):
             #print Hello to Name from process id
To create a new process, p, that runs function HelloWorld:
        p = Process(target=HelloWorld, args=(name,))
There are several things that we use when calling a process. It uses "named arguments". We specify that the first parameter is the target, or function we'd like to use, and the second parameter is the tuple of args that we are passing that function. If there is just one argument and we need to specify a tupleof length 1, we just add a comma after the first argument.

Next, to start process p, write:

       p.start()
To get the process id number, use

          current_process().pid

Parallel Monte Carlo

Recall that in Lab 3, we approximated the value of pi by simulating dart throws. In this lab, you will create a program to do this in parallel, using Python's process pools. You will also look at how long it takes different numbers of processes to complete the task. Your final product, monte.py, will be a python program that asks the user how many processes to use, and prints out the approximate value of pi based on 100,000,000 throws divided among those processes. Your program should also print how long it took to calculate that result. You should create a text file named monte_readme.txt in which you record how long it took your program to approximate pi with 100,000,000 throws, using 1, 2, 4, 8, 20, 40, 100 and 200 processes.

Process Pools

A process pool is a way to create a number of threads, and have them all do the same task. To create a pool of processes, where n is the number of processes you wish to create:

         pool = Pool(processes=n)

To run the processes, you will use the Pool.map function. The map function takes in a function, f, and a list of arguments, with the length of the list equal to the number of processes. (Essentially, you are handing in separate arguments for each process, in the form of a list. Note that this means your function must take a single argument, although that argument may itself be a tuple or a list.) It returns a list of the return values from the processes. This will look like:

         results = pool.map(f, args_list)

 

 

 

Time

To measure how long it takes to approximate pi, you will use the time module. First:

         import time

Before you begin your calculations, get the start time:

         start = time.time()

After you are done, get the end time:

      end = time.time()
Subtract start from end to get the number of seconds that have elapsed.

        elapsed = end - start

Program Outline

The details of implementing a solution are up to you, but here is a suggested outline of how to approach the problem. As usual, think about the 6 steps of program devepopment, and test each piece as you go.

  1. Write a function that takes in a number of trials, runs a simulation of throwing that number of darts, and returns the number of darts that landed in the circle. If you did this correctly in Lab 3, you can just modify your code from that lab.
  2. Create a process pool. Figure out how many trials each process will run. (Note: Your number of processes may not evenly divide into your number of darts, in which case some processes may have one more dart throw than others. Make sure you are throwing the correct number of darts altogether.) Use map to run the processes, and combine the results to get a total number of darts that hit the target. Dividing this by your original number of darts thrown should get you an approximation of pi.
  3. Time how long it takes to compute your approximation of pi.. Create a text file monte_readme.txt and record in it the amount of time it took to approximate pi with 100,000,000 throws for each of the following numbers of processes: 1, 2, 4, 8, 20, 40, 100, and 200.

Account

It frequently makes sense for processes to be able to access the same variables, so they can work together. However, this can be very dangerous, as we will get incorrect results if two or more processes modify the same variable at the same time. To avoid this, we will use something called a Lock, which will only allow one process at a time to access a variable.

In this program, you will create an Account class, which simulates a bank account. You will create methods to add and withdraw money from the account. You will then create multiple processes to act as users of this account. Each process will withdraw a random amount of money from the account, wait a random amount of time, and then add a random amount of money to the account. It will do this 3 times. They will each use their pid for the customer number. The output of this program should look something like:

Customer 25857 removed 71 balance 29
Customer 25858 removed 29 balance 0
Customer 25859 removed 0 balance 0
Customer 25860 removed 0 balance 0
Customer 25861 removed 0 balance 0
Customer 25858 added 71 balance 71
Customer 25858 removed 38 balance 33
Customer 25857 added 57 balance 90
Customer 25857 removed 54 balance 36
Customer 25858 added 58 balance 94
Customer 25858 removed 8 balance 86
Customer 25859 added 90 balance 176
Customer 25859 removed 77 balance 99
Customer 25859 added 30 balance 129
Customer 25859 removed 7 balance 122
Customer 25861 added 29 balance 151
Customer 25861 removed 24 balance 127
Customer 25860 added 57 balance 184
Customer 25860 removed 98 balance 86
Customer 25861 added 23 balance 109
Customer 25861 removed 17 balance 92
Customer 25861 added 29 balance 121
Customer 25859 added 12 balance 133 Customer 25860 added 69 balance 133
Customer 25860 removed 59 balance 74 Customer 25857 added 72 balance 146 Customer 25857 removed 0 balance 146 Customer 25860 added 24 balance 170 Customer 25858 added 52 balance 222 Customer 25857 added 4 balance 226

In order to have this work corrrectly we need two things: shared memory so that all of the processes can access the same account, and synchronization to protect the integrity of the data. The example above was run without synchronization. Notice the two lnes in bold towards the end:

         Customer 25859 added 12 balance 133
         Customer 25860 added 69 balance 133

The second balance is wrong. This happens when we switch processes before the first process is done with its calculation, resulting in both processes using the orignal starting balance - essentially the first transaction is over-written by the second. In order to fix this, we must use a lock when we are altering the balance, to ensure only one process at a time can change the balance. We will need to use processes directly for this part of the lab, rather than using a process pool because the multiprocessing pools don't use shared data.

Shared Memory

For shared memory, we will use the RawValue type available in python's multiprocessing module. It takes a code that indicates the type of the variable we want to share, and a value. To create a shared integer of value 0:

          v = RawValue("i",0)

To access the value of that integer:

          v.value

 

Locks

A lock is an object that only one process can "hold" at a time. To create a lock:

          l = Lock()

Now, if we want to use that lock so that only a single process can change the value of v, we have a process acquire the lock before it alters the value, and release it afterwards, like so:

          l.acquire()     # put the lock into its locked state.
          v.value = v.value - 10
          l.release()     # put the lock into its unlocked state.

When a process tries to acquire a lock that has already been acquired (i.e., locked) by another process, it is put on hold until the earlier process releases (i.e, unlocks) the lock. Note that it is VERY important you always release the lock after you acquire it - otherwise, no other processes will be able to acquire the lock, and your program will hang forever.

 

Sleeping

Recall that we want our customers to wait a random amount between withdrawing and adding money. To do this, we will use the time.sleep method, which causes a process to "sleep" (i.e. do nothing) for a specified amount of time. You will need to make sure to import the time class. To sleep for n seconds:

          time.sleep(n)

Program Outline

  1. Start by creating an Account class, with an init method that takes in a RawValue (used as the account balance), and a Lock, and saves them.
  2. Create a method add(customer_number, amount) that adds the amount to the balance, and prints out the customer number, amount and balance. Make sure you use the RawValue for the balance, and that you use locks correctly to make your code safe for multiple processes.
  3. Create a method withdraw(customer_number, amount) that checks if amount is greater than balance, and then either withdraws all of the available funds (if the balance is smaller than amount), or the amount specified (if the balance is greater than amount). You should update the balance (using locking), print out the customer number, amount withdrawn, and new balance, and return the amount sucessfully withdraw. The balance should never be able to go below 0.
  4. Outside of the Account class, create a method Customer(account). In this method, withdraw a random amount between 0 and 100 from account, then sleep for a random amount between 0 and 5, then add a random amount between 0 and 100 to the account. (You will probably want to use the random.randintmethod.) Do these things (withdraw, sleep, add) 3 times. Use the pid of the current process as the customer number.
  5. Create a new RawValue of type "i", and set it to 100. This will serve as the account balance. Create a new Lock. Create a new account with this RawValue and Lock.
  6. Create 5 Processes with target Customer and argument of your new account. You should NOT use a process pool - instead create the processes yourself. Start the processes.
  7. Look over your output to ensure that they are outputting the correct values.

Handin

If you followed the Honor Code in this assignment, insert a paragraph attesting to the fact within one of your .py files.

I affirm that I have adhered to the Honor Code in this assignment.

You now just need to electronically handin all your files. As a reminder

 
     % cd             # changes to your home directory
     % cd cs150       # goes to your cs150 folder
     % handin         # starts the handin program
                      # class is 150
                      # assignment is 11
                      # file/directory is lab11
     % lshand         # should show that you've handed in something

You can also specify the options to handin from the command line

 
     % cd ~/cs150     # goes to your cs150 folder
     % handin -c 150 -a 11 lab11

File Checklist


You should have submitted the following files:
   hello.py
   monte.py
   monte_readme.txt
   account.py