Friday, November 22, 2002

Parallel Computing - MPI - Scanning Text

Scanning Text

1. Problem Description

The problem is to search large pieces of text for occurrences of word “Cryptosporidium”. Since the amount of text to search is very large parallel computation has been suggested as the possible means to accomplish this task.

2. Method



The steps involved are as follows:

Master Reads File
Sends text to Slaves
Slaves compute occurrence/partial occurrence of word for each block
Slaves send their result to Master
Master computes final result



Notes:

A Slave need not return occurrences at BOTH start and end, only one is sufficient since in either case the master would match its adjacent blocks and the adjacent blocks would also report partial occurrence.
At the first and last block the there is only 1 adjacent block and not 2 as for the rest

3. Parallel Implementation Strategy

I have used MPI to implement the parallel program

Master reads from file “text.txt”
Master sends text to Slaves (I wanted to use MPI_Bcast however it does not work for MPI_CHAR so I had to revert to MPI_Send and MPI_Recv)
The Text is split into the number of slaves. The last process may get extra characters if the number of processes does not completely divide the total string.
Load Balancing is not taken in account however the program is scalable and this may be accommodated if there were different machines.
Slaves compute the occurrence of word “CRYPTOSPORIDIUM”. The answers generated by them are:
0 Not Found (PRESENT_NO)
1 Found Full occurrence (PRESENT_YES)
2 Found occurrences at end of string (PRESENT_END)
3 Found occurrences at starting of string (PRESENT_START)
These results are sent to Master using MPI_Send and MPI_Recv, the master is set to receive as MPI_ANY_SOURCE so results can come in any order
Master computes final result based on the result of slaves. If a slaves has indicated PRESENT_START or PRESENT_END then those blocks are joined with adjacent blocks and compared.
Final result is obtained

4. Time Complexity Analysis

tstartup: Time to start

tcomm1: Communication time when master sends text to slaves and slaves receive it

tcomp=tcomp0+tcomp1+……tcompN: This is the time taken by N process which is specified by user and is the total computation time for each block of text.

tcomm2:Time for slaves to send their individual results and the master to receive

tcompmaster: Time taken by master to compute final result

tcomm1=tstartup+N.tdata

tcomm2=tstartup+4.N

tcomp is O(m)

Overall execution time is

ttotal=tcomm1+tcomp+tcomm2

The TIME analysis shown in the program implementation is only an approximation and not implemented as rigorously due to MPI Constraints

I have added the additional tcompmaster in the implantation though it is usually not included in the theoretical analysis.

Analysis results on sf3.comp.nus.edu.sg

Processes
tstartup
tcomm1
tcomp
tcomm2
3
0.0043851
0.0004821
0.00025745
0.0012721
4
0.0044841
0.0006681
0.0003396
0.0013311
5
0.0043731
0.0009441
0.00040575
0.0007391
6
0.0042731
0.0008021
0.0004609
0.0010651
7
0.0042291
0.0009941
0.0005905
0.0012221
8
0.0044671
0.0011061
0.0005082
0.0015261
9
0.0041731
0.0012171
0.00062735
0.0040601
10
0.0045411
0.0017761
0.0006045
0.0070911
11
0.0047231
0.0015631
0.00067765
0.0031441
12
0.0041541
0.0031841
0.0006838
0.0017441




tstartup: remains fairly constant throughout for all the processes

tcomm1: it is gradually increasing

tcomp: it is very low but seems to be increasing instead of decreasing with more processes. Please note that I tested on sf3 so this may be expected since we are not actually using parallel computation.

Tcomm2: this is also increasing

The results seem fairly consistent since creating more processes the communications time is increasing and overall time is also increasing as seen in next chart:


From this chart it is clear that Total time is increasing with increasing number of processes. (sf3)

5. MPI Source Code

6. Sample Output
rahulaga@sf3:/stu/stu99/rahulaga/3211/project[641]$ /home/course/cs3211/mpi/bin/mpirun -np 3 proj
*********************************************
Minimum 2 process, However for bigger files may give error if few processes(Buffer Limit)
Max processes currently set to 20
Max String size currently set to 50000
0 means Not Found
1 mean Found
2 means found at End of Block
3 means found at Start of Block
*********************************************
TIME: tstartup=0.00455610
TIME: tcomm1=0.00052310
Result on process id 0 is 0 TIME: tcomp0=0.00012215
TIME: tcomm2=0.00034410
Overall Final Result::Word CRYPTOSPORIDIUM found in text!
TIME: tcompmaster=0.00000210
TIME TOTAL: tstartup+tcomm1+(tcomp0+tcomp1....+tcompn)+tcomm2+tcompmaster
Result on process id 2 is 1 TIME: tcomp2=0.00006715
Result on process id 1 is 1 TIME: tcomp1=0.00005815
rahulaga@sf3:/stu/stu99/rahulaga/3211/project[644]$ /home/course/cs3211/mpi/bin/mpirun -np 7 proj
*********************************************
Minimum 2 process, However for bigger files may give error if few processes(Buffer Limit)
Max processes currently set to 20
Max String size currently set to 50000
0 means Not Found
1 mean Found
2 means found at End of Block
3 means found at Start of Block
*********************************************
TIME: tstartup=0.00426610
TIME: tcomm1=0.00090210
Result on process id 0 is 0 TIME: tcomp0=0.00007415
TIME: tcomm2=0.00082210
Overall Final Result::Word CRYPTOSPORIDIUM found in text!
TIME: tcompmaster=0.00000210
TIME TOTAL: tstartup+tcomm1+(tcomp0+tcomp1....+tcompn)+tcomm2+tcompmaster
Result on process id 2 is 0 TIME: tcomp2=0.00007315
Result on process id 6 is 1 TIME: tcomp6=0.00003915
Result on process id 3 is 1 TIME: tcomp3=0.00003315
Result on process id 4 is 0 TIME: tcomp4=0.00007815
Result on process id 5 is 0 TIME: tcomp5=0.00007015
Result on process id 1 is 0 TIME: tcomp1=0.00007415


Parellel Computing - MPI, Image Processor

Images – Sample Output

Original Image:




Sample Translation (parameter 54)



Sample Zoom r>1 (parameter 65 65 2)



Sample Zoom r<1 (parameters 65 65 0.5)



Source Code