![]() |
Understanding GEM Console Output |
This page is designed to provide detailed explanations (for those who are curious) of the output generated in the Console View as well as the content of log files GEM generates.
There is no reason for the casual user to understand in any great detail, the output displayed in the GEM Console View or the contents of the log files that are generated. GEM exists to provide an intuitive, visual interface to debug and understand MPI applications and MPI runtime behavior and to keep the user insulated from the details of trace files. The following views and tools are provided by GEM to accomplish this:
ISP - Insitu Partial Order ----------------------------------------- Command: /home/alan/mpi/MPI_AnySrc/gem/MPI_AnySrc.gem Number Procs: 3 Server: Local Socket Blocking Sends: Enabled FIB: Enabled ----------------------------------------- Started Process: 22044 (0) is alive on formal (2) is alive on formal INTERLEAVING :1 (1) is alive on formal ----------------------------------------- Transition list for 0 0 o=1 i=0 rank=0 Barrier /home/alan/mpi/MPI_AnySrcCanDeadlock/src/MPI_AnySrc.c:31 count=1{[0, 1][0, 2]} {} 1 o=4 i=1 rank=0 Irecv /home/alan/mpi/MPI_AnySrcCanDeadlock/src/MPI_AnySrc.c:37 src=-1 rtag=0 count=1{[0, 2]} {} Matched [1,1] 2 o=7 i=2 rank=0 Recv /home/alan/mpi/MPI_AnySrcCanDeadlock/src/MPI_AnySrc.c:39 src=1 rtag=0{} {} Transition list for 1 0 o=2 i=0 rank=1 Barrier /home/alan/mpi/MPI_AnySrcCanDeadlock/src/MPI_AnySrc.c:31 count=1{[1, 1]} {} 1 o=5 i=1 rank=1 Send /home/alan/mpi/MPI_AnySrcCanDeadlock/src/MPI_AnySrc.c:50 dest=0 stag=0{[1, 2]} {} Matched [0,1] 2 o=8 i=2 rank=1 Recv /home/alan/mpi/MPI_AnySrcCanDeadlock/src/MPI_AnySrc.c:52 src=0 rtag=0{} {} Transition list for 2 0 o=3 i=0 rank=2 Barrier /home/alan/mpi/MPI_AnySrcCanDeadlock/src/MPI_AnySrc.c:31 count=1{[2, 1]} {} 1 o=6 i=1 rank=2 Send /home/alan/mpi/MPI_AnySrcCanDeadlock/src/MPI_AnySrc.c:58 dest=0 stag=0{} {} No matching MPI call found! Detected a DEADLOCK in interleaving 1 ----------------------------------------- Started Process: 22048 (1) is alive on formal (0) is alive on formal (2) is alive on formal INTERLEAVING :2 (1) Finished normally (0) Finished normally (2) Finished normally ----------------------------------------- ISP detected deadlock!!! Total Explored Interleavings: 2 Interleaving Exploration Mode: All Relevant Interleavings ----------------------------------------- List of Irrelevant Barriers - If want to remove, remove a complete match set: Match Set: 1 0: /home/alan/mpi/MPI_AnySrcCanDeadlock/src/MPI_AnySrc.c:31 1: /home/alan/mpi/MPI_AnySrcCanDeadlock/src/MPI_AnySrc.c:31 2: /home/alan/mpi/MPI_AnySrcCanDeadlock/src/MPI_AnySrc.c:31 Match Set: 2 0: /home/alan/mpi/MPI_AnySrcCanDeadlock/src/MPI_AnySrc.c:61 1: /home/alan/mpi/MPI_AnySrcCanDeadlock/src/MPI_AnySrc.c:61 2: /home/alan/mpi/MPI_AnySrcCanDeadlock/src/MPI_AnySrc.c:61
Like GEM Console output, users are not expected to understand log files. The best way understand what the log file contents represent is to run the Happens Before (HB) Viewer to graphically examine the information it displays. The log file consists of a single number (the first line) that says how many processes were used to create the file, a list of every MPI call that program issued and information about how that MPI call interacts with other MPI calls, unless a deadlock is found, at which point progress halts for that particular interleaving. Here the log will have a line giving the interleave number and the word "DEADLOCK". It is possible to have more than one deadlock in your program. Each deadlock will be contained in a particular interleaving (e.g. and interleaving can have at most one deadlock).
3 1 0 0 1 1 Barrier 0_0:1:2: { 1 2 } { [ 1 1 ] [ 2 1 ] } Match: -1 -1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 31 1 0 1 4 5 Irecv -1 0 0_0:1:2: { 2 } { [ 1 2 ] } Match: 1 1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 37 1 0 2 7 -1 Recv 1 0 0_0:1:2: { } { } Match: -1 -1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 39 1 1 0 2 2 Barrier 0_0:1:2: { 1 } { [ 0 1 ] [ 0 2 ] [ 2 1 ] } Match: -1 -1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 31 1 1 1 5 4 Send 0 0 0_0:1:2: { 2 } { [ 0 2 ] } Match: 0 1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 50 1 1 2 8 -1 Recv 0 0 0_0:1:2: { } { } Match: -1 -1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 52 1 2 0 3 3 Barrier 0_0:1:2: { 1 } { [ 0 1 ] [ 0 2 ] [ 1 1 ] } Match: -1 -1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 31 1 2 1 6 -1 Send 0 0 0_0:1:2: { } { } Match: -1 -1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 58 1 DEADLOCK 2 0 0 1 6 Barrier 0_0:1:2: { 1 2 } { [ 1 0 ] [ 2 0 ] [ 1 1 ] [ 0 0 ] [ 2 1 ] } Match: -1 -1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 31 2 0 1 4 10 Irecv -1 0 0_0:1:2: { 2 5 } { [ 2 1 ] [ 2 2 ] } Match: 2 1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 37 2 0 2 7 12 Recv 1 0 0_0:1:2: { 3 } { [ 1 1 ] [ 1 2 ] [ 0 2 ] } Match: 1 1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 39 2 0 3 17 13 Send 1 0 0_0:1:2: { 4 } { [ 1 2 ] [ 1 3 ] [ 0 3 ] } Match: 1 2 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 41 2 0 4 19 16 Recv -1 0 0_0:1:2: { 5 } { [ 1 3 ] [ 1 4 ] } Match: 1 3 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 44 2 0 5 21 17 Wait { 6 } { } Match: -1 -1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 46 2 0 6 23 18 Barrier 0_0:1:2: { 7 } { [ 1 4 ] [ 2 2 ] [ 1 5 ] [ 0 6 ] [ 2 3 ] } Match: -1 -1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 61 2 0 7 24 21 Finalize { } { [ 1 5 ] [ 2 3 ] [ 0 7 ] } Match: -1 -1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 63 2 1 0 2 7 Barrier 0_0:1:2: { 1 } { [ 0 0 ] [ 2 0 ] [ 0 1 ] [ 0 2 ] [ 1 0 ] [ 2 1 ] } Match: -1 -1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 31 2 1 1 5 11 Send 0 0 0_0:1:2: { 2 } { [ 0 2 ] [ 0 3 ] [ 1 1 ] } Match: 0 2 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 50 2 1 2 18 14 Recv 0 0 0_0:1:2: { 3 } { [ 0 3 ] [ 0 4 ] [ 1 2 ] } Match: 0 3 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 52 2 1 3 20 15 Send 0 0 0_0:1:2: { 4 } { [ 0 5 ] [ 1 3 ] } Match: 0 4 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 54 2 1 4 22 19 Barrier 0_0:1:2: { 5 } { [ 0 6 ] [ 2 2 ] [ 0 7 ] [ 1 4 ] [ 2 3 ] } Match: -1 -1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 61 2 1 5 25 22 Finalize { } { [ 0 7 ] [ 2 3 ] [ 1 5 ] } Match: -1 -1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 63 2 2 0 3 8 Barrier 0_0:1:2: { 1 } { [ 0 0 ] [ 1 0 ] [ 0 1 ] [ 0 2 ] [ 2 0 ] [ 1 1 ] } Match: -1 -1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 31 2 2 1 6 9 Send 0 0 0_0:1:2: { 2 } { [ 0 2 ] [ 0 5 ] [ 2 1 ] } Match: 0 1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 58 2 2 2 16 20 Barrier 0_0:1:2: { 3 } { [ 0 6 ] [ 1 4 ] [ 0 7 ] [ 2 2 ] [ 1 5 ] } Match: -1 -1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 61 2 2 3 26 23 Finalize { } { [ 0 7 ] [ 1 5 ] [ 2 3 ] } Match: -1 -1 File: 98 /home/alan/mpi/MPI_AnySrc/src/MPI_AnySrc.c 63 ...
To understand the format, we will take a single line from the generated log file (line shown below) as an example, and explain each portion in detail:
Element | Title | Explanation |
---|---|---|
2 | Interleaving Number (1-based) | This MPI call was issued in the second interleaving. |
0 | Process Number (0-based) | This call was issued by process zero. |
1 | Process Call Index (0-based) | The order the envelope is sent by the process. This was second call issued by this process. |
4 | Scheduler Receive Index (1-based) | This was the fourth call received by ISP's scheduler. |
10 | Scheduler Issue Index (1-based) | The order in which MPI calls are issued to the MPI runtime. This was the tenth call performed by ISP. |
Irecv | MPI Call Name | This call was a MPI Irecv issued by process 0. |
-1 | The Source Index | Source is -1 in this case because the MPI call itself is the source. This call was a MPI Irecv issued by process 0. |
0 | MPI Message Tag | The MPI message tag here is 0. |
0_ | MPI Communicator | The MPI communicator this MPI call is a member of. |
0:1:2 | Processes in MPI communicator | Processes 0, 1, and 2 are in this particular MPI communicator (MPI_COMM_WORLD in this example). |
{2 5} | Intra-Process MPI calls blocked | Blocks the MPI calls within this processes with Process Call Index of 2 and 5, Recv and Finalize in this case. |
{ [ 2 1 ] [ 2 2 ] } | Inter-Process MPIcalls blocked | Blocks the indicated calls across processes (MPI calls found in other processes). Each are listed in pairs (numbers between the [ ] form a pair), the first element of the pair is the process that made the MPI call, the second number is the Process Call Index. For example, the first pair tells us that the call from process 2 with a Process Call Index of 1 is blocked. |
Match: 2 1 | Match Rank & Index | For calls like Send and Recv and has -1 -1 for calls without matches (e.g. collectives). Here the match rank is process 2 and the match index is 1. |
File: 98 | Filename & with length | The length of the file name length in this example is 98. |
/home/.../.../MPI_AnySrc.c | The path and name of the source code file containing the MPI call | The fully qualified path to the source code file named MPI_AnySrc.c. |
37 | Line number MPI call is found on | This MPI call is found on line 37 of the source code file listed above. |
Back to Top | Back to Table of Contents
School of Computing * 50 S. Central Campus Dr. Rm. 3190 * Salt Lake City, UT
84112 * isp-dev@cs.utah.edu
License