| Previous | Contents | Index |
This chapter describes the information found in the optional KAP Fortran/OpenMP listing file and the messages KAP produces. To help you understand its actions, KAP lists the optimizations it performed and provides explanations for the places where no optimization was done.
For example, if three loops could have been optimized but KAP optimized only the one it determined most profitable, the listing file will contain notes giving reasons for the choices. Also, often a small DO loop is left unchanged because it will be faster to process in that form. Such situations can produce unexpected but correct code, so KAP produces an annotated listing to explain its output. The listing may also identify places where the use of directives or assertions may improve KAP effectiveness.
Section 9.1 presents the optional information selected by the
-listoptions
command-line switch. Section 9.1.1 shows an annotated listing of the
original and transformed program. An introduction to the diagnostic
messages that KAP can generate ends the chapter. Appendix F contains
the main listing of KAP diagnostic messages.
9.1 Listing Switches
The -listoptions command-line switch tells KAP what information to include in the listing and error files. The listing and error files can contain any combination of the following messages about the optimizations performed, identified by the single-letter switches listed. The following sections present examples of the output selected by these switches.
See the -cmpoptions command-line switch for the optional information in the transformed code file.
The following examples used Compaq KAP Fortran/OpenMP for Tru64 UNIX default switch values, except for -listoptions=cklnopst :
The following sections explain the format of these listings.
9.1.1 Original Program Listing (O)
The o switch requests an annotated listing of the original program, for example:
KAP/Tru64_U_F90 4.4 k340504 20010517 ATIMESB Source 01-Sep-2001 09:31:22
Page 1
Footnotes Actions DO Loops Line
1
2 C Simple Matrix Multiply example.
3
4 PROGRAM ATIMESB
5
6 PARAMETER M=500, N=400, P=500
7
8 DIMENSION A(1:M,1:N),B(1:N,1:P),C(1:M,1:P)
9
10
11 C Initialize the matrices
12
1 +---------13 DO 10 J=1,N
2 SO !+--------14 DO 10 I=1,M
SO !* 15 A(I,J) = 1.5
!*________16 10 CONTINUE
17
1 +---------18 DO 20 J=1,P
2 SO !+--------19 DO 20 I=1,N
SO !* 20 B(I,J) = 3.0
!*________21 20 CONTINUE
22
23 C Compute C = A * B
24
SO 25 CALL MATMUL(A, M, B, N, C, P)
26 END
Abbreviations Used
SO scalar optimization
Footnote List
1: not vectorized Not an inner loop.
2: scalar optimization Loop unrolled 4 times to improve scalar performance.
KAP/Tru64_U_F90 4.4 k340504 20010517 ATIMESB Source 01-Sep-2001 09:31:22
Page 2
Footnotes Actions DO Line
Loops
27
28
29 SUBROUTINE MATMUL(A, LDA, B, LDB, C, LL)
30 REAL A(LDA,LDB), B(LDB,LL), C(LDA,LL)
31 INTEGER LDA,LDB,LL
32
1 2 3 4 5 6 NO SO +----33 DO 20 J=1,LL
1 3 4 5 6 NO LR SO !+---34 DO 20 I=1,LDA
SO !* 35 C(I,J) =0.0
1 5 6 7 NO LR SO INF !*+--36 DO 20 K=1,LDB
8 DD SO !*! 37 C(I,J) = C(I,J) + ( A(I,K) * B(K,J) )
!*!__38 20 CONTINUE
39
40 RETURN
41
42 END
Abbreviations Used
NO not optimized
LR loop reordering
DD data dependence
SO scalar optimization
INF informational
Footnote List
1: not optimized Loop was asserted serial by directive.
2: not vectorized Not an inner loop.
3: scalar optimization Cleanup-loop for loop unrolling added.
4: scalar optimization Loop unrolled 3 times to improve scalar performance.
5: scalar optimization Strip loop for strip mining with block size 24.
6: scalar optimization Block loop for strip mining with block size 24.
7: informational Unrolling of this loop was not done because
heuristic says size is ok asis.
8: data dependence Data dependence involving this line due to variable C.
|
The calling tree is listed after all program units have been compiled. Each program unit's calling tree consists of the SUBROUTINEs and FUNCTIONs called in that program unit. A listing of variables and arrays used (both from the original source program and in code which KAP added) precedes the calling information.
After the cross-reference and calling tree information for the last program unit in the file, the calling tree information for the entire source file is summarized, for example:
CALL SUMMARY TABLE
CROSS REFERENCE TABLE
Name Type Class Storage
-----------------------------------------------------------------------------
A s.REAL Array
LDA s.INT Var
B s.REAL Array
LDB s.INT Var
C s.REAL Array
LL s.INT Var
J s.INT Var
I s.INT Var
K s.INT Var
II1 s.INT Var
II2 s.INT Var
II3 s.INT Var
II4 s.INT Var
II5 s.INT Var
.
.
.
RR1 s.REAL Var
RR2 s.REAL Var
RR3 s.REAL Var
RR4 s.REAL Var
RR5 s.REAL Var
RR6 s.REAL Var
RR7 s.REAL Var
.
.
.
Abbreviations used in Source Program References
A = used as actual argument
D = Declared or Defined
M = Contents may get modified
U = Its value is used
CALL SUMMARY TABLE
16-Feb-2002 15:02:20
Calling Tree
line# routines at nest max. aggregate nest
4 program ATIMESB
25 call MATMUL 0 0
29 subroutine MATMUL
Calling Tree
ATIMESB
MATMUL
Code Modules
ATIMESB called from
MATMUL called from ATIMESB
|
The KAP switches table lists the settings of the command switches related to optimization used for this program unit. Some of the values may be changed within the program unit by using directives. Not all of these switches can be changed by the user. An example of a KAP switches table for an EV6 architecture follows.
KAP/Tru64_U_F90 4.4 k340504 20010517 ATIMESB Source 01-Sep-2001 09:31:22
Page 1
Switches Used for this Program Unit
no aggressive
align_common=8
align_struct=4
arclimit=5000
assume=cel
no blank_padding
cache_prefetch_line_count=0
cacheline=64,64
cachesize=32
chunk=1
cmp=./mat.cmp.f
no cmpoptions
complex=8
no concurrentize
datasave
directives=akpv
no dlines
dpregisters=32
eiifg=20
no escape
fpregisters=32
no freeformat
no fuse
fuselevel=0
no generateh
no hdir
heaplimit=3433
hli=1
no ignoreoptions
no include
no inline
no inline_and_copy
no inline_create
inline_depth=2
no inline_from_files
no inline_from_libraries
inline_looplevel=2
no inline_manual
inline_optimize=0
input=mat.f
integer=4
interchange
interleave
intlog
no ipa
no ipa_create
ipa_depth=2
no ipa_from_files
no ipa_from_libraries
ipa_looplevel=2
no ipa_manual
ipa_optimize=0
no library_calls
limit=10
lines=55
list=mat.out
listingwidth=132
listoptions=klo
logical=4
machine=s
make_sequenced
miifg=500
minconcurrent=1000
no namepartitioning
optimize=5
no onetrip
no parallelio
no parallelrtl
pdefault=safe
processors=1
psyntax=openmp
real=4
no recursion
roundoff=3
save=manual_adjust
scalaroptimize=3
scan=72
scheduling=e
setassociativity=1
no skip
no small_loops
no srlcd
no suppress
no syntax
tablesize=24000000
tune=EV6
no type
unroll=4
unroll2=160
unroll3=1
useh
|
The loop table shows what KAP did for each DO loop in the program unit. If the loop could not be optimized, a reason is listed. The possible Status entries and brief explanations are in the Loop Table Messages, as shown in the following example:
KAP/Tru64_U_F90 4.4 k340504 20010517 ATIMESB Loop Summary 01-Sep-2001 09:31:22
Loop Summary
From To Loop Loop at Unroll Unroll Iteration
Loop# line line label index nest weight factor workload Status
1 13 16 Do 10 J 1 not inner loop
2 14 16 Do 10 I 2 3 4 left as DO loop
3 18 21 Do 20 J 1 not inner loop
4 19 21 Do 20 I 2 3 4 left as DO loop
.
.
.
|
The program unit names, as processed, are printed to the standard error
file (
stderr
), preceded by the source file name. If the source is read from
standard input, the source file name is left blank.
9.1.6 Compilation Performance Statistics (P)
The compilation performance statistics list the number of lines in the program unit, the compilation time, the compilation rate in lines per minute, and temporary file use (for large program units or inlining). After all program units have been compiled, the cumulative totals are given, along with the final number of lines in the transformed code file. The cumulative values version is shown in the following example:
KAP/Tru64_U_F90 4.4 k340504 20010517 ATIMESB Compilation Statistics
01-Sep-2001 09:31:22
Compilation Statistics For the Routine ATIMESB
26 Lines in Program Unit
23 Noncomment Lines in Program Unit
0.13 CPU Time
12000 Lines Per Minute
10615 Non Comment Lines Per Minute
0 Symbol Cache File Writes
0 Symbol Cache File Reads
110 Source Saver File Reads
26 Source Saver File Writes
1 Source Saver File Opens
0 Name Table File Writes
0 Name Table File Reads
Compilation Statistics For the Routine MATMUL
16 Lines in Program Unit
16 Noncomment Lines in Program Unit
0.90 CPU Time
1066 Lines Per Minute
1066 Non Comment Lines Per Minute
0 Symbol Cache File Writes
0 Symbol Cache File Reads
165 Source Saver File Reads
16 Source Saver File Writes
0 Source Saver File Opens
0 Name Table File Writes
0 Name Table File Reads
Cumulative Compilation Statistics
42 Lines in Source File
39 Noncomment Lines in Source File
2 Program Units in Source File
1.03 CPU Time
2446 Lines Per Minute
2271 Non Comment Lines Per Minute
0 Symbol Cache File Writes
0 Symbol Cache File Reads
275 Source Saver File Reads
42 Source Saver File Writes
1 Source Saver File Opens
0 Name Table File Writes
0 Name Table File Reads
259 Lines in Compile File
|
The summary table shows how many loops appeared in the program unit and how many loops were optimized in different ways, for example:
KAP/Tru64_U_F90 4.4 k340504 20010517 ATIMESB Optimization Summary 01-Sep-2001 09:31:22 4 loops total 2 loops vectorized 2 with inner loop KAP/Tru64_U_F90 4.4 k340504 20010517 ATIMESB Optimization Summary 01-Sep-2001 09:31:22 3 loops total 1 loops vectorized 2 with scalar directive |
The following example shows the annotated transformed program in the listing file. Much of this information is always recorded in the transformed code file regardless of whether the user specifies -listoptions=t .
KAP/Tru64_U_F90 4.4 k340504 20010517 ATIMESB Transformed 01-Sep-2001 09:31:22
Page 1
Footnotes Actions DO Line
Loops
1
2 C Simple Matrix Multiply example.
3
4 PROGRAM ATIMESB
5
6 PARAMETER M = 200, N = 300, P = 200
7
8 DIMENSION A(1:200,1:300), B(1:300,1:200), C(1:200,1:200)
I 8 INTEGER II2, II1
I 8 PARAMETER (II2 = 300, II1 = 200)
9
10
11 C Initialize the matrices
12
1 LM +------13 DO 2 J=1,300
2 LM INF !+-----14 DO 2 I=1,200
!! 15 A(I,J) = 1.5
I !!_____16 2 CONTINUE
17
1 LM +------18 DO 3 J=1,200
2 LM INF !+-----19 DO 3 I=1,300
!! 20 B(I,J) = 3.0
I !!_____21 3 CONTINUE
22
23 C Compute C = A * B
24
SO 25 CALL MATMUL (A,(II1),B,(II2),C,(II1))
26 END
Abbreviations Used
LM label modification
SO scalar optimization
I inserted
INF informational
Footnote List
1: not vectorized Not an inner loop.
2: informational Unrolling of this loop was not done because heuristic
says size is ok as is.
|
KAP/Tru64_U_F90 4.4 k340504 20010517 MATMUL Transformed 01-Sep-2001 09:31:22
Footnotes Actions DO Loops Line
27
28
29 SUBROUTINE MATMUL (A, LDA, B, LDB, C, LL )
30 REAL A(LDA,LDB), B(LDB,LL), C(LDA,LL)
31 INTEGER LDA, LDB, LL
I 31 INTEGER II17, II16
I 31 PARAMETER (II17 = 25, II16 = 1)
I INTEGER II1,II2,II3,II4,II5,II6,II7,II8
X ,II9,II10,II11, II12,II13,II14,II15
I REAL RR1
I 33 II1 = MOD (LL - II16, II17) + II16
I 36 II5 = MOD (LDB - II16, II17) + II16
I 34 II9 = MOD (LDA - II16, II17) + II16
32
1 LM +---------33 DO 2 J=1,LL
2 LM INF !+--------34 DO 2 I=1,LDA
!! 35 C(I,J) = 0.
I !!________38 2 CONTINUE
I 33 II3 = II16
I 33 II2 = II1
3 4 5 I NO SO +---------33 DO 7 II4=II16,LL,II17
I ! 36 II7 = II16
I ! 36 II6 = II5
I ! 36 II15 = II3 + II2 - II16
3 4 5 I NO SO !+--------36 DO 6 II8=II16,LDB,II17
I !! 34 II11 = II16
I !! 34 II10 = II9
I !! 34 II13 = II7 + II6 - II16
3 4 5 I NO SO !!+-------34 DO 5 II12=II16,LDA,II17
!!! 32
I !!! 33 II14 = II11 + II10 - II16
6 LM SO !!!+------33 DO 4 J=II3,II15,II16
6 LM LR SO !!!!+-----34 DO 4 I=II11,II14,II16
I !!!!! 34 RR1 = C(I,J)
2 6 LM LR SO INF!!!!!+----36 DO 3 K=II7,II13,II16
7 !!!!!! 37 RR1 = RR1+(A(I,K) * B(K,J))
I !!!!!!____38 3 CONTINUE
I !!!!! 38 C(I,J) = RR1
I !!!!!_____38 4 CONTINUE
I !!! 38 II11 = II11 + II10
I !!! 38 II10 = II17
I !!!_______38 5 CONTINUE
I !! 38 II7 = II7 + II6
I !! 38 II6 = II17
I !!________38 6 CONTINUE
I ! 38 II3 = II3 + II2
I ! 38 II2 = II17
I !_________38 7 CONTINUE
39
40 RETURN
41
42 END
Abbreviations Used
LM label modification
NO not optimized
LR loop reordering
SO scalar optimization
I inserted
INF informational
Footnote List
1: not vectorized Not an inner loop.
2: informational Unrolling of this loop was not done because
heuristic says size is ok as is.
3: inserted DO loop was inserted here.
4: scalar optimization Block loop for strip mining with block size 25.
5: not optimized Loop was asserted serial by directive.
6: scalar optimization Strip loop for strip mining with block size 25.
7: data dependence Data dependence involving this line due to variable C.
|
This section presents KAP annotated listing information. Unless overridden with the -suppress command-line switch, KAP presents the following information in every KAP source ( -listoptions=o ) or transformed ( -listoptions=t ) program listing:
The following sections explain the format of these entries.
9.2.1 Line Numbers
A statement in the KAP listing labeled with a line number of 21, for
example, is either the same as line 21 from the original program, or is
derived from line 21. These line numbers are useful when inspecting the
KAP transformed program listing. KAP sometimes generates several lines
of code from a single line of the original program; in that case, each
of the new lines of code is labeled with the same number as that of the
original program. Consequently, lines of the KAP transformed program
listing may be easily related to the lines of the original program
listing. Lines from an INCLUDE file are numbered starting from 1 for
the first line in the included file.
9.2.2 DO Loop Markings
DO loops are graphically displayed in a column headed DO Loops. Brackets mark the extent of each DO loop (up to nest level 10), as shown in the following example:
DO Loops Line +--------- 5 DO 99 I = 1,1000 * 6 A(I,1) = B(1) *+-------- 7 DO 95 J = 2,1000 *! 8 A(I,J) = B(J)*A(I,J-1) *!________ 9 95 CONTINUE *_________ 10 99 CONTINUE |
A statement that is enclosed by n DO loops has n exclamation marks (!) on that line. Loops that have been optimized in a major way have asterisks (*) instead of exclamation points in the source listing.
Compaq KAP Fortran/OpenMP for Tru64 UNIX recognizes certain operations, such as matrix multiplication, as basic entities. Frequently, the loops forming such operations will not be marked. |
If the INCLUDE Fortran 90 statement is used within a program, the code that was included appears in the listing file with a plus sign (+) immediately after the line number on each line of the file. INCLUDEd lines are numbered separately from the lines of the main file. An example of this (from an -lo=o listing) follows:
2 subroutine fr (w) 3 real w 4 INCLUDE 'b.f' 1 + a = 1.0 2 + w = w - a 5 return 6 end |
Important details about the actions taken by KAP are given in the Footnotes listing. The Footnotes are numbered and printed at the bottom of each program unit under the Footnote List heading. References to the footnotes are displayed in the listing under the column headed Footnotes. An example of a footnote follows.
In the listing the following appears:
13 DD 1790 if (b(i).le.6) ib(j*i) = i+j |
At the end of the listing under the heading, Footnotes , is the following:
13: data dependence Data dependence involving this line due to variable IB |
In this example,
13
is the footnote number,
DD
, meaning data dependence, is the class of message issued by KAP, and
the
if
statement on line
1790
is the reference for this footnote.
9.2.5 Syntax Error/Warning Messages
When KAP detects syntax errors, it simply copies the input program unit to the transformed code file with no attempt to optimize the code.
When a program has syntax errors, messages are presented in the source ( -listoptions=o ) listing, interspersed with the user's code. Regardless of whether an original listing is requested, stderr: notes syntax errors and warnings. To locate the error messages in the source listing, look for lines beginning with the symbols ### , for example:
Footnotes Actions DO Loops Line
1 SUBROUTINE Z(A, B, N)
2 REAL A(N,N), B(N,N)
+--------3 DO 20 I=1,N
!+-------4 DO 20 J=1,N
!! 5 X = A(I,J)
!! 6 Y = B(I,J)
!!_______7 20 C(I,J) = X + Y
### line(7)
### error %KAP-E-DO_NON_EXE, DO loop ends on a non-executable statement.
### error %KAP-E-STMT_FUNCTION_O, Array not declared or statement
function declared after executable statements.
8 PRINT *, X
9 RETURN
10 END
|
KAP also may intersperse syntax warning messages with the user's code,
but optimization proceeds. Syntax warnings are for constructs that are
not legal, but whose intent is clear.
9.2.6 Questions Generated by KAP
At times KAP needs additional information on which to base optimization decisions. In these cases KAP may ask a question to indicate what additional information is needed.
The following loop comes from an example for -listoptions=o :
1 2 q so +--------- 24 DO 135 I = 1,N 3 DD SO !_________ 25 135 D(IP(I)) = C(I) + 3 3: question %KAP-I-PERMUTATION_VEC, Is "IP" a permutation vector? |
If you know IP is a permutation vector (or, at least, contains no duplicate values), this information can be passed to KAP with the following assertion:
!*$* assert permutation (ip)
DO 135 I = 1,N
135 D(IP(I)) = C(I) + 3
|
This information may enable KAP to optimize the loop or the surrounding
code. See Chapter 6 for information about KAP assertions.
9.2.7 Action Summary
Statements that are translated or modified by KAP are identified by abbreviations in the Action Summary listing field. The notations tell which class(es) of messages were issued for each line or statement. The following list explains each of these classes. KAP lists the abbreviated explanation of its actions at the bottom of the listing. For the DIR class, the class itself usually serves as the message; no detailed message follows. All other class abbreviations indicate a message follows in this class.
The abbreviations and meanings of these classes are as follows:
The Loop Table listing ( -listoptions=l ) includes an entry for each loop indicating whether it was optimized, or why it was not. This section lists the possible messages and gives a brief explanation for each. The two most common reasons for a loop to be left serial are that the iterations were not independent (the listing should give a Data Dependence message) and that the loop contained I/O statements.
Appendix F provides a complete list of the diagnostic messages that can appear in the program listing (original or transformed).
In addition to the listing file, some messages (such as for command-line switch errors or missing files) are written to the error file. These are intended to be self-explanatory.
| Previous | Next | Contents | Index |