diff options
author | Tobias Grosser <tobias@grosser.es> | 2017-06-12 15:21:47 +0300 |
---|---|---|
committer | Tobias Grosser <tobias@grosser.es> | 2017-06-12 15:21:47 +0300 |
commit | 2531a5d8274dbe9bb46e71aa02aca7525515cff1 (patch) | |
tree | c9fa6e4294af8b4954e105fcba0de373f2f291ae /polly/www | |
parent | bccaea57c00d5227830e3c296636d2b54c080a92 (diff) |
[www] Remove outdated documentation
Remove examples 'load_Polly_into_clang' and 'manual_matmul'. This information is
now available in our SPHINX docs (*).
(*) Thanks to Singapuram Sanjay Srivallabh <singapuram.sanjay@gmail.com> who
contributed the SPHINX docs update!
llvm-svn: 305186
Diffstat (limited to 'polly/www')
-rw-r--r-- | polly/www/example_load_Polly_into_clang.html | 143 | ||||
-rw-r--r-- | polly/www/example_manual_matmul.html | 452 | ||||
-rw-r--r-- | polly/www/examples.html | 22 |
3 files changed, 0 insertions, 617 deletions
diff --git a/polly/www/example_load_Polly_into_clang.html b/polly/www/example_load_Polly_into_clang.html deleted file mode 100644 index fdf3132a01cf..000000000000 --- a/polly/www/example_load_Polly_into_clang.html +++ /dev/null @@ -1,143 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<!-- Material used from: HTML 4.01 specs: http://www.w3.org/TR/html401/ --> -<html> -<head> - <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> - <title>Polly - Load Polly into clang</title> - <link type="text/css" rel="stylesheet" href="menu.css"> - <link type="text/css" rel="stylesheet" href="content.css"> -</head> -<body> -<div id="box"> -<!--#include virtual="menu.html.incl"--> -<div id="content"> -<!--=====================================================================--> -<h1>Load Polly into clang and automatically run it at -O3</h1> -<!--=====================================================================--> - -<p><b>Warning:</b> Even though this example makes it very easy to use Polly, -you should be aware that Polly is a young research project. It is expected -to crash, produce invalid code or to hang in complex calculations even for -simple examples. In case you see such a problem, please check the <a -href="bugs.html">Bug database</a> and consider reporting a bug. -<p> -<b>Warning II:</b> clang/LLVM/Polly need to be in sync. This means - you need to compile them yourself from a recent svn/git checkout</b> -<h2>Load Polly into clang</h2> - -By default Polly is configured as a shared library plugin that is loaded in -tools like clang, opt, and bugpoint when they start their execution. - -By loading Polly into clang (or opt) the Polly options become automatically -available. You can load Polly either by adding the relevant commands to -the CPPFLAGS or by creating an alias. - -<pre class="code"> -$ export CPPFLAGS="-Xclang -load -Xclang ${POLLY_BUILD_DIR}/lib/LLVMPolly.so" -</pre> - -or -<pre class="code"> -$ alias pollycc clang -Xclang -load -Xclang ${POLLY_BUILD_DIR}/lib/LLVMPolly.so -</pre> - -To avoid having to load Polly in the tools, Polly can optionally be configured -with cmake to be statically linked in the tools: - -<pre class="code"> -$ cmake -D LINK_POLLY_INTO_TOOLS:Bool=ON -</pre> - -<h2>Optimizing with Polly</h2> - -Optimizing with Polly is as easy as adding <b>-O3 -mllvm -polly</b> to your -compiler flags (Polly is only available at -O3). - -<pre class="code">pollycc -O3 -mllvm -polly file.c</pre> - -<h2>Automatic OpenMP code generation</h2> - -To automatically detect parallel loops and generate OpenMP code for them you -also need to add <b>-mllvm -polly-parallel -lgomp</b> to your CFLAGS. - -<pre class="code">pollycc -O3 -mllvm -polly -mllvm -polly-parallel -lgomp file.c</pre> - -<h2>Automatic Vector code generation</h2> - -Automatic vector code generation can be enabled by adding <b>-mllvm --polly-vectorizer=stripmine</b> to your CFLAGS. - -<pre class="code">pollycc -O3 -mllvm -polly -mllvm -polly-vectorizer=stripmine file.c</pre> - -<h2>Extract a preoptimized LLVM-IR file</h2> - -Often it is useful to derive from a C-file the LLVM-IR code that is actually -optimized by Polly. Normally the LLVM-IR is automatically generated from -the C code by first lowering C to LLVM-IR (clang) and by subsequently applying a -set of preparing transformations on the LLVM-IR. To get the LLVM-IR after the -preparing transformations have been applied run Polly with '-O0'. - -<pre class="code">pollycc -O0 -mllvm -polly -S -emit-llvm file.c</pre> - -<h2>Further options</h2> - -Polly supports further options that are mainly useful for the development or -the -analysis of Polly. The relevant options can be added to clang by appending -<b>-mllvm -option-name</b> to the CFLAGS or the clang -command line. - -<h3>Limit Polly to a single function</h3> -To limit the execution of Polly to a single function, use the option -<b>-polly-only-func=functionname</b>. - -<h3>Disable LLVM-IR generation</h3> -Polly normally regenerates LLVM-IR from the Polyhedral representation. To only -see the effects of the preparing transformation, but to disable Polly code -generation add the option <b>polly-no-codegen</b>. - -<h3>Graphical view of the SCoPs</h3> - -Polly can use graphviz to show the SCoPs it detects in a program. The relevant -options are <b>-polly-show</b>, <b>-polly-show-only</b>, <b>-polly-dot</b> and -<b>-polly-dot-only</b>. The 'show' options automatically run dotty or another -graphviz viewer to show the scops graphically. The 'dot' options store for each -function a dot file that highlights the detected SCoPs. If 'only' is appended at -the end of the option, the basic blocks are shown without the statements the -contain. - -<h3>Change/Disable the Optimizer</h3> -Polly uses by default the isl scheduling optimizer. The isl optimizer optimizes -for data-locality and parallelism using the <a -href="http://pluto-compiler.sf.net">Pluto</a> algorithm. For research it is also -possible to run <a -href="http://www-rocq.inria.fr/~pouchet/software/pocc/">PoCC</a> as external -optimizer. PoCC provides access to the original Pluto implementation. To use -PoCC add <b>-polly-optimizer=pocc</b> to the command line (only available if -Polly was compiled with scoplib support) [removed after <a href="http://llvm.org/releases/download.html#3.4.2">LLVM 3.4.2</a>]. -To disable the optimizer entirely use the option <b>-polly-optimizer=none</b>. - -<h3>Disable tiling in the optimizer</h3> -By default both optimizers perform tiling, if possible. In case this is not -wanted the option <b>-polly-tiling=false</b> can be used to disable it. (This -option disables tiling for both optimizers). - -<h3>Ignore possible aliasing</h3> -By default we only detect scops, if we can prove that the different array bases -can not alias. This is correct do if we optimize automatically. However, -without special user annotations like 'restrict' we can often not prove that -no aliasing is possible. In case the user knows no aliasing can happen in the -code the <b>-polly-ignore-aliasing</b> can be used to disable the check for -possible aliasing. - -<h3>Import / Export</h3> -The flags <b>-polly-import</b> and <b>-polly-export</b> allow the export and -reimport of the polyhedral representation. By exporting, modifying and -reimporting the polyhedral representation externally calculated transformations -can be applied. This enables external optimizers or the manual optimization of -specific SCoPs. -</div> -</div> -</body> -</html> diff --git a/polly/www/example_manual_matmul.html b/polly/www/example_manual_matmul.html deleted file mode 100644 index adac73167b33..000000000000 --- a/polly/www/example_manual_matmul.html +++ /dev/null @@ -1,452 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<!-- Material used from: HTML 4.01 specs: http://www.w3.org/TR/html401/ --> -<html> -<head> - <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> - <title>Polly - Examples</title> - <link type="text/css" rel="stylesheet" href="menu.css"> - <link type="text/css" rel="stylesheet" href="content.css"> -</head> -<body> -<div id="box"> -<!--#include virtual="menu.html.incl"--> -<div id="content"> -<!--=====================================================================--> -<h1>Execute the individual Polly passes manually</h1> -<!--=====================================================================--> - -<p> -This example presents the individual passes that are involved when optimizing -code with Polly. We show how to execute them individually and explain for each -which analysis is performed or what transformation is applied. In this example -the polyhedral transformation is user-provided to show how much performance -improvement can be expected by an optimal automatic optimizer.</p> - -The files used and created in this example are available in the Polly checkout -in the folder <em>www/experiments/matmul</em>. They can be created automatically -by running the <em>www/experiments/matmul/runall.sh</em> script. - -<ol> -<li><h4>Create LLVM-IR from the C code</h4> - -Polly works on LLVM-IR. Hence it is necessary to translate the source files into -LLVM-IR. If more than on file should be optimized the files can be combined into -a single file with llvm-link. - -<pre class="code">clang -S -emit-llvm matmul.c -o matmul.s</pre> -</li> - - -<li><h4>Load Polly automatically when calling the 'opt' tool</h4> - -Polly is not built into opt or bugpoint, but it is a shared library that needs -to be loaded into these tools explicitally. The Polly library is called -LVMPolly.so. It is available in the build/lib/ directory. For convenience we create -an alias that automatically loads Polly if 'opt' is called. -<pre class="code"> -export PATH_TO_POLLY_LIB="~/polly/build/lib/" -alias opt="opt -load ${PATH_TO_POLLY_LIB}/LLVMPolly.so"</pre> -</li> - -<li><h4>Prepare the LLVM-IR for Polly</h4> - -Polly is only able to work with code that matches a canonical form. To translate -the LLVM-IR into this form we use a set of canonicalication passes. They are -scheduled by using '-polly-canonicalize'. -<pre class="code">opt -S -polly-canonicalize matmul.s > matmul.preopt.ll</pre></li> - -<li><h4>Show the SCoPs detected by Polly (optional)</h4> - -To understand if Polly was able to detect SCoPs, we print the -structure of the detected SCoPs. In our example two SCoPs were detected. One in -'init_array' the other in 'main'. - -<pre class="code">opt -basicaa -polly-ast -analyze -q matmul.preopt.ll</pre> - -<pre> -init_array(): -for (c2=0;c2<=1023;c2++) { - for (c4=0;c4<=1023;c4++) { - Stmt_5(c2,c4); - } -} - -main(): -for (c2=0;c2<=1023;c2++) { - for (c4=0;c4<=1023;c4++) { - Stmt_4(c2,c4); - for (c6=0;c6<=1023;c6++) { - Stmt_6(c2,c4,c6); - } - } -} -</pre> -</li> -<li><h4>Highlight the detected SCoPs in the CFGs of the program (requires graphviz/dotty)</h4> - -Polly can use graphviz to graphically show a CFG in which the detected SCoPs are -highlighted. It can also create '.dot' files that can be translated by -the 'dot' utility into various graphic formats. - -<pre class="code">opt -basicaa -view-scops -disable-output matmul.preopt.ll -opt -basicaa -view-scops-only -disable-output matmul.preopt.ll</pre> -The output for the different functions<br /> -view-scops: -<a href="experiments/matmul/scops.main.dot.png">main</a>, -<a href="experiments/matmul/scops.init_array.dot.png">init_array</a>, -<a href="experiments/matmul/scops.print_array.dot.png">print_array</a><br /> -view-scops-only: -<a href="experiments/matmul/scopsonly.main.dot.png">main</a>, -<a href="experiments/matmul/scopsonly.init_array.dot.png">init_array</a>, -<a href="experiments/matmul/scopsonly.print_array.dot.png">print_array</a> -</li> - -<li><h4>View the polyhedral representation of the SCoPs</h4> -<pre class="code">opt -basicaa -polly-scops -analyze matmul.preopt.ll</pre> -<pre> -[...] -Printing analysis 'Polly - Create polyhedral description of Scops' for region: -'for.cond => for.end19' in function 'init_array': - Context: - { [] } - Statements { - Stmt_5 - Domain := - { Stmt_5[i0, i1] : i0 >= 0 and i0 <= 1023 and i1 >= 0 and i1 <= 1023 }; - Schedule := - { Stmt_5[i0, i1] -> schedule[0, i0, 0, i1, 0] }; - WriteAccess := - { Stmt_5[i0, i1] -> MemRef_A[1037i0 + i1] }; - WriteAccess := - { Stmt_5[i0, i1] -> MemRef_B[1047i0 + i1] }; - FinalRead - Domain := - { FinalRead[0] }; - Schedule := - { FinalRead[i0] -> schedule[200000000, o1, o2, o3, o4] }; - ReadAccess := - { FinalRead[i0] -> MemRef_A[o0] }; - ReadAccess := - { FinalRead[i0] -> MemRef_B[o0] }; - } -[...] -Printing analysis 'Polly - Create polyhedral description of Scops' for region: -'for.cond => for.end30' in function 'main': - Context: - { [] } - Statements { - Stmt_4 - Domain := - { Stmt_4[i0, i1] : i0 >= 0 and i0 <= 1023 and i1 >= 0 and i1 <= 1023 }; - Schedule := - { Stmt_4[i0, i1] -> schedule[0, i0, 0, i1, 0, 0, 0] }; - WriteAccess := - { Stmt_4[i0, i1] -> MemRef_C[1067i0 + i1] }; - Stmt_6 - Domain := - { Stmt_6[i0, i1, i2] : i0 >= 0 and i0 <= 1023 and i1 >= 0 and i1 <= 1023 and i2 >= 0 and i2 <= 1023 }; - Schedule := - { Stmt_6[i0, i1, i2] -> schedule[0, i0, 0, i1, 1, i2, 0] }; - ReadAccess := - { Stmt_6[i0, i1, i2] -> MemRef_C[1067i0 + i1] }; - ReadAccess := - { Stmt_6[i0, i1, i2] -> MemRef_A[1037i0 + i2] }; - ReadAccess := - { Stmt_6[i0, i1, i2] -> MemRef_B[i1 + 1047i2] }; - WriteAccess := - { Stmt_6[i0, i1, i2] -> MemRef_C[1067i0 + i1] }; - FinalRead - Domain := - { FinalRead[0] }; - Schedule := - { FinalRead[i0] -> schedule[200000000, o1, o2, o3, o4, o5, o6] }; - ReadAccess := - { FinalRead[i0] -> MemRef_C[o0] }; - ReadAccess := - { FinalRead[i0] -> MemRef_A[o0] }; - ReadAccess := - { FinalRead[i0] -> MemRef_B[o0] }; - } -[...] -</pre> -</li> - -<li><h4>Show the dependences for the SCoPs</h4> -<pre class="code">opt -basicaa -polly-dependences -analyze matmul.preopt.ll</pre> -<pre>Printing analysis 'Polly - Calculate dependences for SCoP' for region: -'for.cond => for.end19' in function 'init_array': - Must dependences: - { } - May dependences: - { } - Must no source: - { } - May no source: - { } -Printing analysis 'Polly - Calculate dependences for SCoP' for region: -'for.cond => for.end30' in function 'main': - Must dependences: - { Stmt_4[i0, i1] -> Stmt_6[i0, i1, 0] : - i0 >= 0 and i0 <= 1023 and i1 >= 0 and i1 <= 1023; - Stmt_6[i0, i1, i2] -> Stmt_6[i0, i1, 1 + i2] : - i0 >= 0 and i0 <= 1023 and i1 >= 0 and i1 <= 1023 and i2 >= 0 and i2 <= 1022; - Stmt_6[i0, i1, 1023] -> FinalRead[0] : - i1 <= 1091540 - 1067i0 and i1 >= -1067i0 and i1 >= 0 and i1 <= 1023; - Stmt_6[1023, i1, 1023] -> FinalRead[0] : - i1 >= 0 and i1 <= 1023 - } - May dependences: - { } - Must no source: - { Stmt_6[i0, i1, i2] -> MemRef_A[1037i0 + i2] : - i0 >= 0 and i0 <= 1023 and i1 >= 0 and i1 <= 1023 and i2 >= 0 and i2 <= 1023; - Stmt_6[i0, i1, i2] -> MemRef_B[i1 + 1047i2] : - i0 >= 0 and i0 <= 1023 and i1 >= 0 and i1 <= 1023 and i2 >= 0 and i2 <= 1023; - FinalRead[0] -> MemRef_A[o0]; - FinalRead[0] -> MemRef_B[o0] - FinalRead[0] -> MemRef_C[o0] : - o0 >= 1092565 or (exists (e0 = [(o0)/1067]: o0 <= 1091540 and o0 >= 0 - and 1067e0 <= -1024 + o0 and 1067e0 >= -1066 + o0)) or o0 <= -1; - } - May no source: - { } -</pre></li> - -<li><h4>Export jscop files</h4> - -Polly can export the polyhedral representation in so called jscop files. Jscop -files contain the polyhedral representation stored in a JSON file. -<pre class="code">opt -basicaa -polly-export-jscop matmul.preopt.ll</pre> -<pre>Writing SCoP 'for.cond => for.end19' in function 'init_array' to './init_array___%for.cond---%for.end19.jscop'. -Writing SCoP 'for.cond => for.end30' in function 'main' to './main___%for.cond---%for.end30.jscop'. -</pre></li> - -<li><h4>Import the changed jscop files and print the updated SCoP structure -(optional)</h4> -<p>Polly can reimport jscop files, in which the schedules of the statements are -changed. These changed schedules are used to descripe transformations. -It is possible to import different jscop files by providing the postfix -of the jscop file that is imported.</p> -<p> We apply three different transformations on the SCoP in the main function. -The jscop files describing these transformations are hand written (and available -in <em>www/experiments/matmul</em>). - -<h5>No Polly</h5> - -<p>As a baseline we do not call any Polly code generation, but only apply the -normal -O3 optimizations.</p> - -<pre class="code"> -opt matmul.preopt.ll -basicaa \ - -polly-import-jscop \ - -polly-ast -analyze -</pre> -<pre> -[...] -main(): -for (c2=0;c2<g;=1535;c2++) { - for (c4=0;c4<g;=1535;c4++) { - Stmt_4(c2,c4); - for (c6=0;c6<g;=1535;c6++) { - Stmt_6(c2,c4,c6); - } - } -} -[...] -</pre> -<h5>Interchange (and Fission to allow the interchange)</h5> -<p>We split the loops and can now apply an interchange of the loop dimensions that -enumerate Stmt_6.</p> -<pre class="code"> -opt matmul.preopt.ll -basicaa \ - -polly-import-jscop -polly-import-jscop-postfix=interchanged \ - -polly-ast -analyze -</pre> -<pre> -[...] -Reading JScop 'for.cond => for.end30' in function 'main' from './main___%for.cond---%for.end30.jscop.interchanged+tiled'. -[...] -main(): -for (c2=0;c2<=1535;c2++) { - for (c4=0;c4<=1535;c4++) { - Stmt_4(c2,c4); - } -} -for (c2=0;c2<=1535;c2++) { - for (c4=0;c4<=1535;c4++) { - for (c6=0;c6<=1535;c6++) { - Stmt_6(c2,c6,c4); - } - } -} -[...] -</pre> -<h5>Interchange + Tiling</h5> -<p>In addition to the interchange we tile now the second loop nest.</p> - -<pre class="code"> -opt matmul.preopt.ll -basicaa \ - -polly-import-jscop -polly-import-jscop-postfix=interchanged+tiled \ - -polly-ast -analyze -</pre> -<pre> -[...] -Reading JScop 'for.cond => for.end30' in function 'main' from './main___%for.cond---%for.end30.jscop.interchanged+tiled'. -[...] -main(): -for (c2=0;c2<=1535;c2++) { - for (c4=0;c4<=1535;c4++) { - Stmt_4(c2,c4); - } -} -for (c2=0;c2<=1535;c2+=64) { - for (c3=0;c3<=1535;c3+=64) { - for (c4=0;c4<=1535;c4+=64) { - for (c5=c2;c5<=c2+63;c5++) { - for (c6=c4;c6<=c4+63;c6++) { - for (c7=c3;c7<=c3+63;c7++) { - Stmt_6(c5,c7,c6); - } - } - } - } - } -} -[...] -</pre> -<h5>Interchange + Tiling + Strip-mining to prepare vectorization</h5> -To later allow vectorization we create a so called trivially parallelizable -loop. It is innermost, parallel and has only four iterations. It can be -replaced by 4-element SIMD instructions. -<pre class="code"> -opt matmul.preopt.ll -basicaa \ - -polly-import-jscop -polly-import-jscop-postfix=interchanged+tiled+vector \ - -polly-ast -analyze </pre> - -<pre> -[...] -Reading JScop 'for.cond => for.end30' in function 'main' from './main___%for.cond---%for.end30.jscop.interchanged+tiled+vector'. -[...] -main(): -for (c2=0;c2<=1535;c2++) { - for (c4=0;c4<=1535;c4++) { - Stmt_4(c2,c4); - } -} -for (c2=0;c2<=1535;c2+=64) { - for (c3=0;c3<=1535;c3+=64) { - for (c4=0;c4<=1535;c4+=64) { - for (c5=c2;c5<=c2+63;c5++) { - for (c6=c4;c6<=c4+63;c6++) { - for (c7=c3;c7<=c3+63;c7+=4) { - for (c8=c7;c8<=c7+3;c8++) { - Stmt_6(c5,c8,c6); - } - } - } - } - } - } -} -[...] -</pre> - -</li> - -<li><h4>Codegenerate the SCoPs</h4> -<p> -This generates new code for the SCoPs detected by polly. -If -polly-import-jscop is present, transformations specified in the imported -jscop files will be applied.</p> -<pre class="code">opt matmul.preopt.ll | opt -O3 > matmul.normalopt.ll</pre> -<pre class="code"> -opt -basicaa \ - -polly-import-jscop -polly-import-jscop-postfix=interchanged \ - -polly-codegen matmul.preopt.ll \ - | opt -O3 > matmul.polly.interchanged.ll</pre> -<pre> -Reading JScop 'for.cond => for.end19' in function 'init_array' from - './init_array___%for.cond---%for.end19.jscop.interchanged'. -File could not be read: No such file or directory -Reading JScop 'for.cond => for.end30' in function 'main' from - './main___%for.cond---%for.end30.jscop.interchanged'. -</pre> -<pre class="code"> -opt -basicaa \ - -polly-import-jscop -polly-import-jscop-postfix=interchanged+tiled \ - -polly-codegen matmul.preopt.ll \ - | opt -O3 > matmul.polly.interchanged+tiled.ll</pre> -<pre> -Reading JScop 'for.cond => for.end19' in function 'init_array' from - './init_array___%for.cond---%for.end19.jscop.interchanged+tiled'. -File could not be read: No such file or directory -Reading JScop 'for.cond => for.end30' in function 'main' from - './main___%for.cond---%for.end30.jscop.interchanged+tiled'. -</pre> -<pre class="code"> -opt -basicaa \ - -polly-import-jscop -polly-import-jscop-postfix=interchanged+tiled+vector \ - -polly-codegen -polly-vectorizer=polly matmul.preopt.ll \ - | opt -O3 > matmul.polly.interchanged+tiled+vector.ll</pre> -<pre> -Reading JScop 'for.cond => for.end19' in function 'init_array' from - './init_array___%for.cond---%for.end19.jscop.interchanged+tiled+vector'. -File could not be read: No such file or directory -Reading JScop 'for.cond => for.end30' in function 'main' from - './main___%for.cond---%for.end30.jscop.interchanged+tiled+vector'. -</pre> -<pre class="code"> -opt -basicaa \ - -polly-import-jscop -polly-import-jscop-postfix=interchanged+tiled+vector \ - -polly-codegen -polly-vectorizer=polly -polly-parallel matmul.preopt.ll \ - | opt -O3 > matmul.polly.interchanged+tiled+openmp.ll</pre> -<pre> -Reading JScop 'for.cond => for.end19' in function 'init_array' from - './init_array___%for.cond---%for.end19.jscop.interchanged+tiled+vector'. -File could not be read: No such file or directory -Reading JScop 'for.cond => for.end30' in function 'main' from - './main___%for.cond---%for.end30.jscop.interchanged+tiled+vector'. -</pre> - -<li><h4>Create the executables</h4> - -Create one executable optimized with plain -O3 as well as a set of executables -optimized in different ways with Polly. One changes only the loop structure, the -other adds tiling, the next adds vectorization and finally we use OpenMP -parallelism. -<pre class="code"> -llc matmul.normalopt.ll -o matmul.normalopt.s && \ - gcc matmul.normalopt.s -o matmul.normalopt.exe -llc matmul.polly.interchanged.ll -o matmul.polly.interchanged.s && \ - gcc matmul.polly.interchanged.s -o matmul.polly.interchanged.exe -llc matmul.polly.interchanged+tiled.ll -o matmul.polly.interchanged+tiled.s && \ - gcc matmul.polly.interchanged+tiled.s -o matmul.polly.interchanged+tiled.exe -llc matmul.polly.interchanged+tiled+vector.ll -o matmul.polly.interchanged+tiled+vector.s && \ - gcc matmul.polly.interchanged+tiled+vector.s -o matmul.polly.interchanged+tiled+vector.exe -llc matmul.polly.interchanged+tiled+vector+openmp.ll -o matmul.polly.interchanged+tiled+vector+openmp.s && \ - gcc -lgomp matmul.polly.interchanged+tiled+vector+openmp.s -o matmul.polly.interchanged+tiled+vector+openmp.exe </pre> - -<li><h4>Compare the runtime of the executables</h4> - -By comparing the runtimes of the different code snippets we see that a simple -loop interchange gives here the largest performance boost. However by adding -vectorization and by using OpenMP we can further improve the performance -significantly. -<pre class="code">time ./matmul.normalopt.exe</pre> -<pre>42.68 real, 42.55 user, 0.00 sys</pre> -<pre class="code">time ./matmul.polly.interchanged.exe</pre> -<pre>04.33 real, 4.30 user, 0.01 sys</pre> -<pre class="code">time ./matmul.polly.interchanged+tiled.exe</pre> -<pre>04.11 real, 4.10 user, 0.00 sys</pre> -<pre class="code">time ./matmul.polly.interchanged+tiled+vector.exe</pre> -<pre>01.39 real, 1.36 user, 0.01 sys</pre> -<pre class="code">time ./matmul.polly.interchanged+tiled+vector+openmp.exe</pre> -<pre>00.66 real, 2.58 user, 0.02 sys</pre> -</li> -</ol> - -</div> -</div> -</body> -</html> diff --git a/polly/www/examples.html b/polly/www/examples.html deleted file mode 100644 index c8ceb659f96b..000000000000 --- a/polly/www/examples.html +++ /dev/null @@ -1,22 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" - "http://www.w3.org/TR/html4/strict.dtd"> -<!-- Material used from: HTML 4.01 specs: http://www.w3.org/TR/html401/ --> -<html> -<head> - <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> - <title>Polly - Examples</title> - <link type="text/css" rel="stylesheet" href="menu.css"> - <link type="text/css" rel="stylesheet" href="content.css"> - <meta http-equiv="REFRESH" - content="0;url=documentation.html"></HEAD> -</head> -<body> -<!--#include virtual="menu.html.incl"--> -<div id="content"> -<!--=====================================================================--> -<h1>Polly: Examples</h1> -<!--=====================================================================--> - -</div> -</body> -</html> |