ProtoFlex   Carnegie Mellon

Differences

This shows you the differences between two versions of the page.

documentation:iiswc2010_tutorial_flexus [2010/11/28 14:15]
mferdman
documentation:iiswc2010_tutorial_flexus [2010/12/03 22:25] (current)
mferdman
Line 88: Line 88:
  * Hit ESC and type **:wq** to save the file and exit.   * Hit ESC and type **:wq** to save the file and exit.
  * Type **mkdir /host**   * Type **mkdir /host**
-  * This is usually a good time to save out a checkpoint right before you mount the host file system.  At the Simics console, type **CTRL-C** followed by something like **write-configuration <ckpt_dir>/<your_checkpoint_name_b4_sfsmount>**+  * This is usually a good time to save out a checkpoint right before you mount the host file system.  At the Simics console, type **CTRL-C** followed by something like **write-configuration ~/images/b4_sfsmount**
  * Type **run** at the Simics console to resume.   * Type **run** at the Simics console to resume.
  * Within the simulated console, type **mount /host**   * Within the simulated console, type **mount /host**
  * Type **ls /host** to see the underlying host machine's root directory   * Type **ls /host** to see the underlying host machine's root directory
-At this point, you should copy microbenchmark files from **~/tutorial_files/microbenchmarks** into the target machine (by copying it from **/host** to a location on the simulated disk).  Save out a NEW checkpoint called **~/images/benchloaded** and quit out of Simics. Now open the checkpoint you saved with vi by typing **vi ~/images/benchloaded** and locate and delete the following lines:+At this point, you should copy microbenchmark files from **/host/home/pf_user/tutorial_files/microbenchmarks** into the target machine (by copying it from **/host** to a location on the simulated disk).  Save out a NEW checkpoint called **~/images/benchloaded** and quit out of Simics. Now open the checkpoint you saved with vi by typing **vi ~/images/benchloaded** and locate and delete the following lines:
<code> <code>
Line 109: Line 109:
  - You can see how the source code inserts the magic instructions by looking at **spinlock.c**   - You can see how the source code inserts the magic instructions by looking at **spinlock.c**
-  - Create a new Simics script called break.simics and fill it in with this:+  - Create a new Simics script called break.simics and fill it in with this: (this should already be available for you under ~/simics-3.0.22/targets/serengeti)
<code> <code>
@def hap_callback(user_arg, cpu, arg): @def hap_callback(user_arg, cpu, arg):
Line 122: Line 122:
</code> </code>
-  - Launch Simics by typing **start-simics break.simics**+  - Launch Simics by typing **../../scripts/start-simics break.simics**
  - Within the simulated console, navigate to the directory where you copied over the microbenchmark files.   - Within the simulated console, navigate to the directory where you copied over the microbenchmark files.
  - Type: **./spinlock 4 1000000000 10 10 0**  (this indicates we want 4 threads and run for effectively an infinite number of iterations)   - Type: **./spinlock 4 1000000000 10 10 0**  (this indicates we want 4 threads and run for effectively an infinite number of iterations)
Line 128: Line 128:
  - Type **run** again and wait until the first thread starts to execute and triggers the magic breakpoint   - Type **run** again and wait until the first thread starts to execute and triggers the magic breakpoint
  - Save a final checkpoint by typing **write-configuration ~/images/spinlock**   - Save a final checkpoint by typing **write-configuration ~/images/spinlock**
 +  - **FINAL STEP** (to prepare the checkpoint we will be using in the ProtoFlex part of the tutorial).  This final step is needed to maximize the performance of the underlying simulated I/O system. Simics is typically the initiator of DMA transactions, which occur at some bulk-sized granularity.  This granularity is set by default to a very low value (64 Bytes) in default Simics checkpoints.  Since Simics is a software-based simulator, issuing many small bulk transfers imposes no simulation overhead.  In our system, large bulk transfers are far more desirable. To change this default setting, you will need to **EDIT** the checkpoint file and make one small change. Copy theType the following commands:
 +<code>
 +write-configuration ~/checkpoints/final
 +quit
 +perl -pi -e 's/dma_block_size: 64/dma_block_size: 8192/' ~/checkpoints/final
 +</code>
\\ \\
Line 134: Line 140:
======3. Working with Flexus====== ======3. Working with Flexus======
-From the simics checkpoint you just created, you will get a chance to run some sample jobs with Flexus and create a Flexpoint library. By this point you should have a valid initial checkpoint stored as **~/images/spinlock**.+From the simics checkpoint you just created, you will get a chance to run some sample jobs with Flexus. By this point you should have a valid initial checkpoint stored as **~/images/spinlock**.
  - Before starting, you should have a few initial directories in the home (which we will explain in the next steps):<code>   - Before starting, you should have a few initial directories in the home (which we will explain in the next steps):<code>
Line 165: Line 171:
  * load the initial checkpoint in Simics (using the **start-simics** script)   * load the initial checkpoint in Simics (using the **start-simics** script)
  * simics> **read-configuration ~/images/spinlock**   * simics> **read-configuration ~/images/spinlock**
-  * simics> **run-command-file ~/tutorial_files/flexus_v4/scripts/mem_and_io_proxy.simics**+  * simics> **run-command-file ~/tutorial_files/flexus_v4/scripts/mem_io_proxy.simics**
  * simics> **write-configuration ~/ckpts/spinlock/baseline/phase_000/simics/phase_000**   * simics> **write-configuration ~/ckpts/spinlock/baseline/phase_000/simics/phase_000**
Line 177: Line 183:
Run a "spinlock" trace job with CMP.L2Shared.Trace Run a "spinlock" trace job with CMP.L2Shared.Trace
  * **run_job -run trace -cfg 4cores -local CMP.L2Shared.Trace spinlock**   * **run_job -run trace -cfg 4cores -local CMP.L2Shared.Trace spinlock**
-    * Explanation of "local": -local requests to run a batch of jobs locally.  without -local an interactive run is assumed which waits at the simics> prompt instead of running. +  * Explanation of "local": -local requests to run a batch of jobs locally.  without -local an interactive run is assumed which waits at the simics> prompt instead of running. 
-   * Explanation of "remote": -remote will submit jobs to a remote cluster (e.g., Condor, PBS, etc...) [not available for the tutorial].+ * Explanation of "remote": -remote will submit jobs to a remote cluster (e.g., Condor, PBS, etc...) [not available for the tutorial]
 +  * You can **run** simulation, interrupt it with **ctrl+c**, and change debug severity with **flexus.debug-set-severity iface**.
++++ ++++
-====Displaying statistics through the stat-manager tool====+====Displaying statistics with the stat-manager tool====
++++CLICK - Expand/Collapse| ++++CLICK - Expand/Collapse|
Find the run directory for the trace job in ~/results/ and examine the resulting statistics database: Find the run directory for the trace job in ~/results/ and examine the resulting statistics database:
  * **~/tutorial_files/flexus_v4/stat-manager/stat-manager list-measurements**   * **~/tutorial_files/flexus_v4/stat-manager/stat-manager list-measurements**
  * See the cache hit/miss statistics, branch predictor stats, and instruction mix breakdown.   * See the cache hit/miss statistics, branch predictor stats, and instruction mix breakdown.
-    * **~/tutorial_files/flexus_v4/stat-manager/stat-manager print "Region 000" | less** +    * **~/tutorial_files/flexus_v4/stat-manager/stat-manager print 'Region 000' | less** 
-    * **~/tutorial_files/flexus_v4/stat-manager/stat-manager print "Region 001" | less**+    * **~/tutorial_files/flexus_v4/stat-manager/stat-manager print 'Region 001' | less**
  * By default, stat-manager aggregates statistics across all cores.  You can override this behavior with the -per-node flag.   * By default, stat-manager aggregates statistics across all cores.  You can override this behavior with the -per-node flag.
-    * **~/tutorial_files/flexus_v4/stat-manager/stat-manager -per-node print "Region 001" | less**+    * **~/tutorial_files/flexus_v4/stat-manager/stat-manager -per-node print 'Region 001' | less**
++++ ++++
 +
 +====Running timing simulations====
 +++++CLICK - Expand/Collapse|
 +Run a "spinlock" timing job with CMP.L2SharedNUCA.OoO
 +  * **run_job -run timing -cfg 4cores -ma CMP.L2SharedNUCA.OoO spinlock**
 +  * NOTE: When running timing simulations, one must pass the **-ma** parameter to Simics.
 +  * You can **run** simulation, interrupt it with **ctrl+c**, and change debug severity with **flexus.debug-set-severity iface**, **run 10** will run 10 cycles on all CPUs.
 +  * Rebuild the simulator with **vverb** debug output (CMP.L2SharedNUCA.OoO-vverb) and try running simulation with **flexus.debug-set-severity vverb** to see the detailed debug output.
 +++++
 +
 +======4. Using Statistical Sampling with Flexus======
====Creating a flexpoint library==== ====Creating a flexpoint library====
Line 225: Line 243:
Use stat-sample to combine all the stats_db.out.selected.gz files into a single statistics file. Use stat-sample to combine all the stats_db.out.selected.gz files into a single statistics file.
-  * **~/tutorial_files/flexus_v4/stat-manager stat-sample stats_db.out.gz */stats_db.out.selected.gz**+  * **~/tutorial_files/flexus_v4/stat-manager/stat-sample stats_db.out.gz */stats_db.out.selected.gz**
  * Examine the resulting stats_db.out.gz file that contains the combined results of all flexpoints.   * Examine the resulting stats_db.out.gz file that contains the combined results of all flexpoints.
  * Examine the IPCs of the various flexpoints:   * Examine the IPCs of the various flexpoints: