Differences
This shows you the differences between two versions of the page.
documentation:iiswc2010_tutorial_flexus [2010/11/28 18:30] mferdman |
documentation:iiswc2010_tutorial_flexus [2010/12/03 22:25] (current) mferdman |
||
---|---|---|---|
Line 109: | Line 109: | ||
- You can see how the source code inserts the magic instructions by looking at **spinlock.c** | - You can see how the source code inserts the magic instructions by looking at **spinlock.c** | ||
- | - Create a new Simics script called break.simics and fill it in with this: | + | - Create a new Simics script called break.simics and fill it in with this: (this should already be available for you under ~/simics-3.0.22/targets/serengeti) |
<code> | <code> | ||
@def hap_callback(user_arg, cpu, arg): | @def hap_callback(user_arg, cpu, arg): | ||
Line 128: | Line 128: | ||
- Type **run** again and wait until the first thread starts to execute and triggers the magic breakpoint | - Type **run** again and wait until the first thread starts to execute and triggers the magic breakpoint | ||
- Save a final checkpoint by typing **write-configuration ~/images/spinlock** | - Save a final checkpoint by typing **write-configuration ~/images/spinlock** | ||
+ | - **FINAL STEP** (to prepare the checkpoint we will be using in the ProtoFlex part of the tutorial). This final step is needed to maximize the performance of the underlying simulated I/O system. Simics is typically the initiator of DMA transactions, which occur at some bulk-sized granularity. This granularity is set by default to a very low value (64 Bytes) in default Simics checkpoints. Since Simics is a software-based simulator, issuing many small bulk transfers imposes no simulation overhead. In our system, large bulk transfers are far more desirable. To change this default setting, you will need to **EDIT** the checkpoint file and make one small change. Copy theType the following commands: | ||
+ | <code> | ||
+ | write-configuration ~/checkpoints/final | ||
+ | quit | ||
+ | perl -pi -e 's/dma_block_size: 64/dma_block_size: 8192/' ~/checkpoints/final | ||
+ | </code> | ||
\\ | \\ | ||
Line 134: | Line 140: | ||
======3. Working with Flexus====== | ======3. Working with Flexus====== | ||
- | From the simics checkpoint you just created, you will get a chance to run some sample jobs with Flexus and create a Flexpoint library. By this point you should have a valid initial checkpoint stored as **~/images/spinlock**. | + | From the simics checkpoint you just created, you will get a chance to run some sample jobs with Flexus. By this point you should have a valid initial checkpoint stored as **~/images/spinlock**. |
- Before starting, you should have a few initial directories in the home (which we will explain in the next steps):<code> | - Before starting, you should have a few initial directories in the home (which we will explain in the next steps):<code> | ||
Line 177: | Line 183: | ||
Run a "spinlock" trace job with CMP.L2Shared.Trace | Run a "spinlock" trace job with CMP.L2Shared.Trace | ||
* **run_job -run trace -cfg 4cores -local CMP.L2Shared.Trace spinlock** | * **run_job -run trace -cfg 4cores -local CMP.L2Shared.Trace spinlock** | ||
- | * Explanation of "local": -local requests to run a batch of jobs locally. without -local an interactive run is assumed which waits at the simics> prompt instead of running. | + | * Explanation of "local": -local requests to run a batch of jobs locally. without -local an interactive run is assumed which waits at the simics> prompt instead of running. |
- | * Explanation of "remote": -remote will submit jobs to a remote cluster (e.g., Condor, PBS, etc...) [not available for the tutorial]. | + | * Explanation of "remote": -remote will submit jobs to a remote cluster (e.g., Condor, PBS, etc...) [not available for the tutorial]. |
+ | * You can **run** simulation, interrupt it with **ctrl+c**, and change debug severity with **flexus.debug-set-severity iface**. | ||
++++ | ++++ | ||
- | ====Displaying statistics through the stat-manager tool==== | + | ====Displaying statistics with the stat-manager tool==== |
++++CLICK - Expand/Collapse| | ++++CLICK - Expand/Collapse| | ||
Find the run directory for the trace job in ~/results/ and examine the resulting statistics database: | Find the run directory for the trace job in ~/results/ and examine the resulting statistics database: | ||
* **~/tutorial_files/flexus_v4/stat-manager/stat-manager list-measurements** | * **~/tutorial_files/flexus_v4/stat-manager/stat-manager list-measurements** | ||
* See the cache hit/miss statistics, branch predictor stats, and instruction mix breakdown. | * See the cache hit/miss statistics, branch predictor stats, and instruction mix breakdown. | ||
- | * **~/tutorial_files/flexus_v4/stat-manager/stat-manager print "Region 000" | less** | + | * **~/tutorial_files/flexus_v4/stat-manager/stat-manager print 'Region 000' | less** |
- | * **~/tutorial_files/flexus_v4/stat-manager/stat-manager print "Region 001" | less** | + | * **~/tutorial_files/flexus_v4/stat-manager/stat-manager print 'Region 001' | less** |
* By default, stat-manager aggregates statistics across all cores. You can override this behavior with the -per-node flag. | * By default, stat-manager aggregates statistics across all cores. You can override this behavior with the -per-node flag. | ||
- | * **~/tutorial_files/flexus_v4/stat-manager/stat-manager -per-node print "Region 001" | less** | + | * **~/tutorial_files/flexus_v4/stat-manager/stat-manager -per-node print 'Region 001' | less** |
++++ | ++++ | ||
+ | |||
+ | ====Running timing simulations==== | ||
+ | ++++CLICK - Expand/Collapse| | ||
+ | Run a "spinlock" timing job with CMP.L2SharedNUCA.OoO | ||
+ | * **run_job -run timing -cfg 4cores -ma CMP.L2SharedNUCA.OoO spinlock** | ||
+ | * NOTE: When running timing simulations, one must pass the **-ma** parameter to Simics. | ||
+ | * You can **run** simulation, interrupt it with **ctrl+c**, and change debug severity with **flexus.debug-set-severity iface**, **run 10** will run 10 cycles on all CPUs. | ||
+ | * Rebuild the simulator with **vverb** debug output (CMP.L2SharedNUCA.OoO-vverb) and try running simulation with **flexus.debug-set-severity vverb** to see the detailed debug output. | ||
+ | ++++ | ||
+ | |||
+ | ======4. Using Statistical Sampling with Flexus====== | ||
====Creating a flexpoint library==== | ====Creating a flexpoint library==== | ||
Line 225: | Line 243: | ||
Use stat-sample to combine all the stats_db.out.selected.gz files into a single statistics file. | Use stat-sample to combine all the stats_db.out.selected.gz files into a single statistics file. | ||
- | * **~/tutorial_files/flexus_v4/stat-manager stat-sample stats_db.out.gz */stats_db.out.selected.gz** | + | * **~/tutorial_files/flexus_v4/stat-manager/stat-sample stats_db.out.gz */stats_db.out.selected.gz** |
* Examine the resulting stats_db.out.gz file that contains the combined results of all flexpoints. | * Examine the resulting stats_db.out.gz file that contains the combined results of all flexpoints. | ||
* Examine the IPCs of the various flexpoints: | * Examine the IPCs of the various flexpoints: |