Commit e8b8636d by sakundu

Added screenshots and results of Ariane133-NG45 design for different Flows

Signed-off-by: sakundu <sakundu@ucsd.edu>
parent 38109225
...@@ -2,35 +2,93 @@ We implement [Ariane design with 133 macros](../../../Testcases/ariane133) on th ...@@ -2,35 +2,93 @@ We implement [Ariane design with 133 macros](../../../Testcases/ariane133) on th
## *Macro Placement Generated by Cadence Flow-1* ## *Macro Placement Generated by Cadence Flow-1*
The screenshot of the design using Cadence Flow-1 on Nangate45 enablement is shown below. The screenshot of the placed and routed Ariane133-NG45 design generated using [Flow-1](../../figures/flow-1.PNG) is shown below. Flow-1 uses Cadence Concurrent Macro Placer (CMP) to place the macros.
<img src="./screenshots/Ariane133_Innovus.png" alt="ariane133_cadence" width="400"/> <p align="center">
<img height="400" src="./screenshots/Innovus_Flow1_Placement.png" alg="wlCost">
<img height="400" src="./screenshots/Innovus_Flow1_Routing.png" alg="wlCost">
</p>
|Physical Design Stage|Core Area (um^2)|Standard Cell Area (um^2)|Macro Area (um^2)|Total Power (mW)|Wirelength(um)|WS(ns)|TNS(ns)|Congestion(H)|Congestion(V)| | Physical Design Stage | Core Area (um^2) | Standard Cell Area (um^2) | Macro Area (um^2) | Total Power (mW) | Wirelength (um) | WS (ns) | TNS (ns) | Congestion (H) | Congestion (V) |
|---------------------|----------------|-------------------------|-----------------|----------------|--------------|------|-------|-------------|-------------| |-----------------------|------------------|---------------------------|-------------------|------------------|-----------------|---------|----------|----------------|----------------|
|preCTS |2560080 |213764 |1018356 |285 |3488909 |0.009 |0.000 |0.00% |0.00% | | postSynth | 1814274 | 215401 | 1018356 | 751.379 | 3643746 | -2.976 | -12125.3 | | |
|postCTS |2560080 |214818 |1018356 |297 |3492066 |0.000 |0.000 |0.00% |0.00% | | preCTS | 1814274 | 241394 | 1018356 | 778.305 | 3731576 | -0.216 | -394.259 | 0.01% | 0.02% |
|postRoute |2560080 |214818 |1018356 |297 |3598577 |0.189 |0.000 | | | | postCTS | 1814274 | 245861 | 1018356 | 825.089 | 3779460 | -0.163 | -168.228 | 0.02% | 0.03% |
| postRoute | 1814274 | 245861 | 1018356 | 825.273 | 3917426 | -0.165 | -186.824 | | |
| postRouteOpt | 1814274 | 246221 | 1018356 | 825.631 | 3921486 | -0.165 | -167.579 | | |
## *Flow-2 Result for the Macro Placement Generated by Cadence Concurrent Macro Placer*
The screenshot of the placed and routed Ariane133-NG45 design generated using [Flow-2](../../figures/flow-2.PNG) is shown below. Here the initial macro placement is generated using CMP.
<p align="center">
<img height="400" src="./screenshots/Innovus_Flow2_Placement.png" alg="wlCost">
<img height="400" src="./screenshots/Innovus_Flow2_Routing.png" alg="wlCost">
</p>
## *Macro Placement Generated by ORFS (Flow-3)* The following table shows the different metrics for Ariane133-NG45 (Utilization: 68% Clock Period: 1.3ns) implementation using [Flow-2](../../figures/flow-2.PNG).
The screenshot of the design using ORFS on Nangate45 enablement is shown below. | Physical Design Stage | Core Area (um^2) | Standard Cell Area (um^2) | Macro Area (um^2) | Total Power (mW) | Wirelength (um) | WS (ns) | TNS (ns) | Congestion (H) | Congestion (V) |
<img src="./screenshots/Ariane133_ORFS.png" alt="ariane136_orfs" width="400"/> |-----------------------|------------------|---------------------------|-------------------|------------------|----------------|--------|----------|---------------|---------------|
| preCTS | 1814274 | 251874 | 1018356 | 807.994 | 3885279 | -0.150 | -242.589 | 0.02% | 0.02% |
| postCTS | 1814274 | 254721 | 1018356 | 851.977 | 3923912 | -0.127 | -133.426 | 0.04% | 0.10% |
| postRoute | 1814274 | 254721 | 1018356 | 850.483 | 4049905 | -0.239 | -410.578 | | |
| postRouteOpt | 1814274 | 256230 | 1018356 | 851.546 | 4057140 | -0.154 | -196.527 | | |
## *Macro Placement Generated by OpenROAD Flow Scripts (ORFS) (Flow-3)*
The following screenshot shows the placed Ariane133-NG45 design generated using [Flow-3](../../figures/flow-3.PNG). Here the macro placement is generated using the RTL-MP engine of OpenROAD.
<p align="center">
<img src="./screenshots/Ariane133_ORFS.png" alt="ariane136_orfs" height="400"/>
</p>
## *Baseline Macro Placement Generated by Human* ## *Baseline Macro Placement Generated by Human*
The screenshot of the design using the Cadence Innovus tool for standard-cell placement and routing on Nangate45 enablement is shown below. ### Manual macro placement on a gridded canvas
<img src="./screenshots/manual_ariane133_Innovus.png" alt="ariane133_cadence" width="400"/> The following screenshots show the placed and routed Ariane133-NG45 design generated using [Flow-2](../../figures/flow-2.PNG), where the macros are placed manually on a gridded canvas.
The manual macro placement is provided in [manual_floorplan.def](https://github.com/TILOS-AI-Institute/MacroPlacement/blob/main/Flows/NanGate45/ariane133/def/manual_floorplan.def). <p align="center">
<img height="400" src="./screenshots/Human_Gridded_Placement.png">
<img height="400" src="./screenshots/Human_Gridded_Routing.png">
</p>
The manual macro placement is provided in [manual_floorplan.def](../ariane133/def/manual_floorplan.def).
We generate the manual macro placement in two steps: We generate the manual macro placement in two steps:
(1) we call the [gridding](https://github.com/TILOS-AI-Institute/MacroPlacement/tree/main/CodeElements/Gridding) scripts to generate grid cells (in this case, we end up with a 27 x 27 grid); and (2) we manually place macros so that their centers lie on centers of grid cells, with no overlap between macros or overflow of macros beyond the layout canvas. (1) we call the [gridding](../../../CodeElements/Gridding/) scripts to generate grid cells (in this case, we end up with a 22 columns x 30 rows grid); and (2) we manually place macros so that their centers lie on centers of grid cells, with no overlap between macros or overflow of macros beyond the layout canvas.
Note that this human-constructed macro placement can serve as a competitive baseline for [Circuit Training](https://github.com/google-research/circuit_training). Note that this human-constructed macro placement can serve as a competitive baseline for [Circuit Training](https://github.com/google-research/circuit_training).
The metrics reported by the Innovus tool after different physical design stages are shown below. The metrics reported by the Innovus tool after different physical design stages are shown below.
Note that (1) we set the activity factor to 0.2 in our flow; (2) the standard cell area does not include physical cells; (3) In order to match [Nature paper](https://www.nature.com/articles/s41586-021-03544-w), we adjust the pin positions to occupy about 60% of the left boundary; and (4) the total macro area for ariane133 (NanGate45) is 1018356um^2, and the overall utilization is 48.228%. Note that (1) we set the activity factor to 0.2 in our flow; (2) the standard cell area does not include physical cells; (3) In order to match [Nature paper](https://www.nature.com/articles/s41586-021-03544-w), we adjust the pin positions to occupy about 60% of the left boundary; and (4) the total macro area for ariane133 (NanGate45) is 1018356um^2, and the overall utilization is ~68%%.
| Physical Design Stage | Core Area (um^2) | Standard Cell Area (um^2) | Macro Area (um^2) | Total Power (mW) | Wirelength (um) | WS (ns) | TNS (ns) | Congestion (H) | Congestion (V) |
|-----------------------|------------------|---------------------------|-------------------|------------------|-----------------|---------|----------|----------------|----------------|
| postSynth | 1814274 | 242813 | 1018356 | 759.332 | 4941086 | -0.764 | -451.478 | | |
| preCTS | 1814274 | 243778 | 1018356 | 791.963 | 4821642 | -0.119 | -137.735 | 0.06% | 0.08% |
| postCTS | 1814274 | 246360 | 1018356 | 835.687 | 4863282 | -0.115 | -50.599 | 0.06% | 0.13% |
| postRoute | 1814274 | 246360 | 1018356 | 834.281 | 5013763 | -0.121 | -74.638 | | |
| postRouteOpt | 1814274 | 246583 | 1018356 | 834.660 | 5018976 | -0.117 | -55.035 | | |
### Macro placement generated by an Industry expert
We thank Dr. Jinwook Jung for sharing the Human Macro Placement of Ariane133 design. The following figure shows placed and routed Ariane133-NG45 design where the macro palcement is generated by Dr. Jung. He has also shared the [place_srams.tcl](./def/place_srams.tcl) to reproduce the macro placement.
<p align="center">
<img height="400" src="./screenshots/Human_Expert_Placement.png">
<img height="400" src="./screenshots/Human_Expert_Routing.png">
</p>
|Physical Design Stage|Core Area (um^2)|Standard Cell Area (um^2)|Macro Area (um^2)|Total Power (mW)|Wirelength(um)|WS(ns)|TNS(ns)|Congestion(H)|Congestion(V)| The following table shows the different metrics, when [Flow-2](../../figures/flow-2.PNG) is used for the above macro placement.
|---------------------|----------------|-------------------------|-----------------|----------------|--------------|------|-------|-------------|-------------| | Physical Design Stage | Core Area (um^2) | Standard Cell Area (um^2) | Macro Area (um^2) | Total Power (mW) | Wirelength (um) | WS (ns) | TNS (ns) | Congestion (H) | Congestion (V) |
|preCTS |2560080 |215189 |1018356 |286 |4470832 |-0.002|-0.005 |0.00% |0.00% | |-----------------------|------------------|---------------------------|-------------------|------------------|-----------------|---------|----------|----------------|----------------|
|postCTS |2560080 |216323 |1018356 |300 |4472866 |0.001 |0.000 |0.00% |0.00% | | postSynth | 1814274 | 244729 | 1018356 | 739.014 | 4664986 | -0.764 | -472.548 | | |
|postRoute |2560080 |216323 |1018356 |299 |4587141 |0.284 |0.000 | | | | preCTS | 1814274 | 245069 | 1018356 | 792.784 | 4549848 | -0.133 | -175.54 | 0.03% | 0.20% |
| postCTS | 1814274 | 247377 | 1018356 | 835.154 | 4567212 | -0.128 | -72.595 | 0.04% | 0.21% |
| postRoute | 1814274 | 247377 | 1018356 | 833.294 | 4705687 | -0.143 | -86.669 | | |
| postRouteOpt | 1814274 | 246722 | 1018356 | 831.745 | 4711746 | -0.107 | -69.589 | | |
## *Macro Placement Generated by Circuit Training (CT)*
The following screenshot shows the placed and routed Ariane133-NG45 design generated using [Flow-2](../../figures/flow-2.PNG), where the initial macro placement is generated by CT.
<p align="center">
<img height="400" src="./screenshots/CT_Placement.png">
<img height="400" src="./screenshots/CT_Routing.png">
</p>
The following table shows the different metrics, when [Flow-2](../../figures/flow-2.PNG) is used when initial macro placement is generated using CT.
| Physical Design Stage | Core Area (um^2) | Standard Cell Area (um^2) | Macro Area (um^2) | Total Power (mW) | Wirelength (um) | WS (ns) | TNS (ns) | Congestion (H) | Congestion (V) |
|-----------------------|------------------|---------------------------|-------------------|------------------|-----------------|---------|----------|----------------|----------------|
| postSynth | 1814274 | 244614 | 1018356 | 761.754 | 4884882 | -0.764 | -533.519 | | |
| preCTS | 1814274 | 244373 | 1018356 | 792.626 | 4732895 | -0.123 | -184.135 | 0.03% | 0.11% |
| postCTS | 1814274 | 247965 | 1018356 | 837.464 | 4762751 | -0.084 | -35.57 | 0.04% | 0.15% |
| postRoute | 1814274 | 247965 | 1018356 | 835.824 | 4887126 | -0.123 | -63.739 | | |
| postRouteOpt | 1814274 | 248448 | 1018356 | 836.399 | 4892431 | -0.09 | -57.448 | | |
\ No newline at end of file
# This script was written and developed by ABKGroup students at UCSD. However, the underlying commands and reports are copyrighted by Cadence.
# We thank Cadence for granting permission to share our research to help promote and foster the next generation of innovators.
# We thank Dr. Jinwook Jung for sharing the Human Macro Placement of Ariane-133 design.
# Disclaimer from Dr. Jung: I don’t have any prior knowledge about the design, and just did the placement based on the SRAM instance names as well as “aesthetic.”
unplaceAllBlocks
set sram_name fakeram45_256x16
set sram_size_x [dbGet [dbGet head.libCells.name $sram_name -p].size_x]
set sram_size_y [dbGet [dbGet head.libCells.name $sram_name -p].size_y]
set sram_size_hx [expr $sram_size_x / 2]
set sram_size_hy [expr $sram_size_y / 2]
set halo_size_x 5
set halo_size_y 5
set fp_llx [dbGet top.fPlan.coreBox_llx]
set fp_lly [dbGet top.fPlan.coreBox_lly]
set fp_urx [dbGet top.fPlan.coreBox_urx]
set fp_ury [dbGet top.fPlan.coreBox_ury]
set fp_cx [expr [dbGet top.fPlan.coreBox_urx]/2]
set fp_cy [expr [dbGet top.fPlan.coreBox_ury]/2]
set site_width [dbGet head.sites.size_x] ;# 0.19
set fp_spacing_urx [expr 260*$site_width]
set fp_spacing_ury [expr 260*$site_width]
#-------------------------------------------------------------------------------
# i_icache/tag
#-------------------------------------------------------------------------------
set begin_llx [expr $fp_urx - $fp_spacing_urx - $sram_size_x]
set begin_lly [expr $fp_cy - $sram_size_hy]
set WAY 4
set NUM_SRAMS_PER_WAY 3
for {set i 0} {$i < $WAY} {incr i} {
for {set j 0} {$j < $NUM_SRAMS_PER_WAY} {incr j} {
set sram_inst [dbGet top.insts.name *i_icache/sram_block[$i].tag*/macro_mem[$j]*]
puts $sram_inst
set row_idx [expr $i]
set col_idx [expr $NUM_SRAMS_PER_WAY*$i + $j]
set x [expr $begin_llx - ($col_idx > 0)*$sram_size_x]
set y [expr $begin_lly]
if {$col_idx % 2 == 0} {
placeInstance $sram_inst $x $y R0
} else {
# Add additional spacing
set x [expr $x - 4*$halo_size_x]
# Mirror the pin side
placeInstance $sram_inst $x $y MY
}
set begin_llx $x
}
}
#-------------------------------------------------------------------------------
# i_icache/data
# I'll place the first two ways on the top of the tag arrays.
# The other two ways below the tag arrays.
#-------------------------------------------------------------------------------
## First half
set ref_tag_sram i_cache_subsystem/i_icache/sram_block[0].tag_sram/macro_mem[0].i_ram
set begin_llx [dbGet [dbGet top.insts.name $ref_tag_sram -p].pt_x]
set begin_lly [dbGet [dbGet top.insts.name $ref_tag_sram -p].pt_y]
# Add additional spacing
set begin_lly [expr $begin_lly + 2*$sram_size_y + 4*$halo_size_y]
set WAY 2
set NUM_SRAMS_PER_WAY 8
for {set i 0} {$i < $WAY} {incr i} {
for {set j 0} {$j < $NUM_SRAMS_PER_WAY} {incr j} {
set sram_inst [dbGet top.insts.name *i_icache/sram_block[$i].data*/macro_mem[$j]*]
puts $sram_inst
set row_idx [expr $i]
set col_idx [expr $j]
set x [expr $begin_llx - ($col_idx > 0)*$sram_size_x]
set y [expr $begin_lly - $row_idx*$sram_size_y]
if {$col_idx % 2 == 0} {
placeInstance $sram_inst $x $y R0
} else {
# Add additional spacing
set x [expr $x - 4*$halo_size_x]
if {$col_idx == $NUM_SRAMS_PER_WAY - 1} {
# Don't mirror the last column.
placeInstance $sram_inst $x $y R0
} else {
# Mirror the pin side
placeInstance $sram_inst $x $y MY
}
}
set begin_llx $x
}
set begin_llx [dbGet [dbGet top.insts.name $ref_tag_sram -p].pt_x]
}
## Second half
set ref_tag_sram i_cache_subsystem/i_icache/sram_block[0].tag_sram/macro_mem[0].i_ram
set begin_llx [dbGet [dbGet top.insts.name $ref_tag_sram -p].pt_x]
set begin_lly [dbGet [dbGet top.insts.name $ref_tag_sram -p].pt_y]
# Add additional spacing
set begin_lly [expr $begin_lly - $sram_size_y - 4*$halo_size_y]
set WAY 4
set NUM_SRAMS_PER_WAY 8
for {set i 2} {$i < $WAY} {incr i} {
for {set j 0} {$j < $NUM_SRAMS_PER_WAY} {incr j} {
set sram_inst [dbGet top.insts.name *i_icache/sram_block[$i].data*/macro_mem[$j]*]
puts $sram_inst
set row_idx [expr $i - 2]
set col_idx [expr $j]
set x [expr $begin_llx - ($col_idx > 0)*$sram_size_x]
set y [expr $begin_lly - $row_idx*$sram_size_y]
if {$col_idx % 2 == 0} {
placeInstance $sram_inst $x $y R0
} else {
# Add additional spacing
set x [expr $x - 4*$halo_size_x]
if {$col_idx == $NUM_SRAMS_PER_WAY - 1} {
# Don't mirror the last column.
placeInstance $sram_inst $x $y R0
} else {
# Mirror the pin side
placeInstance $sram_inst $x $y MY
}
}
set begin_llx $x
}
set begin_llx [dbGet [dbGet top.insts.name $ref_tag_sram -p].pt_x]
}
#-------------------------------------------------------------------------------
# i_nbdcache/tag: First 4 ways near the top-left corner of the core area.
#-------------------------------------------------------------------------------
set begin_llx [expr $fp_llx]
set begin_lly [expr $fp_ury - $sram_size_y]
set num_ways 8
set num_srams_per_way 3
for {set i 0} {$i < $num_ways/2} {incr i} {
for {set j 0} {$j < $num_srams_per_way} {incr j} {
set sram_inst [dbGet top.insts.name *i_nbdcache/sram_block[$i].tag*/macro_mem[$j]*]
puts $sram_inst
set row_idx [expr $i]
set col_idx [expr $j]
set x [expr $begin_llx + ($col_idx > 0)*$sram_size_x]
set y [expr $begin_lly - $row_idx*$sram_size_y]
if {$col_idx % 2 == 0} {
placeInstance $sram_inst $x $y MY
} else {
# Add additional spacing
set x [expr $x + 4*$halo_size_x]
# Mirror
placeInstance $sram_inst $x $y R0
}
set begin_llx $x
}
set begin_llx [expr $fp_llx]
}
#-------------------------------------------------------------------------------
# i_nbdcache/tag: Second 4 ways near the top-left corner of the core area.
#-------------------------------------------------------------------------------
set begin_llx [expr $fp_llx]
set begin_lly [expr $fp_lly]
set num_ways 8
set num_srams_per_way 3
for {set i 4} {$i < $num_ways} {incr i} {
puts "HEY"
for {set j 0} {$j < $num_srams_per_way} {incr j} {
set sram_inst [dbGet top.insts.name *i_nbdcache/sram_block[$i].tag*/macro_mem[$j]*]
puts $sram_inst
set row_idx [expr $i - $num_ways/2]
set col_idx [expr $j]
set x [expr $begin_llx + ($col_idx > 0)*$sram_size_x]
set y [expr $begin_lly + $row_idx*$sram_size_y]
if {$col_idx % 2 == 0} {
placeInstance $sram_inst $x $y MY
} else {
# Add additional spacing
set x [expr $x + 4*$halo_size_x]
# Mirror
placeInstance $sram_inst $x $y R0
}
set begin_llx $x
}
set begin_llx [expr $fp_llx]
}
#-------------------------------------------------------------------------------
# i_nbdcache/data: first 4 ways near the top edge of the core area.
#-------------------------------------------------------------------------------
set begin_llx [expr $fp_urx - $fp_spacing_urx - $sram_size_x]
set begin_lly [expr $fp_ury - $sram_size_y]
set num_ways 8
set num_srams_per_way 8
for {set i 0} {$i < $num_ways/2} {incr i} {
for {set j 0} {$j < $num_srams_per_way} {incr j} {
set sram_inst [dbGet top.insts.name *i_nbdcache/sram_block[$i].data*/macro_mem[$j]*]
puts $sram_inst
set row_idx [expr int(floor($i/2))]
set col_idx [expr $j + ($i%2) * $num_srams_per_way]
set x [expr $begin_llx - ($col_idx > 0)*$sram_size_x]
set y [expr $begin_lly - $row_idx*$sram_size_y]
if {$col_idx % 2 == 0} {
placeInstance $sram_inst $x $y R0
} else {
# Add additional spacing
set x [expr $x - 4*$halo_size_x]
# Mirror the pin side
placeInstance $sram_inst $x $y MY
}
set begin_llx $x
}
if {$i == 1} {
set begin_llx [expr $fp_urx - $fp_spacing_urx - $sram_size_x]
}
}
#-------------------------------------------------------------------------------
# i_nbdcache/data: second 4 ways near the bottom edge of the core area.
#-------------------------------------------------------------------------------
set begin_llx [expr $fp_urx - $fp_spacing_urx - $sram_size_x]
set begin_lly [expr $fp_lly]
set num_ways 8
set num_srams_per_way 8
for {set i 4} {$i < $num_ways} {incr i} {
for {set j 0} {$j < $num_srams_per_way} {incr j} {
set sram_inst [dbGet top.insts.name *i_nbdcache/sram_block[$i].data*/macro_mem[$j]*]
puts $sram_inst
set row_idx [expr int(floor(($i - 4)/2))]
set col_idx [expr $j + ($i%2) * $num_srams_per_way]
set x [expr $begin_llx - ($col_idx > 0)*$sram_size_x]
set y [expr $begin_lly + $row_idx*$sram_size_y]
if {$col_idx % 2 == 0} {
placeInstance $sram_inst $x $y R0
} else {
# Add additional spacing
set x [expr $x - 4*$halo_size_x]
# Mirror the pin side
placeInstance $sram_inst $x $y MY
}
set begin_llx $x
}
if {$i == 5} {
set begin_llx [expr $fp_urx - $fp_spacing_urx - $sram_size_x]
}
}
##-------------------------------------------------------------------------------
## Manual refinement
##-------------------------------------------------------------------------------
placeInstance i_cache_subsystem/i_icache/sram_block[3].tag_sram/macro_mem[2].i_ram 704.7 895.42 R0
placeInstance i_cache_subsystem/i_icache/sram_block[3].tag_sram/macro_mem[1].i_ram 704.7 762.42 R0
placeInstance i_cache_subsystem/i_icache/sram_block[2].tag_sram/macro_mem[2].i_ram 704.7 456.42 R0
placeInstance i_cache_subsystem/i_icache/sram_block[3].tag_sram/macro_mem[0].i_ram 704.7 323.42 R0
placeInstance i_cache_subsystem/i_nbdcache/valid_dirty_sram/macro_mem[0].i_ram 704.7 609.42 R0
placeInstance i_cache_subsystem/i_icache/sram_block[0].data_sram/macro_mem[7].i_ram 762.27 895.42 MY
placeInstance i_cache_subsystem/i_icache/sram_block[1].data_sram/macro_mem[7].i_ram 762.27 762.42 MY
placeInstance i_cache_subsystem/i_icache/sram_block[2].data_sram/macro_mem[7].i_ram 762.27 456.42 MY
placeInstance i_cache_subsystem/i_icache/sram_block[3].data_sram/macro_mem[7].i_ram 762.27 323.42 MY
verify_drc -limit -1
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment