<spanclass="c8">Proxy Cost Computation in Circuit Training</span>
</p>
<pclass="c0">
<spanclass="c8">
</span>
</p>
<pclass="c6">
<spanclass="c5">In Circuit Training, </span>
<spanclass="c5 c13">p</span>
<spanclass="c5 c13">roxy cost</span>
<spanclass="c7 c5"> is the weighted sum of wirelength, density, and congestion costs. It is used to determine the overall quality of the macro placement solution. </span>
</p>
<pclass="c0">
<spanclass="c7 c5">
</span>
</p>
<pclass="c6">
<imgsrc="images/image1.png">
<imgsrc="images/image2.png">
</p>
<pclass="c6">
<spanclass="c5">Where </span>
<spanclass="c5">w</span>
<spanclass="c4">wirelength</span>
<spanclass="c5">, </span>
<spanclass="c5">w</span>
<spanclass="c4">density</span>
<spanclass="c5"> and </span>
<spanclass="c5">w</span>
<spanclass="c4">congestion</span>
<spanclass="c5"> are the weights. From the </span>
<spanclass="c14 c16 c19">
<aclass="c11"href="https://www.google.com/url?q=https://github.com/google-research/circuit_training/blob/9e7097fa0c2a82030f43b298259941fc8ca6b7ae/circuit_training/environment/environment.py%23L61-L65&sa=D&source=editors&ust=1663567364363272&usg=AOvVaw0S9jYHB_cGUS8EoaQ63Wwq">Circuit Training repo</a>
</span>
<spanclass="c5">, we found that w</span>
<spanclass="c4">wirelength</span>
<spanclass="c5">=1, </span>
<spanclass="c5">w</span>
<spanclass="c4">density</span>
<spanclass="c5">= 1, and </span>
<spanclass="c5">w</span>
<spanclass="c4">congestion</span>
<spanclass="c5">= 0.5. From communication with Google engineers, we learned that in their internal flow, they use w</span>
<spanclass="c4">wirelength</span>
<spanclass="c5">=1, </span>
<spanclass="c5">w</span>
<spanclass="c4">density</span>
<spanclass="c5">= 0.5, and </span>
<spanclass="c5">w</span>
<spanclass="c4">congestion</span>
<spanclass="c7 c5">= 0.5. </span>
</p>
<pclass="c0">
<spanclass="c7 c5">
</span>
</p>
<pclass="c6">
<spanclass="c5">CircuitTraining repo provides the plc_wrapper_main binary to compute these cost functions. There is no available detailed description, or open-source implementation, of these cost functions. With feedback and confirmations from Google engineers, we have implemented all three cost functions; the source code is available </span>
<spanclass="c7 c5">The wirelength cost function depends on the net (bounding box) half-perimeter wirelength (HPWL). So, first we describe steps to compute HPWL of a net – and then we compute the wirelength cost.</span>
</p>
<pclass="c0">
<spanclass="c7 c5">
</span>
</p>
<pclass="c6">
<spanclass="c20 c3">Procedure to compute net HPWL: </span>
<spanclass="c5">A protobuf netlist consists of different types of nodes. Different possible types of nodes are macro, standard cell, macro pin and port. A net consists of one source node and one or more sink nodes. A net can have only standard cell, macro pin and port as its source or sink nodes. In the following wirelength cost computation procedure, we use the term </span>
<spanclass="c5 c13">net weight</span>
<spanclass="c5">,</span>
<spanclass="c5"> which is the weight of the </span>
<spanclass="c14 c5">source node</span>
<spanclass="c7 c5"> of the net. This weight indicates the total number of connections between the source and each sink node. </span>
</p>
<pclass="c0">
<spanclass="c7 c5">
</span>
</p>
<pclass="c6">
<spanclass="c3 c20">Procedure to compute wirelength cost:</span>
<spanclass="c7 c5">In the above procedure, canvas_height is the height of the canvas and canvas_width is the width of the canvas.</span>
</p>
<pclass="c0">
<spanclass="c7 c5">
</span>
</p>
<pclass="c6">
<spanclass="c3">Density cost computation:</span>
</p>
<pclass="c6">
<spanclass="c1">Density cost function depends on the gridcell density. So, first we describe the steps to compute gridcell density – and then we compute the density cost.</span>
</p>
<pclass="c0">
<spanclass="c1">
</span>
</p>
<pclass="c6">
<spanclass="c1">The gridcell density of grid (i, j) is the ratio of the summation of all the overlapped areas (the common area between the node and the grid) of standard cell and macro nodes with the grid (i, j) to the total gridcell area.</span>
</p>
<pclass="c0">
<spanclass="c1">
</span>
</p>
<pclass="c6">
<spanclass="c8">Procedure to compute density cost:</span>
<spanclass="c1">Density cost = (average density of top k densest gridcells) * 0.5.</span>
</li>
</ol>
<pclass="c0">
<spanclass="c8">
</span>
</p>
<pclass="c6">
<span>Notice that 0.5 is not the “weight” of this cost function, but simply another factor applied besides the weight factor from the cost function. Google engineers informed us “ the 0.5 is there to correct the </span>
<spanclass="c14 c16">
<aclass="c11"href="https://www.google.com/url?q=https://github.com/google-research/circuit_training/blob/9e7097fa0c2a82030f43b298259941fc8ca6b7ae/circuit_training/grouping/grouping.py%23L370&sa=D&source=editors&ust=1663567364367750&usg=AOvVaw2-k4jjfubAAwD7CZGZeyft">bloating of the std cell clusters</a>
</span>
<spanclass="c1">”.</span>
</p>
<pclass="c0">
<spanclass="c8">
</span>
</p>
<aid="kix.ke5yvwxwz23v">
</a>
<pclass="c6">
<spanclass="c18">Congestion cost </span>
<spanclass="c18">computation</span>
<spanclass="c8">:</span>
</p>
<pclass="c6">
<spanclass="c1">We divide the congestion cost computation into six sub-stages:</span>
<aclass="c11"href="#kix.vz68r6b84tl6">Compute horizontal and vertical congestion of each grid due to net routing.</a>
</span>
</li>
<liclass="c2 li-bullet-0">
<spanclass="c14 c16">
<aclass="c11"href="#kix.2vjgmooqq2ri">Apply smoothing only to grid congestion due to net routing.</a>
</span>
</li>
<liclass="c2 li-bullet-0">
<spanclass="c14 c16">
<aclass="c11"href="#kix.cmfsmmjct5gp">Compute congestion of each grid due to macros. When a module overlaps with multiple gridcells, if any part of the module partially overlaps with the gridcell (either vertically, or horizontally), we set the top row (if vertical) or right column (if horizontal) to 0. </a>
<spanclass="c1">Congestion due to two-pin nets.</span>
</li>
<liclass="c2 li-bullet-0">
<span>Congestion due to three-pin nets.</span>
</li>
<liclass="c2 li-bullet-0">
<spanclass="c1">Congestion due to multi-pin nets where the number of pins is greater than three.</span>
</li>
</ol>
<pclass="c0">
<spanclass="c1">
</span>
</p>
<pclass="c6">
<span>A grid location (i, j) is the intersection of the i</span>
<spanclass="c22">th</span>
<span> column with the j</span>
<spanclass="c22">th</span>
<spanclass="c1"> row.</span>
</p>
<pclass="c0">
<spanclass="c1">
</span>
</p>
<pclass="c6">
<spanclass="c1">For these three problems we consider that the horizontal routing cost due to a net-segment from (i, j) grid to (i+1, j) grid applies only to the grid (i, j). Similarly the vertical routing cost due to a net-segment from (i,j) grid to (i, j+1) grid applies only to the grid (i,j). Here the direction of the net does not matter. </span>
</p>
<pclass="c0">
<spanclass="c1">
</span>
</p>
<pclass="c6">
<spanclass="c1">Now we compute the congestion due to different nets:</span>
</p>
<pclass="c0">
<spanclass="c1">
</span>
</p>
<pclass="c6">
<spanclass="c21 c14 c13">Congestion due to two-pin nets:</span>
</p>
<pclass="c6">
<spanclass="c1">Two-pin net routing depends on the source and sink node. Consider </span>
<spanclass="c1">Add vertical congestion cost (considering weight w) due to this net to grids from (i2, jmin) to (i2, jmax - 1).</span>
</li>
</ol>
<pclass="c0">
<spanclass="c1">
</span>
</p>
<pclass="c6">
<span>In the following figure P2 is the source pin and P1 is </span>
<span>the </span>
<spanclass="c1">sink pin of the net. When the arrow crosses the top edge of the grid cell it contributes to the vertical congestion cost of the grid cell and when it crosses the right edge of the grid cell it contributes to the horizontal congestion cost of the grid cell.</span>
<spanclass="c14 c13 c21">Congestion due to three-pin nets:</span>
</p>
<pclass="c6">
<spanclass="c1">Congestion cost of three-pin nets does not change when the locations of the pins are interchanged.</span>
</p>
<pclass="c0">
<spanclass="c1">
</span>
</p>
<pclass="c6">
<spanclass="c1">In the following figure, P3 is the source and P1 and P2 are the sinks. We see that interchanging the position does not change the route.</span>
<spanclass="c1">In the below function all congestion cost computation takes into account the weight.</span>
</p>
<pclass="c0">
<spanclass="c1">
</span>
</p>
<pclass="c6">
<spanclass="c1">First we describe these two functions and then we describe how the congestion due to three pin nets are computed.</span>
</p>
<pclass="c6">
<spanclass="c8">Congestion cost update using L_routing:</span>
</p>
<pclass="c6">
<spanclass="c1">The inputs are three pin grid id and net weight. We consider pin grids are (i1, j1), (i2, j2) and (i3, j3) where i1 < i2 < i3 and (j1 < j2 < j3) or (j1 > j2 > j3).</span>
<spanclass="c1">Add horizontal congestion cost due to the net to grids from (i1, j1) to (i2-1, j1)</span>
</li>
<liclass="c2 li-bullet-0">
<spanclass="c1">Add horizontal congestion cost due to the net to grids from (i2, j2) to (i3-1, j2)</span>
</li>
<liclass="c2 li-bullet-0">
<spanclass="c1">Add vertical congestion cost due to the net to grids from (i2, min(j1, j2)) to (i2, max(j1, j2) - 1).</span>
</li>
<liclass="c2 li-bullet-0">
<spanclass="c1">Add vertical congestion cost due to the net to grids from (i3, min(j2, j3)) to (i3, max(j2, j3) - 1).</span>
</li>
</ol>
<pclass="c0">
<spanclass="c1">
</span>
</p>
<pclass="c6">
<spanclass="c18">Congestion cost update using T_routing</span>
<spanclass="c1">:</span>
</p>
<pclass="c6">
<spanclass="c1">The inputs are three pin grid id and net weight. We consider pin grids as (i1, j1), (i2, j2) and (i3, j3) where (j1 <= j2 <= j3 ) or (j1 >= j2 >= j3).</span>
<spanclass="c1">Sort the pin based on the column. After sorting pin locations are (i1, j1), (i2, j2) and (i3, j3). As it is sorted based on column i1 <= i2 <= i3.</span>
</li>
<liclass="c2 li-bullet-0">
<spanclass="c1"> If i1 < i2 and i2 < i3 and min(j1, j3) < j2 and max(j1, j3) > j2:</span>
<spanclass="c1">Add horizontal congestion cost due to the net to grids from (i1, j1) to (i2 -1, j1)</span>
</li>
<liclass="c6 c10 li-bullet-0">
<spanclass="c1">Add horizontal congestion cost due to the net to grids from (i2, j2) to (i3 -1, j2)</span>
</li>
<liclass="c6 c10 li-bullet-0">
<spanclass="c1">Add vertical congestion cost due to the net to grids from (i2, min(j2, j3)) to (i2, max(j2, j3) - 1).</span>
</li>
<liclass="c6 c10 li-bullet-0">
<spanclass="c1">Return</span>
</li>
</ol>
<olclass="c12 lst-kix_nurf0486bu14-0"start="5">
<liclass="c2 li-bullet-0">
<spanclass="c1">Update congestion cost using T_routing.</span>
</li>
</ol>
<pclass="c0">
<spanclass="c1">
</span>
</p>
<pclass="c0">
<spanclass="c1">
</span>
</p>
<pclass="c6">
<spanclass="c1">The following four figures represent the four cases mentioned in the above procedure from point two to point five.</span>
</p>
<pclass="c0">
<spanclass="c1">
</span>
</p>
<pclass="c6">
<spanclass="c1"> Figure corresponding to point two. Figure corresponding to point three.</span>
<spanclass="c1"> Figure corresponding to point three. Figure corresponding to point five.</span>
<spanclass="c1">If not out-of-bound, take k gridcells on each side (left/right), divide the current cell entry by the total number of gridcells taken and add the value to the corresponding gridcell.</span>
<spanclass="c1">If not out-of-bound, take k gridcells on each side (up/down), divide the current cell entry by the total number of gridcells taken and add the value to the corresponding gridcell.</span>
</li>
</ul>
<ulclass="c12 lst-kix_di30h54lh28-1">
<liclass="c6 c10 li-bullet-0">
<span>For example, suppose that </span>
<spanclass="c13">smoothing </span>
<spanclass="c1">= 2 (default value), and we apply it to horizontal grid congestion in four rows of gridcells with respect to the red gridcell highlighted in each row. Then, the blue gridcells in each row show the numbers of gridcells that we divide by (respectively from the top row to the bottom row: 3, 4, 5, 4) when smoothing congestion.</span>
<spanclass="c21 c14 c23">Computation for Macro Congestion:</span>
</p>
<ulclass="c12 lst-kix_8wtulwd7wreh-0 start">
<liclass="c2 li-bullet-0">
<spanclass="c1">For each soft macro + hard MACRO:</span>
</li>
</ul>
<ulclass="c12 lst-kix_8wtulwd7wreh-1 start">
<liclass="c6 c10 li-bullet-0">
<spanclass="c1">For each gridcell it overlaps with:</span>
</li>
</ul>
<ulclass="c12 lst-kix_8wtulwd7wreh-2 start">
<liclass="c6 c9 li-bullet-0">
<spanclass="c1">For both horizontal and vertical macro routing congestion map:</span>
</li>
</ul>
<ulclass="c12 lst-kix_8wtulwd7wreh-3 start">
<liclass="c6 c15 li-bullet-0">
<spanclass="c1">Find the dimension of overlap, multiply by macro routing allocation</span>
</li>
<liclass="c6 c15 li-bullet-0">
<spanclass="c1">Divide by (the grid_cell dimension multiplied by routing per micron)</span>
</li>
<liclass="c6 c15 li-bullet-0">
<spanclass="c1">Add to the corresponding gridcell</span>
</li>
</ul>
<ulclass="c12 lst-kix_8wtulwd7wreh-0">
<liclass="c2 li-bullet-0">
<spanclass="c1">Example:</span>
</li>
</ul>
<ulclass="c12 lst-kix_8wtulwd7wreh-1 start">
<liclass="c6 c10 li-bullet-0">
<span>Given a </span>
<span>single hard macro HM_1</span>
<spanclass="c1"> (pink rectangle in the figure below), we have two pins instantiated on the top-right and bottom-left, driven by the ports at “P_1” located at the bottom-left of the canvas.</span>
<spanclass="c1">Whenever there are gridcells partially overlapped, whether in horizontal or vertical direction, we set the vertical congestion of the top gridcells to 0 (if partially overlapped vertically) and we set the horizontal congestion of the right gridcells to 0 (if partially overlapped horizontally).</span>
</li>
</ul>
<pclass="c0">
<spanclass="c1">
</span>
</p>
<aid="id.mv122dawrylu">
</a>
<pclass="c6">
<spanclass="c21 c14 c23">Computation of the final congestion cost:</span>
</p>
<ulclass="c12 lst-kix_y42njsxzs2xp-0 start">
<liclass="c2 li-bullet-0">
<spanclass="c1">Adding the Macro allocation congestion and Net routing congestion together for both Vertical and Horizontal congestion map</span>
</li>
<liclass="c2 li-bullet-0">
<spanclass="c1">Concat both vertical and horizontal congestion maps together.</span>
</li>
<liclass="c2 li-bullet-0">
<span>Take the top </span>
<spanclass="c18">5%</span>
<span> of the most congested gridcells </span>
<spanclass="c18">in the concatenation</span>
<spanclass="c1">, and average them out to get the final congestion cost. </span>
**MacroPlacement** is an open, transparent effort to provide a public, baseline implementation of [Google Brain's Circuit Training](https://github.com/google-research/circuit_training)(Morpheus) deep RL-based placement method. We will provide (1) testcases in open enablements, along with multiple EDA tool flows; (2) implementations of missing or binarized elements of Circuit Training; (3) reproducible example macro placement solutions produced by our implementation; and (4) post-routing results obtained by full completion of the synthesis-place-and-route flow using both proprietary and open-source tools.
**MacroPlacement** is an open, transparent effort to provide a public, baseline implementation of [Google Brain's Circuit Training](https://github.com/google-research/circuit_training)(Morpheus) deep RL-based placement method. We will provide (1) testcases in open enablements, along with multiple EDA tool flows; (2) implementations of missing or binarized elements of Circuit Training; (3) reproducible example macro placement solutions produced by our implementation; and (4) post-routing results obtained by full completion of the synthesis-place-and-route flow using both proprietary and open-source tools.
## **Important links**
- In [our progress](https://tilos-ai-institute.github.io/MacroPlacement/Docs/OurProgress/) documentation you can find the latest updates.
- The [proxy cost](https://tilos-ai-institute.github.io/MacroPlacement/Docs/ProxyCost/) documentation contains implementation details of wirelength, density and congestion cost that [Circuit Trainig](https://github.com/google-research/circuit_training) is using.
## **Table of Contents**
## **Table of Contents**
<!-- - [Reproducible Example Solutions](#reproducible-example-solutions) -->
<!-- - [Reproducible Example Solutions](#reproducible-example-solutions) -->
-[Testcases](#testcases) contains open-source designs such as Ariane, MemPool and NVDLA.
-[Testcases](#testcases) contains open-source designs such as Ariane, MemPool and NVDLA.
-[Enablements](#enablements) contains PDKs for open-source enablements such as NanGate45, ASAP7 and SKY130HD with FakeStack. Memories required by the designs are also included.
-[Enablements](#enablements) contains PDKs for open-source enablements such as NanGate45, ASAP7 and SKY130HD with FakeStack. Memories required by the designs are also included.
-[Flows](#flows) contains tool setups and runscripts for both proprietary and open-source SP&R tools such as Cadence Genus/Innovus and OpenROAD.
-[Flows](#flows) contains tool setups and runscripts for both proprietary and open-source SP&R tools such as Cadence Genus/Innovus and OpenROAD.
-[Code Elements](#code-elements) contains implementation of engines such as Clustering, Grouping, Gridding, Format translators required by Circuit Training flow.
-[Code Elements](#code-elements) contains implementation of engines such as Clustering, Grouping, Gridding, Format translators required by Circuit Training flow.
-[Baseline for Circuit Training](#baseline-for-circuit-training) provides a competitive baseline for [Google Brain's Circuit Training](https://github.com/google-research/circuit_training).
-[Baseline for Circuit Training](#baseline-for-circuit-training) provides a baseline for [Google Brain's Circuit Training](https://github.com/google-research/circuit_training).