Commit 66b6f484 by Xing

Update theory.tex

parent bf1172ac
......@@ -78,7 +78,7 @@ symbols $\hat{t}$ with given input sequence $s$, i.e, $o^{L}=P(\hat{t}|s_0,s_1)$
To remove all the handcrafted induction as well as for a more realistic
scenario, agents for this referential game are independent of each other,
without sharing model parameters or architectural connections. As shown in
with no sharing model parameters or architectural connections. As shown in
Algorithm~\ref{al:learning}, we train the separate Speaker $S$ and Listener $L$ with
Stochastic Policy Gradient methodology in a tick-tock manner, i.e, training one
agent while keeping the other one. Roughly, when training the Speaker, the
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment