~

103e85ae · Zidong Du · ef75968e · 103e85ae · 103e85ae
Commit 103e85ae authored Sep 10, 2020 by Zidong Du
Hide whitespace changes
Inline Side-by-side

Showing with 15 additions and 19 deletions

AAAI2021/tex/experiments.tex
+8 -8

AAAI2021/tex/theory.tex
+7 -11

No files found.
--- a/AAAI2021/tex/experiments.tex
+++ b/AAAI2021/tex/experiments.tex
@@ -19,7 +19,7 @@ is around 0.8 when $h_{size}\le 20$; MIS significantly decreases to 0.75 when
 $h_{size}$ increases from 20 to 40; MIS further reduces to 0.7 when $h_{size}$
 increases from 40 to 100.
 For different vocabulary sizes, the MIS shares the
-similar behaviour.
+similar behavior.
 It is because symbols in low-compositional languages carry semantic information
 about more concepts. As a result, higher capacity is required to characterize the
 complex semantic information for low-compositional language to emerge.
@@ -41,7 +41,7 @@ We further breakdown our results to investigate the importance of agent capacity
 to the compositionality of symbolic language. Figure~\ref{fig:exp2} reports the
 ratio of high compositional symbolic language in all emerged languages,
 Figure~\ref{fig:exp2} (a) and (b) for $MIS>0.99$ and $MIS>0.9$, respectively. It
-cam be observed that the ratio of high compositional symbolic languages
+can be observed that the ratio of high compositional symbolic languages
 decreases drastically with the increase of $h_{size}$. Especially, when $h_size$
 is large enough (e.g., $>40$), high compositional symbolic language is hard to
 emerge in a natural referential game.
@@ -90,19 +90,19 @@ Figure~\ref{fig:bench}.
 %\end{figure}


-Figure~\ref{fig:exp3} reports the accurcy of Listener, i.e., correctly
-predicting the symbols spoken by Speaker ($t=\hat(t)$), which varies with the
+Figure~\ref{fig:exp3} reports the accuracy of Listener, i.e., ratio of the correctly
+predicted symbols spoken by Speaker ($t=\hat(t)$), which varies with the
 training iterations under different agent capacities.
 Figure~\ref{fig:exp3} (a) shows that when $h_size$ equals to 1, the agent capacity is
 too low to handle languages. Figure~\ref{fig:exp3} (b) shows that when $h_size$
 equals to 2, agent can only learn $LA$ whose compositionality (i.e. \emph{MIS})
 is highest in all three languages. Combing these two observations, we can infer that
-language with lower compositionality need higher agent capacity to ensure communicating
+language with lower compositionality requires higher agent capacity to ensure communicating
 successfully (i.e., $h_size$). Figure~\ref{fig:exp3} (c) to (h) show that the
-higher agent capacity cause a faster training process for all three languages, but the
+higher agent capacity causes a faster training process for all three languages, but the
 improvement for different languages is quite different.
-It is obvious that language with lower compostionality also need higher agent
-capacity to training faster.
+It is obvious that language with lower compositionality also requires higher agent
+capacity to train faster.


 %In conclude, teaching an artificial language with

--- a/AAAI2021/tex/theory.tex
+++ b/AAAI2021/tex/theory.tex
@@ -16,13 +16,13 @@ including the environment setup, agent architecture, and training algorithm.
 \subsection{Environment setup}
 \label{ssec:env}
 Figure~\cite{fig:game} shows the entire environment used in this study,
-i.e., a common used referential game. Roughly, the referential game requires the speaker and
+i.e., a commonly used referential game. Roughly, the referential game requires the speaker and
 listener working cooperatively to accomplish a certain task. 
 In this paper, the task is xxxx.


 \textbf{Game rules} In our referential game, agents follow the following rules
-to finish the game in a cooperatively manner. In each round, once received an
+to finish the game in a cooperative manner. In each round, once received an
 input object $t$, Speaker $S$ speaks a symbol sequence $s$ to Listener $L$ ;
 Listener $L$ reconstruct the predicted result $\hat{t}$ based on the listened
 sequence $s$; if $t=\hat{t}$, agents win this game and receive positive rewards
@@ -59,18 +59,14 @@ including the Speaker $S$ and Listener $L$.

 \textbf{Speaker.} Regarding the Speaker $S$, it is constructed as a three-layer neural
 network. The Speaker $S$ processes the input object $t$ with a fully-connected
-layer to obtain the hidden layer $h^s$, which is split into two sub-layers. Each
-sub-layer is further processed with fully-connected layers to obtain the output
+layer to obtain the hidden layer $h^s$, which is further processed with fully-connected layers to obtain the output
 layer. The output layer results indicate the probability distribution of symbols
 with given input object $t$, i.e., $o_i^{s}=P(s_i|t)$ $i\in{0,1}$. \note{The final
 readout symbols are sampled based on such probability distribution.}

 \textbf{Listener.} Regarding the Listener $L$, it is constructed as a
-three-layer neural network, too. Different from Speaker $S$ that split the
-hidden layer into two sub-layers, $L$ concatenates two sub-layers into one
-output layer. The output layer results are also the probability distribution of
+three-layer neural network, too. Different from Speaker $S$ that tries to separate input object into words, $L$ tries to concatenates words to understand the combined meaning. The output layer results are also the probability distribution of
 symbols $\hat{t}$ with given input sequence $s$, i.e, $o^{L}=P(\hat{t}|s_0,s_1)$.
-\note{The final readout symbol is sampled based the probability.}



@@ -79,7 +75,7 @@ symbols $\hat{t}$ with given input sequence $s$, i.e, $o^{L}=P(\hat{t}|s_0,s_1)$


 To remove all the handcrafted induction as well as for a more realistic
-scenario, agents for this referential game are independent to each other,
+scenario, agents for this referential game are independent of each other,
 without sharing model parameters or architectural connections. As shown in
 Algorithm~\ref{al:learning}, we train the separate Speaker $S$ and Listener $L$ with
 Stochastic Policy Gradient methodology in a tick-tock manner, i.e, training one
@@ -90,13 +86,13 @@ $\theta_S$, where $\theta_S$ is the neural network parameters of Speaker $S$
 with learned output probability distribution $\pi_S$, and $\theta_L$ is the
 neural network parameters of Listener with learned probability distribution $\pi_L$.
 Similarly, when training the Listener, the target is set to maximize the
-expected reward$ J(theta_S, theta_L)$ by fixing the parameter $\theta_S$ and
+expected reward$ J(\theta_S, \theta_L)$ by fixing the parameter $\theta_S$ and
 adjusting the parameter $\theta_L$.

 Additionally, to avoid the handcrafted induction on emergent language, we only
 use the predict result $\hat{t}$ of the listener agent as the 
 evidence of whether giving the positive rewards. Then, the gradients of the
-expected reward $ J(theta_S, theta_L)$ can be calculated as follows:
+expected reward $ J(\theta_S, \theta_L)$ can be calculated as follows:
 \begin{align}
  \nabla_{\theta^S} J &= \mathbb{E}_{\pi^S, \pi^L} \left[ R(\hat{t}, t) \cdot
    \nabla_{\theta^S} \log{\pi^S(s_0, s_1 | t)} \right] \\