Commit 103e85ae by Zidong Du

~

parent ef75968e
......@@ -19,7 +19,7 @@ is around 0.8 when $h_{size}\le 20$; MIS significantly decreases to 0.75 when
$h_{size}$ increases from 20 to 40; MIS further reduces to 0.7 when $h_{size}$
increases from 40 to 100.
For different vocabulary sizes, the MIS shares the
similar behaviour.
similar behavior.
It is because symbols in low-compositional languages carry semantic information
about more concepts. As a result, higher capacity is required to characterize the
complex semantic information for low-compositional language to emerge.
......@@ -41,7 +41,7 @@ We further breakdown our results to investigate the importance of agent capacity
to the compositionality of symbolic language. Figure~\ref{fig:exp2} reports the
ratio of high compositional symbolic language in all emerged languages,
Figure~\ref{fig:exp2} (a) and (b) for $MIS>0.99$ and $MIS>0.9$, respectively. It
cam be observed that the ratio of high compositional symbolic languages
can be observed that the ratio of high compositional symbolic languages
decreases drastically with the increase of $h_{size}$. Especially, when $h_size$
is large enough (e.g., $>40$), high compositional symbolic language is hard to
emerge in a natural referential game.
......@@ -90,19 +90,19 @@ Figure~\ref{fig:bench}.
%\end{figure}
Figure~\ref{fig:exp3} reports the accurcy of Listener, i.e., correctly
predicting the symbols spoken by Speaker ($t=\hat(t)$), which varies with the
Figure~\ref{fig:exp3} reports the accuracy of Listener, i.e., ratio of the correctly
predicted symbols spoken by Speaker ($t=\hat(t)$), which varies with the
training iterations under different agent capacities.
Figure~\ref{fig:exp3} (a) shows that when $h_size$ equals to 1, the agent capacity is
too low to handle languages. Figure~\ref{fig:exp3} (b) shows that when $h_size$
equals to 2, agent can only learn $LA$ whose compositionality (i.e. \emph{MIS})
is highest in all three languages. Combing these two observations, we can infer that
language with lower compositionality need higher agent capacity to ensure communicating
language with lower compositionality requires higher agent capacity to ensure communicating
successfully (i.e., $h_size$). Figure~\ref{fig:exp3} (c) to (h) show that the
higher agent capacity cause a faster training process for all three languages, but the
higher agent capacity causes a faster training process for all three languages, but the
improvement for different languages is quite different.
It is obvious that language with lower compostionality also need higher agent
capacity to training faster.
It is obvious that language with lower compositionality also requires higher agent
capacity to train faster.
%In conclude, teaching an artificial language with
......
......@@ -16,13 +16,13 @@ including the environment setup, agent architecture, and training algorithm.
\subsection{Environment setup}
\label{ssec:env}
Figure~\cite{fig:game} shows the entire environment used in this study,
i.e., a common used referential game. Roughly, the referential game requires the speaker and
i.e., a commonly used referential game. Roughly, the referential game requires the speaker and
listener working cooperatively to accomplish a certain task.
In this paper, the task is xxxx.
\textbf{Game rules} In our referential game, agents follow the following rules
to finish the game in a cooperatively manner. In each round, once received an
to finish the game in a cooperative manner. In each round, once received an
input object $t$, Speaker $S$ speaks a symbol sequence $s$ to Listener $L$ ;
Listener $L$ reconstruct the predicted result $\hat{t}$ based on the listened
sequence $s$; if $t=\hat{t}$, agents win this game and receive positive rewards
......@@ -59,18 +59,14 @@ including the Speaker $S$ and Listener $L$.
\textbf{Speaker.} Regarding the Speaker $S$, it is constructed as a three-layer neural
network. The Speaker $S$ processes the input object $t$ with a fully-connected
layer to obtain the hidden layer $h^s$, which is split into two sub-layers. Each
sub-layer is further processed with fully-connected layers to obtain the output
layer to obtain the hidden layer $h^s$, which is further processed with fully-connected layers to obtain the output
layer. The output layer results indicate the probability distribution of symbols
with given input object $t$, i.e., $o_i^{s}=P(s_i|t)$ $i\in{0,1}$. \note{The final
readout symbols are sampled based on such probability distribution.}
\textbf{Listener.} Regarding the Listener $L$, it is constructed as a
three-layer neural network, too. Different from Speaker $S$ that split the
hidden layer into two sub-layers, $L$ concatenates two sub-layers into one
output layer. The output layer results are also the probability distribution of
three-layer neural network, too. Different from Speaker $S$ that tries to separate input object into words, $L$ tries to concatenates words to understand the combined meaning. The output layer results are also the probability distribution of
symbols $\hat{t}$ with given input sequence $s$, i.e, $o^{L}=P(\hat{t}|s_0,s_1)$.
\note{The final readout symbol is sampled based the probability.}
......@@ -79,7 +75,7 @@ symbols $\hat{t}$ with given input sequence $s$, i.e, $o^{L}=P(\hat{t}|s_0,s_1)$
To remove all the handcrafted induction as well as for a more realistic
scenario, agents for this referential game are independent to each other,
scenario, agents for this referential game are independent of each other,
without sharing model parameters or architectural connections. As shown in
Algorithm~\ref{al:learning}, we train the separate Speaker $S$ and Listener $L$ with
Stochastic Policy Gradient methodology in a tick-tock manner, i.e, training one
......@@ -90,13 +86,13 @@ $\theta_S$, where $\theta_S$ is the neural network parameters of Speaker $S$
with learned output probability distribution $\pi_S$, and $\theta_L$ is the
neural network parameters of Listener with learned probability distribution $\pi_L$.
Similarly, when training the Listener, the target is set to maximize the
expected reward$ J(theta_S, theta_L)$ by fixing the parameter $\theta_S$ and
expected reward$ J(\theta_S, \theta_L)$ by fixing the parameter $\theta_S$ and
adjusting the parameter $\theta_L$.
Additionally, to avoid the handcrafted induction on emergent language, we only
use the predict result $\hat{t}$ of the listener agent as the
evidence of whether giving the positive rewards. Then, the gradients of the
expected reward $ J(theta_S, theta_L)$ can be calculated as follows:
expected reward $ J(\theta_S, \theta_L)$ can be calculated as follows:
\begin{align}
\nabla_{\theta^S} J &= \mathbb{E}_{\pi^S, \pi^L} \left[ R(\hat{t}, t) \cdot
\nabla_{\theta^S} \log{\pi^S(s_0, s_1 | t)} \right] \\
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment