Commit 228f6965 by YZhao
parents c7940929 013236e0
\section{Experiments}
\label{sec:exp}
%\section{Agent Capacity vs. Compositionality} %\section{Agent Capacity vs. Compositionality}
%\label{ssec:exp} %\label{ssec:exp}
\begin{figure}[t]
\centering \includegraphics[width=0.99\columnwidth]{fig/Figure6_Compostionality_of_symbolic_language.pdf}
\caption{Compositionality of symbolic language under different parameters
($[\mu-\sigma,\mu+\sigma]$, where $\mu$ is the mean value and $\sigma$ is
the standard deviation).}
\label{fig:exp1}
\end{figure}
\begin{figure}[t]
\centering \includegraphics[width=0.99\columnwidth]{fig/Figure7_The_ratio_of_high_compositional_language.pdf}
\caption{The ratio of high compositional language. (a) $MIS>0.99$. (b)
$MIS>0.9$. }
\label{fig:exp2}
\end{figure}
\begin{figure}[t]
\centering
\includegraphics[width=0.99\columnwidth]{fig/Figure10_p_value.pdf}
\caption{The Chi-square test between high-compositionality and agent
capacity. (a) $MIS>0.99$. (b)
$MIS>0.9$.}
\label{fig:exp10}
\end{figure}
\begin{figure}[t]
\centering
\includegraphics[width=0.8\columnwidth]{fig/Figure8_Three_artificial_languages_with_different_MIS.pdf}
\caption{Three pre-defined language for teaching. (a) LA: high compositionality
($MIS=1$). (b) LB: mediate compositionality ($MIS=0.83$). (c) LC: low compositionality ($MIS=0.41$).}
\label{fig:bench}
\end{figure}
\section{Experiments}
\label{sec:exp}
We exploit the relationship between agent capacity and the compositionality of We exploit the relationship between agent capacity and the compositionality of
symbolic language that emerged in our natural referential game. symbolic language that emerged in our natural referential game.
For various configuration of For various configuration of
...@@ -27,15 +62,6 @@ emerging high compositional symbolic language. ...@@ -27,15 +62,6 @@ emerging high compositional symbolic language.
\begin{figure}[t]
\centering \includegraphics[width=\columnwidth]{fig/Figure6_Compostionality_of_symbolic_language.pdf}
\caption{Compositionality of symbolic language under different parameters
($[\mu-\sigma,\mu+\sigma]$, where $\mu$ is the mean value and $\sigma$ is
the standard deviation).}
\label{fig:exp1}
\end{figure}
We further breakdown our results to investigate the importance of agent capacity We further breakdown our results to investigate the importance of agent capacity
to the compositionality of symbolic language. Figure~\ref{fig:exp2} reports the to the compositionality of symbolic language. Figure~\ref{fig:exp2} reports the
ratio of high compositional symbolic language in all emerged languages, ratio of high compositional symbolic language in all emerged languages,
...@@ -55,26 +81,21 @@ On other side, agents are enforced to use compositionality to express ...@@ -55,26 +81,21 @@ On other side, agents are enforced to use compositionality to express
more meanings, for the constraint from low capacity. more meanings, for the constraint from low capacity.
\begin{figure}[t]
\centering Additionally, we also perform $\chi^2$ test to check the statistical
\includegraphics[width=\columnwidth]{fig/Figure7_The_ratio_of_high_compositional_language.pdf} significance between the high compositionality and agent
\caption{The ratio of high compositional language. (a) $h_{size}>0.99$. (b) capacity. Figure~\ref{fig:exp10} reports the $\chi^2$ test results for
$h_{size}>0.9$. } $MIS>0.99$ and $MIS>0.9$ in (a) and (b), respectively. It can be observed that
\label{fig:exp2} for different vocabulary size, the p-value is always less than 0.05, which means
\end{figure} the high compositionality has statistical significance related to agent
capacity.
%\subsection{Breakdown} %\subsection{Breakdown}
%\label{ssec:language} %\label{ssec:language}
\begin{figure}[t]
\centering
\includegraphics[width=0.8\columnwidth]{fig/Figure8_Three_artificial_languages_with_different_MIS.pdf}
\caption{Three pre-defined language for teaching. (a) LA: high compositionality
($\mathit{MIS}=1$). (b) LB: mediate compositionality ($\mathit{MIS}=0.83$). (c) LC: low compositionality ($\mathit{MIS}=0.41$).}
\label{fig:bench}
\end{figure}
\begin{figure*}[t] \begin{figure*}[t]
......
\section{Related works}
\label{sec:relatedwork}
%external environmental factors
Previous works focus on the external environmental factors that impact the
compositionality of emerged symbolic language.
Some significant works on studying the external environmental factor on the compositionality of emergent language are summarized on Table~\ref{tab:rel}.
For example, ~\citet{kirby2015compression} explored how the pressures for expressivity and compressibility lead the structured language.
~\citet{kottur-etal-2017-natural} constrained the vocabulary size and whether the listener has memory to coax the compositionality of the emergent language.
~\citet{lazaridou2018emergence} showed that the degree of structure found in the input data affects the emergence of the symbolic language.
~\citet{li2019ease} studied how the pressure, ease of teaching, impact on the iterative language of the population regime.
~\citet{evtimova2018emergent} designed a novel multi-modal scenarios, which the speaker and the listener should access to different modalities of the input object, to explore the language emergence.
Such factors are deliberately designed, which are too ideal to be true in the real world.
In this paper, these handcrafted inductions above are all removed, and the high compostional language is leaded only by the agent capacity. \rmk{this should be largely emphasized.}
\begin{table*}[htbp] \begin{table*}[htbp]
\centering \centering
\small \small
...@@ -36,6 +20,24 @@ In this paper, these handcrafted inductions above are all removed, and the high ...@@ -36,6 +20,24 @@ In this paper, these handcrafted inductions above are all removed, and the high
\end{tabular} \end{tabular}
\end{table*} \end{table*}
\section{Related works}
\label{sec:relatedwork}
%external environmental factors
Previous works focus on the external environmental factors that impact the
compositionality of emerged symbolic language.
For example, ~\citet{kirby2015compression} explored how the pressures for expressivity and compressibility lead the structured language.
~\citet{kottur-etal-2017-natural} constrained the vocabulary size and whether the listener has memory to coax the compositionality of the emergent language.
~\citet{lazaridou2018emergence} showed that the degree of structure found in the input data affects the emergence of the symbolic language.
~\citet{li2019ease} studied how the pressure, ease of teaching, impact on the iterative language of the population regime.
~\citet{evtimova2018emergent} designed a novel multi-modal scenarios, which the speaker and the listener should access to different modalities of the input object, to explore the language emergence.
Such factors are deliberately designed, which are too ideal to be true in
the real world. None of these works realizes the importance of model capacity of
agent itself. \rmk{this should be largely emphasized.}
%measure %measure
To measure the compositionality of emerged symbolic language, many metrics are To measure the compositionality of emerged symbolic language, many metrics are
proposed~\cite{kottur-etal-2017-natural,choi2018compositional,lazaridou2018emergence,evtimova2018emergent,chaabouni2020compositionality}. proposed~\cite{kottur-etal-2017-natural,choi2018compositional,lazaridou2018emergence,evtimova2018emergent,chaabouni2020compositionality}.
...@@ -61,5 +63,9 @@ The topographic similarity\cite{lazaridou2018emergence} is introduced to measure ...@@ -61,5 +63,9 @@ The topographic similarity\cite{lazaridou2018emergence} is introduced to measure
From Table~\ref{tab:rel}, most metrics are proposed on the sight of the speaker. In our view, human begings developed the language based on both the speakers and the listener. Only one research of \cite{choi2018compositional} in Table~\ref{tab:rel} qualitatively considered from the sight of the speaker and the listener. In this paper, we propose a novel quatitative metric from both the speaker's sight and the listener's sight. From Table~\ref{tab:rel}, most metrics are proposed on the sight of the speaker. In our view, human begings developed the language based on both the speakers and the listener. Only one research of \cite{choi2018compositional} in Table~\ref{tab:rel} qualitatively considered from the sight of the speaker and the listener. In this paper, we propose a novel quatitative metric from both the speaker's sight and the listener's sight.
In conclusion, the previous works coaxed the compositional language based on some careful designed handcrafted inductions,
and the metric from the sight of both the speaker and the listener is still lacking.
In this paper, we remove all the handcrafted inductions in Table~\ref{tab:rel},
and use the minimized induction based on theoretical analysis.
Moreover, we propose a novel quantitative metric, which is properer than previous works based on the speaker's sight.
...@@ -33,6 +33,15 @@ R\left(c_0,s_0\right) & R\left(c_0,s_0\right) ...@@ -33,6 +33,15 @@ R\left(c_0,s_0\right) & R\left(c_0,s_0\right)
\end{equation} \end{equation}
Each column of $M$ correspond to the semantic information carried by one symbol. In a perfectly compositional language, each symbol represents one specific concept exclusively. Therefore, the similarity between the columns of $M$ and a one-hot vector is align with the compositionality of the emergent language. Each column of $M$ correspond to the semantic information carried by one symbol. In a perfectly compositional language, each symbol represents one specific concept exclusively. Therefore, the similarity between the columns of $M$ and a one-hot vector is align with the compositionality of the emergent language.
\begin{figure}[t]
\centering
\includegraphics[width=\columnwidth]{fig/Figure5_An_emergent_language.pdf}
\caption{An emergent language that the unilateral metrics cannot measure its non-compositionality. Notice that given $s_1 = \mathrm{a}$, the listener can neither determine the shape nor the color without the knowledge about $s_0$.}
\label{fig:unilateral}
\end{figure}
Finally, we define \emph{raw mutual information similarity} ($\mathit{MIS}_0$) Finally, we define \emph{raw mutual information similarity} ($\mathit{MIS}_0$)
as the average cosine similarity of $M$ columns and one-hot vectors, as as the average cosine similarity of $M$ columns and one-hot vectors, as
Equation~\ref{eq:mis2}. Furthermore, $\mathit{MIS}$ is the normalized mutual Equation~\ref{eq:mis2}. Furthermore, $\mathit{MIS}$ is the normalized mutual
...@@ -49,12 +58,6 @@ following formula: ...@@ -49,12 +58,6 @@ following formula:
\mathit{MIS} &= \frac{n\cdot \mathit{MIS}_0 - 1}{n-1} \mathit{MIS} &= \frac{n\cdot \mathit{MIS}_0 - 1}{n-1}
\end{aligned}\end{equation} \end{aligned}\end{equation}
\begin{figure}[t]
\centering
\includegraphics[width=\columnwidth]{fig/Figure5_An_emergent_language.pdf}
\caption{An emergent language that the unilateral metrics cannot measure its non-compositionality. Notice that given $s_1 = \mathrm{a}$, the listener can neither determine the shape nor the color without the knowledge about $s_0$.}
\label{fig:unilateral}
\end{figure}
MIS is a bilateral metric. Unilateral metrics, e.g. \emph{topographic similarity (topo)}\cite{} and \emph{posdis}\cite{}, only take the policy of the speaker into consideration. We provide an example to illustrate the inadequacy of unilateral metrics, shown in Figure~\ref{fig:unilateral}. In this example, the speaker only uses $s_1$ to represent shape. From the perspective of speaker, the language is perfectly compositional (i.e. both topo and posdis are 1). However, the listener cannot distinguish the shape depend only on $s_1$, showing the non-compositionality in this language. The bilateral metric MIS addresses such defect by taking the policy of the listener into account, thus $\mathit{MIS} < 1$. MIS is a bilateral metric. Unilateral metrics, e.g. \emph{topographic similarity (topo)}\cite{} and \emph{posdis}\cite{}, only take the policy of the speaker into consideration. We provide an example to illustrate the inadequacy of unilateral metrics, shown in Figure~\ref{fig:unilateral}. In this example, the speaker only uses $s_1$ to represent shape. From the perspective of speaker, the language is perfectly compositional (i.e. both topo and posdis are 1). However, the listener cannot distinguish the shape depend only on $s_1$, showing the non-compositionality in this language. The bilateral metric MIS addresses such defect by taking the policy of the listener into account, thus $\mathit{MIS} < 1$.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment