\caption{An emergent language that the unilateral metrics cannot measure its non-compositionality. Notice that given $s_1=\mathrm{a}$, the listener can neither determine the shape nor the color without the knowledge about $s_0$.}
\label{fig:unilateral}
\end{figure}
Before giving the definition of MIS, we first model the agents in the referential games. As shown in Figure~\ref{fig:modeling}, the listener and speaker in the referential game are connected in tandem. The speaker agent can be regard as a channel, whose input is a concept $c =(c_0, c_1)$ and output is a symbol $s =(s_0, s_1)$. The listener agent can be regard as another channel, whose input is a symbol $s =(s_0, s_1)$ and output is a predict result $\hat{t}=(\hat{c}_0, \hat{c}_1)$. Since the output of the listener only depends on the symbol $s$, we can model the policy of the speaker agent and the listener agent by the probability distribution $P(s =(s_0, s_1) | t =(c_0, c_1))$ and $P(\hat{t}=(\hat{c}_0, \hat{c}_1) | s_0, s_1)$, respectively.
Now we can analyse the information of the concepts preserved in the transmission process given the symbol transmitted, i.e. the conditional mutual information $I\left(t,\hat{t}|s\right)$. Whenever a stable language emerged, the speaker and the listener consistently use a specific symbol $s$ to refer to a specific object $t$. Therefore we can safely say $I\left(t,\hat{t}|s\right)= I\left(t,\hat{t}|s_{t,\hat{t}}\right)$ where $s_{t,\hat{t}}=\max_s\left\{P\left(\hat{t}|s\right)P\left(s|t\right)\right\}$. This conditional mutual information can be obtained by Equation~\ref{eq:cmi}.
Each column of $M$ correspond to the semantic information carried by one symbol. In a perfectly compositional language, each symbol represents one specific concept exclusively. Therefore, the similarity between the columns of $M$ and a one-hot vector is align with the compositionality of the emergent language.
\caption{An emergent language that the unilateral metrics cannot measure its non-compositionality. Notice that given $s_1=\mathrm{a}$, the listener can neither determine the shape nor the color without the knowledge about $s_0$.}