Each column of $M$ correspond to the semantic information carried by one symbol. In a perfectly compositional language, each symbol represents one specific concept exclusively. Therefore, the similarity between the columns of $M$ and a one-hot vector is align with the compositionality of the emergent language.
Finally, we define \emph{raw mutual information similarity} (denoted as $S_0$) as the average cosine similarity of $M$ columns and one-hot vectors, as Equation~\ref{eq:mis2}. Furthermore, MIS (denoted as $S$) is the normalized raw mutual information similarity into the $[0,1]$ value range.
Finally, we define \emph{raw mutual information similarity} ($MIS_0$)
as the average cosine similarity of $M$ columns and one-hot vectors, as
Equation~\ref{eq:mis2}. Furthermore, $MIS$ is the normalized raw mutual
information similarity into the $[0,1]$ value range, which can be computed with