AAAI 2021

c9f1b78e · Zidong Du · fe3a50fb · c9f1b78e
Commit c9f1b78e authored Sep 05, 2020 by Zidong Du
Show whitespace changes
Inline Side-by-side

Showing with 66 additions and 7 deletions

AAAI2021/paper.tex
+66 -7

No files found.
--- a/AAAI2021/paper.tex
+++ b/AAAI2021/paper.tex
@@ -177,17 +177,76 @@ easier-to-teach. \textcolor{red}{ZD: effects}


 \section{Introduction}
-The emergence and evolution of human language has always been an important and controversial issue. The problem covers many fields, including artificial intelligence in computer science. Computer scientists induce the emergence and evolution of languages in multi-agent systems by setting up pure communication scenarios, such as referential games and communication-action policies.
-Researchers have confirmed that agents can master a symbolic language to complete appointed tasks. Such symbolic language is a communication protocol using symbols or characters to represent concepts. Moreover, people not only care about the emergence of language, but also try to make the emergent language similar to human natural language.

-Compositionality is a widely accepted metric used to measure the hierarchical complexity of language structure, and it is also a key feature to distinguish human language from animal language. Syntactic languages with high compositionality, such as human natural language, are able to express complex meanings through the combination of symbols and to produce certain syntax. In contrast, non-syntactic languages with low compositionality, such as animal languages, are almost impossible to extract specific concepts from a single symbol. Researchers have recognized the importance of compositionality and found that various environmental pressures would affect compositionality.

-Besides environmental pressures, we suggest that the impact of internal factors from agents themselves on compositionality is equally significant. A biological hypothesis show that the cranial capacity of animals is not big enough to master languages with high compositionality. In neuron network based multi-agent systems, this hypothesis corresponds to a point of view that it’s difficult for agents with insufficient characterization capacity (i.e. number of neural nodes) to master languages with high compositionality. However, combine theoretical analysis and environmental results, we hold the complete opposite view -- within the range afforded by the need for successful communication, lower characterization capacity facilitates the emergence of symbolic language with higher compositionality.

-For theoretical analysis, we use the MSC (Markov Series Channel) to model language transmission process and the probability distribution of symbols and concepts to model agents. Our methodology has the certain generalization ability cause it does not depend on the specific structure or algorithm of agents’ model. Combine the MSC model with mutual information theory, we certify the characterization capacity’s impact on compositionality theoretically. Specifically, we prove that a symbol of emergent languages with lower compositionality need carry more complex semantic information (i.e. mutual information between original concepts received by speaker and predicted concepts outputted by listener). So agents use such languages require more neural nodes in to characterize the semantic information.

-In terms of experiments, in order to examine the relationship between capacity and compostionality in 'natural' environments, we avoid imposing any environmental pressures on agents through the following settings: a). Scenarios: a listener-speaker referential games for pure communication; b). Models: the listener and the speaker don’t share any parameters, and are not connected together to form an Auto-Encoder structure; c). Rewards: the only criterion for each of agents to receive a positive reward is whether the forecast output from the listener is correct. Under an experimental framework with such settings, the experimental results show that the effect of characterization capacity on compositionality is consistent with the theoretical analysis.
-In addition, as a by-product of theoretical analysis, we propose ‘bilateral’ metrics for measuring compositionality and the degree of alignment between symbols and concepts. For the degree of alignment between symbols and concepts, the metric should be higher only if speaker and listener ‘bilateral’ correspond a symbol to the same concept more stably. For compositionality, we hold the view that a single symbol of symbolic languages with higher composionality should be used to ground or transmit a certain concept ‘bilaterally’ and more exclusively between listener and speaker.
+
+The emergence and evolution of human language has always been an important and
+controversial issue. The problem covers many fields, including artificial
+intelligence in computer science. Computer scientists induce the emergence and
+evolution of languages in multi-agent systems by setting up pure communication
+scenarios, such as referential games and communication-action policies.
+
+Researchers have confirmed that agents can master a symbolic language to
+complete appointed tasks. Such symbolic language is a communication protocol
+using symbols or characters to represent concepts. Moreover, people not only
+care about the emergence of language, but also try to make the emergent language
+similar to human natural language. 
+
+Compositionality is a widely accepted metric used to measure the hierarchical
+complexity of language structure, and it is also a key feature to distinguish
+human language from animal language. Syntactic languages with high
+compositionality, such as human natural language, are able to express complex
+meanings through the combination of symbols and to produce certain syntax. In
+contrast, non-syntactic languages with low compositionality, such as animal
+languages, are almost impossible to extract specific concepts from a single
+symbol. Researchers have recognized the importance of compositionality and found
+that various environmental pressures would affect compositionality. 
+
+Besides environmental pressures, we suggest that the impact of internal factors
+from agents themselves on compositionality is equally significant. A biological
+hypothesis show that the cranial capacity of animals is not big enough to master
+languages with high compositionality. In neuron network based multi-agent
+systems, this hypothesis corresponds to a point of view that it’s difficult for
+agents with insufficient characterization capacity (i.e. number of neural nodes)
+to master languages with high compositionality. However, combine theoretical
+analysis and environmental results, we hold the complete opposite view -- within
+the range afforded by the need for successful communication, lower
+characterization capacity facilitates the emergence of symbolic language with
+higher compositionality. 
+
+For theoretical analysis, we use the MSC (Markov Series Channel) to model
+language transmission process and the probability distribution of symbols and
+concepts to model agents. Our methodology has the certain generalization ability
+cause it does not depend on the specific structure or algorithm of agents’
+model. Combine the MSC model with mutual information theory, we certify the
+characterization capacity’s impact on compositionality
+theoretically. Specifically, we prove that a symbol of emergent languages with
+lower compositionality need carry more complex semantic information (i.e. mutual
+information between original concepts received by speaker and predicted concepts
+outputted by listener). So agents use such languages require more neural nodes
+in to characterize the semantic information. 
+
+In terms of experiments, in order to examine the relationship between capacity
+and compostionality in 'natural' environments, we avoid imposing any
+environmental pressures on agents through the following settings: a). Scenarios:
+a listener-speaker referential games for pure communication; b). Models: the
+listener and the speaker don’t share any parameters, and are not connected
+together to form an Auto-Encoder structure; c). Rewards: the only criterion for
+each of agents to receive a positive reward is whether the forecast output from
+the listener is correct. Under an experimental framework with such settings, the
+experimental results show that the effect of characterization capacity on
+compositionality is consistent with the theoretical analysis.
+
+In addition, as a by-product of theoretical analysis, we propose ‘bilateral’
+metrics for measuring compositionality and the degree of alignment between
+symbols and concepts. For the degree of alignment between symbols and concepts,
+the metric should be higher only if speaker and listener ‘bilateral’ correspond
+a symbol to the same concept more stably. For compositionality, we hold the view
+that a single symbol of symbolic languages with higher composionality should be
+used to ground or transmit a certain concept ‘bilaterally’ and more exclusively
+between listener and speaker. 
 To sum up, our contributions are as follows: 
 \begin{itemize}
 \item We explore a novel factor (i.e. characterization capacity of agents) in compositionality, and show its impact both theoretically and experimentally.