Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
A
AAAI21_Emergent_language
Overview
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
haoyifan
AAAI21_Emergent_language
Commits
65d62699
Commit
65d62699
authored
Sep 09, 2020
by
Zidong Du
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
~
parent
4e45bed3
Show whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
99 additions
and
2 deletions
+99
-2
AAAI2021/paper.tex
+2
-2
AAAI2021/tex/experiments.tex
+2
-0
AAAI2021/tex/theory.tex
+95
-0
No files found.
AAAI2021/paper.tex
View file @
65d62699
...
@@ -173,11 +173,11 @@
...
@@ -173,11 +173,11 @@
inductions (e.g., small vocabulary sizes, carefully constructed distractors,
inductions (e.g., small vocabulary sizes, carefully constructed distractors,
and ease-of-teaching) in multi-agent learning, which are unnatural.
and ease-of-teaching) in multi-agent learning, which are unnatural.
Yet, few studies investigate the emergence of symbolic language with high
Yet, few studies investigate the emergence of symbolic language with high
compositionality
\emph
{
naturally
}
, i.e., without
any
deliberately handcrafted
compositionality
\emph
{
naturally
}
, i.e., without deliberately handcrafted
inductions.
inductions.
In this paper, we are the first to successfully achieve high compositional symbolic
In this paper, we are the first to successfully achieve high compositional symbolic
language in a
purely
\emph
{
natural
}
manner.
language in a
\emph
{
natural
}
manner.
Initially, by thoroughly investigating the compositionality of emerged symbolic
Initially, by thoroughly investigating the compositionality of emerged symbolic
language after removing the
\emph
{
deliberately handcrafted
}
language after removing the
\emph
{
deliberately handcrafted
}
inductions, we observe that the agent capacity plays a key role in
inductions, we observe that the agent capacity plays a key role in
...
...
AAAI2021/tex/experiments.tex
View file @
65d62699
\section
{
Experiments
}
\label
{
sec:exp
}
AAAI2021/tex/theory.tex
View file @
65d62699
\section
{
Experimental Setup
}
\label
{
sec:thory
}
In this section, we introduce the experimental setup used in this paper,
including the environment setup, agent architecture, and training algorithm.
\begin{figure}
[t]
\centering
\includegraphics
[width=0.9\columnwidth]
{
fig/occupy
}
\caption
{
\rmk
{
The entire environment used in this paper.
}}
\label
{
fig:game
}
\end{figure}
\subsection
{
Environment setup
}
\label
{
ssec:env
}
Figure~
\cite
{
fig:game
}
shows the entire environment used in this study,
i.e., a common used referential game. Roughly, the referential game requires the speaker and
listener working cooperatively to accomplish a certain task.
In this paper, the task is xxxx.
\textbf
{
Game rules
}
In our referential game, agents follow the following rules
to finish the game in a cooperatively manner. In each round,once received an
input object
$
t
$
, Speaker
$
S
$
speaks a symbol sequence
$
s
$
to Listener
$
L
$
;
Listener
$
L
$
reconstruct the predict result
$
\hat
{
t
}$
based on the listened
sequence
$
s
$
; if
$
t
=
\hat
{
t
}$
, agents win this game and receive positive rewards
(
$
R
(
t,
\hat
{
t
}
)=
1
$
); otherwise agents fail this game and receive negative rewards
(
$
R
(
t,
\hat
{
t
}
)=-
1
$
).
Precisely,
An input object t is a concept sequence with fixed length, denoted
$
t
=(
c
_
0
,c
_
1
)
$
.
The concept
$
c
_
0
(
shape
)
$
and
$
c
_
1
(
color
)
$
are indicated as a
one-hot vector respectively.
The length of each one-hot vector ranges from 3 to 6.
These two vectors are concatenated to denote the input object t.
Each symbol sequence s contains two words, denoted
$
(
s
_
0
,s
_
1
)
$
. Each word
$
s
_
i
$
is chosen in the vocabulary set
$
V
$
. In this game, let the card
$
|V|
$
range from
4 to 10, and the inequation
$
|V|
^
2
\geq
|M
_
1
||M
_
1
|
$
is satisfied to ensure the
symbol sequence
$
(
s
_
0
,s
_
1
)
$
can be used to denote all the input object t. The
one-hot vector with the length
$
|V|
$
is used to indicate the word
$
s
_
0
$
and
$
s
_
1
$
respectively. Then, the two one-hot vectors are concatenated to denote the
symbol sequence s.
The predict result
$
\hat
{
t
}$
is denoted as a one-hot vector with the length
$
|M
_
0
||M
_
1
|
$
. Each bit of the one-hot vector denotes one input object. If the
predict result
$
\hat
{
t
}
[
i
*
|M
_
1
|
+
j
]=
1
$
, the one-hot vector of each predict
concept
$
\hat
{
c
}_
0
$
and
$
\hat
{
c
}_
1
$
respectively satisfied
$
\hat
{
c
}_
0
[
i
]=
1
$
and
$
\hat
{
c
}_
1
[
j
]=
1
$
.
If
$
(
c
_
0
,c
_
1
)
$
is equal to
$
(
\hat
{
c
}_
0
,
\hat
{
c
}_
1
)
$
, the input object and the
predict result indicate the same object.
\subsection
{
Agent architecture
}
\label
{
ssec:agent
}
\begin{figure}
[t]
\centering
\includegraphics
[width=0.9\columnwidth]
{
fig/occupy
}
\caption
{
\rmk
{
The architecture of agents.
\emph
{
Left:
}
speaker.
\emph
{
Right:
}
listener.
}}
\label
{
fig:agents
}
\end{figure}
The agents apply their own policy to play the referential game. Denote the
policy of the speaker agent S and the listener L as
$
\pi
_
S
$
and
$
\pi
_
L
$
.
$
\pi
_
S
$
indicates the conditional probability
$
P
(
s
_
0
|t
)
$
and
$
P
(
s
_
1
|t
)
$
.
$
\pi
_
L
$
indicates the conditional probability
$
P
(
\hat
{
t
}
|s
_
0
,s
_
1
)
$
. The listener agent
output predict result
$
\hat
{
t
}$
through random sampling on the conditional
probability
$
P
(
\hat
{
t
}
|s
_
0
,s
_
1
)
$
. The neural networks are used to simulate the
agent policy. The agent architecture is shown in Figure 1.
For the speaker, the input object t is firstly passed to a MLP to get a hidden
layer vector
$
h
^
S
$
. Then, the hidden layer vector is split into two feature
vectors
$
h
_
0
^
S
$
and
$
h
_
1
^
S
$
with length h
\_
size. Through a MLP and a softmax layer,
these feature vectors are transformed as the output
$
o
_
0
$
and
$
o
_
1
$
with the length
|V| respectively. Lastly, the symbol sequences
$
s
_
0
$
and
$
s
_
1
$
are sampled from the
output
$
o
_
0
$
and
$
o
_
1
$
.
For the listener, the input symbol sequences
$
s
_
0
$
and
$
s
_
1
$
are passed into a MLP
respectively to get the hidden layer vectors
$
h
_
0
$
and
$
h
_
1
$
. The length of each
vector is h
\_
size. Concatenating these vectors, and passing the conjunctive
vector into a MLP and a softmax layer, the output
$
o
^
L
$
with length
$
|M
_
0
||M
_
1
|
$
denotes
$
P
(
\hat
{
t
}
|s
_
0
,s
_
1
)
$
. Lastly, the predict result is sampled from the
output
$
o
^
L
$
.
In the experiments, the symbol h
\_
size is used to denote the model capacity of
the agents.
\subsection
{
Training algorithm
}
\label
{
ssec:training
}
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment