Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
A
AAAI21_Emergent_language
Overview
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
haoyifan
AAAI21_Emergent_language
Commits
65d62699
Commit
65d62699
authored
Sep 09, 2020
by
Zidong Du
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
~
parent
4e45bed3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
99 additions
and
2 deletions
+99
-2
AAAI2021/paper.tex
+2
-2
AAAI2021/tex/experiments.tex
+2
-0
AAAI2021/tex/theory.tex
+95
-0
No files found.
AAAI2021/paper.tex
View file @
65d62699
...
...
@@ -173,11 +173,11 @@
inductions (e.g., small vocabulary sizes, carefully constructed distractors,
and ease-of-teaching) in multi-agent learning, which are unnatural.
Yet, few studies investigate the emergence of symbolic language with high
compositionality
\emph
{
naturally
}
, i.e., without
any
deliberately handcrafted
compositionality
\emph
{
naturally
}
, i.e., without deliberately handcrafted
inductions.
In this paper, we are the first to successfully achieve high compositional symbolic
language in a
purely
\emph
{
natural
}
manner.
language in a
\emph
{
natural
}
manner.
Initially, by thoroughly investigating the compositionality of emerged symbolic
language after removing the
\emph
{
deliberately handcrafted
}
inductions, we observe that the agent capacity plays a key role in
...
...
AAAI2021/tex/experiments.tex
View file @
65d62699
\section
{
Experiments
}
\label
{
sec:exp
}
AAAI2021/tex/theory.tex
View file @
65d62699
\section
{
Experimental Setup
}
\label
{
sec:thory
}
In this section, we introduce the experimental setup used in this paper,
including the environment setup, agent architecture, and training algorithm.
\begin{figure}
[t]
\centering
\includegraphics
[width=0.9\columnwidth]
{
fig/occupy
}
\caption
{
\rmk
{
The entire environment used in this paper.
}}
\label
{
fig:game
}
\end{figure}
\subsection
{
Environment setup
}
\label
{
ssec:env
}
Figure~
\cite
{
fig:game
}
shows the entire environment used in this study,
i.e., a common used referential game. Roughly, the referential game requires the speaker and
listener working cooperatively to accomplish a certain task.
In this paper, the task is xxxx.
\textbf
{
Game rules
}
In our referential game, agents follow the following rules
to finish the game in a cooperatively manner. In each round,once received an
input object
$
t
$
, Speaker
$
S
$
speaks a symbol sequence
$
s
$
to Listener
$
L
$
;
Listener
$
L
$
reconstruct the predict result
$
\hat
{
t
}$
based on the listened
sequence
$
s
$
; if
$
t
=
\hat
{
t
}$
, agents win this game and receive positive rewards
(
$
R
(
t,
\hat
{
t
}
)=
1
$
); otherwise agents fail this game and receive negative rewards
(
$
R
(
t,
\hat
{
t
}
)=-
1
$
).
Precisely,
An input object t is a concept sequence with fixed length, denoted
$
t
=(
c
_
0
,c
_
1
)
$
.
The concept
$
c
_
0
(
shape
)
$
and
$
c
_
1
(
color
)
$
are indicated as a
one-hot vector respectively.
The length of each one-hot vector ranges from 3 to 6.
These two vectors are concatenated to denote the input object t.
Each symbol sequence s contains two words, denoted
$
(
s
_
0
,s
_
1
)
$
. Each word
$
s
_
i
$
is chosen in the vocabulary set
$
V
$
. In this game, let the card
$
|V|
$
range from
4 to 10, and the inequation
$
|V|
^
2
\geq
|M
_
1
||M
_
1
|
$
is satisfied to ensure the
symbol sequence
$
(
s
_
0
,s
_
1
)
$
can be used to denote all the input object t. The
one-hot vector with the length
$
|V|
$
is used to indicate the word
$
s
_
0
$
and
$
s
_
1
$
respectively. Then, the two one-hot vectors are concatenated to denote the
symbol sequence s.
The predict result
$
\hat
{
t
}$
is denoted as a one-hot vector with the length
$
|M
_
0
||M
_
1
|
$
. Each bit of the one-hot vector denotes one input object. If the
predict result
$
\hat
{
t
}
[
i
*
|M
_
1
|
+
j
]=
1
$
, the one-hot vector of each predict
concept
$
\hat
{
c
}_
0
$
and
$
\hat
{
c
}_
1
$
respectively satisfied
$
\hat
{
c
}_
0
[
i
]=
1
$
and
$
\hat
{
c
}_
1
[
j
]=
1
$
.
If
$
(
c
_
0
,c
_
1
)
$
is equal to
$
(
\hat
{
c
}_
0
,
\hat
{
c
}_
1
)
$
, the input object and the
predict result indicate the same object.
\subsection
{
Agent architecture
}
\label
{
ssec:agent
}
\begin{figure}
[t]
\centering
\includegraphics
[width=0.9\columnwidth]
{
fig/occupy
}
\caption
{
\rmk
{
The architecture of agents.
\emph
{
Left:
}
speaker.
\emph
{
Right:
}
listener.
}}
\label
{
fig:agents
}
\end{figure}
The agents apply their own policy to play the referential game. Denote the
policy of the speaker agent S and the listener L as
$
\pi
_
S
$
and
$
\pi
_
L
$
.
$
\pi
_
S
$
indicates the conditional probability
$
P
(
s
_
0
|t
)
$
and
$
P
(
s
_
1
|t
)
$
.
$
\pi
_
L
$
indicates the conditional probability
$
P
(
\hat
{
t
}
|s
_
0
,s
_
1
)
$
. The listener agent
output predict result
$
\hat
{
t
}$
through random sampling on the conditional
probability
$
P
(
\hat
{
t
}
|s
_
0
,s
_
1
)
$
. The neural networks are used to simulate the
agent policy. The agent architecture is shown in Figure 1.
For the speaker, the input object t is firstly passed to a MLP to get a hidden
layer vector
$
h
^
S
$
. Then, the hidden layer vector is split into two feature
vectors
$
h
_
0
^
S
$
and
$
h
_
1
^
S
$
with length h
\_
size. Through a MLP and a softmax layer,
these feature vectors are transformed as the output
$
o
_
0
$
and
$
o
_
1
$
with the length
|V| respectively. Lastly, the symbol sequences
$
s
_
0
$
and
$
s
_
1
$
are sampled from the
output
$
o
_
0
$
and
$
o
_
1
$
.
For the listener, the input symbol sequences
$
s
_
0
$
and
$
s
_
1
$
are passed into a MLP
respectively to get the hidden layer vectors
$
h
_
0
$
and
$
h
_
1
$
. The length of each
vector is h
\_
size. Concatenating these vectors, and passing the conjunctive
vector into a MLP and a softmax layer, the output
$
o
^
L
$
with length
$
|M
_
0
||M
_
1
|
$
denotes
$
P
(
\hat
{
t
}
|s
_
0
,s
_
1
)
$
. Lastly, the predict result is sampled from the
output
$
o
^
L
$
.
In the experiments, the symbol h
\_
size is used to denote the model capacity of
the agents.
\subsection
{
Training algorithm
}
\label
{
ssec:training
}
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment