Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
A
AAAI21_Emergent_language
Overview
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
haoyifan
AAAI21_Emergent_language
Commits
61b69bcc
Commit
61b69bcc
authored
Sep 10, 2020
by
haoyifan
Browse files
Options
Browse Files
Download
Plain Diff
Merge branch 'master' of
http://62.234.201.16/hao/AAAI21_Emergent_language
hao
parents
4f5ef212
3c299656
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
2 deletions
+2
-2
AAAI2021/tex/theory.tex
+2
-2
No files found.
AAAI2021/tex/theory.tex
View file @
61b69bcc
...
...
@@ -122,9 +122,9 @@ use the predicted result $\hat{t}$ of the listener agent as the
evidence of whether giving positive rewards. Then, the gradients of the
expected reward
$
J
(
\theta
_
S,
\theta
_
L
)
$
can be calculated as follows:
\begin{align}
\nabla
_{
\theta
^
S
}
J
&
=
\mathbb
{
E
}_{
\pi
^
S
_{
old
}
,
\pi
^
L
}
\left
[ r(
\hat
{
t
}
, t)
\cdot
\nabla
_{
\theta
^
S
}
J
&
=
\mathbb
{
E
}_{
\pi
^
S,
\pi
^
L
}
\left
[ r(
\hat
{
t
}
, t)
\cdot
\frac
{
\nabla
_{
\theta
^
S
}
\pi
^
S(s
_
0, s
_
1 | t)
}{
\pi
^
S
_{
old
}
(s
_
0, s
_
1 | t)
}
\right
]
\\
\nabla
_{
\theta
^
L
}
J
&
=
\mathbb
{
E
}_{
\pi
^
S,
\pi
^
L
_{
old
}
}
\left
[ r(
\hat
{
t
}
, t)
\cdot
\nabla
_{
\theta
^
L
}
J
&
=
\mathbb
{
E
}_{
\pi
^
S,
\pi
^
L
}
\left
[ r(
\hat
{
t
}
, t)
\cdot
\frac
{
\nabla
_{
\theta
^
L
}
\pi
^
L(
\hat
{
t
}
| s
_
0, s
_
1)
}{
\pi
^
L
_{
old
}
(
\hat
{
t
}
| s
_
0, s
_
1)
}
\right
]
\end{align}
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment