Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
A
AAAI21_Emergent_language
Overview
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
haoyifan
AAAI21_Emergent_language
Commits
0de80caa
Commit
0de80caa
authored
Sep 15, 2020
by
haoyifan
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
hh
parent
d28758bd
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
4 deletions
+3
-4
code/Agent_algorithm.py
+3
-4
No files found.
code/Agent_algorithm.py
View file @
0de80caa
...
@@ -9,8 +9,7 @@ from metrics import update_metric, update_speaker_prob, update_listener_prob, up
...
@@ -9,8 +9,7 @@ from metrics import update_metric, update_speaker_prob, update_listener_prob, up
# hyperparameters
# hyperparameters
MAX_EP
=
10000000
# maximum count of training data
MAX_EP
=
10000000
# maximum count of training data
A_LR
=
0.002
# learning rate for actor
LR
=
0.002
# learning rate
C_LR
=
0.002
# learning rate for critic
BATCH_SIZE
=
128
BATCH_SIZE
=
128
EPSILON
=
0.2
EPSILON
=
0.2
CLIP
=
0.2
CLIP
=
0.2
...
@@ -76,7 +75,7 @@ class Speaker(object):
...
@@ -76,7 +75,7 @@ class Speaker(object):
surrogate
=
ratio
*
self
.
tf_reward
surrogate
=
ratio
*
self
.
tf_reward
self
.
loss
=
-
tf
.
reduce_mean
(
tf
.
minimum
(
surrogate
,
tf
.
clip_by_value
(
ratio
,
1.
-
CLIP
,
1.
+
CLIP
)
*
self
.
tf_reward
))
self
.
loss
=
-
tf
.
reduce_mean
(
tf
.
minimum
(
surrogate
,
tf
.
clip_by_value
(
ratio
,
1.
-
CLIP
,
1.
+
CLIP
)
*
self
.
tf_reward
))
self
.
train
=
tf
.
train
.
AdamOptimizer
(
A_
LR
)
.
minimize
(
self
.
loss
,
var_list
=
[
speak_params
])
self
.
train
=
tf
.
train
.
AdamOptimizer
(
LR
)
.
minimize
(
self
.
loss
,
var_list
=
[
speak_params
])
self
.
sess
.
run
(
tf
.
global_variables_initializer
())
self
.
sess
.
run
(
tf
.
global_variables_initializer
())
...
@@ -215,7 +214,7 @@ class Listener(object):
...
@@ -215,7 +214,7 @@ class Listener(object):
surrogate
=
ratio
*
self
.
tf_reward
surrogate
=
ratio
*
self
.
tf_reward
self
.
loss
=
-
tf
.
reduce_mean
(
tf
.
minimum
(
surrogate
,
tf
.
clip_by_value
(
ratio
,
1.
-
CLIP
,
1.
+
CLIP
)
*
self
.
tf_reward
))
self
.
loss
=
-
tf
.
reduce_mean
(
tf
.
minimum
(
surrogate
,
tf
.
clip_by_value
(
ratio
,
1.
-
CLIP
,
1.
+
CLIP
)
*
self
.
tf_reward
))
self
.
train
=
tf
.
train
.
AdamOptimizer
(
A_
LR
)
.
minimize
(
self
.
loss
,
var_list
=
[
listen_params
])
self
.
train
=
tf
.
train
.
AdamOptimizer
(
LR
)
.
minimize
(
self
.
loss
,
var_list
=
[
listen_params
])
self
.
sess
.
run
(
tf
.
global_variables_initializer
())
self
.
sess
.
run
(
tf
.
global_variables_initializer
())
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment