Commit 4a3afd5a by haoyifan

haoyifan add code

parent af2d112a
# Environment
A speaker-listener referential game based on reinforcement learning algorithm
# Agents (Listener and Speaker)
Stochastic Policy Gradient agents without parameter sharing or network connecting
# Code structure
'Agent_algorithm.py': contains code for the whole referential game framework
1). class Speaker(): algorithm and structure of the speaker
2). class Listener(): algorithm and structure of the listener
3). main(): the top function of all code, including settings, running process and evaluation of the referential game
'metrics.py': contrains code for getting the probability distribution about symbols and concepts, and for computing the MIS, which is a metric to measure compositionality in our paper
1). update_speaker_prob(): getting policy and probability distribution of the speaker
2). update_listener_prob(): getting policy and probability distribution of the listener
3). update_R_and_MIS(): getting the metric MIS
4). update_metric(): the top function of 'metrics.py'
# Run
python Agent_algorithm.py GPU_ID
for example, if you want use GPU 0,1,2, you can run like: python Agent_algorithm 0,1,2
# Logs
run_logs/log_XXX: contains policies of agents during the training process and the emergent language after trainig
result_logs/log_XXX: contains mutual information matrix M and the metric MIS
This diff is collapsed. Click to expand it.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment