Friday, October 28, 2016

What is so special about thragos?

When you install the game thragos-0.0.0 you can simply walk around (control with the arrow keys), and occasionally you hit the NPCs, which causes the game to stop and allows you to make decision. The NPC chooses one of the ten actions. Sometimes it attacks you, sometimes it practices or makes a weapon. It happens that the NPC asks whether you are a wizard or whether you have a weapon. Why?

Well - that is what thragos is all about. The Wlodkowic specification executed by the game is in the unpacked tarball, the file wlodkowic/thragos.wlodkowic. Please enter the directory wlodkowic and grep the file thragos.wlodkowic for "estimate" and "set":

grep "estimate" thragos.wlodkowic | grep "set"

It responds with 1480 lines similar to this one:
set({he_is_a_wizard=>{false},
he_has_a_weapon=>{false},he_is_stronger_than_me=>{false},
reward=>{false},test_result=>{false,true,none},
i_am_a_wizard=>{false,true},i_have_a_weapon=>{false,true},i_can_see_him=>{false,true},he_has_used_a_weapon=>{false,true},he_has_used_magic=>{false,true}},{action=>{estimate_whether_he_is_a_wizard}},{he_is_a_wizard=>{false},he_has_a_weapon=>{false},he_is_stronger_than_me=>{false},reward=>{none},test_result=>{false},i
_am_a_wizard=>{false},i_have_a_weapon=>{true},i_can_see_him=>{false},he_has_used_a_weapon=>{false},he_has_used_magic=>{false}},1.0);


Let's highlight the initial state query (color blue) and terminal state query (color green):
set({he_is_a_wizard=>{false},
he_has_a_weapon=>{false},he_is_stronger_than_me=>{false},
reward=>{false},test_result=>{false,true,none},
i_am_a_wizard=>{false,true},i_have_a_weapon=>{false,true},i_can_see_him=>{false,true},he_has_used_a_weapon=>{false,true},he_has_used_magic=>{false,true}}
,{action=>{estimate_whether_he_is_a_wizard}},
{he_is_a_wizard=>{false},he_has_a_weapon=>{false},he_is_stronger_than_me=>{false},reward=>{none},test_result=>{false},i_am_a_wizard=>{false},i_have_a_weapon=>{true},i_can_see_him=>{false},he_has_used_a_weapon=>{false},he_has_used_magic=>{false}},1.0);

I can assure you that the green part (terminal state query) always contains reward=>{none}. What does it mean? It means that just asking the question gives no direct benefit to the agent. In spite of that the agent sometimes chooses to LEARN something about the hidden variables. Now this is what I call real Machine Learning. It is implied by the algorithm itself. The NPC will not only perform the actions like making a weapon or practising, he will also question its environment for hidden variables. Not always, of course. But sometimes.

No comments:

Post a Comment