Saturday, September 3, 2016

The magic of hidden variables

Why to introduce hidden variables? There are good reasons to do so. First, they allow to improve the predictability of your model. In some cases they can even make your model deterministic. On the other hand the amount of hidden variables should be minimized. Entities should not be multiplied above the necessity (the principle known as Ockham's Razor). Second, there may be ways to reveal the hidden variables. If they are, then this gives you a chance to predict the consequences of your actions better.

It is a rather philosophical question whether the hidden variables are "real". Whatever makes your model better, is real. You could imagine a computer program (similar to Perkun/Perkun2) that introduces the hidden variables on its own.

Let us take a look at the example4 in the Perkun2 tarball (directory examples/example4). Use Perkun2 version 0.0.3 from First execute the command:

> perkun2 example4_final_code_stupid.perkun2

It expects the values of the variables "response" and "reward". "Response" does not affect the payoff function, while "reward" does affect it. The "response" is used to reveal the answer on the question asked by the computer. There are two hidden variables:
  • hidden variable secret:{false,true};
  • hidden variable computer_is_asking:{false,true};
The "secret" is something that the agent computer does not know (but the other agent, human, knows it! It is an input variable for him!). The objective of the program is to say what is the value of "secret". It may choose action "false", "true", "none" or "ask" (see the output variable action).

On prompt "Perkun2 (computer) >" please type "none none":

Perkun2 (computer) > none none

The computer chooses "action=false". Why? There are 50% chances that it is right. Not very smart of it. Why not asking the human (who knows the secret)? Let us type "none false" on prompt "Perkun2 (human)":

Perkun2 (human) > none false

This means that the "response" is "none" (like previously), but the "reward" is "false", i.e. not good (see the payoff function). On next prompt type "none false" again:

Perkun2 (computer) > none false

Now the computer realizes it got punished and changes its decision - the action is "true". But it can do better than that. Exit the session and run:

> perkun2 example4_final_code_smart.perkun2

(You may check that the only difference between example4_final_code_smart.perkun2 and example4_final_code_stupid.perkun2 is the argument of the command loop). Now we run loop with the game tree depth 3.

Type "none none" on the first prompt:

Perkun2 (computer) > none none

Now the decision is to ask human for the secret! This does not change the input though, so we can type "none none" on the next prompt:

Perkun2 (human) > none none

On the next prompt the computer is expecting a response from human. The first input variable (response) should be true or false, depending on what the actual value of secret is. Let us assume the secret is true:

Perkun2 (computer) > true none

Now the chosen action is true! I.e. in the "smart" version (differing only by the game tree depth) the computer chooses first to apply the action that reveals the value of the secret, and then, depending on the human response - to answer accordingly.

How the computer knows that the human knows the truth? Well - this is implied by the definition of the agent human (check that the secret is an input variable for this agent). Why does it assume that the human will not lie? This is implied by the model of the agent human. We could also have a model that assumes that the human always lies, in that case the computer would ask and then choose the opposite response (action=false on response=true and action=true on response=false).

Instead of asking a human it could be anything else, like performing a complex calculation or checking on Internet. You should remember that using extra actions that reveal the values of hidden variables may cost extra depth of the game tree (i.e. the argument passed to "loop" must be greater).

The computer does not know what the value of the secret is (initially). Perkun/Perkun2 are able to plan actions that do not give them directly any reward, and to plan usage of the knowledge that is supposed to be revealed. This is a very important feature of Perkun/Perkun2! They plan performing experiments and using the knowledge gained in these experiments whatever it will be.

No comments:

Post a Comment