Pawel Biernacki blog: March 2017

Thursday, March 9, 2017

Is zubr better than perkun or wlodkowic?

Just to recall: perkun and wlodkowic are interpreters. Zubr, on the contrary, is a code generator. Is it better? It has some advantage, since its code does not generate all the visible states, i.e. all the situations possible in the game. It works in a different way - much closer to the chess playing programs. It builds the game tree dynamically.

Both perkun/wlodkowic and zubr generated code contain my optimization algorithm. The same algorithm to maximize the expected value of the payoff function.

Zubr generates a Java code, which I consider an advantage.

All the three tools come in the perkun package: https://sourceforge.net/projects/perkun/

If you have a C++ program that needs my optimization algorithm then it is better to link it against libperkun or libwlodkowic. I have written two small games demonstrating how to do it, it is https://sourceforge.net/projects/perkunwars/ (for perkun) and https://sourceforge.net/projects/thragos/ (for wlodkowic). They both create separate processes for the perkun/wlodkowic interpreters and communicate with the parent process with pipes. Feel free to take a look at their source code.

There are, however, some things you might consider a zubr's disadvantage. For example the model - you have to hardcode it in the getModelProbability method. There is no syntax for a zubr model. The same holds for the payoff (method getPayoff). Wlodkowic offers an extra section for the apriori belief - again, in zubr this requires an extra method.

Zubr has also no syntax to inform the optimizer about the impossible states or illegal actions. It should be resolved with an extra feature - the iterators. I hope to explain them later. You may also take a look at the zubr man page and the code it generates.

In the recent posts I walked through the zubr examples stored in the "examples" folder of the perkun package. I tried to demonstrate that the hidden variables based state is beneficial for the quality of prediction/optimization. I think it is time for a major example using zubr, something like Perkun Wars for perkun or Thragos for wlodkowic.

Wednesday, March 8, 2017

example22_hidden_variables_based_predictor.zubr

You want a prove that hidden variables allow to optimize better? Here you are.

Imagine an optimizer that takes two input variables instead of one. The Perkun section of the zubr specification looks as follows:

values
{
    value FALSE, TRUE;
}

variables
{
    input variable alpha:{FALSE, TRUE}, reward:{FALSE, TRUE};
    hidden variable gamma:{FALSE, TRUE};
    output variable action:{FALSE, TRUE};
}

There are two input variables now: alpha and reward. What is the semantics? Alpha has a sequence FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE,... and so on, independently on the agent's action. But the agent does not know where we begin within the sequence. Action is a bet - it is an attempt to predict the next alpha. Depending on the action the agent receives a reward - an immediate information whether the prediction was correct. Reward TRUE means the prediction was right, FALSE means no reward.

You can execute the program directly from my server:

http://www.pawelbiernacki.net/hiddenVariableBasedPredictor.jnlp

For example let us start with FALSE, FALSE. The program sets its initial belief to gamma=>FALSE at 50% and gamma=>TRUE at 50%. The chosen action is FALSE (he bets the next alpha will be false). Let as assume he was wrong and the next alpha will be TRUE. So there will be no reward, enter TRUE, FALSE.

Now he knows that gamma is FALSE (the belief reflects this). The action will be TRUE. So he thinks the next alpha will be TRUE. Let's confirm his expectations: enter TRUE, TRUE. Now gamma=>TRUE. Action => FALSE.

In short - due to the usage of the hidden variables based state his prediction will always be correct after the first two signals. He will always get a reward TRUE. Only in the beginning there is an uncertainty (reflected by the belief).

When you compare this optimizer (in fact - this predictor) with the functions based merely on the input variables you will see that no function can beat him. I found two functions that are pretty good:

f1(FALSE, FALSE) = FALSE
f1(FALSE, TRUE) = FALSE
f1(TRUE, FALSE) = TRUE
f1(TRUE, TRUE) = FALSE

f2(FALSE, FALSE) = FALSE
f2(FALSE, TRUE) = TRUE
f2(TRUE, FALSE) = TRUE
f2(TRUE, TRUE) = TRUE

I tested all the possible 16 functions - only f1 and f2 get close. But even they make mistakes (after the first two signals). On the contrary - our predictor generated by zubr can make only one mistake, after first two signals he makes no more mistakes.

If you take a look at the file example22_hidden_variables_based_predictor.zubr (unpack perkun and see the "examples" folder) you will see that we use a custom dialog (extending JDialog) in the method getInput. This was necessary because we have two input variables here. You may process the example file with zubr:

zubr example22_hidden_variables_based_predictor.zubr > MyOptimizer.java

The result Java code can be compiled (remember to place it in a package "optimizer").

What is the conclusion? The optimizer/predictor with a state is much better for the case discussed here than any function based on the input variables. The state should be based on the hidden variables (it is not the only possibility, but the most natural one). This was the problem with the AI - we tried to achieve this with IF THEN, and IF THEN can only see the current state. The hidden variables are a natural way to compress our knowledge about the past. The history.

Tuesday, March 7, 2017

example21_set_apriori_belief.zubr

In the examples I assume we have a good world model (for example we know the sequence FALSE, FALSE, TRUE, TRUE on MOVE) but we do not know exactly where we begin. If we initially get FALSE then MOVE could lead to another FALSE or TRUE. This implies that the initial belief (probability distribution) must reflect this uncertainty. But even though we do not know the hidden variables initially - we may know more than nothing about them. For instance if we talk with a patient and we are a doctor we may introduce a hidden variable "patient_has_cancer". But we should not assume 50% for TRUE and 50% for FALSE, as zubr usually does. Instead we should apply the natural probability distribution of cancer in the population, i.e. use a so-called apriori belief.

This requires us to tell zubr we will define the method setAPrioriBelief:

%option setaprioribelief own

Then in the definition section we provide the implementation:

protected void setAPrioriBelief(Belief target) {
    for (StateIterator i = createNewStateIterator(target.getVisibleState()); !i.getFinished(); i.increment()) {
        State si = i.createState();
        target.addState(si);

        if (si.getVariableValue("gamma") == Value.FALSE)
            target.setProbability(si, 0.3f);
        else
            target.setProbability(si, 0.7f);
    }
}

As you can see we iterate over all possible states using a StateIterator, create states and add them to the target (Belief). We will talk later about the iterators so take them for granted now. Once we have populated belief with states we may query the states for hidden variable values and set the probability. Note that we choose to set 30% for gamma=>FALSE and 70% for gamma=>TRUE.

Now process the example with zubr and compile the java outcome:

zubr example21_set_a_priori_belief.zubr > MyOptimizer.java

You can also execute the program directly from my server:

http://www.pawelbiernacki.net/aprioriOptimizer.jnlp

Have you noticed a small change after the first signal? The belief is not 50%/50% any more, but 30%/70%! This can be important when we have more real-world examples.

Download zubr from https://sourceforge.net/projects/perkun/.

Monday, March 6, 2017

example20_hidden_variables.zubr

This is our first example with the hidden variables. The Perkun section of the zubr specification looks as follows:

values
{
    value FALSE, TRUE;
    value MOVE, DONT_MOVE;
}

variables
{
    input variable alpha:{FALSE, TRUE}; // alpha may have value FALSE or TRUE
    hidden variable gamma:{FALSE, TRUE};
    output variable beta:{MOVE, DONT_MOVE};
}

What is it good for? Imagine an automaton that can do either MOVE or DONT_MOVE. When it constantly does MOVE then the input will be:

FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE,...

But it is not known where in the sequence we begin. So even though it knows that two FALSE are succeeding when it gets a FALSE it does not know whether it was the first FALSE in the sequence or the second one.

The payoff function makes the program to "like" TRUE as input and "dislike" FALSE.

You may process the example with zubr to obtain the Java optimizer code:

zubr example20_hidden_variables.zubr > MyOptimizer.java

Here is the link to the program (you can run it directly from my server):

http://www.pawelbiernacki.net/hiddenVariablesOptimizer.jnlp

There are three scenarios possible:

1. TRUE -> DONT_MOVE
     TRUE -> DONT_MOVE
     TRUE -> DONT_MOVE
     ....

2. FALSE -> MOVE
     FALSE -> MOVE
     TRUE -> MOVE
     TRUE -> DONT_MOVE
     TRUE -> DONT_MOVE
     ...

3. FALSE -> MOVE
     TRUE -> MOVE
     TRUE -> DONT_MOVE
     TRUE -> DONT_MOVE
   ...

You can see that the program created by zubr behaves a little as if it were indeterministic. Sometimes it responds TRUE with MOVE, sometimes with DONT_MOVE.

In fact it is completely deterministic, but it has a state, which is a belief (probability distribution over a set of two possible facts - gamma => FALSE and gamma => TRUE). This belief changes depending on the performed actions and obtained results. Because it has this additional knowledge (the belief) the optimizer can permit itself for example to choose MOVE when it knows that after MOVE still a TRUE will follow. On the contrary in the first scenario after TRUE it does not know whether another TRUE will follow, therefore it chooses DONT_MOVE.

This is an important point that I want to make - the state is very important for successful optimization and the hidden variables are a natural way to construct such states. Second - the optimizers can be deterministic, but still better than functions based on the input variables. In the case discussed here it is easy to construct a function that performs just as well as the zubr generated optimizer:

f(FALSE) = MOVE
f(TRUE) = DONT_MOVE

So in this case a function is just as good as the zubr optimizer, but in the more complex cases the functions just cannot beat the optimizers. We will later discuss such an example. The hidden variable based optimizers differ from the functions in so far that they have a deeper "understanding" of the outer world.

Download zubr from https://sourceforge.net/projects/perkun/.

Sunday, March 5, 2017

example19_get_payoff.zubr

What is the purpose of an optimizer? It attempts to maximize the expected value of the so-called payoff function. In this example we are finally implementing a method specifying the payoff function. First we have to tell zubr about it:

%option getpayoff own // method getPayoff

Then in the definition section we provide the implementation:

protected float getPayoff(VisibleState vs) {

    switch (vs.getVariableValue("alpha"))
    {
        case FALSE:
            return 0.0f;

        case TRUE:
            return 100.0f; // TRUE is "better" than FALSE
    }
    return 0.0f;
}

This way we make our optimizer to prefer alpha=TRUE and dislike alpha=FALSE. The example can be executed as usually, with zubr:

zubr example19_get_payoff.zubr > MyOptimizer.java

There are two possible decisions: MOVE and DONT_MOVE. The Perkun section in the zubr specification looks as follows:

values
{
    value FALSE, TRUE;
    value MOVE, DONT_MOVE;
}

variables
{
    input variable alpha:{FALSE, TRUE}; // alpha may have value FALSE or TRUE
    output variable beta:{MOVE, DONT_MOVE};
}

You can execute the final program directly from my server:
http://www.pawelbiernacki.net/getPayoffOptimizer.jnlp

As is easy to anticipate the optimizer will do MOVE after FALSE, while he would do DONT_MOVE after TRUE. We still don't have the hidden variables here, but the example is sufficient to introduce the getPayoff method.

An interesting observation - if there are no hidden variables then the optimizer could be replaced by a simple function. Only then. The optimizers with hidden variables can be much better than any function mapping input to output, as was shown previously.

The next example will be based on hidden variables, which is what makes zubr interesting.

Saturday, March 4, 2017

example18_simple_automaton.zubr

In this example we define also the method "execute" which is performed once the optimizer finds the optimal decision:

%option execute own // method execute

In the definition section we provide the implementation:

protected void execute(Action a) {
JOptionPane.showMessageDialog(frame, a.getVariableValue("beta").getName(), "Action", JOptionPane.INFORMATION_MESSAGE);
}

The model in this automaton requires you to enter FALSE, TRUE, FALSE, TRUE,... and so on if the optimizer chooses MOVE, and a constant signal if it chooses DONT_MOVE.

You can run the program from my server.

http://www.pawelbiernacki.net/simpleAutomaton.jnlp

The optimizer always chooses MOVE. What's wrong with it? We should define one more method that allows discriminating the input signals - getPayoff. Then the optimizer will choose such output signals that the expected value of payoff is maximal.

I would like to add some comments concerning the difference between perkun and zubr. Perkun creates all possible visible states in the very beginning, and then for each visible state it generates all possible states. A visible state contains a vector of the input variables values, while a state contains a vector of the hidden variable values. With the increasing amount of hidden variables its demand for memory grows exponentially. On the contrary zubr generates code that creates the game tree dynamically, like a chess playing program. The code written by zubr demands therefore much less memory than an equivalent program in Perkun, but is slower.

Zubr and perkun (and wlodkowic) support hidden variables. This is a killer feature - the hidden variables allow to compress our knowledge about history and are a natural way to do it. The examples discussed so far contained no hidden variables, but it will change.

The code generated by zubr contain my optimization algorithm (just like perkun and wlodkowic). This algorithm has not been documented yet, I tried to write some documents about it, but the result was not satisfactory for me. If you want to understand the algorithm please take a look at the source code (classes perkun::optimizer or wlodkowic::optimizer).

When you link libperkun or libwlodkowic to your own program you have to obey the rules of the GPL 3.0, but when you create a Java code with zubr you may use it just as you would use the outcome of yacc/bison http://www.gnu.org/software/bison/manual/html_node/Conditions.html. You are free to use the code generated by zubr in proprietary programs.

Download zubr (and perkun + wlodkowic) from: https://sourceforge.net/projects/perkun/.

Friday, March 3, 2017

Hidden variable based predictor vs. functions

I have created a minimalistic example demonstrating that hidden variables are beneficial. I have written a small program that compares a predictor created by zubr with the function predictors. You may take a look at my code (it is included in the JAR, license GPL 3.0):

http://www.pawelbiernacki.net/TestHiddenVariableBasedPredictors.jar

You may also run the program directly from my server:

http://www.pawelbiernacki.net/TestHiddenVariableBasedPredictors.jnlp

It calculates the scores (amount of correct guesses divided by amount of all guesses) for various lengths of the test sequence. The predictor with the id = -1 is a zubr generated optimizer, the other predictors are simply functions. There are all possible functions tested (there are 16 of them). They have the ids 0..15.

As you can see the hidden variable based predictor outperforms the functions (its score equals 0.92 for the test sequence of length 19 while the best function achieves only 0.70).

The difference between the function predictors and the hidden variable based predictor is that the functions are stateless, while the hidden variable based predictor does have a state. Its state is a belief - a probability distribution over the set of states (vectors of hidden variable values). My point is that the IF THEN construct is too weak to achieve the AI, because IF THEN takes into account the current state only, ignoring the history. The hidden variables are all about history - they are a natural way to compress our knowledge about history. Therefore the optimizers/predictors based on hidden variables are so much better than the stateless predictors.

id	1	2	3	4	5	6	7	8	9	10
-1	0	0.25	0.5	0.63	0.7	0.75	0.79	0.81	0.83	0.85
0	0	0.25	0.33	0.38	0.4	0.42	0.43	0.44	0.44	0.45
1	0	0.25	0.42	0.44	0.45	0.46	0.46	0.47	0.47	0.47
2	0	0.25	0.33	0.31	0.3	0.29	0.29	0.28	0.28	0.28
3	0	0.25	0.33	0.38	0.4	0.42	0.43	0.44	0.44	0.45
4	0	0.25	0.42	0.5	0.55	0.58	0.61	0.63	0.64	0.65
5	0	0.25	0.33	0.38	0.3	0.33	0.36	0.38	0.33	0.35
6	0	0.25	0.5	0.5	0.4	0.42	0.5	0.5	0.44	0.45
7	0	0.25	0.33	0.31	0.3	0.29	0.29	0.28	0.28	0.28
8	0	0.25	0.33	0.38	0.4	0.42	0.43	0.44	0.44	0.45
9	0	0.25	0.5	0.5	0.4	0.42	0.5	0.5	0.44	0.45
10	0	0.25	0.33	0.38	0.3	0.33	0.36	0.38	0.33	0.35
11	0	0.25	0.42	0.44	0.45	0.46	0.46	0.47	0.47	0.47
12	0	0.25	0.33	0.38	0.4	0.42	0.43	0.44	0.44	0.45
13	0	0.25	0.33	0.38	0.4	0.42	0.43	0.44	0.44	0.45
14	0	0.25	0.42	0.5	0.55	0.58	0.61	0.63	0.64	0.65
15	0	0.25	0.33	0.38	0.4	0.42	0.43	0.44	0.44	0.45

Take a look at the numbers (the first row is the hidden variable based predictor).

You may wonder why I used the term "predictor" rather than "optimizer". Well, this has something to do with the nature of my example. The action performed by this optimizer is a bet - it tries to predict the next value of one of the input variables. I will discuss the example used here later.

Zubr can be downloaded from https://sourceforge.net/projects/perkun/.

Thursday, March 2, 2017

example17_get_model_probability.zubr

In this example we tell zubr we will provide our own implementation of the method getModelProbability:

%option getmodelprobability own // method getModelProbability

Then in the definition section we provide the implementation:

protected float getModelProbability(VisibleState vs1, State s1, Action a, VisibleState vs2, State s2) {

    // here we can query the visible states for input variables
    // the states for hidden variables (there are none at present)
    // and the action for the output variables (also none at present)

    // vs1 and s1 represent the current visible state and state
    // a represents the action
    // vs2 and s2 represent the future visible state and state

    System.out.println("current alpha => " + vs1.getVariableValue("alpha").getName());
    System.out.println("future alpha => " + vs2.getVariableValue("alpha").getName());

    return 0.5f;
}

The VisibleState, State and Action are classes created by zubr. We can use their methods getVariableValue to query them for the input variable values, hidden variable values and output variable values, respectively. The method getVariableValue returns Value (an enum created by zubr).

As usual we can process our example with zubr and compile the result Java class:

zubr example17_get_model_probability.zubr > MyOptimizer.java

You can run the program directly from my server:
http://www.pawelbiernacki.net/getModelProbabilityOptimizer.jnlp

However it only consumes the input, no output is shown. In order to see what the optimizer decisions are we will have to redefine the method "execute".

In order to cancel the execution press "Cancel" button on the getInput dialog box.

Wednesday, March 1, 2017

example16_on_error_in_populate_belief_for_consequence.zubr

In this example we will add an error handling method. First we have to tell zubr we will do it. In the declaration section add:

%option onerrorinpopulatebeliefforconsequence own
// with this option we tell zubr we will provide our own implementation of the method
// onErrorInPopulateBeliefForConsequence

Then in the definition section we provide its implementation:

// this is the implementation of the method we promised to provide:

protected void onErrorInPopulateBeliefForConsequence(Belief formerBelief, Action formerAction, VisibleState vs) {
JOptionPane.showMessageDialog(frame, "error in populate belief for consequence", "Error", JOptionPane.ERROR_MESSAGE);
System.exit(0);
}

As you can see the method takes following arguments:

Belief
Action
VisibleState

They are all classes defined by zubr. The error happens when given the belief we attempt to perform an action and obtain as a result the visible state that was unexpected. Our implementation ignores them and simply displays a dialog message, then exits.

If you download zubr (https://sourceforge.net/projects/perkun/) and process the file example16_on_error_in_populate_belief_for_consequence.zubr from the "examples" folder with it then you obtain the optimizer Java code:

zubr example16_on_error_in_populate_belief_for_consequence.zubr > MyOptimizer.java

This Java code can be compiled and executed. You can run it from my server:

http://www.pawelbiernacki.net/optimizerWithErrorHandling.jnlp

This program just allows you to input the alpha (twice) and then exits displaying our error message. Why? The problem is we did not provide any model, i.e. whatever happens will be unexpected for our optimizer. In the next example we will provide the model and things will become more interesting.