P-Switcher
1.1 Switch on loss and switch on loss/tie
When pspace was first implemented on the hills, one of the first switching
algorithms was the simple 'switch on loss' or 'switch on tie or loss'.
res ldp.ab _RESULT, #0
str ldp.a _STRATEGY, str1 ;load strategy in use
sne.ab #0, res ;check result
lost add.a #1, str1 ;lost change
mod.a #X, str1 ;X = number of strategies to switch
win stp.ab str1, _STRATEGY
str1 jmp @0, strat0
dat 0, strat1
dat 0, strat2
. . .
dat 0, stratX-1
Length: 6 + 1/strategy
Cycles to completion: after Win/Tie: 6
Loss : 7
The code is short and decisions are made quickly. However it is definitely
lacking in sophistication ( which you don't always need to be effective ).
The mod instruction ensures that a valid strategy is chosen even if brainwashed.
However, since there is no active brainwash detection, the strategy
chosen is essentially random. The algorithm does average ( for a pspace
warrior ) against non-pspace opponents, but tends to do better against
more compex p-opponents that take multiple rounds to make decisions. Jack in the
box used this type of p-logic quite successfully.
The following switcher is very similar to the above one, but with two changes: firstly, and
less important, the add.a #1,pTable line has been changed to
nop.f }pTable,}-1000: this has two advantages 1) If the b-field of this
instruction happens to be hit very early in the round, the warrior will still
behave properly and 2) It gets a free bomb in. Both minor advantages, but
there are no disadvantages, so why not? The second, and more important, change
is that the pspace value is not re-stored unless it has changed (note that it
still always undergoes a mod before being used); this saves a cycle and leads
to exactly the same behaviour. Disadvantages of this switcher: same as before,
and also it can't be modified to a switch-on-loss or tie, without losing it's
speed advantage.
If the jmn.b pGood,pGo line is changed to jmn.b pTable,pGo, the switcher gets
another cycle quicker on a win or a tie, but it then has the possibility of
suicide if it is brainwashed and doesn't lose. Whether this is a good thing or
not depends on the number of brainwashers on the hill, what they're
brainwashing with, how quick your boot routine is and countless other things.
pGo ldp.ab #pResult, #0
ldp.a #pSpace, pTable
jmn.b pGood, pGo
nop.f }pTable, }-1000
stp.ab pTable, #pSpace
pGood mod.a #pNum, pTable
pTable jmp.a @0, strat1
dat.f 0, strat2
Length: 6 + 1/strategy
Cycles: Win/Tie: 5
Loss: 7
Another possible solution is shown below. It is one instruction smaller but it
needs one cycle more (two cycles more for strategy 2) than the above mentioned code:
pGold ldp.ab #0, #0 ; get results of last battle
ldp.a #pflag, pGold ; retrieve attempted strategy
add.a #1, @pGold ; if a loss, increment strategy
mod.a #2, pGold ; safeguard against brainwashing
stp.ab pGold, #pflag ; store current strategy
jmz.a strat1, pGold ; select strategy 1
jmp strat2 ; select strategy 2
Length: 5 + 1/strategy
Cycles: Win/Tie: 6 for strat1 and 7 for strat2 (6 if strategy 2 immediately follows)
Loss: 6 for strat1 and 7 for strat2 (6 if strategy 2 immediately follows)
A powerful adaptation of the routine can be made with no extra
instructions. By increasing the MOD number we have an assymetric
switcher, by which the second strategy is selected more often than
the first. This can be very helpful in pairing up a strong
all-purpose warrior with a special-purpose warrior.
The following two switcher have the same behaviour: on a loss, they will
change to the next strategy in their table, whilst they will select a specific
strategy after a tie. The first switcher is slower but smaller than the second.
pSelect equ 0 ; which to select on tie
pGo ldp.ab #pResult, #0
ldp.a #pSpace, pTable
sne.ab #0, pGo
add.a #1, pTable
sne.ab #2, pGo
mov.a #pSelect, pTable
mod.a #pNum, pTable
stp.ab pTable, #pSpace
pTable jmp.a @0, strat1
dat.f 0, strat2
Length: 8 + 1/strategy
Cycles: Win: 7
Tie/Loss: 8
Switch on loss, select on tie number 2
pGo ldp.a #pResult, pJump
ldp.a #pSpace, pTable
pJump jmp.a @0, pLoss
dat.f 0, pWin
dat.f 0, pTie
[any number of instructions you like in here]
pTie mov.a #(pSelect-1),pTable
pLoss nop.f }pTable, }-1000
stp.ab pTable, #pSpace
pWin mod.a #pNum, pTable
pTable jmp.a @0, strat1
dat.f 0, strat2
Length: 9 + 1/strategy
Cycles: First round: 4
Win: 5
Tie: 8
Loss: 7
A more complex switcher was first released in Kline's Gem of the Ocean. It
is quite effective while maintaining simplicity. It uses an array to determine its
next strategy thereby controlling the sequence of warriors. The key here is
to select the proper sequence of warriors as well as good components.
parray dat 0, strat1 ;table for pswitching
dat 0, strat3 ;- stay after win
dat 0, strat1 ;- go to ptie after tie
dat 0, strat4 ;- move up one after loss
dat 0, strat1
dat 0, strat3
ptie dat 0, strat2
dat 0, strat4
presult ldp.ab #0, #0 ;pswitch code, nice and short
pselect ldp.ab #PFLAG, #0
sne.ab #0, presult ;advance table selection after a loss
add.ab #1, pselect
sne.ab #2, presult ;go to Sweep after a tie
mov #ptie-parray,pselect
stp.ba pselect, pselect
mod.ab #presult-parray,pselect
add.ba pselect, pchoice
pchoice jmp @parray, {pselect ;jump to selection
Length: 10+1/strategy
Cycles: First round: 4
Win: 6
Tie: 5
Loss: switch: 12
no switch: 10
1.2 Switch on loss or loss/tie with brainwash detection
To equip a switcher with a decent brainwash detection system is simple. One method
is to simply load a number into pspace on the first round and check it every
remaining round to see if it's been tampered with. You could reset it and
run a special anti-wash warrior or stick with the anti-wash warrior for the
rest of the battles. This could take a lot of extra p-code. A better method
is may be this:
res ldp.ab _RESULT, #0
str ldp.a _STRATEGY, str1 ;load strategy in use
mod.a #97, str1 ;valid strategies are all mod 97
jmn.a washed, str1 ;do 'have been washed' code [then reset?]
ldp.a _STRATEGY, str1 ;got to be a way to get rid of repeating this
sne.ab #0, res ;check result
lost add.a #97, str1 ;lost change
mod.a #X, str1 ;X = number of strategies to switch
win stp.ab str1, _STRATEGY
str1 jmp @0, strat0
dat 0, strat1
dat 0, strat2
. . .
dat 0, stratX-1
Length: 9 + 1/strategy
Cycles: brainwashed: 4
Win: 9
Tie: 9
Loss: 10
This is almost exactly the same p-code mentioned at first in chapter 1.1 with
the addition of three extra lines
of instructions to ensure that all valid strategies are mod 97. A brainwash
would have to wash with a multiple of 97 (unfortunately zero is one of them)
to escape detection. Of course the special number can vary as long as the
mod.a #X instruction yields the proper series to switch your components.
1.3 Switch on two consecutive loss/tie
Quickness is especially important after the release of a new generation of
superfast quickscans (Q^2, Q^3, etc). The downside is no matter what effective
strategy you are using, a spurious loss will switch you to a less effective one.
It may take even another loss to return to the optimal strategy if you are switching more than one
component. To combat this phenomena, a switch on two consecutive
tie/loss combinations was developed. Strategies will only change on a loss loss, tie loss,
loss tie, or tie tie.
A loss or tie will advance the table pointer contained in _STRAT. If the
_STRAT pointer is 0 or 2, that strategy will get its 'second chance'. If the
index is 1 or 3, strategies will switch. A win should reset the pointer to
either 0 or 3 depending on what strategy is currently in use. This was done by
taking the index modulo 2 and subtracting the result from the current index.
_RES EQU #0
_STRAT EQU #1
result ldp.ab _RES, wlt
strat ldp.a _STRAT, choice
wlt sne.ab #1, #0
jmp win ; win has occured
add.a #1, choice ; loss or tie has occured
mod.a #4, choice
stp.a choice, _STRAT
choice jmp @0, strat1 ; _STRAT = 0, choose strat1
nop 0, strat1 ; _STRAT = 1, choose strat1
nop 0, strat2 ; _STRAT = 2, choose strat2
nop 0, strat2 ; _STRAT = 3, choose strat2
win mov.ab choice, #0 ; reset index becuase of win
mod.ab #2, win
sub.ba win, choice
jmp choice-1
Length: 11 + 2/strategy
Cycles: Win: 8
Tie: 7
Loss: 7
All in all, the brain is much bigger (8 vs. 15) and a bit slower. But it
should has the smarts necessary to defeat non-pspace programs quite handily.
However this isn't the case. When switching the components the logic performe
similarly or worse! The optimal strategy would win for awhile, but the occasional loss
tie combination would cause it to switch. The secondary strategy (being a
somewhat general purpose strategy) would not switch out as fast due to the
new switching rules. This seemed to balance the extra wins gained. It seems
that the p-logic would be much more effective if the secondary strategy
is very specialized against one type of program and losses quickly against
everything else.
1.4 Switch on n losses
If the warrior loses for pLosses
consecutive rounds, then it will switch to the next strategy in its table.
This means that a spurious loss for the optimal strategy won't kick in another
strategy. Also, after a win
or a tie, it is as quick as the above switchers, but is quite a bit slower
after a loss, and also bigger.
One potential problem with this switcher is that it is very zero brainwash
sensitive: if it is brainwashed with a zero, the first strategy in the table
will be selected, and the loss counter reset, which could produce problems if
this strategy lost reguarly to the brainwasher. If pLosses is 2, the switcher
can be modified by changing pLoss to ldp.ab #pSpace2,#1 and pThink to jmz.b
pWin,pLoss, to make it sensitive to a 1-brainwash rather than 0-brainwash,
whilst if pLosses>2, then changing pLoss to ldp.ab #pSpace2,#1 and pThink to
cmp.b pLoss,#1 will have the same effect, but increasing the running time by 1
cycle for a loss without switch.
One final modification that can be made is changing the first three lines to
pGo ldp.ab #pResult,pJump
ldp.a #pSpace1,pTable
pJump jmp.a @0,pLoss
dat.f 0,pTie
which will lead to a switch after pLosses rounds without a win. Note that this
modification is not compatible with the brainwash changes above, and changes
the stats to: Length: 12+1/strategy. Cycles: Win: 6, Tie/Loss no switch:10,
Tie/Loss switch: 12, First round: 4
pSpace1 equ 1 ; used to store the current strategy
pSpace2 equ 2 ; used to store the loss counter
pLosses equ 3 ; how many losses before we switch
pGo ldp.ab #pResult,#0
ldp.a #pSpace1,pTable
pJump jmn.b pWin,pGo
pLoss ldp.ab #pSpace2,#0
nop.f >pLoss,}-1000
mod.ab #pLosses,pLoss
pThink jmn.b pWin,pLoss
nop.f }pTable,}-2000
stp.ab pTable,#pSpace1
pWin stp.b pLoss,#pSpace2
mod.a #pNum,pTable
pTable jmp.a @0,strat1
dat.f 0,strat2
end pGo
Length: 11 + 1/strategy
Cycles: Win/Tie: 6
Loss: when no strategy switch occurs: 10
when switch occurs: 12
The following switcher is very similar to the switch on n consecutive losses switcher
above, but has different enough behaviour to warrant a separate entry.
The only difference here is that a
tie will not reset the loss count, so a sequence of loss-tie-loss-loss will
cause a switch, if pLosses is equal to 3.
pGo ldp.ab #pResult,pJump
ldp.a #pSpace1,pTable
pJump jmp.a @0,pLoss
dat.f 0,pWin
dat.f 0,pTie
[any number of instructions here]
pLoss ldp.ab #pSpace2,#0
nop.f >pLoss,}-1000
mod.ab #pLosses,pLoss
pThink jmn.b pWin,pLoss
nop.f }pTable,}-2000
stp.ab pTable,#pSpace1
pWin stp.b pLoss,#pSpace2
pTie mod.a #pNum,pTable
pTable jmp.a @0,strat1
dat.f 0,strat2
Length: 13+1/strategy
Cycles: First round: 4
Win: 6
Tie: 5
Loss: switch: 12
no switch: 10
A p-switcher with a more sophisticated scoring system is shown below:
start ldp.ab _STRATEGY, check
jmz.b switch, check ; washed or first round, just do a switch
check mod.ab #97, #0000 ; if not multiple of 97, have been washed!
jmn.b scanner, check ; washed --> do scanner (can do any plogic)
ldp.a _RESULT, keeper
ldp.ab _SCORE, adjust
keeper jmp @0, adjust+1 ; result == 0 == loss
jmp 0, adjust ; result == 1 == win
jmp 0, adjust+2 ; result == 2 == tie
adjust add.ab #6, #0 ; win
add.ab #-2, adjust ; lose
add.ab #-1, adjust ; tie
ldp.a _STRATEGY, select ; again. can't seem to save it.
slt.b adjust, #7000 ; did it go negative?
jmp switch
return stp.b adjust, _SCORE
mod.a #3, select
select jmp @0, scanner
jmp 0, stone
jmp 0, paper
switch add.a #97, select
stp.ab select, _STRATEGY
mov.ab #10, adjust
jmp return
Length: 21 + 1/strategy
Cycles: first round: 4
brainwashed: 9 (4 if brainwashed with 0)
Win: 20
Tie: 18
Loss: 19
This code creates a 'score' for each strategy starting at 10. For every
win the score increases by 3, tie--decrease by 1, loss--decrease by 3. The
strategy switches when the score goes negative. When fighting a non-pspace
opponent, the logic should quickly find the proper strategy and not deviate
from the optimum.
1.5 The P^2 switchers
Most of the above presented various switching routines have
reasonably good response times. The disadvantage of all this routines is that
they are not very flexible. A switching strategy that works well for one
component may work poorly with another. Any attempt to rectify this usually
means using a large, slow switching routine.
To solve this Anton Marsden wrote a general purpose switching routine (P^2) that is
fast no matter what strategy the 'programmer' wanted to use.
The P^2 is a fast table-based switcher which can manage any conceivable strategy
or combination of strategies using 6 cycles. However, the tables are
potentially large and for this reason it may not be the best switcher to use
in some cases, especially if you want to use a quickscan in your P-Warrior.
Below is a demonstration of a switch-on-3-consecutive-losses routine containing 2 components.
1. Draw a state transition diagram showing all possible states. Each state
should have three outward arrows for win, loss, and tie inputs. Each arrow
should identify a component to be run and point to the next state. Note that
'components' and 'states' are not equated in this logic as they are in many
previous switchers. State numbers should start from 0.
Key:
{n}=state n
W=win
L=loss
T=tie
c1=component 1
c2=component 2
The following diagram is a bit messy but you should get the idea.
+-------------------------------------------+
| T,W (c1) |
| +-------------------+ |
| | T,W (c1) | |
| v | |
+->{0}---------------->{1}---------------->{2}----->{3}
| ^ L (c1) L (c1) L (c2)
| |
+-+
T,W (c1)
+-------------------------------------------+
| T,W (c2) |
| +-------------------+ |
| | T,W (c2) | |
| v | |
+->{3}---------------->{4}---------------->{5}----->{0}
| ^ L (c2) L (c2) L (c1)
| |
+-+
T,W (c2)
2. Pick an initial state transition, eg. {2} -> {0} using c1.
3. Put all loss transitions into a data table where entry n-1 contains the
state transition information for state n-1:
loss_table
dat 1,c1 ; State 0 goes to state 1 on a loss using component 1.
dat 2,c1 ; State 1 goes to state 2 on a loss using component 1.
dat 3,c2 ; State 2 goes to state 3 on a loss using component 2.
dat 4,c2 ; State 3 goes to state 4 on a loss using component 2.
dat 5,c2 ; State 4 goes to state 5 on a loss using component 2.
dat 0,c1 ; State 5 goes to state 0 on a loss using component 1.
4. Do the same for wins and ties.
win_table
dat 0,c1 ; State 0 goes to state 0 on a win using component 1.
dat 0,c1 ; State 1 goes to state 0 on a win using component 1.
dat 0,c1 ; State 2 goes to state 0 on a win using component 1.
dat 3,c2 ; State 3 goes to state 3 on a win using component 2.
dat 3,c2 ; State 4 goes to state 3 on a win using component 2.
dat 3,c2 ; State 5 goes to state 3 on a win using component 2.
tie_table
dat 0,c1 ; State 0 goes to state 0 on a tie using component 1.
dat 0,c1 ; State 1 goes to state 0 on a tie using component 1.
dat 0,c1 ; State 2 goes to state 0 on a tie using component 1.
dat 3,c2 ; State 3 goes to state 3 on a tie using component 2.
dat 3,c2 ; State 4 goes to state 3 on a tie using component 2.
dat 3,c2 ; State 5 goes to state 3 on a tie using component 2.
5. Set up the initial state.
init_state EQU (tie_table+2) ; {2} -> {0} using c1.
6. Put the tables and components into P^2. Remember to set STATES to the
correct value (6 in this case).
7. Optimise the code - this is left as an exercise for the reader.
How it works: the result of the last fight is loaded as well as the state used
for the last fight. From this information the next state is determined using
the tables that were constructed from the state transition diagram. Then the
component associated with the transition is executed.
;redcode-94
;name P^2
;author Anton Marsden
;assert CORESIZE==8000
PSTATE EQU 666 ; pspace location containing current state
STATES EQU 6 ; maximum number of states (for brainwash protection)
;NOTE: state values go from 0 to STATES-1
dat 0,init_state-state
in dat 0,loss_table-state
dat 0,win_table-state
dat 0,tie_table-state
think ldp.a #0,in ; get input value
load ldp.a #PSTATE,state ; load old state
mod.a #STATES,state ; brainwash protection
add.ba *in,state ; select correct state table
store stp.a *state,load ; store new state
state jmp @0 ; jump to warrior code
; You could also delete the #PSTATE value
; if you wish (or the stp.a instruction).
c1 EQU stone_boot
c2 EQU paper_boot
loss_table
dat 1,c1
dat 2,c1
dat 3,c2
dat 4,c2
dat 5,c2
dat 0,c1
win_table
dat 0,c1
dat 0,c1
dat 0,c1
dat 3,c2
dat 3,c2
dat 3,c2
tie_table
dat 0,c1
dat 0,c1
init_state dat 0,c1
dat 3,c2
dat 3,c2
dat 3,c2
END think
The P^2b switcher is based upon the ideas presented already in the P^2 switcher. P^2 has a major
weakness: it needs a very large table to implement even simple state
transitions. Generally, the state transition rules we use for one component
can be used with another - but P^2 doesn't take this into account. P^2b
attempts to use this transition similarity property.
P^2b takes 7 instructions to execute (as opposed to 6 for P^2).
In order to keep the code size to a minimum, it is required that state 0 and
state 1 execute the same warrior. Apart from this restriction, it should be
possible to implement any state transition table that was possible in P^2.
However, keep in mind that the initial state is a little bit tricky to set
up properly.
;redcode-94
;name P^2b
;author Anton Marsden
;strategy P-switcher
;assert CORESIZE==8000
PSTATE EQU 666 ; pspace location containing current state
STATES EQU 9 ; number of states
T1 EQU (t1-in)
T2 EQU (t2-in)
T3 EQU (t2-in)
think ldp.a #0,in ; get input value
load ldp.a #PSTATE,state ; load old state
mod.a #STATES,state ; brainwash protection
add.ba *state,in ; select correct state table
in add.ba 0,state
store stp.a state,load ; store new state
state jmp }0,T1 ; jump to warrior code
dat w1,T2 ; if state.A == 0, the A field of this
; instruction will be jumped to.
dat w1,T3
dat w2,T1
dat w2,T2
dat w2,T3
dat w3,T1
dat w3,T2
dat w3,T3
t2 dat w1,1 ; loss - add 1 to the state
dat 0,-1 ; win - subtract 1 from the state
dat 0,0 ; tie - leave the state as it is
t1 dat 0,1 ; loss - add 1 to the state
dat 0,0 ; win - leave the state as it is
dat 0,0 ; tie - leave the state as it is
;Warrior code
w1: jmp #1,1
w2: jmp #2,2
w3: jmp #3,3
END think
T1, T2 and T3 correspond to different states a component can be in.
In this case, T2 and T3 use the same win/loss/tie table to switch states.
Another P^2 variation was first introduced in Ian Oversby's Nomolos. It uses just one
table instead of 3 as in the case of the original P^2 switcher. The advantage is that
the switching pattern is easier to modify, because it is clearly seen in the code.
init dat.f #0, #0
loss dat.f #0, #1
win dat.f #0, #-1
tie dat.f #0, #0
pcode ldp.a #0, loss
ldp.a #plc, table
add.ba *loss, table
mod.a #19, table ; (-1 mod 19) = 0 :-)
stp.ab *table, #plc
table jmp.b @0, w1 ; State 0
dat.f #1, w1 ; Stone
dat.f #2, w1
dat.f #3, w1
dat.f #4, w1
dat.f #6, w2 ; Must always waste one line :-(
dat.f #6, w2 ; One-shot
dat.f #7, w2
dat.f #8, w2
dat.f #9, w2
dat.f #10, w2
dat.f #12, w3 ; Must always waste one line :-(
dat.f #12, w3 ; Mini-HSA
dat.f #13, w3
dat.f #14, w3
dat.f #15, w3
dat.f #17, w4 ; Paper
dat.f #17, w4
dat.f #17, w4 ; Safety net
1.6 The P^3 switcher
Redcoders often design their pspace-based switching routines in the form of
a state-transition diagram as already shown in Chapter 1.5:
.->------. L = loss
| | W = win
W,T | __V______ T = tie
(w0) | | |<---.
`--<--| State 0 | \ w0 = strategy A (e.g. scissors)
`---------' `. w1 = strategy B (e.g. paper)
| L (w1) |
| |
____V____ |
.---->| | |
| | State 1 |---->--'
W,T | `---------'
(w1) | | L (w0)
`--<----'
Each round gives the program a new chance to change its mode of operation,
depending on the previous state and the result of the previous battle. In
addition, there is some action (or strategy) associated with each transition
between states:
State Input Next Action
0 L 1 w1
0 W 0 w0
0 T 0 w0
1 L 0 w0
1 W 1 w1
1 T 1 w1
Note that this pspacer would run w0 upon all transitions to state 0, and
that it would run w1 upon all transitions to state 1. Most interesting P^2
switchers execute the same program upon all transitions to the same state.
Can you see why?
If we assume that all transitions to a particular state run the same
warrior, then we no longer need to specify an action for each transition;
we simply specify an action for each state. That leads to a more compact
representation of the table:
State Action L W T
0 w0 1 0 0
1 w1 0 1 1
The corresponding diagram becomes:
.->------. L = loss
| | W = win
W,T | __V______ T = tie
| | |
| | State 0 |<---.
`--<--| (w0) | \ w0 = strategy A (e.g. scissors)
`---------' `. w1 = strategy B (e.g. paper)
| L |
| |
____V____ |
.---->| | |
| | State 1 | |
| | (w1) |---->--'
W,T | `---------'
| | L
`--<----'
The switcher for Electric Head has that property. Specifically, Electric
Head implemented the following state-transition table:
State Action L W T
0 w0 1 0 1 <---- initial state
1 w0 2 0 2
2 w0 3 0 3
3 w1 4 3 3
4 w1 5 3 4
5 w1 6 4 5
6 w2 7 6 0
7 w3 8 7 7
8 w3 9 7 8
9 w3 3 8 9
A P^3 switcher can implement the same table in less time and in less space
than Electric Head's original code:
think ldp.a #0, in
ldp.a #PSTATE, table
mod.ba *in, table
stp.b *table, #PSTATE
table jmp }0, 441 ; = (44*10) + 1 = (49*9) + 0 = (40*11) + 1 ;initial state
dat w0, 882 ; = (88*10) + 2 = (98*9) + 0 = (80*11) + 2
dat w0, 333 ; = (33*10) + 3 = (37*9) + 0 = (30*11) + 3
dat w1, 894 ; = (89*10) + 4 = (99*9) + 3 = (81*11) + 3
dat w1, 345 ; = (34*10) + 5 = (38*9) + 3 = (31*11) + 4
dat w1, 346 ; = (34*10) + 6 = (38*9) + 4 = (31*11) + 5
dat w2, 627 ; = (62*10) + 7 = (69*9) + 6 = (57*11) + 0
dat w3, 898 ; = (89*10) + 8 = (99*9) + 7 = (81*11) + 7
dat w3, 349 ; = (34*10) + 9 = (38*9) + 7 = (31*11) + 8
dat w3, 53 ; = ( 5*10) + 3 = ( 5*9) + 8 = ( 4*11) + 9
dat w3, 349 ; = (34*10) + 9 = (38*9) + 7 = (31*11) + 8 ;unreachable
in dat 0, 10 ; must have non-zero b-field in the previous cell
spl 1, 9
spl 1, 11
For example, after a transition to state 4, the pspacer will remember
the code 345. In the case of a loss, the next state will be decoded by
345 mod 10; after a win, the next state would be 345 mod 9; after a tie,
the next state would be 345 mod 11. Each of these decodings gives the
proper transition from one state (4) to the next (5, 3, or 4).
Note that we cannot specify different warriors for two different transitions
to the same state. However, as we have seen before, that is a very soft
restriction. Another restriction is that the 0th state must use the same
warrior as the 1st state.
Now we show how to construct the codes that are used in the table. This
requires some abstract algebra, but the results will be summarized below.
First, we must introduce some notation.
Operators listed from highest to lowest priority:
gcd(a,b) the greatest common divisor of a and b
a =< b a is less than or equal to b
a >= b a is greater than or equal to b
a and b logical and
a -> b a implies b (if a is true, then b is true)
"mod" will no longer refer to the MOD operation from redcode. Instead:
a mod n a modulo n
a *' b (mod n) the product of a and b modulo n
a +' b (mod n) the sum of a and b modulo n
a == b (mod n) a is congruent to b modulo n (also: n divides a-b)
I will borrow the % operation from the C language, which is similar to the
MOD from redcode:
a % b the unique c such that 0 =< c < b and c == a (mod b)
One of the theoretical results requires some more notation:
Z the set of all integers
Z mod n the set of all congruence classes modulo n
G x H the group formed by the Cartesian product of groups G and H
Modular arithmetic should look familiar to Redcoders, since all arithmetic
on the standard hills is done in Z mod 8000.
Examples:
1 == 4 (mod 3) -> 3*k = 1-4 for some integer k
2 *' 3 == 1 (mod 5)
(2 * 3) % 5 = 1
gcd(3,3) = 3
gcd(2,3) = 1
gcd(2,4) = 2
We begin with a semi-formal description of the switcher:
.------------------------------------------------------.
| P^3 switcher |
|------------------------------------------------------|
| r,s,t are positive integers |
| |
| For each state in the table, there are non-negative |
| integers p,a,b,c such that p < CoreSize and |
| |
| a < r and p == a (mod r) and |
| b < s and p == b (mod s) and |
| c < t and p == c (mod t) |
`------------------------------------------------------'
There will be different p,a,b,c for each state. However, since each state
may be considered independently, we won't bother to put subscripts on those
variables.
In the example above, r = 10, s = 9, and t = 11. Depending on the state,
p could be 441, 882, 333, ..., or 53. In the 0th state, a = 1, b = 0,
and c = 1.
To guarantee that we can compute any transition table, we need to choose
r,s,t such that for all possible a,b,c, there is a value for p such that
the above properties hold. In that case, there are r*s*t possible triplets
(a,b,c). It would be nice if every p between 0 and r*s*t-1 resulted in a
different triplet (p mod r, p mod s, p mod t). Then every possible triplet
(a,b,c) could be computed by choosing one of the first r*s*t non-negative
integers for p. That would guarantee that we can compute any transition
table if r*s*t does not exceed CoreSize.
Using some results from group theory, we can make those guarantees true
if gcd(r,s)=1 and gcd(r,t)=1 and gcd(s,t)=1. For ambitious mathematicians
who want to prove that statement, two of the relevant theorems are:
Z mod m x Z mod n is cyclic if and only if gcd(m,n)=1.
Every finite cyclic group of order n is isomorphic to Z mod n.
To express any possible transition table with n states, we must be able
to compute all triples (a,b,c) such that a<n, b<n, and c<n. Therefore, we
should choose r,s,t such that r >= n, s >= n, and t >= n. If we do that,
then we can assume that either r>n, s>n, or t>n (otherwise, either n=1,
gcd(r,s)>1, gcd(r,t)>1, or gcd(s,t)>1).
The result is that we will have some extra states which are not reachable
from the transitions specified by our n-state table. We should specify the
transitions for those unreachable states just in case a lucky opponent
brainwashes the switcher. That is why we added an extra table entry in the
example from Electric Head. We chose the same transitions as state 8 since
we hypothesized that Carbonite was the most brainwash-resistant component.
Since the size of the table will be the maximum value of r,s,t, we should
choose r,s,t to minimize max{r,s,t}. Since r,s,t are at least as large as n,
we can't achieve any result that is better than s=n, r=n+1, and t=n+2. In
fact, if n is odd, then that particular assignment does satisfy all of the
constraints. If n is even, then gcd(r,t) = gcd(n,n+2) = 2, so we are forced
to look for larger r,s, or t. In that case, we can simply add one state to
the table (to make n odd) to achieve the next best possible result.
That gives us the following rule for selecting r,s,t for an n-state table:
If n is odd, then m = n+1.
If n is even, then m = n+2.
r = m, s = m-1, t = m+1.
Now that we have fixed r,s,t, we can compute the value p for each state
given a,b,c. We use the following result:
.-----------.
| Theorem |
`-----------'
Given a,b,c,m,n,i,j,p such that
m = n+1 if n is odd
m = n+2 if n is even
0 =< a < m and
0 =< b < m-1 and
0 =< c < m+1
i = b+c-2*a
j = (i+1+m)/2 if i is odd
j = i/2 if i is even
p = (j*(m-1)+b-a)*m+a
Then
p == a (mod m)
p == b (mod m-1)
p == c (mod m+1)
.-----------.
| Proof |
`-----------'
Suppose that i is odd.
j = (i+1+m)/2
-> j = (i+1)/2 + m/2 (since m is even and i is odd)
-> 2*j = (i+1) + m
-> 2*j == i+1+m (mod m+1)
-> 2*j == i +' m+1 (mod m+1)
-> 2*j == i +' 0 (mod m+1)
-> 2*j == i (mod m+1)
Suppose that i is even.
j = i/2 -> 2*j = i -> 2*j == i (mod m+1)
In either case, 2*j == i (mod m+1).
2*j == i (mod m+1)
-> 2*j == b+c-2*a (mod m+1)
-> j *' 2 == b+c-2*a (mod m+1)
-> j *' (-2) == 2*a-c-b (mod m+1)
-> j *' (m+1-2) == 2*a-c-b (mod m+1)
-> j *' (m-1) == 2*a-c-b (mod m+1)
-> j*(m-1) == 2*a-c-b (mod m+1)
-> j*(m-1)+b-a == a-c (mod m+1)
Substitute k for j*(m-1)+b-a.
k == a-c (mod m+1)
-> k*(-1) == c-a (mod m+1)
-> k *' (-1) == c-a (mod m+1)
-> k *' (m+1-1) == c-a (mod m+1)
-> k *' m == c-a (mod m+1)
-> k*m == c-a (mod m+1)
-> k*m+a == c (mod m+1)
-> (j*(m-1)+b-a)*m+a == c (mod m+1)
-> p == c (mod m+1)
This proves 1/3 of the theorem. Furthermore,
k = j*(m-1)+b-a -> j*(m-1) = k-(b-a)
-> k == b-a (mod m-1)
-> k*1 == b-a (mod m-1)
-> k *' 1 == b-a (mod m-1)
-> k *' (m-1+1) == b-a (mod m-1)
-> k *' m == b-a (mod m-1)
-> k*m == b-a (mod m-1)
-> k*m+a == b (mod m-1)
-> (j*(m-1)+b-a)*m+a == b (mod m-1)
-> p == b (mod m-1)
For the last part, we simply observe:
p = k*m+a -> k*m = p-a -> p == a (mod m)
Thus concluding the proof of this theorem.
.-----------.
| Summary |
`-----------'
In summary, to construct a table of n states:
m = n+1 if n is odd
m = n+2 if n is even
For each state, choose integers a,b,c such that
0 =< a < m and
0 =< b < m-1 and
0 =< c < m+1
i = b+c-2*a
j = (i+1+m)/2 if i is odd
j = i/2 if i is even
p = (j*(m-1)+b-a)*m+a % m*(m-1)*(m+1)
p % m = a
p % m-1 = b
p % m+1 = c
.-----------.
| Example |
`-----------'
For Electric Head, the number of states is n = 10. Normally, m = 12 to
guarantee that all possible tables can be implemented. However, in this
case, m = 10 will work if we choose to decode with m on loss, m-1 on win,
and m+1 on tie. This implementation works because we know the particular
a,b,c values for each state and they are within the bounds:
0 =< a < 10 (a is the transition on loss)
0 =< b < 9 (b is the transition on win)
0 =< c < 11 (c is the transition on tie)
Now we must determine the p value for each state. For state 0,
a = 1
b = 0
c = 1
i = 0+1-2*1 = -1
j = (-1+1+10)/2 = 5 (since i was odd)
p = (5*(10-1)+0-1)*10+1 % 10*(10-1)*(10+1)
= (5*9-1)*10+1 % 10*9*11
= 441 % 990
So p = 441 is the value that will actually appear in the redcode. We verify
that it is indeed correct:
441 % 10 = 1
441 % 9 = 0
441 % 11 = 1
Note that m*(m-1)*(m+1) = 990 does not exceed CoreSize = 8000. Since
20*19*21 = 7980, any table with 19 states can be implemented in a CoreSize
of 8000. Of course, larger tables are possible with tricky coding.
1.7 Addaptive switcher
The first apperance of an addaptive switcher was in Neogryzor's P-Key. His entry for the Redcoders Frenzy Round 7. In this round one had to construct a
p-warrior from three given components (Cloner II, CLP and Frontwards), which cannot be modified.
The Redcoding required was *only* to create the p-brain. Each entry was fight each other
entry for 2000 rounds in a round robin tournament without self-fights.
The scoring was one point for a win and no points for a tie or loss. The scoring encourages offensive
switchers, so tieing was not an acceptable result.
The main idea was to change the switching method if it was not
successful. Notice that a single one switches always in the same way,
so it can be predictable. And all opponents in that round were p-warriors as well.
-loss -> switch to the next -> tie : The switch has not been
successful so it must be modified.
It is necessary to define three variables wich contain the change
of the strategy after every result: 'w','l' and 't'.
-on loss: strat=strat+l
-on tie: strat=strat+t
-on win: strat=strat+w
The P-brain starts working like a single switcher:
l=1 -> next strat if loss
t=2 -> previous strat if tie
w=0 -> same strat if win
But these constants will be modified if they are not effective:
result two rounds before last result action
------------------------ ----------- ------
tie tie t=t+2 (same as -1)
loss tie l=l+2
win tie w=w+2
tie loss t=t+1
loss loss l=l+1
win loss w=w+1
xxx win no changes, all right
It is recommended to use MOD #3,X after this changes to make sure
the constants are between 0 and 2, for brainwash detection can be done and
so the trace is easier, but I think this is not a real necessity.
The working order is:
1- adjust constants depending on the last result and two rounds before
2- Switch depending on the last result, using the modified constants
As you see, it is necesary to save into the p-space:
- the last strategy selected
- result two round before
- l, w and t
The variables are stored in the p-space in this order just to make
the code a little shorter:
D_STRAT EQU 249
D_PREV EQU 250
D_LOS EQU D_PREV+1
D_WIN EQU D_PREV+2
D_TIE EQU D_PREV+3
When working with the previous results we find a problem: in the
first round all variables are 0. Just checking the location 0 of the p-space
seems to be the solution. If it is = -1, the values are saved in the
p-space, but the problem doesn't ends here. The last result value is still -1. It
can be incremented using '>', but remember that 0 means a loss in the
previous fight and the switcher will change the 'l' constant, (the result two round
before is 0). The initial value of 'l' must be 0 because it will be changed
to 1 in the first round. An easier solution is to add 2 to the previous
result, so it will be = 1 and the initial values will not be changed, but it
needs one instruction more. P-key uses the first method, but it took a lot of
tweaking because some instructions needed to be modified:
RESULT: LDP.B 1,#0 ;if RESULT==-1, the JMZ line detects a
DAT 0,0
STRAT: LDP.AB #D_STRAT,#0
PREV: LDP.AB #D_PREV,#0
JMZ.F INIT,>RESULT ;if RESULT==-1 then = 0 and jumps, else it is
; decremented later
(INIT loads the default values assuming it is the first round.)
Well, the most important thing is already done. Now one have to explain
some important details:
The strategies are identified by comparison:
* 0 < strat < 2667 :Cloner II (1000 by default)
* 2667 <= strat < 5334 :CLP
* 5334 <= strat < 0 :Frontwards
The switch is made easily just adding 2667 or 2*2667,
(CORESIZE/3), to the previous strategy. Obviously CORESIZE%3 is not = 0, so after 3
switches, the strat has been incremented or decremented. The default
starting value is 1000 for Cloner, so after 3*(1000) backward switches
the strategy number gets wrong, (in the worst case). That's 3000
consecutive switches, but only 2000 rounds were fought, so there was no problem.
Though it can be done by defining the strategies as 0, 1 and 2.
The switch can be done by adding 1 or 2 and then using MOD #3 to ensure
the strategy value is not greater than 2, so one more instruction is
needed. Otherwise the selection and the strategy launch is easier and shorter.
Both methods were tested and the first one has shown to score
better. I believe this probably happens because the first one is more robust
when brainwashed with 0, but it depends on the opponent's brainwash.
P-key has partial brainwash detection capability by finding unexpected
values in a variable. If the previous result is > 3, that means that
the warrior has been brainwashed, and the default values are loaded into
the p-space like in the first round. Though when it has been brainwashed
with a number between 0 and 3, (most opponents were using 0), it is not
detected. Fortunately the p-brain seems to be not so vulnerable after a
brainwash.
JMZ.F INIT,>RESULT ;first round check
SLT.AB #3,PREV ;brainwash detection.There is a little
bug. It should be #2 not 3
JMP CHLS,<RESULT ;jumps to the switcher
INIT: STP.AB #2,#D_TIE ;loads default values into p-space
STP.AB #0,#D_WIN
STP.AB #0,#D_LOS ;notice that this value will change to 1 in the first round
MOV #1000,STRAT ;STRAT=ClonerII, but it will change to CLP in the first round
In the first round RESULT=-1. It is incremented by the check.
Then RESULT=0. It means that the last result is loss, so the strategy will
change from Cloner II to CLP.
The last detail is the selected starting strategy. Using Frontwards as
first strat is riskly because it scores 50% against itself and it is
beaten, (and brainwashed) by CLP. Cloner II beats CLP but it can be
brainwashed too, (it's a paper). Being brainwashed at the beginning is a very bad
start. The warrior could get stuck executing the default starting strategy,
and being brainwashed repeatedly. If a warrior is caught in this
situation, it is condemned. CLP looks to be the best default strat when starting
and P-key uses it. Anyway, Cloner II is not so bad, and Frontwards is not
recommended.
Finally I can say there are 4 characteristics wich make P-key a
successful switcher:
* It's adaptative switching, which allows it to beat any single
switcher.
* The default selected data.
* Its switching engine, which showed to be less vulnerable than
others when brainwashed.
* The brainwash detection.
As you can see, the switch only depends on previous results, not
strategies, so P-key is not so effective against some table based or
semi-random switchers.
;redcode-94
;name P-Key
;author G.Labarga
;assert 1
;strategy learning switcher
;strategy It should defeat any single switcher
;strategy Redcoders Frenzy, round 7 winner, Feb. 22th 2003
STEP EQU 2667 ;switching step
D_STRAT EQU 249 ;variables are placed in consecutive
p-space locations
D_PREV EQU 250
D_LOS EQU D_PREV+1 ;the 'l', 'w' and 't' variables
D_WIN EQU D_PREV+2
D_TIE EQU D_PREV+3
RESULT: LDP.B 1,#0 ;<- Last round result
STRAT: LDP.AB #D_STRAT,#0 ;<- Last strategy
PREV: LDP.AB #D_PREV,#0 ;<- result two rounds before. This
comment was wrong in the submited version
JMZ.F INIT,>RESULT ;First round check, RESULT is
incremented
SLT.AB #3,PREV ;brainwash detection. Very slight bug:
should be #2 instead of #3
JMP CHLS,<RESULT ;jumps to CHLS and tweaks RESULT if it
is not the first round
INIT: STP.AB #2,#D_TIE ;first round or brainwashed: Load the
default values
STP.AB #0,#D_WIN
STP.AB #0,#D_LOS
MOV #1000,STRAT ;CLP will be the first executed
strategy
CHLS: JMN CHWI,RESULT ;Check the last round result
MOV.A #1,AWLT ;loss-> last decision,(w/l/t) ++
CHWI: SNE #1,RESULT
JMP ACT ;win: w,l,t are not modified. jumps to
the switcher
CHTI: SNE #2,RESULT
MOV.A #2,AWLT ;tie-> last decision,(w/l/t) --
ADJ: ADD.BA PREV,1 ;Adjust w/l/t
WLT: LDP.AB #D_PREV+1,#0 ;WLT: the variable that must be
modified
AWLT: ADD.AB #0,WLT ;increment or decrement,(-1=+2)
MOD.AB #3,WLT ;reduce switch to mod3
STP.BA WLT,WLT ;and save the modified variable
ACT: ADD.BA RESULT,1 ; Now switches the strategy depending
last result (W/L/T)
ACTS: LDP.AB #D_PREV+1,#0 ;Loads the variable, (w, l or t)
MUL.AB #STEP,ACTS ;strat = strat + 2667 * w/l/t
ADD.B ACTS,STRAT
SAVE: STP.B RESULT,#D_PREV ;last result saved in the p-space
STP.B STRAT,#D_STRAT ;the selected strat is saved in the
p-space
SLCT: SLT #STEP,STRAT ;Launches the selected strat
JMP SRC ;Cloner II
SLT #2*STEP,STRAT
JMP slDodger ;CLP
JMP boot ;Frontwards
; *** components size: 61 ins; P-brain: 32
But nevertheless keep in mind that adaptive switchers work best with specialist components, which beat the opponents components by a good percentage.
|