The Corewar Info Page

Sections

Home
Hills
Infinite Hills
Tournaments
Software
Evolving
Optimizer
Community
Newsletter
Discussion
History

Sections

For Beginners

First Steps
FAQ
Guides
Lexicon
Benchmarks

For Beginners

> Home > The Corewar Lexicon > How to Create a Number of Parallel Processess

How to Create a Number of Parallel Processess

For warriors which contain a multi-process paper, or a vector launched imp, it is neccessary to create a number of processes running in parallel. The usual method of achieving this is to use a combination of spl 1 and mov -1,0 instructions. To generate an exact number of parallel processes simply converting the number required in binary 3 -> 11, subtract one -> 10, use a spl 1 for every one and a mov -1,0 for every zero.

E.g. 5 decimal = 101 binary, take away one = 100 binary. This becomes the following code,

   spl 1     ;
   mov -1,0  ; Generate 5 processes
   mov -1,0  ;

However, it is possible to do this slightly faster, and perhaps even gain an extra b-field or two into the bargain, which can be used for storing data, or decrementing locations in core.

There are snippets which product 3,5,7 and 9 processes. If required, it is possible to add a number of spl 1,<nnn instructions to the end of the snippet. Each one will double the number of processes.

N   OLD CODE         NEW CODE         COMMENTS

3   spl  1, <xxx     spl  2, <xxx     One cycle faster, and one
    mov -1,    0     spl  1, <yyy     extra b-field

5   spl  1, <xxx     spl  2, <xxx     Faster by two cycles, and
    mov -1,    0     spl  2, <yyy     two extra b-fields
    mov -1,    0     spl  1, <zzz

7   spl  1, <xxx     spl  1, <xxx     One cycle faster, but no
    spl  1, <yyy     spl  1,   }0     extra b-fields, but won't
    mov -1,    0     spl  1, <yyy     work under ICWS'88 and is
                                      self-modifying.

9   spl  1, <xxx     spl 2, <xxx      Three cycles faster, plus
    mov -1,    0     spl 2, <yyy      two extra b-fields, but
    mov -1,    0     spl 1,   }0      won't work under ICWS'88
    mov -1,    0     spl 1, <zzz      and is self-modifying.

11  spl  1, <xxx     spl  1, <xxx     Two cycles fast, plus
    mov -1,    0     spl  1,  }0      one extra b-fields, but
    spl  1, <xxx     spl  2, <yyy     won't work under ICWS'88
    mov -1,    0     spl  1, <zzz     and is self-modifying.

While you will be lucky if this method gains your paper a fraction of a point on the hill, it is still clearly better. Of course, the code for 2^n processes remains identical, and cannot be improved.

There exist also another snippet to save an instruction. Because the following single instruction generates 3 processes:

   spl   0, }0

By adding further lines you can get the following paralell processes:

N   OLD CODE         ALT 1 CODE       ALT 2 CODE        

3   spl  1, <xxx     spl  2, <xxx     spl   0,   }0
    mov -1,    0     spl  1, <yyy

5   spl  1, <xxx     spl  2, <xxx     spl   0,   }0
    mov -1,    0     spl  2, <yyy     mov   asd, 0      
    mov -1,    0     spl  1, <zzz     (asd  spl 1)

6   spl  1, <xxx          -           spl   0,   }0
    mov -1,    0                      spl   1,   <xxx
    spl  1, <yyy

7   spl  1, <xxx     spl  1, <xxx           -
    spl  1, <yyy     spl  1,   }0
    mov -1,    0     spl  1, <yyy

9   spl  1, <xxx     spl 2, <xxx      spl   0,   }0
    mov -1,    0     spl 2, <yyy      mov   -1,  0
    mov -1,    0     spl 1,   }0      spl   1,   <xxx
    mov -1,    0     spl 1, <zzz

11  spl  1, <xxx     spl  1, <xxx      spl   0,  }0
    mov -1,    0     spl  1,  }0       mov   1,  0
    spl  1, <xxx     spl  2, <yyy      spl   1,  <xxx
    mov -1,    0     spl  1, <zzz

Keep in mind, that some processes get a cycle ahead with these. For example the following code

spl 2
spl 1

will get processes out of sync if it starts with more than 1.

Incidentally, getting them out of sync isn't always bad. One could try to take advantage of these.

spl 1
spl 1
spl 2

The code above could allowing for example to boot one piece of code of length 4 and one piece of code with length 8 with no wasted cycles and minimal boot code length.

For larger numbers of parallel running processes one could use a kind of vector launch parallel processes as the example shows:

;normal 17 parallel process loader
;4 wasted cycles

 spl  1
 mov -1, 0
 mov -1, 0
 mov -1, 0
 mov -1, 0

;no cycle waste
;every cycle used for spliting
;but 4 instructions longer

s0 spl 2
s1 spl s3
s2 spl @v1, }0
s3 spl *v1, }0
s4 spl 1

s5 ;code goes here

v1 dat s4, s3
   dat s4, s4
   dat s4, 0
   dat s5, 0

The process queue from run in this case looks like:

; 0
; 1 2
; 2 2 3
; 2 3 3 3
; 3 3 3 3 4
; 3 3 3 4 4 4
; 3 3 4 4 4 4 4
; 3 4 4 4 4 4 4 4
; 4 4 4 4 4 4 4 4 5
; 4 4 4 4 4 4 4 5 5 5
; 4 4 4 4 4 4 5 5 5 5 5
; 4 4 4 4 4 5 5 5 5 5 5 5
; 4 4 4 4 5 5 5 5 5 5 5 5 5
; 4 4 4 5 5 5 5 5 5 5 5 5 5 5
; 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5
; 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
; 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5

It's hard to say which of these two scores better. Just try both and look which one fits better to your warrior.

But you might also consider booting part of the spl block together with the main body, as it is shown for example in Digitalis 2002. This might be from importance to boot as fast as possible away from your bulky quickscanning part.

Main Articles

Paper

QScan

Scanner

Main Articles