NK Competition

Rules and material

We put our agents through a lot, but how would we fare on fitness landscapes ourselves?

We designed an NK Landscape Search Algorithm Competition, with the goal of finding the best search algorithm that the TOM Society can design. The game is simple: provide us with your search algorithm, and we will unleash it on many landscapes and see how it does!

As specifications and assumptions are important, the rules are included below. In short, your algorithm has 100 shots at sampling locations on an N=10, K=3 random-interaction fitness landscape, and we count the best fitness outcome of those 100 shots.

We look forward to receiving your submissions through the form on this page; happy searching!

Jerker, Maciej, and Axel

Rules

More details about the test environment and rules of the competition are described in the rules.

Material

To get you going, and to coordinate your submissions, please build on our Matlab or Python script.

Deadline

The initial submission deadline is January 10, 2025. Then we will announce the preliminary winner. After that, you can keep refining your algorithm for another round, ending before the 2025 TOM Conference. In the meantime, we will showcase approaches and their scores here.

Showcases

CWA / Reciprocal Hamming (michael christensen)

Estimates of function values and uncertainty hereof are maintained by weighing the observations by reciprocal Hamming distance.

The algorithm deterministically picks the point with highest estimate plus UCB-like bonus for uncertainty.

Python code:

# Notes: CWA algorithm by Michael Christensen 20241027

if t==1:

# Initialization of parameters and helpers, preferably put outside both loops:

ucb = 6.0 * np.sqrt((k+1)/6/n/n); # Exploration rate times local variation, ~1/2.

bad_value = np.nextafter(-np.inf,0,dtype=np.float64) # Worst performance possible.

genomeToIndex = lambda g,n : sum([g[i]*(1<<(n-i-1)) for i in range(n)])

indexToGenome = lambda i,n : np.array([(i>>p)&1 for p in range(n-1,-1,-1)])

# Initialization of estimates (by index) for each simulation:

H = np.zeros(2**n,dtype=float) # Reciprocal Hamming weights.

Y = np.zeros(2**n,dtype=float) # Hamming weighed observations.

# Initial sample deterministically:

choice_configuration = np.zeros(n,dtype=int)

else:

# Update H and Y from previous choice:

index = int(genomeToIndex(choices[t-2],n))

perf = performances[t-2]

H[index] = -1 # Minus one indicates observed, use abs for calculations.

Y[index] = perf # Actual performance when observed.

D = np.array([int.bit_count(index^j) for j in range(2**n)],dtype=float) # Hamming distances.

H = (H==-1) * (H) + (H!=-1) * (H+1/(D+(D==0))) # Add 1/D to H.

Y = (H==-1) * (Y) + (H!=-1) * (Y+perf/(D+(D==0)))# Add perf/D to Y.

# Select config deterministically as best Y/H plus variation bonus (UCB):

estimates = (H==-1) * (bad_value) + (H!=-1) * (Y/abs(H)+ucb/np.sqrt(abs(H)))

max_index = np.argmax(estimates)

choice_configuration = indexToGenome(max_index,n)

better credit (nathan gabriel)

This algorithm is inspired by the “credit economy” as it is discussed in social epistemology. Initial searches aim to produce configurations that are significantly different from prior configurations (reflecting researchers being motivated to produce original ideas for which they will be credited). This promotes transient diversity. Then, after this initial phase, local searches are performed on promising configurations. These searches start from a randomly selected configuration chosen from a set of top performers (inspired by Wu’s Better than Best model). When a local search produces a configuration already investigated, I found a measurable improvement by doing a new random selection rather than merely doing a new local search. I believe that, in part, this improvement is due to configurations that have already been frequently investigated being more likely to produce duplicates. I.e., doing a new random selection increases transient diversity in virtue of less investigated configurations being less likely to produce a duplicate and consequently more likely to yield the choice configuration.

Python code:

# fixed search parameters

unique_check = 0

difference_while = 0

difference_while_exitcondition = 0

# tuned search parameters

numrandom_initial_testsA = 17 # technically any increase in this variable should improve fitness

difference_thresholdA = 7

pick_from_top0 = 6

extra_mutation_chances = 50 # technically any increase in this variable should improve fitness

choice_configuration_matrixB = choices[:t, :]

# Initial round random choices and performance

if t==1:

choice_configuration = np.random.randint(0, 2, n)

choices[0] = choice_configuration

t = 1 # Set t to 1 for the first period

choice_configuration_matrixB = choices[:t, :]

elif t<numrandom_initial_testsA:

while difference_while == 0:

choice_configuration = np.random.randint(0, 2, n)

difference_while_exitcondition += 1

if np.max(np.sum(choice_configuration_matrixB[:] == choice_configuration, axis=1)) < difference_thresholdA:

difference_while = 1

elif difference_while_exitcondition > 10*1024: # sorry I know this exit condition is irresponsible ¯\_(ツ)_/¯. It's been a long year

# when this exit condition is met, the algorithm gives up and does a local jump

if pick_from_top0 < t:

pick_from_top = pick_from_top0

else:

pick_from_top = t

ch0 = np.random.randint(0, pick_from_top)

ind = np.argpartition(performances, -pick_from_top)[-pick_from_top:] # the top m performing configurations where m = pick_from_top

betteridx = ind[ch0]

betterch = choices[betteridx,:] # loosely resembling Jingyi Wu's ""Better than Best: Epistemic Landscapes and Diversity of Practice in Science""

# Then, change the choice on one randomly chosen dimension

ch = np.random.randint(0, n) # a random dimension (a random integer between 0 and n-1)

choice_configuration = betterch.copy() # copy the better choice vector

choice_configuration[ch] = 1 - betterch[ch] # Then flip the choice on dimension ch

for retry in range(0, extra_mutation_chances): # retry if the new choice has already been tried

if any((choice_configuration_matrixB[:]==choice_configuration).all(1)):

ch0 = np.random.randint(0, pick_from_top)

betteridx = ind[ch0]

betterch = choices[betteridx,:]

ch = np.random.randint(0, n)

choice_configuration = betterch.copy()

choice_configuration[ch] = 1 - betterch[ch]

difference_while = 1

else:

# local searches almost always

if pick_from_top0 < t:

pick_from_top = pick_from_top0

else:

pick_from_top = t

ch0 = np.random.randint(0, pick_from_top)

ind = np.argpartition(performances, -pick_from_top)[-pick_from_top:]

betteridx = ind[ch0]

betterch = choices[betteridx,:] # loosely resembling Jingyi Wu's ""Better than Best: Epistemic Landscapes and Diversity of Practice in Science""

# Then, change the choice on one randomly chosen dimension

ch = np.random.randint(0, n) # a random dimension (a random integer between 0 and n-1)

choice_configuration = betterch.copy() # copy the better choice vector

choice_configuration[ch] = 1 - betterch[ch] # Then flip the choice on dimension ch

for retry in range(0, extra_mutation_chances): # retry if the new choice has already been tried

if any((choice_configuration_matrixB[:]==choice_configuration).all(1)):

ch0 = np.random.randint(0, pick_from_top)

betteridx = ind[ch0]

betterch = choices[betteridx,:]

ch = np.random.randint(0, n)

choice_configuration = betterch.copy()

choice_configuration[ch] = 1 - betterch[ch]

# at the end you should have a choice_configuration variable that represents the new location

while unique_check == 0: #at the very least you can improve the algorithm by insuring that you dont investigate duplicates

if any((choice_configuration_matrixB[:]==choice_configuration).all(1)):

choice_configuration = np.random.randint(0, 2, n)

else:

unique_check = 1

Chilled greedy climbing (Qiufan Huang)

The algorithm is a refined version of the long-jump and is inspired by the Simulated Annealing algorithm. Its objective is to improve the connection between the local search stage and the global search stage while limiting the search range of the global stage. Instead of a full jump or a one-dimensional flip, the algorithm applies a jump of random steps. These adjustments make the search process more effective compared to the default algorithm.

The detailed steps are as follows:

(0 < t < 5): The algorithm initializes 4 random choices.

(t = 5): It identifies the choice with the highest performance and designates it as the anchor choice. A close search is then conducted around this choice with a step size of 1 or 2.

(5 < t < 45):

If the latest choice performs better than the anchor choice, the latest choice becomes the new anchor, and a close search is performed around it with a step size of 1 or 2.

If the latest choice performs worse than the anchor choice, it may still be accepted as the anchor with a tolerance probability (initially set to 0.7). A close search is then conducted around this inferior choice. Otherwise, the inferior choice is abandoned, and a global search is performed around the anchor choice with a step randomly selected from {3, 4, 5, 6}.

(t > 45): The algorithm selects the top 5 choices from the previous iterations. One of these choices is randomly chosen, and a random flip is applied to 1 or 2 dimensions.

This approach balances local refinement and global exploration, improving efficiency and effectiveness.

Python code:

import random
import pandas as pd

---

# Initial round random choices and performance

patience_init=0.7 # the initial patience degree, the tolerance for a choice with lower performance at the beginning

decrease_rate=0.05 # the decrease rate of patience once we accept a choice with lower performance

df_choi= pd.DataFrame(choices[0:t]) # record the choices before period t

choice_configuration=None #initialize choice_configuration as null to avoid misuse

if t<5:

'''initialize with 4 random choices '''

choice_configuration = np.random.randint(0, 2, n)

if t==5:

''' find the best performace among the four periods and have a close search (1 or 2 steps) '''

max_index = np.argmax(performances) # index of the highest performance

anchor_choice = choices[max_index,:] # the choice vector

j=random.choice([1,2]) # the length of step

counter = 0

while True: #produce a choice of 1 or 2 steps that is not repeated before

counter += 1

if counter>100: # to aviod infinite loop, generate a random choice

choice_configuration=np.random.randint(0, 2, n)

break

ch = np.random.randint(0, n, size=j)

choice_configuration = anchor_choice.copy() # we copy the best choice vector

choice_configuration[ch] = 1 - anchor_choice[ch]

newdf=pd.DataFrame([choice_configuration])

con=pd.concat([df_choi,newdf])

if sum(con.duplicated())>sum(df_choi.duplicated()):

continue

else:

break

if 5<t<45:

'''if the last choice has higher performance than the choice before, we do another close search with a step of 1 or 2

if the last choice has lower performance than the choice before, we may tolerate a lower value with the possibility of patience_degree(start with 0.7)

each time we tolerate, the patinece_degree will go down 0.05.

if we don't tolerate this choice, we go for a a global search with a step randomly choose from (3,4,5,6). '''

if performances[t-2]>performances[t-3]:

anchor_choice=choices[t-2]

j=random.choice([1,2])

counter = 0

while True:

counter += 1

if counter>100:

choice_configuration=np.random.randint(0, 2, n)

break

ch = np.random.randint(0, n, size=j)

choice_configuration = anchor_choice.copy() # we copy the best choice vector

choice_configuration[ch] = 1 - anchor_choice[ch]

newdf=pd.DataFrame([choice_configuration])

con=pd.concat([df_choi,newdf])

if sum(con.duplicated())>sum(df_choi.duplicated()):

continue

else:

break

if performances[t-2]<performances[t-3]:

try:

current_patience= patience_init-patience_num*decrease_rate

except:

current_patience=patience_init

if np.random.rand()<current_patience:

anchor_choice=choices[t-2]

try:

patience_num=patience_num+1

except:

patience_num=1

j=random.choice([1,2,3])

else:

anchor_choice=choices[t-3]

j=random.choice([3,4,5,6])

counter = 0

while True:

counter += 1

if counter>100:

choice_configuration=np.random.randint(0, 2, n)

break

ch = np.random.randint(0, n, size=j)

choice_configuration = anchor_choice.copy() # we copy the best choice vector

choice_configuration[ch] = 1 - anchor_choice[ch]

newdf=pd.DataFrame([choice_configuration])

con=pd.concat([df_choi,newdf])

if sum(con.duplicated())>sum(df_choi.duplicated()):

continue

else:

break

else:#a random greedy climbing process

'''1. find the top five choices with highest performance

2. randomly choose a choice and have close search of 1 or 2 steps

3. exam if this choices is repeated or not. if not, break loop

'''

random_top=np.random.choice(np.argsort(performances)[-5:])

anchor_choice = choices[random_top,:] # the choice vector

j=random.choice([1,2])

counter = 0

while True:

counter += 1

if counter>100:

choice_configuration=np.random.randint(0, 2, n)

break

ch = np.random.randint(0, n, size=j)

choice_configuration = anchor_choice.copy() # we copy the best choice vector

choice_configuration[ch] = 1 - anchor_choice[ch]

newdf=pd.DataFrame([choice_configuration])

con=pd.concat([df_choi,newdf])

if sum(con.duplicated())>sum(df_choi.duplicated()):

continue

else:

break

Genetic algorithms (Luigi marengo & Manuel Romagnoli)

GenAlg and GenAlg 2: Perform genetic algortihm search. These algorithms first generate a random population, then perform one- or two-points crossover for a few rounds and finally a few rounds of one-bit mutations.

Matlab code:

GenAlg:

% Coded by Luigi Marengo and Manuel Romagnoli

% Uses genetic alorithms. It first generates a random population,

% then performs crossover for a few rounds and finally a few rounds

% of one-bit mutations

if t==1

PopSize=50; %size of population of strings

Pop=zeros(PopSize, n);

PopPerf = zeros(PopSize,1);

n_crossover=30; %number of strings created by crossover

stop_crossover=PopSize+n_crossover;

end

if t <= PopSize % uses first iterations to build a random population

choice_configuration = randi(2,1,n)-1;

Pop(t,:)=choice_configuration;

end

if (t>1) && (t<=PopSize+1)

PopPerf(t-1) = performances(t-1);

end

if (PopSize<t) && (t<=stop_crossover) %crossover

cross_point = randi([2 n-1]); % random integer between 2 and n-1 for crossover point

[PerfSort,Indeces]=sort(PopPerf,'descend');

estrazione_casuale = rand;

if estrazione_casuale<0.5

% Here we take the first part of the string from the

% first-best string and the second part from the second-best

% string

for jx=1:cross_point

Pop(Indeces(PopSize),jx)=Pop(Indeces(1),jx);

end

for jx=cross_point+1:n

Pop(Indeces(PopSize),jx)=Pop(Indeces(2),jx);

end

else

% Here it's the other way round

for jx=1:cross_point

Pop(Indeces(PopSize),jx)=Pop(Indeces(2),jx);

end

for jx=cross_point+1:n

Pop(Indeces(PopSize),jx)=Pop(Indeces(1),jx);

end

choice_configuration = Pop(Indeces(PopSize),:);

end

if t>stop_crossover % local hill climbing for the following periods

% First, we identify the choice vector with the highest performance so far

[~, max_index]= max(performances);

bestch = choices(max_index,:);

% Then, we change the choice on one randomly chosen dimension

ch = randi(n); % a random dimension (a random integer from 1 to n)

choice_configuration = bestch; % we copy the best choice vector

choice_configuration(ch) = 1-bestch(ch); % Then we flip the choice on dimension ch

end

GenAlg 2:

% Coded by Luigi Marengo and Manuel Romagnoli

% Uses genetic alorithms. It first generates a random population,

% then performs two-points crossover for a few rounds and finally a few rounds

% of one-bit mutations

if t==1

PopSize=60; %size of population of strings

Pop=zeros(PopSize, n);

PopPerf = zeros(PopSize,1);

n_crossover=10; %number of strings created by crossover

stop_crossover=PopSize+n_crossover;

end

if t <= PopSize % uses first iterations to build a random population

choice_configuration = randi(2,1,n)-1;

Pop(t,:)=choice_configuration;

end

if (t>1) && (t<=PopSize+1)

PopPerf(t-1) = performances(t-1);

end

if (PopSize<t) && (t<=stop_crossover) %crossover

cross_point1 = randi([2 5]); % random integer between 2 and 5 for 1st crossover point

cross_point2 = randi([5 9]); % random integer between 5 and 9 for 2nd crossover point

[PerfSort,Indeces]=sort(PopPerf,'descend');

estrazione_casuale = rand;

if estrazione_casuale<0.5

% Here we take the first and last part of the string from the

% first-best string and the mid part from the second-best

% string

for jx=1:cross_point1

Pop(Indeces(PopSize),jx)=Pop(Indeces(1),jx);

end

for jx=cross_point1+1:cross_point2

Pop(Indeces(PopSize),jx)=Pop(Indeces(2),jx);

end

for jx=cross_point2+1:n

Pop(Indeces(PopSize),jx)=Pop(Indeces(1),jx);

end

else

% Here it's the other way round

for jx=1:cross_point1

Pop(Indeces(PopSize),jx)=Pop(Indeces(2),jx);

end

for jx=cross_point1+1:cross_point2

Pop(Indeces(PopSize),jx)=Pop(Indeces(1),jx);

end

for jx=cross_point2+1:n

Pop(Indeces(PopSize),jx)=Pop(Indeces(2),jx);

end

choice_configuration = Pop(Indeces(PopSize),:);

end

if t>stop_crossover % local hill climbing for the following periods

% First, we identify the choice vector with the highest performance so far

[~, max_index]= max(performances);

bestch = choices(max_index,:);

% Then, we change the choice on one randomly chosen dimension

ch = randi(n); % a random dimension (a random integer from 1 to n)

choice_configuration = bestch; % we copy the best choice vector

choice_configuration(ch) = 1-bestch(ch); % Then we flip the choice on dimension ch

end

Theory-Driven Search (madison singell)

This method uses a theory to search and select strategies. After an initial period of data collection, the DM forms a theory of strategic choice by placing the factors in order of how much their variance explains performance (A->B => Performance). The DM then uses this theory to select new strategies that might be performant given it, periodically updating her theory in response to performance feedback.

The idea I'm trying to think about with this approach is that when the world is so complex that it cannot be approximated by analyses that assume a functional form, perhaps using a non-random sampling method with a known functional form (for example a linear theory!) would allow for effective analysis of previous data in order to find new and performant solutions.

Python code:

First, import the following packages:

import random as random

import pandas as pd

import statsmodels.api as sm

The algorithm:

#Optimized Parameters

#Theory Size

m = n

#Initial Data Collection Amount

time_dat_col = 50

my_theory = list(range(0,n))

#In first two time periods, select the following configurations:

#all factors set to [0] and

if t==1:

choice_configuration = np.array([0]*n)

choices[0] = choice_configuration

t = 1 # Set t to 1 for the first period

#all factors set to [1].

elif t == 2:

choice_configuration = np.array([1]*n)

choices[1] = choice_configuration

#This is done to help generate an initial theory by ensuring there is

#at least one data point for each factor where it is set to 0 vs. set to 1.

#Initial Data Collection for 50 times periods

elif t < time_dat_col:

choice_configuration = np.random.randint(0, 2, n)

#Choose Linear Theory A->B Using Linear Regression Y ~ A + B + A*B on Data

elif t % n == 0:

#NEW THEORY

#Choose the set of M policies that together most account for performance

my_choices = choices[:-(number_of_periods-t-2)].tolist()

perfy = performances[:-(number_of_periods-t-2)]

max_index = np.argmax(performances)

bestch = choices[max_index,:]

my_choice_config = list(bestch)

my_data = {"y" : perfy}

my_m = list(range(0,n))

#Pick an order and Values

my_m_ord = []

my_m_for_rem = my_m.copy()

my_dar = {"y": perfy}

my_model = "y ~ "

for item in range(0,n):

my_dat_x = []

str_name = "x" + str(item)

if item == 0:

my_model = my_model + str_name

else:

my_model = my_model + " + "+ str_name

for chc in range(len(my_choices)):

my_dat_x.append(my_choices[chc][item])

my_dar[str_name] = my_dat_x

my_data_frame = pd.DataFrame(my_dar)

while len(my_m_ord) < len(my_m):

my_rr = []

my_coef = []

if len(my_m_ord) == 0:

my_model = "y ~ "

else:

if my_model != "y ~ ":

my_model = my_model + " + "

str_name_1 = "x" + str(my_m_ord[-1])

my_model = my_model + str_name_1

if len(my_m_ord) > 1:

str_name_2 = "x" + str(my_m_ord[-2])

my_model = my_model + " + " + str_name_1 + ":" + str_name_2

for row in range(len(my_m_for_rem)):

my_model_test = my_model

if my_model != "y ~ ":

my_model_test = my_model + " + "

str_name = "x" + str(my_m_for_rem[row])

my_model_test = my_model_test + str_name

if len(my_m_ord) > 1:

str_name_2 = "x" + str(my_m_ord[-1])

my_model_test = my_model_test + " + " + str_name + ":" + str_name_2

my_ols_pred = sm.OLS.from_formula(my_model_test, data=my_data_frame).fit()

my_rr.append(my_ols_pred.rsquared)

my_coef.append(my_ols_pred.params[str_name])

max_m = my_rr.index(max(my_rr))

my_max_coef = my_coef[max_m]

#Use Linear Regression to Select Configuration

if my_max_coef > 0:

my_choice_config[my_m_for_rem[max_m]] = 1

else:

my_choice_config[my_m_for_rem[max_m]] = 0

my_last = my_m_ord + [my_m_for_rem[max_m]]

my_m_ord.append(my_m_for_rem[max_m])

my_m_for_rem.remove(my_m_for_rem[max_m])

choice_configuration = np.array(my_choice_config)

not_in_m = list(set(list(range(n)))-set(my_m_ord))

my_theory = my_m_ord.copy()

#Use Theory to Select Next Strategy

else:

my_choices = choices.tolist()[:t-2]

perfy = performances.tolist()[:t-2]

max_index = np.argmax(performances)

bestch = choices[max_index,:]

my_choice_config = list(bestch)

#RANDOMLY CHOOSE ONE ITEM IN THEORY TO CHANGE

#ITS CONSEQUENTS BASED ON DATA

my_rando = np.random.choice(my_theory)

my_index = my_theory.index(my_rando)

to_change = my_theory[my_index:]

my_choice_config[to_change[0]] = 1-my_choice_config[to_change[0]]

val_to_filt = my_choice_config[to_change[0]]

for item in range(len(to_change)-1):

my_test = to_change[item+1]

my_filter = to_change[item]

my_model = "y ~ "

my_data = {}

str_name = "x" + str(my_test)

my_model = my_model + str_name

my_dat_x = []

my_dat_y = []

for chc in range(len(my_choices)):

if my_choices[chc][my_filter] == val_to_filt:

my_dat_x.append(my_choices[chc][my_test])

my_dat_y.append(perfy[chc])

my_data[str_name] = my_dat_x

my_data["y"] = my_dat_y

my_data_frame = pd.DataFrame(my_data)

my_ols_pred = sm.OLS.from_formula(my_model, data=my_data_frame).fit()

my_coef = list(my_ols_pred.params)[-1]

if my_coef > 0:

my_choice_config[my_test] = 1

else:

my_choice_config[my_test] = 0

val_to_filt = my_choice_config[my_test]

choice_configuration = np.array(my_choice_config)

#If choice already chosen local learn

if any((choices[:]==choice_configuration).all(1)):

#Local learn

max_index = np.argmax(performances)

bestch = choices[max_index,:]

choice_configuration = bestch.copy()

ch = random.choice(range(0,n))

choice_configuration[ch] = 1- choice_configuration[ch]

Chill Jumps and Zero Revisits (chengwei liu)

This approach starts with a 0.65 probability of random jumps that cools down to 0.05.

Each time a random jump or local bit flip leads to a previously visited configuration, the code retries to maintain uniqueness.

Matlab code:

initial_T = 0.65;

final_T = 0.05;

if number_of_periods > 1

alpha = (final_T/initial_T)^(1/(number_of_periods-1));

problongjump = initial_T * alpha^(t-1);

else

problongjump = initial_T;

end

all_choices_so_far = choices(1:t-1,:);

attempts = 100;

visited = true;

check_visited = @(cand) any(all(all_choices_so_far == cand,2));

if t == 1

candidate = randi([0,1],1,n);

if check_visited(candidate)

found_unvisited = false;

for i = 1:attempts

cand_try = randi([0,1],1,n);

if ~check_visited(cand_try)

candidate = cand_try;

found_unvisited = true;

break;

end

if ~found_unvisited

found_unvisited = false;

for x = 0:(2^n - 1)

cand = bitget(x,1:n);

cand = cand(end:-1:1);

if ~check_visited(cand)

candidate = cand;

found_unvisited = true;

break;

end

if ~found_unvisited

candidate = randi([0,1],1,n);

end

choice_configuration = candidate;

else

if rand < problongjump

candidate = randi([0,1],1,n);

if check_visited(candidate)

found_unvisited = false;

for i = 1:attempts

cand_try = randi([0,1],1,n);

if ~check_visited(cand_try)

candidate = cand_try;

found_unvisited = true;

break;

end

if ~found_unvisited

found_unvisited = false;

for x = 0:(2^n - 1)

cand = bitget(x,1:n);

cand = cand(end:-1:1);

if ~check_visited(cand)

candidate = cand;

found_unvisited = true;

break;

end

if ~found_unvisited

candidate = randi([0,1],1,n);

end

choice_configuration = candidate;

else

[~, max_index] = max(performances);

bestch = choices(max_index,:);

ch = randi(n);

candidate = bestch;

candidate(ch) = 1 - bestch(ch);

if check_visited(candidate)

found_unvisited = false;

for i = 1:attempts

cand_try = randi([0,1],1,n);

if ~check_visited(cand_try)

candidate = cand_try;

found_unvisited = true;

break;

end

if ~found_unvisited

found_unvisited = false;

for x = 0:(2^n - 1)

cand = bitget(x,1:n);

cand = cand(end:-1:1);

if ~check_visited(cand)

candidate = cand;

found_unvisited = true;

break;

end

if ~found_unvisited

candidate = randi([0,1],1,n);

end

choice_configuration = candidate;

end

search with logic rules (Stefano lipira)

The algorithm begins with an exploration phase, where random binary choices are generated and evaluated over multiple periods. During the initial period (50 periods), random configurations are selected. In subsequent periods, the algorithm either explores new random configurations or refines existing ones using a combination of logic functions (NAND). A probabilistic mechanism is used to decide between long jumps, where the best-performing configuration is modified by combining it with a random vector, or a local hill-climbing strategy, where one dimension of the best-performing configuration is randomly altered to potentially improve the overall performance.

Python code:

''' ===================================================================

THE SEARCH ALGORITHM: This is the only part you can edit

===================================================================='''

# Define logic functions

def logic_imply(a, b):

return np.logical_or(np.logical_not(a), b).astype(int)

def logic_Notimply(a, b):

return np.logical_or(a, np.logical_not(b)).astype(int)

def logic_xor(a, b):

return np.logical_xor(a, b).astype(int)

def logic_and(a, b):

return np.logical_and(a, b).astype(int)

def logic_or(a, b):

return np.logical_or(a, b).astype(int)

def logic_nand(a, b):

return np.logical_not(np.logical_and(a, b)).astype(int)

def logic_nor(a, b):

return np.logical_not(np.logical_or(a, b)).astype(int)

def logic_xnor(a, b):

return np.logical_not(np.logical_xor(a, b)).astype(int)

def logic_xor_and(a, b):

xor = np.logical_xor(a, b)

or_ = np.logical_or(a, b)

return np.logical_and(xor, or_).astype(int)

def logic_xor_nand(a, b):

xor = np.logical_xor(a, b)

nand = np.logical_not(np.logical_and(a, b))

return np

def logic_xor_nor(a, b):

xor = np.logical_xor(a, b)

nor = np.logical_not(np.logical_or(a, b))

return np

def logic_xor_xnor(a, b):

xor = np.logical_xor(a, b)

xnor = np.logical_not(np.logical_xor(a, b))

return np

def logic_imply_and_not(a, b):

imply = np.logical_or(np.logical_not(a), b)

and_ = np.logical_and(a, b)

return np.logical_and(imply, np.logical_not(and_)).astype(int)

current_logic_function1 = logic_nand

current_logic_function2 = logic_nand

# Initial round random choices and performance

exploration_time=50

if t==1:

choice_configuration = np.random.randint(0, 2, n)

choices[0] = choice_configuration

t = 1 # Set t to 1 for the first period

elif t<=exploration_time:

# # Use the best-performing choice so far from past periods

# max_index = np.argmax(performances[:t - 1])

# best_choice = choices[max_index, :]

# # Choose a random second vector for logic combination

# second_choice = np.random.randint(0, 2, n)

# # Apply the current logic function

# choice_configuration = current_logic_function1(best_choice, second_choice)

choice_configuration = np.random.randint(0, 2, n)

choices[0] = choice_configuration

else:

problongjump = 0.4

if np.random.rand()<problongjump:

# Use the best-performing choice so far from past periods

max_index = np.argmax(performances[:t - 1])

best_choice = choices[max_index, :]

# Choose a random second vector for logic combination

second_choice = np.random.randint(0, 2, n)

# Apply the current logic function

choice_configuration = current_logic_function1(best_choice, second_choice)

# choice_configuration = np.random.randint(0, 2, n)

else: # local hill climbing

# First, we identify the choice vector with the highest performance so far

max_index = np.argmax(performances) # index of the highest performance

bestch = choices[max_index,:] # the choice vector

# Then, we change the choice on one randomly chosen dimension

ch = np.random.randint(0, n) # a random dimension (a random integer between 0 and n-1)

choice_configuration = bestch.copy() # we copy the best choice vector

choice_configuration[ch] = 1 - bestch[ch] # Then we flip the choice on dimension ch

''' ===================================================================

THE SEARCH ALGORITHM ENDS HERE

===================================================================='''

bayesian optimization (jerker denrell)

Bayesian optimization:

We use "Bayesian Optimization"; a search technique for finding the optimum for complicated functions we have incomplete knowledge about.

We model the 2^10 = 1024 choice configurations as having a multivariate normal distribution with covariances that depend on the hamming distance between choices. Whenever a new performance is observed, we update the mean vector and the covariance matrix for the unobserved choices, using well-known formulas for the multivariate normal distribution. This keeps track of how promising unseen choices are and how uncertain we are about their promise.

We calculate the "expected improvement" for each unseen choice: ei_i=E[Max(p_i,ma)]−ma, where ma is the maximum performance observed so far, and the expectation is over the conditional distribution of the performance of unseen alternative i.

We choose the alternative with the highest expected improvement. Expected improvement is larger for alternatives with higher means and for alternatives we are more uncertain about.

If we ignore uncertainty and always choose the configuration with the highest conditional mean, we get a lower average score (about 0.755).

Matlab code:

if t == 1 %

choice_configuration = zeros(1,n); %choose all zeros in first period

nobin = 2^n; % Number of binary vectors

binaryVectors = dec2bin(0:nobin-1) - '0'; % Generate all binary vectors

%list of the indices of the chosen configurations

observed_indices = [1]; %1 = all zeros

unobserved_indices = 2:1:nobin;

%the covariances for hamming distances 0 to 10

cohdist = [833, 500, 278, 139, 60, 20, 4, 0, 0, 0, 0] / 100000;

%the covariance between the performances of two choices at hamming distance d from

%each other is (1/(n^2)) * (1/12) * (n - d(i)) *(nchoosek(n-1-d(i), k) / nchoosek(n-1, k))

% if n-1-d(i) >=k and 0 otherwise

% Compute pairwise Hamming distances

hd = pdist2(binaryVectors, binaryVectors, 'hamming') * size(binaryVectors, 2); % Scale by vector length

% covariances depend on hamming distances

C = cohdist(hd + 1); % Add 1 because MATLAB indexing starts at 1

% Initial mean vector (expected performance is 0.5)

mu = 0.5+zeros(nobin, 1);

% functions needed to compute EI

phi = @(z) exp(-0.5 * z.^2) / sqrt(2 * pi); % Standard normal PDF

Phi = @(z) 0.5 * (1 + erf(z / sqrt(2))); % Standard normal CDF

else %if t > 1

% Partition the covariance matrix (needed to update the mean

% and covariances)

Coo = C(observed_indices, observed_indices);

Cou = C(observed_indices, unobserved_indices);

Cuo = Cou'; % Cou and Cuo are transposes

Cuu = C(unobserved_indices, unobserved_indices);

% Compute the conditional means and covariances

y_obs = performances(1:t-1,1); %the performances so far

cm = mu(unobserved_indices) + Cuo / Coo * (y_obs - mu(observed_indices)); %updated means

Sigma_cond = Cuu - Cuo / Coo * Cou; %updated covariance matrix

cv = diag(Sigma_cond); %the variances

%Compute expected improvement

relstdev = sqrt(max(cv, 0.0001)); % computes stdev, ensures all values are positive

ym = max(y_obs); %maximum performance so far

z = (cm - ym) ./ relstdev;

ei = (cm - ym) .* Phi(z) + relstdev .* phi(z); % EI computation

%choose configuration

ve = find(ei==max(ei)); %find elements with maximal ei

id = ve(randi(length(ve))); %pick one randomly

idbest = unobserved_indices(id); %the index of the chosen configuration

observed_indices = [observed_indices, idbest]; %add to observed

unobserved_indices(unobserved_indices == idbest) = []; %delete from unobserved

choice_configuration = binaryVectors(idbest ,:);

end

BAndit (Axel zeijen)

Bandit v1.1

A simple bandit strategy (learn with some learning rate, and select with a softmax).

When it observes a fitness value, it updates beliefs about nearby locations, with some decay over hamming distance.

It uses four main parameters (and these can most likely be optimized further):

1: explore_periods: the first x periods are just exploration (from a pre-defined set of evenly spaced-out configurations)

2: learning rate alpha, here a fixed parameter

3: distancedecay (how much to discount distant observations)

4: softmax temperature, here a declining function of time.

This version iteratively updates beliefs, not sure why this should be good in theory, but it seems to work.

The code works with numbers (000 = location 1, 100 = location 2, etc.) rather than binary vectors (and converts them back at the end of each period). Just a personal preference.

Matlab code:

if t == 1

% set up things in the initial round

% --- set up helper tools, this can go to the top of the script ---

% create a matrix that captures the hamming distance between locations

bitset = dec2bin(0:1:2^n-1)-'0';

HD = zeros(2^n);

for i = 1:2^n, for j = 1:2^n, HD(i,j) = sum(abs(bitset(i,:) - bitset(j,:))); end; end

% the first tries are spaced out over the landscape. Can probably be improved.

first_tries = [1;1024;32;228;365;693;850;907;118;123;174;207;218;296;314;332;343;404;413;454;497;556;584;605;723;745;782;819;930;47;52;78;136;155;213;278;345;354;394;419;535;538;550;587;625;653;706;743;772;809;837;913;4;6;7;10;11;13;18;19;21;25;34;35;37;41;49;55;61;66;67;69;73;81;84;95;97;103;106;112;130;131;133;137;140;145;150;151;161;167;171;178;179;184;185;188;191;193;216;229];

% --- global search parameters, can also go to the top of the script

alpha = 0.625; % learning rate

distancedecay = 2.15; % hamming distance discount to new info

explore_periods = 30; % first x periods, just search

T_initial = 0.0065; % also, T declines over time, see below

% set up initial things for the simulation, this needs to stay here

beliefs = ones(2^n,1) / 2; % initial beliefs all 0.5

choices_numeric = zeros(number_of_periods,1); % because I prefer to work with numbers rather than bit strings

end

% Step 1: update beliefs (in all but the first period)

if t > 1

lastchoice = choices_numeric(t-1);

lastscore = performances(t-1);

distances = HD(lastchoice,:); % This says: how far was the last configuration from all other configurations?

distances(distances==0) = 1; % To not divide by 0. Not elegant, but we won't repeat configurations anyway.

% beliefs about every location are updated with some weight:

% general learning parameter alpha matters, but also the distance to the observed location.

weights = alpha ./ distances .^ distancedecay;

% update beliefs (and fix the array direction back to n x 1)

beliefs = (1 - weights) .* beliefs' + weights .* lastscore;

beliefs = beliefs';

end

% Step 2: choose a configuation

if t <= explore_periods

choice = first_tries(t); % pick numbers from the list above in the first X periods

else

% this line ensures we don't repeat past configurations

beliefs(choices_numeric(choices_numeric>0)) = 0;

% Then, do the softmax thing

T = T_initial / log(1+t); % temperature declines over time

choice = find(cumsum(exp(beliefs/T) / sum(exp(beliefs/T))) >= rand, 1, 'first');

end

choices_numeric(t) = choice; % to save for myself

choice_configuration = bitget(choice-1, 1:n);

Random then greedy (Jerker Denrell)

40 initial random choices and then local search based on a randomly selected top 5 alternative

Matlab code:

intper = 40;

if t <= intper % Random choice phase

choice_configuration = binornd(1, 0.5, 1, n);

end

if t > intper

% best choices so far

[sortedArray, sortedIndexes] = sort(performances(:,1), 'descend'); % Sort array in descending order

top_choices = sortedIndexes(:,1); % Get indexes of the best values

%pick a random choice among top 5

ra = randi(5);

%then do a local search from this

bestch = choices(top_choices(ra),:);

ch = randi(n); % a random dimension (a random integer from 1 to n)

choice_configuration = bestch; % we copy the best choice vector

choice_configuration(ch) = 1-bestch(ch); % Then we flip the choice on dimension ch

end

Expl46 (Luigi Marengo & Manuel Romagnoli)

It begins with an exploration phase of random choices, then it takes the best string and performs some rounds of 3-bit mutation, then some rounds of 2-bit mutation and finally of 1-bit mutations.

Matlab code:

% Coded by Luigi Marengo and Manuel Romagnoli

% It begins with an exploration phase of random choice, then it

% takes the best string and performs a round of 3-bit mutation,

% then a round of 2-bit mutation and finally 1-bit mutations

ExplorePhase=46;

Threemut=20;

Twomut=20;

if t <= ExplorePhase % make a random choice for the first ExplorePhase periods

choice_configuration = randi(2,1,n)-1;

end

if (t>ExplorePhase) && (t<=ExplorePhase+Threemut) %3-bit mutations

% First identify the choice vector with the highest performance so far

[~, max_index]= max(performances);

bestch = choices(max_index,:);

% Then change three randomly chosen bits

ch1 = randi(n);

ch2 = randi(n);

ch3 = randi(n);

choice_configuration = bestch; % copy the best choice vector

choice_configuration(ch1) = 1-bestch(ch1); % Then we flip the choice on dimension ch1

choice_configuration(ch2) = 1-bestch(ch2); % Then we flip the choice on dimension ch2

choice_configuration(ch3) = 1-bestch(ch3); % Then we flip the choice on dimension ch3

end

if (t>ExplorePhase+Threemut) && (t<=ExplorePhase+Threemut+Twomut) %2-bit mutations

% First identify the choice vector with the highest performance so far

[~, max_index]= max(performances);

bestch = choices(max_index,:);

% Then change two randomly chosen bits

ch1 = randi(n);

ch2 = randi(n);

choice_configuration = bestch; % we copy the best choice vector

choice_configuration(ch1) = 1-bestch(ch1); % Then we flip the choice on dimension ch1

choice_configuration(ch2) = 1-bestch(ch2); % Then we flip the choice on dimension ch2

end

if t>ExplorePhase+Threemut+Twomut %1-bit mutations

% First, we identify the choice vector with the highest performance so far

[~, max_index]= max(performances);

bestch = choices(max_index,:);

% Then change one randomly chosen bit

ch1 = randi(n);

choice_configuration = bestch; % copy the best choice vector

choice_configuration(ch1) = 1-bestch(ch1); % Then we flip the choice on dimension ch1

end