We put our agents through a lot, but how would we fare on fitness landscapes ourselves?
We designed an NK Landscape Search Algorithm Competition, with the goal of finding the best search algorithm that the TOM Society can design. The game is simple: provide us with your search algorithm, and we will unleash it on many landscapes and see how it does!
As specifications and assumptions are important, the rules are included below. In short, your algorithm has 100 shots at sampling locations on an N=10, K=3 random-interaction fitness landscape, and we count the best fitness outcome of those 100 shots.
We look forward to receiving your submissions through the form on this page; happy searching!
Jerker, Maciej, and Axel
More details about the test environment and rules of the competition are described in the rules.
To get you going, and to coordinate your submissions, please build on our Matlab or Python script.
The initial submission deadline is January 10, 2025. Then we will announce the preliminary winner. After that, you can keep refining your algorithm for another round, ending before the 2025 TOM Conference. In the meantime, we will showcase approaches and their scores here.
Estimates of function values and uncertainty hereof are maintained by weighing the observations by reciprocal Hamming distance.
The algorithm deterministically picks the point with highest estimate plus UCB-like bonus for uncertainty.
Python code:
# Notes: CWA algorithm by Michael Christensen 20241027
if t==1:
# Initialization of parameters and helpers, preferably put outside both loops:
ucb = 6.0 * np.sqrt((k+1)/6/n/n); # Exploration rate times local variation, ~1/2.
bad_value = np.nextafter(-np.inf,0,dtype=np.float64) # Worst performance possible.
genomeToIndex = lambda g,n : sum([g[i]*(1<<(n-i-1)) for i in range(n)])
indexToGenome = lambda i,n : np.array([(i>>p)&1 for p in range(n-1,-1,-1)])
# Initialization of estimates (by index) for each simulation:
H = np.zeros(2**n,dtype=float) # Reciprocal Hamming weights.
Y = np.zeros(2**n,dtype=float) # Hamming weighed observations.
# Initial sample deterministically:
choice_configuration = np.zeros(n,dtype=int)
else:
# Update H and Y from previous choice:
index = int(genomeToIndex(choices[t-2],n))
perf = performances[t-2]
H[index] = -1 # Minus one indicates observed, use abs for calculations.
Y[index] = perf # Actual performance when observed.
D = np.array([int.bit_count(index^j) for j in range(2**n)],dtype=float) # Hamming distances.
H = (H==-1) * (H) + (H!=-1) * (H+1/(D+(D==0))) # Add 1/D to H.
Y = (H==-1) * (Y) + (H!=-1) * (Y+perf/(D+(D==0)))# Add perf/D to Y.
# Select config deterministically as best Y/H plus variation bonus (UCB):
estimates = (H==-1) * (bad_value) + (H!=-1) * (Y/abs(H)+ucb/np.sqrt(abs(H)))
max_index = np.argmax(estimates)
choice_configuration = indexToGenome(max_index,n)
This algorithm is inspired by the “credit economy” as it is discussed in social epistemology. Initial searches aim to produce configurations that are significantly different from prior configurations (reflecting researchers being motivated to produce original ideas for which they will be credited). This promotes transient diversity. Then, after this initial phase, local searches are performed on promising configurations. These searches start from a randomly selected configuration chosen from a set of top performers (inspired by Wu’s Better than Best model). When a local search produces a configuration already investigated, I found a measurable improvement by doing a new random selection rather than merely doing a new local search. I believe that, in part, this improvement is due to configurations that have already been frequently investigated being more likely to produce duplicates. I.e., doing a new random selection increases transient diversity in virtue of less investigated configurations being less likely to produce a duplicate and consequently more likely to yield the choice configuration.
Python code:
# fixed search parameters
unique_check = 0
difference_while = 0
difference_while_exitcondition = 0
# tuned search parameters
numrandom_initial_testsA = 17 # technically any increase in this variable should improve fitness
difference_thresholdA = 7
pick_from_top0 = 6
extra_mutation_chances = 50 # technically any increase in this variable should improve fitness
choice_configuration_matrixB = choices[:t, :]
# Initial round random choices and performance
if t==1:
choice_configuration = np.random.randint(0, 2, n)
choices[0] = choice_configuration
t = 1 # Set t to 1 for the first period
choice_configuration_matrixB = choices[:t, :]
elif t<numrandom_initial_testsA:
while difference_while == 0:
choice_configuration = np.random.randint(0, 2, n)
difference_while_exitcondition += 1
if np.max(np.sum(choice_configuration_matrixB[:] == choice_configuration, axis=1)) < difference_thresholdA:
difference_while = 1
elif difference_while_exitcondition > 10*1024: # sorry I know this exit condition is irresponsible ¯\_(ツ)_/¯. It's been a long year
# when this exit condition is met, the algorithm gives up and does a local jump
if pick_from_top0 < t:
pick_from_top = pick_from_top0
else:
pick_from_top = t
ch0 = np.random.randint(0, pick_from_top)
ind = np.argpartition(performances, -pick_from_top)[-pick_from_top:] # the top m performing configurations where m = pick_from_top
betteridx = ind[ch0]
betterch = choices[betteridx,:] # loosely resembling Jingyi Wu's ""Better than Best: Epistemic Landscapes and Diversity of Practice in Science""
# Then, change the choice on one randomly chosen dimension
ch = np.random.randint(0, n) # a random dimension (a random integer between 0 and n-1)
choice_configuration = betterch.copy() # copy the better choice vector
choice_configuration[ch] = 1 - betterch[ch] # Then flip the choice on dimension ch
for retry in range(0, extra_mutation_chances): # retry if the new choice has already been tried
if any((choice_configuration_matrixB[:]==choice_configuration).all(1)):
ch0 = np.random.randint(0, pick_from_top)
betteridx = ind[ch0]
betterch = choices[betteridx,:]
ch = np.random.randint(0, n)
choice_configuration = betterch.copy()
choice_configuration[ch] = 1 - betterch[ch]
difference_while = 1
else:
# local searches almost always
if pick_from_top0 < t:
pick_from_top = pick_from_top0
else:
pick_from_top = t
ch0 = np.random.randint(0, pick_from_top)
ind = np.argpartition(performances, -pick_from_top)[-pick_from_top:]
betteridx = ind[ch0]
betterch = choices[betteridx,:] # loosely resembling Jingyi Wu's ""Better than Best: Epistemic Landscapes and Diversity of Practice in Science""
# Then, change the choice on one randomly chosen dimension
ch = np.random.randint(0, n) # a random dimension (a random integer between 0 and n-1)
choice_configuration = betterch.copy() # copy the better choice vector
choice_configuration[ch] = 1 - betterch[ch] # Then flip the choice on dimension ch
for retry in range(0, extra_mutation_chances): # retry if the new choice has already been tried
if any((choice_configuration_matrixB[:]==choice_configuration).all(1)):
ch0 = np.random.randint(0, pick_from_top)
betteridx = ind[ch0]
betterch = choices[betteridx,:]
ch = np.random.randint(0, n)
choice_configuration = betterch.copy()
choice_configuration[ch] = 1 - betterch[ch]
# at the end you should have a choice_configuration variable that represents the new location
while unique_check == 0: #at the very least you can improve the algorithm by insuring that you dont investigate duplicates
if any((choice_configuration_matrixB[:]==choice_configuration).all(1)):
choice_configuration = np.random.randint(0, 2, n)
else:
unique_check = 1
The algorithm is a refined version of the long-jump and is inspired by the Simulated Annealing algorithm. Its objective is to improve the connection between the local search stage and the global search stage while limiting the search range of the global stage. Instead of a full jump or a one-dimensional flip, the algorithm applies a jump of random steps. These adjustments make the search process more effective compared to the default algorithm.
The detailed steps are as follows:
(0 < t < 5): The algorithm initializes 4 random choices.
(t = 5): It identifies the choice with the highest performance and designates it as the anchor choice. A close search is then conducted around this choice with a step size of 1 or 2.
(5 < t < 45):
If the latest choice performs better than the anchor choice, the latest choice becomes the new anchor, and a close search is performed around it with a step size of 1 or 2.
If the latest choice performs worse than the anchor choice, it may still be accepted as the anchor with a tolerance probability (initially set to 0.7). A close search is then conducted around this inferior choice. Otherwise, the inferior choice is abandoned, and a global search is performed around the anchor choice with a step randomly selected from {3, 4, 5, 6}.
(t > 45): The algorithm selects the top 5 choices from the previous iterations. One of these choices is randomly chosen, and a random flip is applied to 1 or 2 dimensions.
This approach balances local refinement and global exploration, improving efficiency and effectiveness.
Python code:
import random
import pandas as pd
---
# Initial round random choices and performance
patience_init=0.7 # the initial patience degree, the tolerance for a choice with lower performance at the beginning
decrease_rate=0.05 # the decrease rate of patience once we accept a choice with lower performance
df_choi= pd.DataFrame(choices[0:t]) # record the choices before period t
choice_configuration=None #initialize choice_configuration as null to avoid misuse
if t<5:
'''initialize with 4 random choices '''
choice_configuration = np.random.randint(0, 2, n)
if t==5:
''' find the best performace among the four periods and have a close search (1 or 2 steps) '''
max_index = np.argmax(performances) # index of the highest performance
anchor_choice = choices[max_index,:] # the choice vector
j=random.choice([1,2]) # the length of step
counter = 0
while True: #produce a choice of 1 or 2 steps that is not repeated before
counter += 1
if counter>100: # to aviod infinite loop, generate a random choice
choice_configuration=np.random.randint(0, 2, n)
break
ch = np.random.randint(0, n, size=j)
choice_configuration = anchor_choice.copy() # we copy the best choice vector
choice_configuration[ch] = 1 - anchor_choice[ch]
newdf=pd.DataFrame([choice_configuration])
con=pd.concat([df_choi,newdf])
if sum(con.duplicated())>sum(df_choi.duplicated()):
continue
else:
break
if 5<t<45:
'''if the last choice has higher performance than the choice before, we do another close search with a step of 1 or 2
if the last choice has lower performance than the choice before, we may tolerate a lower value with the possibility of patience_degree(start with 0.7)
each time we tolerate, the patinece_degree will go down 0.05.
if we don't tolerate this choice, we go for a a global search with a step randomly choose from (3,4,5,6). '''
if performances[t-2]>performances[t-3]:
anchor_choice=choices[t-2]
j=random.choice([1,2])
counter = 0
while True:
counter += 1
if counter>100:
choice_configuration=np.random.randint(0, 2, n)
break
ch = np.random.randint(0, n, size=j)
choice_configuration = anchor_choice.copy() # we copy the best choice vector
choice_configuration[ch] = 1 - anchor_choice[ch]
newdf=pd.DataFrame([choice_configuration])
con=pd.concat([df_choi,newdf])
if sum(con.duplicated())>sum(df_choi.duplicated()):
continue
else:
break
if performances[t-2]<performances[t-3]:
try:
current_patience= patience_init-patience_num*decrease_rate
except:
current_patience=patience_init
if np.random.rand()<current_patience:
anchor_choice=choices[t-2]
try:
patience_num=patience_num+1
except:
patience_num=1
j=random.choice([1,2,3])
else:
anchor_choice=choices[t-3]
j=random.choice([3,4,5,6])
counter = 0
while True:
counter += 1
if counter>100:
choice_configuration=np.random.randint(0, 2, n)
break
ch = np.random.randint(0, n, size=j)
choice_configuration = anchor_choice.copy() # we copy the best choice vector
choice_configuration[ch] = 1 - anchor_choice[ch]
newdf=pd.DataFrame([choice_configuration])
con=pd.concat([df_choi,newdf])
if sum(con.duplicated())>sum(df_choi.duplicated()):
continue
else:
break
else:#a random greedy climbing process
'''1. find the top five choices with highest performance
2. randomly choose a choice and have close search of 1 or 2 steps
3. exam if this choices is repeated or not. if not, break loop
'''
random_top=np.random.choice(np.argsort(performances)[-5:])
anchor_choice = choices[random_top,:] # the choice vector
j=random.choice([1,2])
counter = 0
while True:
counter += 1
if counter>100:
choice_configuration=np.random.randint(0, 2, n)
break
ch = np.random.randint(0, n, size=j)
choice_configuration = anchor_choice.copy() # we copy the best choice vector
choice_configuration[ch] = 1 - anchor_choice[ch]
newdf=pd.DataFrame([choice_configuration])
con=pd.concat([df_choi,newdf])
if sum(con.duplicated())>sum(df_choi.duplicated()):
continue
else:
break
GenAlg and GenAlg 2: Perform genetic algortihm search. These algorithms first generate a random population, then perform one- or two-points crossover for a few rounds and finally a few rounds of one-bit mutations.
Matlab code:
GenAlg:
% Coded by Luigi Marengo and Manuel Romagnoli
% Uses genetic alorithms. It first generates a random population,
% then performs crossover for a few rounds and finally a few rounds
% of one-bit mutations
if t==1
PopSize=50; %size of population of strings
Pop=zeros(PopSize, n);
PopPerf = zeros(PopSize,1);
n_crossover=30; %number of strings created by crossover
stop_crossover=PopSize+n_crossover;
end
if t <= PopSize % uses first iterations to build a random population
choice_configuration = randi(2,1,n)-1;
Pop(t,:)=choice_configuration;
end
if (t>1) && (t<=PopSize+1)
PopPerf(t-1) = performances(t-1);
end
if (PopSize<t) && (t<=stop_crossover) %crossover
cross_point = randi([2 n-1]); % random integer between 2 and n-1 for crossover point
[PerfSort,Indeces]=sort(PopPerf,'descend');
estrazione_casuale = rand;
if estrazione_casuale<0.5
% Here we take the first part of the string from the
% first-best string and the second part from the second-best
% string
for jx=1:cross_point
Pop(Indeces(PopSize),jx)=Pop(Indeces(1),jx);
end
for jx=cross_point+1:n
Pop(Indeces(PopSize),jx)=Pop(Indeces(2),jx);
end
else
% Here it's the other way round
for jx=1:cross_point
Pop(Indeces(PopSize),jx)=Pop(Indeces(2),jx);
end
for jx=cross_point+1:n
Pop(Indeces(PopSize),jx)=Pop(Indeces(1),jx);
end
end
choice_configuration = Pop(Indeces(PopSize),:);
end
if t>stop_crossover % local hill climbing for the following periods
% First, we identify the choice vector with the highest performance so far
[~, max_index]= max(performances);
bestch = choices(max_index,:);
% Then, we change the choice on one randomly chosen dimension
ch = randi(n); % a random dimension (a random integer from 1 to n)
choice_configuration = bestch; % we copy the best choice vector
choice_configuration(ch) = 1-bestch(ch); % Then we flip the choice on dimension ch
end
GenAlg 2:
% Coded by Luigi Marengo and Manuel Romagnoli
% Uses genetic alorithms. It first generates a random population,
% then performs two-points crossover for a few rounds and finally a few rounds
% of one-bit mutations
if t==1
PopSize=60; %size of population of strings
Pop=zeros(PopSize, n);
PopPerf = zeros(PopSize,1);
n_crossover=10; %number of strings created by crossover
stop_crossover=PopSize+n_crossover;
end
if t <= PopSize % uses first iterations to build a random population
choice_configuration = randi(2,1,n)-1;
Pop(t,:)=choice_configuration;
end
if (t>1) && (t<=PopSize+1)
PopPerf(t-1) = performances(t-1);
end
if (PopSize<t) && (t<=stop_crossover) %crossover
cross_point1 = randi([2 5]); % random integer between 2 and 5 for 1st crossover point
cross_point2 = randi([5 9]); % random integer between 5 and 9 for 2nd crossover point
[PerfSort,Indeces]=sort(PopPerf,'descend');
estrazione_casuale = rand;
if estrazione_casuale<0.5
% Here we take the first and last part of the string from the
% first-best string and the mid part from the second-best
% string
for jx=1:cross_point1
Pop(Indeces(PopSize),jx)=Pop(Indeces(1),jx);
end
for jx=cross_point1+1:cross_point2
Pop(Indeces(PopSize),jx)=Pop(Indeces(2),jx);
end
for jx=cross_point2+1:n
Pop(Indeces(PopSize),jx)=Pop(Indeces(1),jx);
end
else
% Here it's the other way round
for jx=1:cross_point1
Pop(Indeces(PopSize),jx)=Pop(Indeces(2),jx);
end
for jx=cross_point1+1:cross_point2
Pop(Indeces(PopSize),jx)=Pop(Indeces(1),jx);
end
for jx=cross_point2+1:n
Pop(Indeces(PopSize),jx)=Pop(Indeces(2),jx);
end
end
choice_configuration = Pop(Indeces(PopSize),:);
end
if t>stop_crossover % local hill climbing for the following periods
% First, we identify the choice vector with the highest performance so far
[~, max_index]= max(performances);
bestch = choices(max_index,:);
% Then, we change the choice on one randomly chosen dimension
ch = randi(n); % a random dimension (a random integer from 1 to n)
choice_configuration = bestch; % we copy the best choice vector
choice_configuration(ch) = 1-bestch(ch); % Then we flip the choice on dimension ch
end
This method uses a theory to search and select strategies. After an initial period of data collection, the DM forms a theory of strategic choice by placing the factors in order of how much their variance explains performance (A->B => Performance). The DM then uses this theory to select new strategies that might be performant given it, periodically updating her theory in response to performance feedback.
The idea I'm trying to think about with this approach is that when the world is so complex that it cannot be approximated by analyses that assume a functional form, perhaps using a non-random sampling method with a known functional form (for example a linear theory!) would allow for effective analysis of previous data in order to find new and performant solutions.
Python code:
First, import the following packages:
import random as random
import pandas as pd
import statsmodels.api as sm
The algorithm:
#Optimized Parameters
#Theory Size
m = n
#Initial Data Collection Amount
time_dat_col = 50
my_theory = list(range(0,n))
#In first two time periods, select the following configurations:
#all factors set to [0] and
if t==1:
choice_configuration = np.array([0]*n)
choices[0] = choice_configuration
t = 1 # Set t to 1 for the first period
#all factors set to [1].
elif t == 2:
choice_configuration = np.array([1]*n)
choices[1] = choice_configuration
#This is done to help generate an initial theory by ensuring there is
#at least one data point for each factor where it is set to 0 vs. set to 1.
#Initial Data Collection for 50 times periods
elif t < time_dat_col:
choice_configuration = np.random.randint(0, 2, n)
#Choose Linear Theory A->B Using Linear Regression Y ~ A + B + A*B on Data
elif t % n == 0:
#NEW THEORY
#Choose the set of M policies that together most account for performance
my_choices = choices[:-(number_of_periods-t-2)].tolist()
perfy = performances[:-(number_of_periods-t-2)]
max_index = np.argmax(performances)
bestch = choices[max_index,:]
my_choice_config = list(bestch)
my_data = {"y" : perfy}
my_m = list(range(0,n))
#Pick an order and Values
my_m_ord = []
my_m_for_rem = my_m.copy()
my_dar = {"y": perfy}
my_model = "y ~ "
for item in range(0,n):
my_dat_x = []
str_name = "x" + str(item)
if item == 0:
my_model = my_model + str_name
else:
my_model = my_model + " + "+ str_name
for chc in range(len(my_choices)):
my_dat_x.append(my_choices[chc][item])
my_dar[str_name] = my_dat_x
my_data_frame = pd.DataFrame(my_dar)
while len(my_m_ord) < len(my_m):
my_rr = []
my_coef = []
if len(my_m_ord) == 0:
my_model = "y ~ "
else:
if my_model != "y ~ ":
my_model = my_model + " + "
str_name_1 = "x" + str(my_m_ord[-1])
my_model = my_model + str_name_1
if len(my_m_ord) > 1:
str_name_2 = "x" + str(my_m_ord[-2])
my_model = my_model + " + " + str_name_1 + ":" + str_name_2
for row in range(len(my_m_for_rem)):
my_model_test = my_model
if my_model != "y ~ ":
my_model_test = my_model + " + "
str_name = "x" + str(my_m_for_rem[row])
my_model_test = my_model_test + str_name
if len(my_m_ord) > 1:
str_name_2 = "x" + str(my_m_ord[-1])
my_model_test = my_model_test + " + " + str_name + ":" + str_name_2
my_ols_pred = sm.OLS.from_formula(my_model_test, data=my_data_frame).fit()
my_rr.append(my_ols_pred.rsquared)
my_coef.append(my_ols_pred.params[str_name])
max_m = my_rr.index(max(my_rr))
my_max_coef = my_coef[max_m]
#Use Linear Regression to Select Configuration
if my_max_coef > 0:
my_choice_config[my_m_for_rem[max_m]] = 1
else:
my_choice_config[my_m_for_rem[max_m]] = 0
my_last = my_m_ord + [my_m_for_rem[max_m]]
my_m_ord.append(my_m_for_rem[max_m])
my_m_for_rem.remove(my_m_for_rem[max_m])
choice_configuration = np.array(my_choice_config)
not_in_m = list(set(list(range(n)))-set(my_m_ord))
my_theory = my_m_ord.copy()
#Use Theory to Select Next Strategy
else:
my_choices = choices.tolist()[:t-2]
perfy = performances.tolist()[:t-2]
max_index = np.argmax(performances)
bestch = choices[max_index,:]
my_choice_config = list(bestch)
#RANDOMLY CHOOSE ONE ITEM IN THEORY TO CHANGE
#ITS CONSEQUENTS BASED ON DATA
my_rando = np.random.choice(my_theory)
my_index = my_theory.index(my_rando)
to_change = my_theory[my_index:]
my_choice_config[to_change[0]] = 1-my_choice_config[to_change[0]]
val_to_filt = my_choice_config[to_change[0]]
for item in range(len(to_change)-1):
my_test = to_change[item+1]
my_filter = to_change[item]
my_model = "y ~ "
my_data = {}
str_name = "x" + str(my_test)
my_model = my_model + str_name
my_dat_x = []
my_dat_y = []
for chc in range(len(my_choices)):
if my_choices[chc][my_filter] == val_to_filt:
my_dat_x.append(my_choices[chc][my_test])
my_dat_y.append(perfy[chc])
my_data[str_name] = my_dat_x
my_data["y"] = my_dat_y
my_data_frame = pd.DataFrame(my_data)
my_ols_pred = sm.OLS.from_formula(my_model, data=my_data_frame).fit()
my_coef = list(my_ols_pred.params)[-1]
if my_coef > 0:
my_choice_config[my_test] = 1
else:
my_choice_config[my_test] = 0
val_to_filt = my_choice_config[my_test]
choice_configuration = np.array(my_choice_config)
#If choice already chosen local learn
if any((choices[:]==choice_configuration).all(1)):
#Local learn
max_index = np.argmax(performances)
bestch = choices[max_index,:]
choice_configuration = bestch.copy()
ch = random.choice(range(0,n))
choice_configuration[ch] = 1- choice_configuration[ch]
This approach starts with a 0.65 probability of random jumps that cools down to 0.05.
Each time a random jump or local bit flip leads to a previously visited configuration, the code retries to maintain uniqueness.
Matlab code:
initial_T = 0.65;
final_T = 0.05;
if number_of_periods > 1
alpha = (final_T/initial_T)^(1/(number_of_periods-1));
problongjump = initial_T * alpha^(t-1);
else
problongjump = initial_T;
end
all_choices_so_far = choices(1:t-1,:);
attempts = 100;
visited = true;
check_visited = @(cand) any(all(all_choices_so_far == cand,2));
if t == 1
candidate = randi([0,1],1,n);
if check_visited(candidate)
found_unvisited = false;
for i = 1:attempts
cand_try = randi([0,1],1,n);
if ~check_visited(cand_try)
candidate = cand_try;
found_unvisited = true;
break;
end
end
if ~found_unvisited
found_unvisited = false;
for x = 0:(2^n - 1)
cand = bitget(x,1:n);
cand = cand(end:-1:1);
if ~check_visited(cand)
candidate = cand;
found_unvisited = true;
break;
end
end
if ~found_unvisited
candidate = randi([0,1],1,n);
end
end
end
choice_configuration = candidate;
else
if rand < problongjump
candidate = randi([0,1],1,n);
if check_visited(candidate)
found_unvisited = false;
for i = 1:attempts
cand_try = randi([0,1],1,n);
if ~check_visited(cand_try)
candidate = cand_try;
found_unvisited = true;
break;
end
end
if ~found_unvisited
found_unvisited = false;
for x = 0:(2^n - 1)
cand = bitget(x,1:n);
cand = cand(end:-1:1);
if ~check_visited(cand)
candidate = cand;
found_unvisited = true;
break;
end
end
if ~found_unvisited
candidate = randi([0,1],1,n);
end
end
end
choice_configuration = candidate;
else
[~, max_index] = max(performances);
bestch = choices(max_index,:);
ch = randi(n);
candidate = bestch;
candidate(ch) = 1 - bestch(ch);
if check_visited(candidate)
found_unvisited = false;
for i = 1:attempts
cand_try = randi([0,1],1,n);
if ~check_visited(cand_try)
candidate = cand_try;
found_unvisited = true;
break;
end
end
if ~found_unvisited
found_unvisited = false;
for x = 0:(2^n - 1)
cand = bitget(x,1:n);
cand = cand(end:-1:1);
if ~check_visited(cand)
candidate = cand;
found_unvisited = true;
break;
end
end
if ~found_unvisited
candidate = randi([0,1],1,n);
end
end
end
choice_configuration = candidate;
end
end
The algorithm begins with an exploration phase, where random binary choices are generated and evaluated over multiple periods. During the initial period (50 periods), random configurations are selected. In subsequent periods, the algorithm either explores new random configurations or refines existing ones using a combination of logic functions (NAND). A probabilistic mechanism is used to decide between long jumps, where the best-performing configuration is modified by combining it with a random vector, or a local hill-climbing strategy, where one dimension of the best-performing configuration is randomly altered to potentially improve the overall performance.
Python code:
''' ===================================================================
THE SEARCH ALGORITHM: This is the only part you can edit
===================================================================='''
# Define logic functions
def logic_imply(a, b):
return np.logical_or(np.logical_not(a), b).astype(int)
def logic_Notimply(a, b):
return np.logical_or(a, np.logical_not(b)).astype(int)
def logic_xor(a, b):
return np.logical_xor(a, b).astype(int)
def logic_and(a, b):
return np.logical_and(a, b).astype(int)
def logic_or(a, b):
return np.logical_or(a, b).astype(int)
def logic_nand(a, b):
return np.logical_not(np.logical_and(a, b)).astype(int)
def logic_nor(a, b):
return np.logical_not(np.logical_or(a, b)).astype(int)
def logic_xnor(a, b):
return np.logical_not(np.logical_xor(a, b)).astype(int)
def logic_xor_and(a, b):
xor = np.logical_xor(a, b)
or_ = np.logical_or(a, b)
return np.logical_and(xor, or_).astype(int)
def logic_xor_nand(a, b):
xor = np.logical_xor(a, b)
nand = np.logical_not(np.logical_and(a, b))
return np
def logic_xor_nor(a, b):
xor = np.logical_xor(a, b)
nor = np.logical_not(np.logical_or(a, b))
return np
def logic_xor_xnor(a, b):
xor = np.logical_xor(a, b)
xnor = np.logical_not(np.logical_xor(a, b))
return np
def logic_imply_and_not(a, b):
imply = np.logical_or(np.logical_not(a), b)
and_ = np.logical_and(a, b)
return np.logical_and(imply, np.logical_not(and_)).astype(int)
current_logic_function1 = logic_nand
current_logic_function2 = logic_nand
# Initial round random choices and performance
exploration_time=50
if t==1:
choice_configuration = np.random.randint(0, 2, n)
choices[0] = choice_configuration
t = 1 # Set t to 1 for the first period
elif t<=exploration_time:
# # Use the best-performing choice so far from past periods
# max_index = np.argmax(performances[:t - 1])
# best_choice = choices[max_index, :]
# # Choose a random second vector for logic combination
# second_choice = np.random.randint(0, 2, n)
# # Apply the current logic function
# choice_configuration = current_logic_function1(best_choice, second_choice)
choice_configuration = np.random.randint(0, 2, n)
choices[0] = choice_configuration
else:
problongjump = 0.4
if np.random.rand()<problongjump:
# Use the best-performing choice so far from past periods
max_index = np.argmax(performances[:t - 1])
best_choice = choices[max_index, :]
# Choose a random second vector for logic combination
second_choice = np.random.randint(0, 2, n)
# Apply the current logic function
choice_configuration = current_logic_function1(best_choice, second_choice)
# choice_configuration = np.random.randint(0, 2, n)
else: # local hill climbing
# First, we identify the choice vector with the highest performance so far
max_index = np.argmax(performances) # index of the highest performance
bestch = choices[max_index,:] # the choice vector
# Then, we change the choice on one randomly chosen dimension
ch = np.random.randint(0, n) # a random dimension (a random integer between 0 and n-1)
choice_configuration = bestch.copy() # we copy the best choice vector
choice_configuration[ch] = 1 - bestch[ch] # Then we flip the choice on dimension ch
''' ===================================================================
THE SEARCH ALGORITHM ENDS HERE
===================================================================='''
Bayesian optimization:
We use "Bayesian Optimization"; a search technique for finding the optimum for complicated functions we have incomplete knowledge about.
We model the 2^10 = 1024 choice configurations as having a multivariate normal distribution with covariances that depend on the hamming distance between choices. Whenever a new performance is observed, we update the mean vector and the covariance matrix for the unobserved choices, using well-known formulas for the multivariate normal distribution. This keeps track of how promising unseen choices are and how uncertain we are about their promise.
We calculate the "expected improvement" for each unseen choice: ei_i=E[Max(p_i,ma)]−ma, where ma is the maximum performance observed so far, and the expectation is over the conditional distribution of the performance of unseen alternative i.
We choose the alternative with the highest expected improvement. Expected improvement is larger for alternatives with higher means and for alternatives we are more uncertain about.
If we ignore uncertainty and always choose the configuration with the highest conditional mean, we get a lower average score (about 0.755).
Matlab code:
if t == 1 %
choice_configuration = zeros(1,n); %choose all zeros in first period
nobin = 2^n; % Number of binary vectors
binaryVectors = dec2bin(0:nobin-1) - '0'; % Generate all binary vectors
%list of the indices of the chosen configurations
observed_indices = [1]; %1 = all zeros
unobserved_indices = 2:1:nobin;
%the covariances for hamming distances 0 to 10
cohdist = [833, 500, 278, 139, 60, 20, 4, 0, 0, 0, 0] / 100000;
%the covariance between the performances of two choices at hamming distance d from
%each other is (1/(n^2)) * (1/12) * (n - d(i)) *(nchoosek(n-1-d(i), k) / nchoosek(n-1, k))
% if n-1-d(i) >=k and 0 otherwise
% Compute pairwise Hamming distances
hd = pdist2(binaryVectors, binaryVectors, 'hamming') * size(binaryVectors, 2); % Scale by vector length
% covariances depend on hamming distances
C = cohdist(hd + 1); % Add 1 because MATLAB indexing starts at 1
% Initial mean vector (expected performance is 0.5)
mu = 0.5+zeros(nobin, 1);
% functions needed to compute EI
phi = @(z) exp(-0.5 * z.^2) / sqrt(2 * pi); % Standard normal PDF
Phi = @(z) 0.5 * (1 + erf(z / sqrt(2))); % Standard normal CDF
else %if t > 1
% Partition the covariance matrix (needed to update the mean
% and covariances)
Coo = C(observed_indices, observed_indices);
Cou = C(observed_indices, unobserved_indices);
Cuo = Cou'; % Cou and Cuo are transposes
Cuu = C(unobserved_indices, unobserved_indices);
% Compute the conditional means and covariances
y_obs = performances(1:t-1,1); %the performances so far
cm = mu(unobserved_indices) + Cuo / Coo * (y_obs - mu(observed_indices)); %updated means
Sigma_cond = Cuu - Cuo / Coo * Cou; %updated covariance matrix
cv = diag(Sigma_cond); %the variances
%Compute expected improvement
relstdev = sqrt(max(cv, 0.0001)); % computes stdev, ensures all values are positive
ym = max(y_obs); %maximum performance so far
z = (cm - ym) ./ relstdev;
ei = (cm - ym) .* Phi(z) + relstdev .* phi(z); % EI computation
%choose configuration
ve = find(ei==max(ei)); %find elements with maximal ei
id = ve(randi(length(ve))); %pick one randomly
idbest = unobserved_indices(id); %the index of the chosen configuration
observed_indices = [observed_indices, idbest]; %add to observed
unobserved_indices(unobserved_indices == idbest) = []; %delete from unobserved
choice_configuration = binaryVectors(idbest ,:);
end
Bandit v1.1
A simple bandit strategy (learn with some learning rate, and select with a softmax).
When it observes a fitness value, it updates beliefs about nearby locations, with some decay over hamming distance.
It uses four main parameters (and these can most likely be optimized further):
1: explore_periods: the first x periods are just exploration (from a pre-defined set of evenly spaced-out configurations)
2: learning rate alpha, here a fixed parameter
3: distancedecay (how much to discount distant observations)
4: softmax temperature, here a declining function of time.
This version iteratively updates beliefs, not sure why this should be good in theory, but it seems to work.
The code works with numbers (000 = location 1, 100 = location 2, etc.) rather than binary vectors (and converts them back at the end of each period). Just a personal preference.
Matlab code:
if t == 1
% set up things in the initial round
% --- set up helper tools, this can go to the top of the script ---
% create a matrix that captures the hamming distance between locations
bitset = dec2bin(0:1:2^n-1)-'0';
HD = zeros(2^n);
for i = 1:2^n, for j = 1:2^n, HD(i,j) = sum(abs(bitset(i,:) - bitset(j,:))); end; end
% the first tries are spaced out over the landscape. Can probably be improved.
first_tries = [1;1024;32;228;365;693;850;907;118;123;174;207;218;296;314;332;343;404;413;454;497;556;584;605;723;745;782;819;930;47;52;78;136;155;213;278;345;354;394;419;535;538;550;587;625;653;706;743;772;809;837;913;4;6;7;10;11;13;18;19;21;25;34;35;37;41;49;55;61;66;67;69;73;81;84;95;97;103;106;112;130;131;133;137;140;145;150;151;161;167;171;178;179;184;185;188;191;193;216;229];
% --- global search parameters, can also go to the top of the script
alpha = 0.625; % learning rate
distancedecay = 2.15; % hamming distance discount to new info
explore_periods = 30; % first x periods, just search
T_initial = 0.0065; % also, T declines over time, see below
% set up initial things for the simulation, this needs to stay here
beliefs = ones(2^n,1) / 2; % initial beliefs all 0.5
choices_numeric = zeros(number_of_periods,1); % because I prefer to work with numbers rather than bit strings
end
% Step 1: update beliefs (in all but the first period)
if t > 1
lastchoice = choices_numeric(t-1);
lastscore = performances(t-1);
distances = HD(lastchoice,:); % This says: how far was the last configuration from all other configurations?
distances(distances==0) = 1; % To not divide by 0. Not elegant, but we won't repeat configurations anyway.
% beliefs about every location are updated with some weight:
% general learning parameter alpha matters, but also the distance to the observed location.
weights = alpha ./ distances .^ distancedecay;
% update beliefs (and fix the array direction back to n x 1)
beliefs = (1 - weights) .* beliefs' + weights .* lastscore;
beliefs = beliefs';
end
% Step 2: choose a configuation
if t <= explore_periods
choice = first_tries(t); % pick numbers from the list above in the first X periods
else
% this line ensures we don't repeat past configurations
beliefs(choices_numeric(choices_numeric>0)) = 0;
% Then, do the softmax thing
T = T_initial / log(1+t); % temperature declines over time
choice = find(cumsum(exp(beliefs/T) / sum(exp(beliefs/T))) >= rand, 1, 'first');
end
choices_numeric(t) = choice; % to save for myself
choice_configuration = bitget(choice-1, 1:n);
40 initial random choices and then local search based on a randomly selected top 5 alternative
Matlab code:
intper = 40;
if t <= intper % Random choice phase
choice_configuration = binornd(1, 0.5, 1, n);
end
if t > intper
% best choices so far
[sortedArray, sortedIndexes] = sort(performances(:,1), 'descend'); % Sort array in descending order
top_choices = sortedIndexes(:,1); % Get indexes of the best values
%pick a random choice among top 5
ra = randi(5);
%then do a local search from this
bestch = choices(top_choices(ra),:);
ch = randi(n); % a random dimension (a random integer from 1 to n)
choice_configuration = bestch; % we copy the best choice vector
choice_configuration(ch) = 1-bestch(ch); % Then we flip the choice on dimension ch
end
It begins with an exploration phase of random choices, then it takes the best string and performs some rounds of 3-bit mutation, then some rounds of 2-bit mutation and finally of 1-bit mutations.
Matlab code:
% Coded by Luigi Marengo and Manuel Romagnoli
% It begins with an exploration phase of random choice, then it
% takes the best string and performs a round of 3-bit mutation,
% then a round of 2-bit mutation and finally 1-bit mutations
ExplorePhase=46;
Threemut=20;
Twomut=20;
if t <= ExplorePhase % make a random choice for the first ExplorePhase periods
choice_configuration = randi(2,1,n)-1;
end
if (t>ExplorePhase) && (t<=ExplorePhase+Threemut) %3-bit mutations
% First identify the choice vector with the highest performance so far
[~, max_index]= max(performances);
bestch = choices(max_index,:);
% Then change three randomly chosen bits
ch1 = randi(n);
ch2 = randi(n);
ch3 = randi(n);
choice_configuration = bestch; % copy the best choice vector
choice_configuration(ch1) = 1-bestch(ch1); % Then we flip the choice on dimension ch1
choice_configuration(ch2) = 1-bestch(ch2); % Then we flip the choice on dimension ch2
choice_configuration(ch3) = 1-bestch(ch3); % Then we flip the choice on dimension ch3
end
if (t>ExplorePhase+Threemut) && (t<=ExplorePhase+Threemut+Twomut) %2-bit mutations
% First identify the choice vector with the highest performance so far
[~, max_index]= max(performances);
bestch = choices(max_index,:);
% Then change two randomly chosen bits
ch1 = randi(n);
ch2 = randi(n);
choice_configuration = bestch; % we copy the best choice vector
choice_configuration(ch1) = 1-bestch(ch1); % Then we flip the choice on dimension ch1
choice_configuration(ch2) = 1-bestch(ch2); % Then we flip the choice on dimension ch2
end
if t>ExplorePhase+Threemut+Twomut %1-bit mutations
% First, we identify the choice vector with the highest performance so far
[~, max_index]= max(performances);
bestch = choices(max_index,:);
% Then change one randomly chosen bit
ch1 = randi(n);
choice_configuration = bestch; % copy the best choice vector
choice_configuration(ch1) = 1-bestch(ch1); % Then we flip the choice on dimension ch1
end