2020年12月23日星期三

Improving this genetic algorithm for error minimization

I've written a simple genetic algorithm, designed to perform fitting. That is, given some input f(x), I'm able to solve for x without knowing f (and, really, an f(x) doesn't have to even exist). My process is as follows:

  1. I generate some initial points, uniformly distributed across the known solution interval 0,1.

  2. I then iterate, until I reach some some maximum number of generations. With each iteration, I:

    a) Sort the current set of points by minimum error (RSS error), store them in the parents list

    b) Keep the first 500, and throw in a few randomly selected points from the initial list

    c) 1/2 of points from the parents list I "smear", by generating a new point distributed plucked from a Gaussian with a mean given by the chosen point

    d) I now fill the remaining "empty slots" in the parents list by producing "children". Children are produced by randomly selecting two points from the parents list, and taking their mean ((male + female)/2).

    e) Finally, I set the initial list equal to the parents list, and jump back to a)

In the end, I sort the list one final time, and select the first element to be the solution.

See code below

I end up with a somewhat-ok solution. It seems to jump surprisingly quickly to the neighborhood of the solution, but then fails to make much progress from there. I still obtain better (and faster) results using brute-force. So, I'd like to improve my algorithm.

A couple notes:

I'm well aware that this depends, to some extent, on the problem I'm applying the algorithm to. I'm interested in using this for a few different things. Ignoring the problem, what can I do with what I have to improve this?

I'm also aware better/easier/faster/etc methods (probably) exist. I'm just interested in the topic.

My code:

        initial = []            # Generate random initial test points          for i in range(5000):              initial.append(random.uniform(0,1))            for i in range(max_generations):              # Sort according to some error function              initial.sort(key = error_func)                # Keep the "best" 500              parents = initial[:500]                # Throw in a few random points              for i in range(randint(10,100)):                  parents.append(initial[randint(0,len(parents) - 1)])                # "Mutate" half the parents              for individual in parents:                  if randint(0,1):                      individual = random.gauss(individual, 1E-5)                children = []                while len(children) < (5000 - len(parents)):                  # Randomly pick a male and female                  male = parents[randint(0, len(parents) - 1)]                  female = parents[randint(0, len(parents) - 1)]                    # Produce a child                  children.append((male + female) / 2)                 parents.extend(children)              initial = parents         initial.sort(key = error_func)       print(initial[0])  

Some points I'm considering:

  1. I know there are several different ways genetic algorithms choose the "fittest" points (individuals/members/etc). Perhaps there's a better way than just sorting by least error?

  2. Is it generally better to have more starting points, or more iterations/generations?

Goals

#1: Improve accuracy and precision #2: Improve speed with which a "nice" solution is found

https://stackoverflow.com/questions/65421016/improving-this-genetic-algorithm-for-error-minimization December 23, 2020 at 04:15PM

没有评论:

发表评论