TensorGP: A Data Parallel Approach to Genetic Programming

Genetic Programming (GP) is an automatic problem-solving technique inspired by nature, that optimizes solutions through the evolution of computer programs. Though a powerful tool, GP is also known to suffer from the burden of being computationally expensive by design. This is especially true during the domain evaluation phase where we have to run our program for every fitness case. Currently, due to the rise of parallel computing in recent years, data vectorization is arguably the most attractive strategy to accelerate the evolutionary process of GP applications.


The main goal of this project is to investigate the advantages of applying data vectorization and fitness caching to the fitness evaluation phase using throughput-oriented architectures such as the GPU. To accomplish this, we developed, implemented, and explored a general-purpose GP engine capable of catering to our vectorization needs – TensorGP.


The Framework


As the name implies, TensorGP works with tensors, which in the context of Machine Learning, are often defined as a multidimensional array of elements. TensorGP takes the classical approach of most other GP systems and expands on it by using TensorFlow to vectorize operations, consequently speeding up the domain evaluation process by seamlessly distributing computational efforts across different hardware architectures. Additionally, TensorFlow allows for the caching of intermediate fitness results, which accelerates the evolutionary process by avoiding the re-execution of highly fit code.


In its simplest form, each individual can be represented as a mathematical expression, as exemplified in Figure 1. TensorGP follows a tree-based approach, internally representing individuals as a tree graph. This implies a first translation phase from string to tree representation, which is only performed at the beginning of the evolutionary process.


Figure 1

Possible genotype to phenotype translation phases in TensorGP.


TensorFlow can either execute in an eager or graph-oriented mode. When it comes to graph execution, TensorFlow internally converts the tree structure into a graph before actually calculating any values. This is done in order to cache potential intermediate results from subtrees, effectively generalizing our tree graph structure to a Directed Acyclic Graph (DAG). On the other hand, the eager execution model allows for the immediate execution of vectorized operations, eliminating the overhead of explicitly generating the intermediate DAG of operations. As the individuals being evolved are constantly changing from generation to generation, it is reasonable to assume that the eager execution mode would instead be a good fit for GP.


Experimental Results


Our early experiments saw the speed comparison between different approaches for both raw domain evaluation and the evolution of a fixed initial population for the symbolic regression problem of approximating the Pagie Polynomial function. In this set of initial benchmarks, an average of 5 runs was performed for every test case. Figure 2 shows the average and standard deviation of timing values (in seconds) across domain sizes ranging from 4,096 points to over 4 million points for domain evaluation.


Figure 2

Speed comparison amongst several iterative and vectorized GP approaches for the evaluation of domains with an exponential increase in size.


These preliminary experimental results demonstrated a preference towards eager execution for our vectorized approach using TensorFlow, verifying the hypothesis that GP does not greatly benefit from graph execution for the evaluation of large domains. Besides, we show that performance gains in excess of 100 times are attainable in certain scenarios when comparing TensorGP to other iterative frameworks such as DEAP.


To furnish our results with statistical significance, a second comparative study was conducted on the same problem, but this time including a wider range of frameworks commonly used in Evolutionary Computation research and with 30 evolutionary runs for each test case.


Figure 3

Speed comparison between TensorGP and other widely used GP frameworks for the execution of a controlled evolutionary experiment across domain sizes..


The experimental results shown in Figure 3 demonstrate that TensorGP consistently achieves faster execution times within the same controlled setup when compared to other GP systems. Specifically, we verified speedups of up to two orders of magnitude when compared to well-established GP frameworks that perform iterative evaluation (such as ECJ or EO) as well as a similar vectorization boosted framework (KarooGP). These performance gains are shown to be more pronounced for the evaluation of domains containing millions of data points, with some iterative frameworks not being able to complete the same evolutionary runs in a feasible amount of time. Table 4 highlights the performance trends with the increase in size of the evaluation domain.


Figure 4

Average (AVG) and Standard (STD) execution times, measured in seconds, of the considered experimental setup across different approaches.
Cells corresponding to the fastest approach are highlighted for each test set. Values represented with DNF did not finish either due to memory limitations or time constraints


Due to the complexity of real-world problems, evaluating more fitness cases is desirable as it provides more information about the optimal solution, therefore increasing the potential for approximation and discovery of more accurate candidate solutions. Moreover, the speedups demonstrated further make the case for vectorized evaluation in GP, especially when executed on throughput-oriented architectures, where GPUs are included.




An important difference between academic exercises and real-world problems is that the latter are generally higher in complexity. When it comes to GP, this inherent complexity often translates to bigger and deeper individuals. It is therefore desirable to be able to evaluate larger individuals in a feasible amount of time.

While it is useful to determine performance gains over domains with a higher granularity, the complexity of real-world problems means that we often need to evaluate large individuals, whose trees can be both large and deep. For this reason, it is also desirable to study the impact that tree size and depth has on performance.


In Figure 5. we see the results from an average of 30 runs with the described experimental for the average execution time in seconds across different depths for TensorGP running in both GPU and CPU for both eager and graph execution modes (as well as the iterative baselines DEAP and EVAL). The results are in the form of Genetic Programming operations per second (GPOps/s).

Figure 5

Average GPOp/s results in for all approaches for the evolution of initial populations generated with the RHH method for depths ranging from 4 to 26. Results not shown in the graph were removed for not meeting the defined time threshold of 30 minutes.

As we can infer from the graph, the evaluation speed of all approaches stays relatively constant across depths, meaning that these approaches manage to scale well across trees of different depths, and therefore different sizes.

Additionally, as expected, the approaches that perform iterative domain evaluation are much slower than TensorGP running in either eager or graph execution modes, confirming that vectorization is indeed beneficial not only for dealing with high granular domains but also with larger and more complex individuals.


Evolutionary Art


TensorGP is not restricted to Symbolic Regression problems. As a matter of fact, due to the increase in tensor evaluation speed, our framework facilitates the exploration of certain parameters such as depth and domain size, which happen to be particularly desirable when applying GP to evolutionary art.


In this context, we integrated our engine with the Neural Image Assessment (NIMA) classifier, ranking images according to how visually appealing they appear to a human subject. NIMA was trained using the Aesthetic Visual Analysis (AVA) dataset, which uses images related to human activities with a fixed size of 224 by 224 pixels. The artifacts generated (i.e. images presented atop of this post) are clear indicators of TensorGP’s potential for artistic production.
Besides, regarding execution time, the evolutionary process involved in the generation of these images with TensorGP was substantially faster (40 seconds) in comparison to DEAP (5 hours and 46 minutes).





TensorGP repository on GitHub




  • F. Baeta, J. Correia, T. Martins, and P. Machado, “TensorGP – Genetic Programming Engine in TensorFlow,” in Applications of Evolutionary Computation – 24th International Conference, EvoApplications 2021, Held as Part of EvoStar 2021, Virtual Event, April 7-9, 2021, Proceedings, 2021, pp. 763-778.

  • F. Baeta, J. Correia, T. Martins, and P. Machado, “Speed benchmarking of genetic programming frameworks,” in GECCO ’21: Genetic and Evolutionary Computation Conference, Lille, France, July 10-14, 2021, 2021, pp. 768-775.