Evolving Ambiguous Images
This work explores the creation of ambiguous images, i.e., images that may induce multistable perception, by evolutionary means. Ambiguous images are created using a general purpose approach, composed of an expression-based evolutionary engine and a set of object detectors, which are trained in advance using Machine Learning techniques. Images are evolved using Genetic Programming and object detectors are used to classify them. The information gathered during classification is used to assign fitness. In a first stage, the system is used to evolve images that resemble a single object. In a second stage, the discovery of ambiguous images is promoted by combining pairs of object detectors. The analysis of the results highlights the ability of the system to evolve ambiguous images and the differences between computational and human ambiguous images.
We consider ambiguous images and multistable perception fascinating phenomena, worth studying for both scientific and artistic purposes. Some of the questions that motivate the research are: (i) Can ambiguous images be created by fully automated computational means? (ii) Can this be done from scratch (i.e. without resorting to collages or morphing of pre-existent images)? (iii) How do computational ambiguous images look like? (iv) How do they relate to human ambiguous images? (v) How can the dichotomy between human and computational ambiguity be explored for artistic purposes? (vi) Can one explore computer vs. human creativity and perception scientifically via ambiguous images?
First we evolve images containing a single object. Following in the footsteps of Machado and Correia (2012, 2013) we use an object detector to guide evolution, assigning fitness based on the internal values of the object detection process. Then, using object detectors trained to identify different types of objects, we evolve images containing two distinct objects. Finally, we focus on the evolution of ambiguous images, which is achieved by evolving images containing two distinct objects in the same window of the image.
Overview of the Approach
Figure 1 presents an overview of the framework, which is composed of two main modules, an evolutionary engine and a classifier. For the purpose of this paper, the framework was instantiated with a general-purpose GP-based image generation engine and with a cascade classifier as an object detector. To create a fitness function able to guide evolution it is necessary to convert the binary classification output of the object detector to one that can provide a suitable fitness landscape. This is achieved by accessing internal results of the classification task that provide an indication of the degree of certainty in the classification.
The Genetic Programming engine allows the evolution of populations of images. The genotypes are expression trees where the functions include mathematical and logical operations and the terminal set is composed of two variables, x and y, and random constant values. The phenotypes are images, rendered by evaluating the expression trees for different values of x and y, which serve both as terminal values and image coordinates. In other words, the value of the pixel of coordinates (i,j) is calculated by assigning i to x and j to y and evaluating the expression tree.
The object detectors used in the framework are cascade classifiers (see Figure 2) based on the works of Viola and Jones. Two object detectors were trained, by building datasets of faces and flowers. This training procedure was attained using OpenCV API. Fitness is assigned by accessing internal results of the classification task. As such, images that are immediately rejected by the classifier will have lower fitness values than those that are close to the detection of the object.
The results obtained when evolving images containing single objects confirm previous work in this field . In all runs and for all classifiers, evolution was able to produce images where the object was detected (see Figure 3). Also confirming previous results in the area, although all runs evolved images where the object in question was detected, the visibility of these objects to a human observer is questionable in some of the cases (see Figure 4).
We then focused on the evolution of images containing faces and flowers simultaneously, without enforcing the overlap between the regions where these objects were identified. As it can be observed in Figure 5, some of the evolved images depict the same type of optical illusion as Rubin’s vase. As such, we can state that in some of the evolutionary runs the algorithm evolved images that are ambiguous both from computational and human perspective, in the sense that both computer and human are able to recognize simultaneously a face and a flower in the same region.
In our third experimental the overlap between the regions where faces and flowers are detected becomes a requirement. Figure 6 shows the evolution of the fitness of the best individual. An analysis of the resulting images reveals that although the majority of the runs evolved images where both objects were detected in the same window, which can, as such, be considered computationally ambiguous, most of the images found are not evocative of both objects (see Figure 7). Nevertheless, in some cases, images that are also ambiguous from a human perspective were evolved (see Figure 8).
The experimental results demonstrate that it is possible to evolve ambiguous images. They also highlight the differences between computational and human ambiguity. Although the evolution of computational ambiguous images was frequent, only a portion of these images induce multistable perception in humans. The results obtained so far are not of the same level as human-designed ambiguous images. Nevertheless we consider them inspiring. They also demonstrate the feasibility of the approach and open new avenues for research. The next steps will be the following: perform experiments considering a wider set of classes of objects; further explore the evolution of images with partial and total overlap of object detectors and explore the generation and evolution of ambiguous tiling patterns.
P. Machado, A. Vinhas, J. Correia, and A. Ekárt, “Evolving Ambiguous Images,” in Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015, 2015, pp. 2473-2479.
J. Correia, P. Machado, J. Romero, and A. Carballal, “Evolving Figurative Images Using Expression-Based Evolutionary Art,” in Proceedings of the fourth International Conference on Computational Creativity (ICCC), 2013, pp. 24-31.
P. Machado, J. Correia, and J. Romero, “Expression-Based Evolution of Faces,” in Evolutionary and Biologically Inspired Music, Sound, Art and Design – First International Conference, EvoMUSART 2012, Málaga, Spain, April 11-13, 2012. Proceedings, 2012, pp. 187-198.
P. Machado, J. Correia, and J. Romero, “Improving Face Detection,” in Genetic Programming – 15th European Conference, EuroGP 2012, Málaga, Spain, April 11-13, 2012. Proceedings, 2012, pp. 73-84.
This research is partially funded by: Fundação para a Ciência e Tecnologia , Portugal, under the grant SFRH/BD/90968/2012; project ConCreTe. The project ConCreTe acknowledges the financial support of the Future and Emerging Technologies (FET) programme within the Seventh Framework Programme for Research of the European Commission, under FET grant number 611733.