Abstract

Deep learning is an area of machine learning that is experiencing an enormous development in recent years and whose applications are revolutionizing very diverse domains, including science. This revolution has also reached the field of physics and, among other branches of it, high energy physics. So far, supervised learning models (for regression and classification) have shown great efficiency in massive data analysis. An example of this is the CTLearn project, a library programmed in Python focused on the analysis of images from imaging atmospheric Cherenkov telescopes (IACTs). However, the training of these models, and in general the characterization of the response of IACTs, requires a large volume of labeled data, something that so far is obtained by expensive Monte Carlo simulations. In order to reduce the computational cost and increase the realism of the simulated data, the possibility of replacing such simulations with what is called a generative deep learning model is studied in this project. In particular, the potential of so-called generative adversarial networks or GANs for this purpose is assessed. Using an AC-GANs approach, that is, using an auxiliary model to condition the GANs, the aim is to replicate a set of simulated data from their labels (particle type, energy and arrival direction). The level of realism of the images and their degree of correlation with their labels is evaluated using CTLearn models on a validation data set. As a result, image generation times several orders of magnitude shorter and a promising degree of realism are obtained, but insufficient correlation with the labels, something that could be solved by better tuning of the model configuration.

Introduction

In the last decades, enormous advances in computational power and data collection have led to the current revolution in deep learning, an area of machine learning that makes use of large learning models and massive amounts of data to solve all kinds of problems in a wide variety of fields, including highenergy physics. So far, deep learning has been especially useful in regression and classification tasks. An example of this is the CTLearn project, a Python library focused on image analysis of IACTs.

More recently, due to their impressive results in other areas, the potential of so-called deep learning generative models, those that do not make predictions about the data but learn its distribution, has begun to be explored in physics.

As for the motivation of this work, characterizing the response of IACTs and training CTLearn models require large labeled data sets, which so far are obtained by Monte Carlo simulations. Two major drawbacks of these simulations are their large computational cost and their limited degree of realism. In this project, we study the possibility of employing a deep learning generative model, in particular the so-called generative adversarial networks or GANs, to solve either or both of these problems. This is a widely studied approach in high-energy physics, especially in particle accelerators.

The goal of this project is to implement a basic model of conditional GANs, which we will introduce later, and evaluate its ability to generate labeled images of IACTs.

Theoretical framework

First, machine learning is a branch of artificial intelligence whose goal is to make computers learn, that is, to improve their performance on a given task, from data. Three types of machine learning are usually distinguished: supervised, which makes use of labeled data to train regression and classification models; unsupervised, which trains models using unlabeled data to discover patterns in them; and reinforcement learning, where training is based on a system of rewards for the model's actions.

In turn, deep learning is an area of machine learning that is characterized by making use of large learning models, based on artificial neural networks, which are trained with massive amounts of data. Neural networks, in their simplest form, are supervised learning models consisting of \(L\) layers, in each of which a linear operation is applied, given by a matrix \(W\) and a vector \(b\), which constitute the parameters of the model, followed by a nonlinear activation function \(g\). Its training is carried out by optimizing a cost function \(J\) which measures the discrepancy between the model predictions, \(\hat{y}\), and the expected values, \(y\), given by the training data labels.

Generative modeling is an unsupervised learning task that consists of taking a training data set that follows an unknown distribution, \(p_{\text{data}}\), and learning to represent an estimate, \(p_{\text{model}}\) of that distribution, sometimes implicitly by generating samples.

Generative adversarial networks were introduced in 2014 as an architecture or strategy for training a generative model. What defines this technique is that it approaches this unsupervised learning task as a supervised learning task by making use of two adversarial submodels (usually given by neural networks):

Generator \(G\): this is the generative model to be trained to obtain new realistic samples. The generator takes a random latent vector, \(z\), from a simple distribution, \(p_z\), as a seed, from which a new sample, \(\tilde{x}\), is generated from a distribution, \(p_G\), which tries to approximate the distribution of the real data, \(p_{\text{data}}\).
The discriminator \(D\): is an auxiliary model introduced to give a supervised learning approach to the problem. Its objective is to discern between real samples (from the training data set) and generated samples. With the discriminator's predictions on the real and generated images, the cost function of the discriminator is calculated, while the cost function of the generator is calculated only with the predictions on the generated images. How well or how poorly the generator samples have managed to fool the discriminator at each training step serves as feedback to the generator for its training.

A potential drawback of the GANs seen so far is that there is no control over the actual data generated. One of the ways to extend GANs to a conditional format are the so-called AC-GANs. This strategy consists of providing as input to the generator, together with the latent vector, the labels of the data to be generated (particle type, energy and direction of arrival in our case) and making use of an auxiliary model, which we will call predictor, to validate that the images generated correspond to these labels. In the case that the auxiliary model is pretrained, the training algorithm of the AC-GANs is the same, other than adding a contribution to the loss function of the generator from the predictions of the auxiliary model on the generated data. Thus, if, for example, we want to generate images of numbers and, specifically, of the number 8, the label 8 will be passed as additional input to the generator, the discriminator will evaluate the realism of the image and the predictor will check that it indeed corresponds to the label 8.

Another important aspect about GANs is their evaluation methods. In the case of GANs, the value of the cost function is not necessarily correlated with better results, and even when it seems to be, it does not allow comparing the performance of different models. The problem is that it is difficult to define a function that evaluates the quality of the results, that is, their realism, their variety, etc. Although there are many proposals in this regard, there is no total consensus. For this reason, the visual evaluation of the results is usually taken as a starting point and specific metrics are often defined for the particular domain of each problem.

Methodology

The first step in the realization of the project was to implement the proposed architecture, ACGANs, on a small scale and train it with a small, simple and easy to visually evaluate dataset, such as MNIST. The objective was to make a first quick validation of the idea to be applied to the real problem. The results were highly satisfactory and can be consulted along with the code in its repository on GitHub.

The next step was to extrapolate this implementation to the IACTs problem, adapting the code to the new dataset and integrating the CTLearn models as predictors. The model, which has its own repository on GitHub, has been programmed in such a way that its configuration can be defined by means of an external file, which facilitates experimentation and its integration with other datasets and predefined models.

Once the model itself was programmed, the next step was to select a configuration of the model. For this purpose, a set of simulated training data from the MAGIC telescopes was chosen. Due to the huge number of possible parameter combinations, the lack of a metric to optimize and the high cost of training the model, a systematic filtering of all relevant model configurations was unfeasible. Instead, a semi-random search was carried out, a strategy widely used in this field. The selection method consisted primarily of periodic visual evaluation, which in the vast majority of cases was sufficient to discard the worst configurations. Finally, after having selected a model configuration, an evaluation of the model is necessary. The chosen evaluation method consists of the following. First, a single data set is chosen and divided into 80% for training and 20% for validation. Subsequently, the training set is used for training:

First, some GANs for gamma-induced cascades and others for proton-induced cascades, in both cases conditioned on energy and direction of arrival of the particle that generates them.
Secondly, a classifier that distinguishes between gamma rays and protons.
And lastly, an energy regression model and a direction of arrival model for each type of particle.

Finally, the trained GANs are used to generate images corresponding to the same labels as the validation data set, and their quality is evaluated according to the degree of realism predicted by the discriminator and their correlation with their labels according to the classifier and the two previous regression models. On the other hand, to check whether the GANs represent a gain in terms of computational cost, the time spent in generating the validation data from their labels is measured.

Results

Here are some examples of simulated images (on top) and the generated images (on the bottom) corresponding to the same labels, both for gamma rays (on the left) and protons (on the right).

At first glance, although there is room for improvement, the degree of realism seems acceptable. This idea is supported by the discriminator's estimate of the so-called Wasserstein-1 distance between the simulated images and those generated for the validation data.

To assess the degree of correlation with the labels, we study the evolution of a total of three metrics evaluated on the predictions that CTLearn models make on the validation data. Specifically, we study the fraction of images that are correctly identified with the particle type specified by their labels and the mean absolute error of the CTLearn models' predictions of the energy and direction of arrival of the images with respect to their labels. While there is some degree of correlation, it can surely be improved considerably with better tuning of the model settings.

Finally, the generation time per image is on the order of microseconds. In contrast, the average simulation time of an event (consisting of one image per detector) is about 1 second, depending on what is being simulated and the available computing power. In other words, the difference between the times is between 5 and 6 orders of magnitude.

Conclusions

As a conclusion, the results of this project should not be understood as a proof of effectiveness, but as the first empirical evidence of the potential of GANs for the conditional simulation of IACTs, in particular, to try to obtain more realistic labeled images and in less time than with simulations.