Current location - Loan Platform Complete Network - Big data management - Google Brain Releases Conceptual Activation Vectors to Understand How Neural Networks Think
Google Brain Releases Conceptual Activation Vectors to Understand How Neural Networks Think

Produced by Big Data Digest

Compiled by Ke Li, Qiuyue Zhang, and Junhuan Liu

Interpretability remains one of the biggest challenges for modern deep learning applications. Recent advances in computational modeling and deep learning research have allowed us to create extremely complex models, including thousands of hidden layers and tens of millions of neurons. Cutting-edge deep neural network models with amazing results are relatively simple to build, but understanding how these models create and use knowledge remains a challenge.

Recently, researchers on the Google Brain team published a paper proposing a new approach called Concept Activation Vectors (CAVs), which provides a new perspective on the interpretability of deep learning models.

Understanding the CAV technique requires an understanding of the nature of the interpretability puzzle in deep learning models. In today's generation of deep learning techniques, there is an eternal contradiction between model accuracy and interpretability. The interpretability-accuracy paradox exists between the ability to accomplish complex knowledge tasks and the ability to understand how those tasks are accomplished. Knowledge versus control, performance performance versus verifiability, efficiency versus simplicity... Any one of these choices is really a trade-off between accuracy and interpretability.

Do you care about getting the best results, or do you care about how the results were produced? This is a question that data scientists need to answer in every deep learning scenario. Many deep learning techniques are inherently complex, and while they are accurate in many scenarios, explaining them is very difficult. If we plot some of the most famous deep learning models in an accuracy-interpretability chart, we get the following:

Interpretability in deep learning models is not a single concept. We can understand it at multiple levels:

To get the interpretability defined at each level of the diagram above, several basic building blocks are required. In a recent paper, researchers at Google outlined what seem to them to be some of the basic building blocks of interpretability.

Google summarizes several principles of interpretability as follows:

- Understand the role of hidden layers : Most of the knowledge in a deep learning model is formed in hidden layers. Understanding the function of different hidden layers at the macro level is critical to explaining deep learning models.

- Understanding how nodes are activated : The key to interpretability is not to understand the function of individual neurons in a network, but to understand the population of interconnected neurons that are excited together at the same spatial location. Segmenting a neural network by clusters of interconnected neurons allows us to understand its function at a simpler level of abstraction.

- Understanding concept formation: Understanding how deep neural networks form the individual concepts that make up the final output is another key building block of interpretability.

These principles are the theoretical foundation behind Google's new CAV technology.

Following the ideas discussed earlier, what is commonly thought of as interpretability is the characterization of a deep learning model's predictions by its input features. Logistic regression classifiers are a prime example of this, where the coefficient weights are usually interpreted as the importance of each feature. However, most deep learning models operate on features such as pixel values that do not correspond to high-level concepts easily understood by humans. In addition, the internal values of the model (e.g., neuron activations) are obscure. While techniques such as saliency maps can effectively measure the importance of specific pixel regions, they cannot be correlated with higher-level concepts.

The core idea behind CAV is to measure the relevance of a concept in the model output. The CAV of a concept is a set of vectors made up of the values (e.g., activations) of instances of that concept in different directions. In the paper, the Google research team outlined a linearly interpretable method called Testing with CAV (TCAV), which uses partial derivatives to quantify the sensitivity of predicting potentially high-level concepts represented by CAV. They conceptualized the TCAV definition with four goals:

- Ease of understanding : Users need little to no machine learning expertise.

- Personalization : Adapt to any concept (e.g., gender) and not be limited to the concepts covered in training.

- Plug-and-play: Works without retraining or modifying the machine learning model.

- Global Quantization: All classes or all instances can be interpreted using a single quantitative measure, rather than just a single data input.

To achieve the above goals, the TCAV approach is divided into three basic steps:

1) Define relevant concepts for the model.

2) Understand the sensitivity of the predictions to these concepts.

3) Infer a global quantitative explanation of the relative importance of each concept to each model prediction class.

The first step in the TCAV approach is to define the concepts of interest (CAVs). To do this, TCAV selects a set of instances representing the concept or looks for independent datasets labeled as such. We can learn the CAV by training a linear classifier to distinguish between activations generated by instances of the concept and instances in each layer.

The second step is to generate a TCAV score that quantifies the sensitivity of the prediction to a particular concept.TCAV uses the partial derivatives used to measure the sensitivity of the ML prediction value to the inputs in the direction of a particular concept, in the activation layer.

The final step attempts to assess the global relevance of the learned CAVs, avoiding reliance on irrelevant CAVs.After all, one of the drawbacks of the TCAV technique is that meaningless CAVs may be learned, since CAVs can still be obtained using a randomly selected set of images, and testing on such random concepts is unlikely to be meaningful. To deal with this difficulty, TCAV introduces a statistical significance test, which evaluates the CAV at a random number of training sessions (typically 500.) The underlying idea is that meaningful concepts should yield consistent TCAV scores over multiple training sessions.

The team conducted several experiments to evaluate the efficiency of TCAV compared to other interpretability methods. In one of the most compelling tests, the team used a saliency map to try to predict the relevance of the concept cab to a caption or image. The output of the saliency map is shown below:

Using these images as a test dataset, the Google Brain team invited 50 people on Amazon Mechanical Turk to conduct an experiment. Each experimenter performed a series of *** six randomized sequential tasks (3 types of objects x 2 types of saliency maps) against a single model.

In each task, experimenters are first presented with four images and corresponding saliency masks. Then, they are asked to assess how important the images are to the model (on a 10-point scale), how important the captions are to the model (on a 10-point scale), and how confident they are in their answers (on a 5-point scale). In total*** the experimenters rated 60 different images (120 different salient graphs).

The basic fact of the experiment was that the concept of image was more relevant than the concept of title. However, when looking at saliency maps, people perceived the caption concept as more important (models with 0% noise) or could not discern a difference (models with 100% noise). In contrast, the TCAV results correctly indicate that the image concept is more important.

TCAV is one of the most innovative neural network interpretation methods in years. The initial code can be seen on GitHub. Many mainstream deep learning frameworks are likely to adopt these ideas in the near future.

Related story:

/this-new-google-technique-help-us-understand-how-neural-networks-are-thinking-229f783300