Basic knowledge

What is the framework of in-depth learning?

1 Introduction

I've been talking about the deep learning framework, and I've been experimenting with tensorflow recently. But the understanding of the relationship is still not enough. What kind of operation mechanism does it have?

2 Notes

Deep learning frameworks are also like Caffe and tensorflow, which are tools for deep learning. Simply speaking, they are libraries. Import Caffe and import tensorflow are needed for programming. To make a simple analogy, a deep learning framework is a set of building blocks of the brand. Each component is part of a model or algorithm. You can design how to use building blocks to build blocks that fit your data set. The advantage is that you don't have to repeat the wheel, the model is the building blocks. It's for you. You can assemble them directly, but different ways of assembling, that is, different data sets, depend on you.

3 Application Advantages

The emergence of in-depth learning framework reduces the threshold of entry. You don't need to start coding from complex neural networks. You can use existing models and model parameters to train yourself according to needs. You can also add your layer to existing models, or choose the points you need at the top. Classifiers and optimization algorithms (such as the commonly used gradient descent method). Of course, because of this, no framework is perfect, just like a set of building blocks may not have the kind of building blocks you need, so different frameworks are not completely consistent in the field of application. Generally speaking, the deep learning framework provides a series of components for deep learning (for general algorithms, there will be implementation). When new algorithms need to be used, users need to define them themselves, and then call the function interface of the deep learning framework to use user-defined new algorithms.

4 About Components

Most in-depth learning frameworks contain the following five core components:

1. Tensor

2. Tensor-based operations

3. Computation Graph

4. Automatic Differentiation Tool

5. Extension packages such as BLAS, cuBLAS and cuDNN

Here's a brief explanation of the five core components

4.1 Tensor

Tensor is the core component of all deep learning frameworks, because all subsequent operations and optimization algorithms are based on tensors. The tensor defined in geometric algebra is based on the generalization of vectors and matrices. Generally speaking, we can regard scalars as zero-order tensors, vectors as first-order tensors, and matrices as second-order tensors.

For example, we can represent any RGB color image as a third-order tensor (three dimensions are height, width and color data of the image respectively). The following picture shows a common fruit picture. According to the RGB three primary colors, it can be divided into three red, green and blue gray images. If this representation is written in tensor form, it is the table at the bottom of the picture.

Here's a picture description.

Here's a picture description.

Here's a picture description.

The figure shows only the data of the first five rows and 320 columns, each of which represents a pixel, in which the data [1.0, 1.0, 1.0] is the color. Assuming that [1.0, 0, 0] is red, [0, 1.0, 0] is green, and [0, 0, 1.0] is blue, the data in the first five lines are all white, as shown in the figure.

By extending this definition, we can also use fourth-order tensors to represent a data set containing multiple pictures, including four dimensions: the number of pictures in the data set, the height and width of pictures, and the color data.

It is a very necessary and efficient strategy to abstract all kinds of data into tensor representation and then input the neural network model for subsequent processing. Because without this step, we need to define different types of data operations according to different types of data organization, which wastes a lot of developer energy. More importantly, when the data processing is completed, we can easily convert the tensor back to the desired format. For example, in the Python NumPy package, numpy. imread and numpy. imsave are used to convert images into tensor objects (that is, Tensor objects in code) and tensors into pictures to save.

4.2 Tensor-based operations

With the tensor object, the next step is a series of mathematical operations and processing for this object.

Here's a picture description.

In fact, the whole neural network can be simply regarded as a series of operations for the input tensor in order to achieve some purpose. The so-called "learning" is the process of correcting the errors between the actual output and the expected results of the neural network. The range of operations here is very wide. They can be simple matrix multiplication or slightly complex operations such as convolution, pooling and LSTM. Also, tensor operations supported by frameworks are often different, and you can view their official documents for details (as follows: the instructions for NumPy, Theano and TensorFlow).

NumPy: http://www.scipy-lectures.org/intro/numpy/operations.html

Theano: http://deeplearning.net/software/theano/library/tensor/basic.html

TensorFlow: https://www.tensorflow.org/api_docs/python/math_ops/

It should be noted that most tensor operations are class-based (and abstract) rather than functions (which may be attributed to the fact that most in-depth learning frameworks use them).

Please read the Chinese version for details.

PREVIOUS：Baidu AI CEO was splashed, but do you remembe NEXT：The three-year action plan for the developmen