Why does deep learning take the lead in speech recognition and image processing?

Deep learning has actually promoted the development of many fields at the same time. One example I know is the improvement of traditional language model by word embedding method in NLP [1]; And I believe that deep learning will further promote the development of more AI fields.

Of course, deep learning is the most concerned and obvious performance, and it has made great progress in speech and image recognition. In fact, some students answered beautifully, but I couldn't help saying my own understanding again, so let's discuss it together.

In line with the complexity of scholars' simple questions ... ah, bah, it is the standard of completeness of the argument. I think we can solve this problem from the following three points:

1. Why is deep learning suddenly on fire?

2. Why should deep learning be applied to speech recognition and image recognition?

3. Why can deep learning be successfully applied to speech and image recognition and make a breakthrough?

In order to make more friends who are interested in deep learning understand, I will try my best to explain my views in simple language (in the narrative, it is assumed that you have roughly understood what deep learning and neural networks are and the basic principles of neural networks. By the way, I think you have also browsed other answers):

= = = = = = = = = = = = = I am the dividing line = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

1. Why is deep learning suddenly on fire?

When it comes to this problem, if five or six years ago, many people would definitely say that it was because of Hinton's paper "Reducing the Dimension of Data with Neural Networks" published in Science.

Although the neural network "claims" that it can fit any function and simulate the running form of the human brain, all this is based on the fact that the neural network is deep enough and large enough. Without scale, shallow neural networks can do nothing. However, it is found that optimizing multilayer neural networks is a highly nonconvex problem. When there are too many layers in the network, the training is difficult to converge, or it can only converge to a suboptimal local optimal solution, but its performance is not as good as that of a shallow model with one or two layers. This serious problem directly led to the final decline of neural network method.

In Hinton's article in 2006, he proposed a method of pre-training by using RBM, that is, initializing the network to a "good" level with a specific structure, and then returning to the traditional training method (back propagation BP). It seems that the deep network obtained in this way can achieve a good effect and solve the problem of "not deep" network to some extent. Under this framework, deep learning has attracted people's attention again, and some new methods have been invented (automatic de-noising encoder, Dropout, ReLU…… .....), all of which make neural networks have unprecedented "deeper" possibilities.

But now that we look back at this problem, we should add two more critical elements: big data and high-performance computing.

In today's Internet age, the data accumulation in recent ten years is explosive. A few years later, people found that with enough data, deep network can get very good results even without pre-training. For convolutional neural network CNN or LSTM, pre-training itself is not as easy as full connection. A technology can not greatly improve performance, but also requires researchers to think hard about algorithms and programmers to write code. Who do you think it is? ... the current speech recognition or image recognition system basically has no pre-training steps if there are a large number of training samples.

High performance computing and big data complement each other. Imagine that you have a lot of data (millions of pictures, tens of thousands of hours of voice), but the calculation speed can't keep up. It takes several years to train a network (people who do machine learning should know that this is no exaggeration), so this research is completely unnecessary, right? This is why some people think that the neural network fire is entirely because GPU makes the calculation method faster and better. In this sense, the development of GPU parallel computing has indeed greatly promoted the popularity of deep learning.

With big data and high-performance computing to lay the most solid foundation, human intelligence is endless. Scholars who believe in deep learning have used various algorithms to stimulate the potential of deep learning, such as Microsoft residual learning [2]. Otherwise, no amount of data can train the traditional neural network to 152 layer.

Optimistic Life Extension Big Data

English heard the test health declaration difference of one day how to do

How about Yangji Chinese Medicinal Materials Co., Ltd.?

Commonly used B2B e-commerce platforms in China

Fujian Hunger Vocational and Technical College

What are the three major database retrieval systems?

Was Neymar's choice to leave Barcelona the right one?

Three ways to address high concurrency

System Integration Project Management Engineer and Information System Project Manager, these two certificates can still test, have not been canceled?

Netflix drama "Like You When the Wind is Sweet" is on fire, does it deserve its name?