Mathematical statistics: probability theory and various statistical methods to achieve a basic grasp of, for example, how to calculate Bayesian probability? What is the probability distribution all about? Although proficiency is not required, the relevant background and terminology must be understood.
Interactive data analysis frameworks: This does not mean SQL or database queries, but analytic interaction frameworks like Apache Hive or Apache Kylin. There are many such similar frameworks in the open source community that can use traditional data analysis methods for data analysis or data mining of big data. I have had the experience of using Hive and Kylin, but Hive, especially Hive1 is based on MapReduce, performance is not particularly outstanding, while Kylin using the concept of data cube combined with the star model, you can do a very low-latency analysis of the speed, and Kylin is the first R & D team is the main force of the Chinese Apache incubation project , so the Kylin is the first Apache incubation project with a Chinese development team, so it's getting more and more attention.
Machine learning framework: machine learning is really hot universe, everyone mentioned machine learning and AI, but I have always thought that machine learning is just like a few years ago, like cloud computing, although the current hot, but there is no actual landing project, it may take a few years to gradually mature. But it never hurts to start stocking up on machine learning knowledge now. When it comes to machine learning frameworks, there are many familiar ones, including TensorFlow, Caffe8, Keras9, CNTK10, Torch711, and so on, with TensorFlow leading the way. I currently recommend that you select one of the framework for learning, but to my understanding of these frameworks, most of these frameworks are very convenient to encapsulate a variety of machine learning algorithms to provide users with the use of the underlying algorithms for the understanding of the fact that there is not much to learn. Therefore, I still recommend that you can learn from the principles of machine learning algorithms.