2, data organization NumPy provides many high-grade numerical programming things, such as: matrix data types, vector processing, and sophisticated arithmetic library. Specialized for rigorous numerical processing and produced. It is used by many large financial firms, as well as core scientific organizations such as Lawrence
Livermore, NASA, and others to handle some of the tasks that would otherwise be done in C++, Fortran, or Matlab, etc. PandasPandas is based on NumPy, and was created to handle the task of data analysis. Pandas incorporates a large number of libraries and a number of standard data models that provide what is needed to efficiently manipulate large data sets. pandas provides a large number of functions and methods that allow us to work with data quickly and easily. As you'll soon discover, it's one of the key factors that make Python a robust and efficient data profiling environment.
3, modeling analysis Scikit-learn engaged in data analysis modeling must learn package, supply and summary of the then common algorithms and processing problems in the field of data analysis, such as classification problems, regression problems, clustering problems, dimensionality reduction, model selection, feature engineering.
4, data visualization if you look at visualization in Python, you may think of Matplotlib. in addition, Seaborn is a similar package, which is used for statistical visualization package. About self-study python beginner, how to get started with Python data profiling, the above is a fundamental learning route planning.