The 2020 Zhejiang Province University Computer Grade III Data Management and Analysis Technology Exam Outline
Exam Objectives
Master the theory and basic applications of relational databases, master the basic concepts of big data, the core ideas of Hadoop and Spark, distributed computing models, and have the ability to manage and analyze the application of data based on relational databases and big data.
Basic Requirements
1, master the basic concepts of database;
2, master the relational model, relational model integrity constraints and function dependency category of the normalization theory;
3, proficiency in relational database design methodology: including the design of the ER model, the conversion and optimization of the ER model to the relational model, and the design of database logical structure;<
4, proficiency in basic user management, basic rights management, and application of basic SQL commands in MySQL platform;
5, proficiency in designing MySQL stored procedures and triggers, and understanding of database transaction and concurrency control mechanisms;
6, understanding of the core ideas of the big data technologies Hadoop and Spark and their respective characteristics Spark and Hadoop functional positioning of the difference and connection; understand Spark better than Hadoop's core technology (RDD, DAG, memory computing, inert summation);
7, master the HDFS distributed file system and MapReduce computing model;
8, proficiency in commonly used Linux command line operations and Hadoop command operations;
9, proficiency in MapReduce implementation of common tasks of Hadoop distributed computing;
10, proficiency in the use of interactive spark-shell to write distributed machine learning tasks.
Exam content
I, relational database applications (40%)
1, the basic concepts of database: data, database and data processing, database systems, composition, structure and history of development; database three-level schema structure of the concept of data, the meaning of data logical independence and physical independence.
2, database system data model categories: hierarchical, mesh, relational, object-oriented database, NoSQL database model features, differences and typical DBMS products.
3, relational database integrity constraints category: primary key constraints, foreign key constraints, data type constraints, (Not)Null constraints, Check constraints; master the relational data theory of function dependency, 1NF, 2NF, 3NF, BCNF definition and judgment method.
4, the basic application of structured query language SQL: Database Object Definition Language (DDL) (data types, library creation and deletion, table creation, modification and deletion, view creation and deletion, index creation and deletion), database query (single-table query, single-table itself connection query, 2-table or 3-table connection query (equal value connection, natural connection, left outside, right outside, All outer join select implementation), irrelevant and relevant nested sub-queries, grouped statistical queries, query results sorting), data update (table data insertion, deletion and modification); which irrelevant nested sub-queries require mastery of in, any, all of the application of relevant nested sub-queries require mastery of the simple application of the exists predicate.
5. MySQL Stored Procedure and Trigger Design: MySQL platform without parameters, with a number of in, out parameters of the stored procedure design and trigger design; stored procedure invocation methods and trigger testing; understanding the definition of the transaction, the ACID characteristics of the multi-user database system with the blocking of concurrency control technology, the basic principles.
6. Basic management of MySQL users and permissions: creation of new users, authorization of table objects (select, insert, update, delete permissions).
7. Database Design for Simple Database Applications: Requirements Description, ER Diagram Design, Conversion Methods from ER Diagram to Relational Model, Data Model Optimization, Design Views, Logical Design, Physical Design; ER Diagrams including entities, attributes, associations (1-to-1, 1-to-many, many-to-many), and participation constraints (min min, max max) representations and meanings.
Two, big data management and analysis technology (60%)
1, big data basic concepts: 4V characteristics of big data, types (structured and unstructured big data), the core technology (distributed storage and distributed processing), big data computing models (batch computing, streaming computing, graph computing, query analytics computing), each type of computing model typical representative of the product.
2, Hadoop framework basic theory: Hadoop characteristics, core modules and the corresponding main functions (HDFS distributed file system, MapReduce computing model).
(1) HDFS file system basics: architecture, HDFS implementation of the goals and limitations, HDFS NameNode and DataNode functions and modules (NameNode: FsImage and EditLog; DataNode: data storage and retrieval).
(2) MapReduce computing model basics: architecture (Client, JobTracker, TaskTracker, and Task), advantages (good fault tolerance, low hardware requirements, low programming difficulty, use of multiple scenarios, etc.), design strategy (divide and conquer, computing to the data closer to the Master/Slave architecture). ).
(3) Map/Reduce input/output and workflow: Input-> Map-> Reduce-> Output.
3, commonly used Linux command-line tools and Hadoop operations:
(1) Linux common operations: cd, mkdir, rmdir, cp, mv, rm, cat, more, head, tail, touch, chown, chmod, find, tar, grep;
(2) Hadoop common operations: Hadoop start (all processes start, single process start), view the directory (hdfs dfs ?ls), open the file (hdfs dfs ?cat), upload local files or directories to Hadoop (hdfs dfs ?put), download from Hadoop to the local directory (hadoop dfs -get), delete folders or files on Hadoop (hdfs dfs ?rm||-rmr), create a new directory within the specified directory in Hadoop ( hdfs dfs ?mkdir), rename a file on Hadoop (hdfs dfs ?mv), save all the contents of a specified directory on Hadoop as a file and download it locally at the same time (hdfs dfs ?getmerge), kill a running Hadoop job (hadoop job ?kill ), view PATH directory information (hdfs dfs ?count), display the contents of the file (hdfs dfs ?text), and view help (hdfs dfs -help).
4, the classical statistical algorithms (de-emphasis, counting, sorting, TopK sorting, find the maximum and minimum value) and relational operations (selection, projection, grouping) MapReduce implementation.
5, Spark basic concepts: Spark core technology (RDD: elastic distributed dataset, RDD two types of operations: Transformation and Action, directed acyclic graph DAG, memory computing technology, inert computing), Spark features (fast, rich API, high fault tolerance, deployment of diversified ways), Spark's architecture (driver program, SparkContext object, Cluster Manager, worker nodes).
6, Spark application running architecture and running process (Cluster Manager (Cluster Manager), multiple work nodes (Worker Node), each application's task control node (Driver) and each work node is responsible for the specific task of executing the process (Executor)).
7, using Spark MLib library for machine learning (feature extraction, statistics, classification, regression, clustering, collaborative filtering).
8, Spark classic application scenario analysis: SQL query, text processing, analytics, music, video, advertising accurate recommendations, real-time data analysis.
The above is a small compilation of the 2020 Zhejiang Provincial University Computer Level 3 Data Management and Analysis Technology Exam Syllabus, ready to participate in the second level of the examination of the students are to understand it. Want to keep abreast of the dynamics of the exam partners can? free to book a text message reminder? The global Qingteng will push the national computer grade examination related information reminders for you in time. The global Qingteng friendly tips: for want to participate in the computer level two exam partners, the global Qingteng specially prepared for you computer level two exam materials, if you need to click on the bottom of the article please click on the article? Free Download>>>Computer II Exam Materials? Download and study.