Current location - Loan Platform Complete Network - Big data management - James Nicolai Gray's Research and Development
James Nicolai Gray's Research and Development

Born in 1944, Gray received his Ph.D. from the Department of Computer Science at the prestigious University of California, Berkeley. His doctoral dissertation was on the theory of grammatical analysis of preferred grammars. After his studies, he worked at Bell Labs, IBM, Tandem, and DEC, and his research shifted to the area of databases.

During his time at IBM, he participated in and led the development of projects such as IMS, System R, SQL/DS, DB2, etc. Among them, except for System R, which was only used as a research prototype and didn't become a product, the other ones became influential products of IBM in the database market.

During his time at Tandem, Gray improved and expanded the company's main database product, ENCOMPASS, and worked on projects such as System Dictionary, Parallel Sort, Distributed SQL, and Nonstop SQL.

At DEC, he still has primary technical responsibility for database products. Gray entered the database field, the basic theory of relational databases have matured, but the major companies in the implementation of relational database management system (RDBMS) and product development, have encountered a series of technical problems, mainly in the case of the database is getting bigger and bigger, the database structure is getting more and more complex, and there are more and more users *** enjoy the database, how to guarantee the integrity of the data ( Integrity), security (Security), parallelism (Concurrency), as well as once the failure occurs, how to realize the database from the failure to recover (Recovery). If these issues can not be satisfactorily resolved, no matter which company's database products can not enter the practical, and ultimately can not be accepted by users. It is in the solution of these major technical problems, so that the DBMS maturity and smooth entry into the market in the process, Gray with his ingenuity to play a very critical role.

The DBMS to solve the above problems of the main technical means and methods are as follows:

1. The operation of the database is divided into what is called a "transaction" (or "transaction", transaction) of an atomic unit. Atomic unit. Transaction is a transaction processing (transaction processing) of the basic execution unit, that is, a transaction in the operation either all be executed, or all are not executed, that is, the implementation of the so-called all or none of the principle. A transaction is generally a "start" statement (begin) to start, first from the database to take out some data, and then the required processing, and finally to "submit" statement (commit) end. If an exception occurs in the transaction, the "abnormal termination" statement (abort) or "back" statement (rollback) to undo the execution of the transaction process on the database has done all the updates (the so-called undo), the database will be restored to the correct state at the beginning of the transaction to The transaction starts in the correct state to protect the integrity and consistency of the data.

Jim Gray (1978) 2. Users in the database operation request, the system on the different granularity (granularity) of the data elements (fields, records to the entire file) "locking" (locking), locking data will be temporarily prohibited for other users to access (we only here). user access (we are here only a simplified explanation, in fact, according to the different nature of the user's request for data, the locked data how to treat another user's request, presenting a complex situation, for example, if the locked data will be modified, that is absolutely prohibit other users from accessing the data; and if the locked data is only used to read out, then the other user's read-out request will also be allowed. This is managed and controlled by the so-called "lock compatibility matrix".) Unlocking" (unlocking) after the operation is complete. This mechanism is used to maintain the "concurrency" between transactions, but also to ensure the "integrity" of the data.

3. The establishment of a system running log (log), record the beginning and end of each transaction, as well as in the transaction was updated in the page before the change and change the status of the page (before image and afterimage), so that in the event of a system to make the database was destroyed, according to the regular or occasional for the database to make a backup (backup) coupled with the log information will be restored to the database of the system failure. The information in the log will restore the database to the correct state before the system failure, while retaining the changes made to the database since the last backup.

4. Any updates to the database are committed in two phases (two-phase commit). This is necessary because a transaction may involve two different database systems at the same time, which is especially important in a distributed system.

The above and other methods can be collectively referred to as "transaction processing technology" (transaction processing technique). Gray's creative thinking and pioneering work on the transaction processing technique has made him a recognized authority on the technique. His research results are reflected in a series of papers and research reports published by him, and finally crystallized into a thick monograph Transaction Processing: Concepts and Techniques (Morgan Kaufmann Publishers, 1993, with another author Prof. A. Reuter of the University of Stuttgart, Germany). The other author is Prof. A. Reuter of the University of Stuttgart, Germany). Although transaction processing technology was born in the database research, but for distributed systems, client/server structure of data management and communication, for fault-tolerant and high-reliability systems, also has an important significance. In order to fully realize the ideals of the above three scientific giants, Gray called on the U.S. government to pay attention to support long-term research on IT technology, that its importance is no less than 200 years ago, Jefferson (Thomas Jefferson, 1743-1826, "Declaration of Independence") Jefferson (Thomas Jefferson, 1743-1826, "Declaration of Independence" drafter, the third President of the United States, 1801-1809 in power) decided to use 15 million U.S. dollars from the hands of the French government to buy back the Louisiana Territory (Louisiana Territory, which is located in the Mississippi River and the Rocky Mountains between the north to Canada, the south of the Gulf of Mexico, a large piece of land, covering an area of 2,070,000 km2), known as the Louisiana Territory. ) this famous historical event known as Louisiana Purchase, and then sent to Captain Lewis (Captain Meriwether Lewis) and Clark (William Clark) as the head of the "Corps for Discovery (Corps for Discovery) to the West to explore until the Pacific Coast, for the final formation of the United States today. Pacific Coast, laying the foundation for the eventual formation of the United States today. Gray believes that a good long-term IT goals should have the following five key:

Jim Gray

1. Understandability goals should be able to be expressed simply and be understood.

2. Challenging It is not obvious how to reach the goal.

3. Versatile Useful not just to computer scientists, but to most people.

4. Testable So that you can check the progress of the project and know if the goal has been reached.

5. incremental, with milestones in between to check the progress of the project and inspire the researchers to continue. Supported by the above arguments, Gray proposed several long-term research goals for IT technology are as follows:

1. Scale scalability.

2. Passing the Turing Test.

3. Speech to Text.

4. Text to Speech.

5. Machine vision, which recognizes objects and motion like a human.

6. A personal "Memix" that records everything a person sees and hears and retrieves it quickly when needed.

Jim Gray 7. The world's "Memmix", that is, the establishment of text, music, images, art, film, "the whole collection" (corpus), can answer any question about the answer, like a human expert as fast and good indexing, digesting.

8

8. Virtual reality (Gray uses the term TelePresenee; see the introduction to 1969 Turing Award winner Minsky).

9. Trouble-Free Systems (TFS).

10. Secure Systems.

11. Highly available systems (AlwaysUp).

12. Automatic Programming.