Central Processing Unit (CPU), is one of the main equipment of electronic computers. Its function is mainly to interpret computer instructions and process data in computer software. The so-called programmability of the computer mainly refers to the programming of the CPU. CPU The CPU is the core accessory in a computer, only as big as a matchbox and as thick as dozens of sheets of paper, but it is the computing core and control core of a computer. All operations in a computer are performed by the CPU, which is the core component responsible for reading instructions, decoding them and executing them. CPU, internal memory and input/output devices are the three core components of an electronic computer. Also, the English abbreviation for China Pharmaceutical University is CPU (China Pharmaceutical University )
The main principle of operation of a CPU, regardless of its appearance, is the execution of a series of instructions stored in what is known as a program. We are talking about devices that follow a common architectural design. Programs are stored in computer memory as a series of numbers. Nearly all CPUs operate in four stages: Fetch, Decode, Execute, and Writeback.
The CPU consists of an arithmetic logic component, a register component, and a control component.The CPU takes instructions from memory or cache memory, puts them into an instruction register, and decodes them. It breaks down the instruction into a series of micro-operations and then issues various control commands to execute the series of micro-operations, thus completing the execution of an instruction. An instruction is a basic command of a computer that specifies the type of operation to be performed and the number of operations to be performed. An instruction consists of one byte or more bytes, including an opcode field, one or more fields for the address of the operand, and some status words and feature codes that characterize the state of the machine. Some instructions also contain the operand itself directly.
In 1981, the 8088 chip was used for the first time in IBM's PC (Personal Computer) machine, ushering in a new era of microcomputers. It was from the 8088 that the concept of the PC began to develop worldwide. Early CPUs were usually customized for large and application-specific computers. However, this expensive approach of customizing CPUs for specific applications has largely given way to the development of inexpensive, standardized, processor classes for one or more purposes. This trend toward standardization began in the days of mainframes and microcomputers consisting of a single transistor and accelerated with the advent of integrated circuits. Integrated circuits made it possible for more complex CPUs to be designed and manufactured in a very small space (on the order of micrometers). In 1982, when many of our younger readers were still in their infancy, Intel had already introduced its epochal newest product, the Date 80286 chip, which was a quantum leap forward from both the 8086 and the 8088. Although it was still a 16-bit structure, it contained 134,000 transistors inside the CPU, and the clock frequency was gradually increased from the initial 6 MHz to 20 MHz. Its internal and external data buses were both 16-bit, and its address bus was 24-bit, allowing it to address 16MB of memory. Since the 80286, the CPU has evolved into two modes of operation: real mode and protected mode. Central Processing UnitIn 1985, Intel introduced the 80386 chip, which is the first 32-bit microprocessor in the 80X86 series, and the manufacturing process has also made great progress, compared with the 80286, the 80386 contains 275,000 internal transistors, the clock frequency of 12.5MHz, and then increased to 20MHz, 25MHz, 33MHz. The 80386 has a 32-bit internal and external data bus, a 32-bit address bus, and can address up to 4GB of memory. In addition to having real and protected modes, it added a mode of operation called virtual 86, which could provide multitasking capability by emulating multiple 8086 processors at the same time. In addition to the standard 80386 chip, which is often referred to as the 80386DX, for different market and application considerations, Intel has launched a number of other types of 80386 chips: 80386SX, 80386SL, 80386DL and so on. In 1988, Intel launched the 80386SX is a market positioning in the 80286 and 80386DX between a chip, and 80386DX is different from the external data bus and address bus with the 80286 are the same, respectively, 16-bit and 24-bit (that is, addressing capacity of 16MB).
The Takeoff of the High-Speed CPU Era
Central Processing UnitIn 1990, Intel introduced the 80386 SL and 80386 DL, both of which are low-power, energy-saving chips mainly used in portables and energy-saving desktops. 80386 SL differs from the 80386 DL in that the former is based on the 80386SX and the latter on the 80386DX, but the 80386 SL and the 80386 DX are different in that the former is based on the 80386SX and the latter on the 80386DX, but the latter is based on the 80386DX. The difference between 80386 SL and 80386 DL is that the former is based on 80386SX and the latter is based on 80386DX, but both of them have added a new way of working: system management mode. When entering the system management mode, the CPU automatically reduces the running speed, controls the display and other components such as the hard disk to suspend their work, or even stops running and enters the "hibernation" state in order to achieve the purpose of energy saving. In 1989, the familiar 80486 chip was launched by Intel, the greatness of this chip is that it broke the boundary of 1 million transistors and integrated 1.2 million transistors. 80486's clock frequency was gradually increased from 25MHz to 33MHz and 50MHz. 80486 is a combination of the 80386 and the math co-processor 80387, as well as an 8KB cache. The 80486 was a combination of the 80386 and the math coprocessor 80387, as well as an 8KB cache, on a single chip. The 80486 was the first in the 80X86 family to utilize RISC (Reduced Instruction Set) technology, which allowed it to execute a single instruction in a single clock cycle. It also utilized a burst bus approach, which greatly increased the speed of data exchange with memory. As a result of these improvements, the performance of the 80486 was four times higher than that of the 80386DX with the 80387 math coprocessor. 80486, like the 80386, came in several types. The original type described above was the 80486DX. In 1990, Intel introduced the 80486 SX, a lower-priced model of the 486 type, which differed from the 80486DX in that it did not have a math coprocessor. 80486 DX2 was clocked at twice the system clock speed due to clock doubling technology, meaning that the chip's internal clock speed was twice the speed of the external bus speed, i.e., the chip's internal clock was clocked at twice the system clock speed, i.e., the chip's internal clock was clocked at twice the system clock speed. The 80486 DX2 has an internal clock frequency of 40MHz, 50MHz, 66MHz, etc. The 80486 DX4 is also a clock doubling chip, which allows its internal units to run at 2x or 3x the speed of the external bus. To support this increased internal operating frequency, its on-chip cache is expanded to 16KB. 80486 DX4 is clocked at 100MHz and runs 40% faster than the 66MHz 80486 DX2. 80486 is also available in SL-enhanced type, which has a system management approach for use in portables or energy-efficient desktops. Both the standardization and miniaturization of CPUs have made this class of digital devices (translated as "electronic parts" in Hong Kong) far more common in modern life than computers dedicated to limited applications. Modern microprocessors are found in everything from cars to cell phones to children's toys. The Pentium Era The central processing unit, the Pentium microprocessor, was introduced in March 1993 and incorporates 3.1 million transistors. It uses a number of techniques to improve cpu performance, mainly including the use of superscalar architecture, a built-in floating-point operator that applies superpipelining technology, an increase in the on-chip cache capacity, and the use of internal parity checking to check for internal processing errors.
The main frequency
The main frequency, also called the clock frequency, is measured in megahertz (MHz) or gigahertz (GHz), and is used to indicate the speed of the CPU's computing and data processing. The main frequency of the CPU = external frequency × multiplier factor. Many people believe that the main frequency determines the operating speed of the CPU, which is not only one-sided, but also for the server, this understanding has been biased. So far, there is no definite formula to achieve the main frequency and the actual computing speed of the numerical relationship between the two, even the two major processor manufacturers Intel (Intel) and AMD, there is a great controversy on this point, the development trend of Intel's products, you can see that Intel is very focused on strengthening their own development of the main frequency. Like other processor manufacturers, someone once took a piece of 1GHz Allmart processor to make a comparison, its operating efficiency is equivalent to 2GHz Intel processor. The main frequency of the central processing unit and the actual computing speed there is a certain relationship, but not a simple linear relationship. Therefore, the main frequency of the CPU and the actual computing power of the CPU is not directly related to the main frequency indicates the speed of digital pulse signal oscillation within the CPU. Examples can be found in Intel's processor products: a 1 GHz Itanium chip can perform almost as fast as a 2.66 GHz Xeon/Opteron, or a 1.5 GHz Itanium 2 is about the same speed as a 4 GHz Xeon/Opteron. CPU speed also depends on the performance metrics of the CPU's pipeline, bus, and so on. The CPU frequency is related to the actual computing speed, but it is only one aspect of the performance of the CPU, not the overall performance of the CPU.
The external frequency
The external frequency is the base frequency of the CPU in MHz, which determines the operating speed of the entire motherboard. In common parlance, in desktop computers, the overclocking referred to is the overclocking of the CPU's external frequency (of course, in general, the multiplier frequency of the CPU is locked), and I believe that this point is very well understood. But for the server CPU, overclocking is absolutely not allowed. As I said earlier, the CPU determines the operating speed of the motherboard, the two are synchronized operation, if the server CPU overclocking, change the external frequency, will produce asynchronous operation, (desktop many motherboards support asynchronous operation) this will cause the entire server system is not stable. The vast majority of current computer systems in the external frequency and the motherboard front-side bus is not synchronized speed, and the external frequency and front-side bus (FSB) frequency is easily confused, the following front-side bus introduction to talk about the difference between the two.
Front Side Bus (FSB) frequency
Front Side Bus (FSB) frequency (i.e., bus frequency) is a direct impact on the speed of direct data exchange between the CPU and memory. There is a formula to calculate, that is, data bandwidth = (bus frequency × data bit width)/8, the maximum bandwidth of data transmission depends on the width and transmission frequency of all simultaneously transmitted data. Let's say, the current Xeon Nocona with 64-bit support has a front-end bus of 800MHz, and according to the formula, its maximum bandwidth for data transfer is 6.4GB/sec. The difference between the external frequency of the CPU and the frequency of the front-side bus (FSB): the speed of the front-side bus refers to the speed of data transfer, and the external frequency is the speed of synchronous operation between the CPU and the motherboard. That is to say, 100MHz external frequency refers to the digital pulse signal oscillating 100 million times per second; and 100MHz front-side bus refers to the amount of data transfer per second that the CPU can accept is 100MHz × 64bit ÷ 8bit/Byte = 800MB/s. In fact, now "HyperTransport In fact, now the emergence of "HyperTransport" architecture, so that the actual sense of the front-side bus (FSB) frequency changes. IA-32 architecture must have three important building blocks: Memory Controller Hub (MCH), I/O Controller Hub, and PCI Hub, such as Intel's very typical chipset Intel 7501, Intel 7505 chipset, and so on. Tailored for Dual Xeon processors, the MCH they contain provides the CPU with a front-side bus frequency of 533MHz, and with DDR memory, the front-side bus bandwidth can reach 4.3GB/sec. However, the increasing performance of the processor also brings many problems to the system architecture. The "HyperTransport" architecture not only solves the problem, but also improves the bus bandwidth more effectively, such as AMD Opteron processors, flexible HyperTransport I/O bus architecture allows it to integrate the memory controller, so that the processor does not pass through the system bus to the chipset, but directly with the memory controller. The flexible HyperTransport I/O architecture of the AMD Opteron processor allows it to integrate a memory controller, allowing the processor to exchange data directly with the memory instead of passing it through the system bus to the chipset. In that case, front-side bus (FSB) frequencies are nowhere to be found on AMD Opteron processors.
CPU bits and word length
Central processing unit bits: In digital circuits and computer technology binary is used, the code is only "0" and "1", where either "0" or "1" is used. "0" or "1" in the CPU is a "bit". Word length: The number of bits of a binary number that the CPU can process at one time in a unit of time (at the same time) is called the word length in computer technology. So a CPU that can process 8-bit data is usually called an 8-bit CPU, and a 32-bit CPU can process 32-bit binary data per unit of time. Difference between byte and word length: Since common English characters can be expressed in 8-bit binary, 8-bit is usually called a byte. The length of the word length is not fixed, for different CPUs, the length of the word length is not the same. 8-bit CPUs can only handle one byte at a time, while 32-bit CPUs can handle 4 bytes at a time, and similarly, CPUs with a word length of 64 bits can handle 8 bytes at a time.
The multiplier factor
The multiplier factor refers to the relative proportion between the CPU's main frequency and external frequency. At the same external frequency, the higher the multiplier, the higher the frequency of the CPU. However, in reality, the higher the multiplier, the less significant the CPU itself is, given the same external frequency. This is because the data transfer speed between the CPU and the system is limited, the pursuit of high main frequency and get a high multiplier CPU will appear obvious "bottleneck" effect - the CPU from the system to get the data limit speed can not meet the speed of the CPU computing. Generally, except for the engineering sample version of Intel's CPU is locked multiplier, a small number of such as Inter Core 2 core Pentium Duo E6500K and some of the supreme version of the CPU does not lock the multiplier, and AMD did not lock before, and now AMD has launched a black-box version of the CPU (i.e., unlocked version of the multiplier, the user is free to adjust the multiplier, adjust the multiplier overclocking method than adjusting the external frequency of the stability of the more stable).
Cache
Cache size is also one of the most important indicators of a CPU, and the structure and size of the cache has a very large impact on CPU speed. The cache within the CPU runs at a very high frequency, generally operating at the same frequency as the processor, and working at a much higher efficiency than the system memory and hard disk. In practice, the CPU often needs to repeatedly read the same block of data, and the increase in cache capacity can significantly improve the CPU internal read data hit rate, without having to go to memory or hard disk to find, in order to improve system performance. However, due to the CPU chip area and cost considerations, the cache is very small. L1 Cache (Level 1 Cache) is the first level of CPU cache, which is divided into data cache and instruction cache. Built-in L1 cache capacity and structure of the CPU performance has a greater impact, but the cache memory are composed of static RAM, the structure is more complex, in the CPU core area can not be too large, the capacity of the L1 level cache can not be made too large. General server CPU L1 cache capacity is usually 32-256KB. L2 Cache (L2 Cache) is the CPU's second level of cache, divided into internal and external two chips. The internal chip L2 cache runs at the same speed as the main frequency, while the external L2 cache is only half of the main frequency.The L2 cache capacity also affects the performance of the CPU, and the principle is that the bigger the better, the largest capacity of the CPU for home use in the past is 512KB, and now it can be up to 2M in the laptop computers, and the L2 cache of the CPU for servers and workstations is higher, and it can be up to more than 8M. L3 Cache L3 Cache (Level 3 Cache), divided into two kinds, the early is external, now are built-in. Its actual role is that the application of L3 cache can further reduce memory latency and improve the performance of the processor in large data-volume calculations. Reducing memory latency and increasing the ability to compute large amounts of data are both very helpful for gaming. And in the server space adding L3 cache still provides a significant performance boost. For example, a configuration with a larger L3 cache utilizes physical memory more efficiently, so its slower disk I/O subsystem can handle more data requests. Processors with larger L3 caches provide more efficient file system caching behavior and shorter message and processor queue lengths. In fact, the earliest L3 cache was used in the K6-III processor released by AMD. At that time, the L3 cache was limited by the manufacturing process and was not integrated into the chip, but rather on the motherboard. At that time, the L3 cache was not integrated into the chip due to the manufacturing process, but was integrated into the motherboard. The L3 cache, which was only able to synchronize with the system bus frequency, was not much different from the main memory. L3 cache was later used in Intel's Itanium processors for the server market. Intel is also planning to launch a 9MB L3 cache Itanium2 processor, and later a 24MB L3 cache dual-core Itanium2 processor. But basically, the L3 cache is not very important to the processor's performance improvement. For example, the Xeon MP processor equipped with 1MB L3 cache is still not a match for the Opteron, which shows that an increase in the front-side bus is more effective than an increase in the cache to bring about a more effective performance improvement.
CPU Extended Instruction Set
CPUs rely on instructions for computation and control systems, and each CPU is designed with a set of instructions that match its hardware circuitry. The strength of the instructions is also an important indicator of the CPU, and the instruction set is one of the most effective tools for improving microprocessor efficiency. From the current stage of the mainstream architecture, the instruction set can be divided into two parts of the complex instruction set and streamlined instruction set (instruction set **** there are four types), and from the specific use of the point of view, such as Intel's MMX (Multi Media Extended, which AMD guessed the full name of the Intel did not specify the etymology), SSE, SSE2 (Streaming-), SSE2 (Streaming-), SSE2 (Streaming-), SSE2 (Streaming-), SSE2 (Streaming-), SSE2 (Streaming-). SSE, SSE2 (Streaming-Single Instruction Multiple Data-Extensions 2), SSE3, SSE4 series and AMD's 3DNow! are all CPU extended instruction sets, which enhance the CPU's ability to handle multimedia, graphics and the Internet, respectively. Often referred to as the "CPU's instruction set", the SSE3 instruction set is also the smallest at present, with MMX containing 57 commands, SSE containing 50 commands, SSE2 containing 144 commands, and SSE3 containing 13 commands. SSE4 is also the most advanced instruction set, with Intel Core series processors already supporting SSE4, AMD will add support for SSE4 in future dual-core processors, and ANA processors will also support this instruction set.
CPU core and I/O voltage
Starting from the 586 CPU, the CPU operating voltage is divided into two types: core voltage and I/O voltage, and usually the CPU core voltage is less than or equal to the I/O voltage. The size of the kernel voltage is based on the CPU's production process, generally the smaller the production process, the lower the kernel operating voltage; I/O voltage is generally in the range of 1.6~5 V. The low voltage can solve the problem of excessive power consumption and high heat generation.
Manufacturing process
The micron of the manufacturing process is the distance between the circuits and circuits within the IC. The trend in manufacturing processes is toward higher and higher densities. Higher density IC circuit design means that in the same size area of IC, you can have a higher density and more complex function circuit design. Now the main 180nm, 130nm, 90nm, 65nm, 45nm. Recently inter has had 32nm manufacturing process Core i3/i5 series. AMD, on the other hand, has indicated that its products will skip the 32nm process (a few 32nm products, such as Orochi and Llano, will be produced in the third quarter of 2010) and release 28nm products (name not yet determined) in the middle of 2011
Packaging
CPU packaging is the process of using a specific material to encapsulate a CPU chip or module in a way that protects it from damage. CPU packaging is a protective measure to prevent damage to the CPU chip or CPU module by using specific materials, and generally the CPU must be packaged before it can be delivered to the user.CPU packaging depends on the form of CPU mounting and device integration design, and in terms of the general classification, CPUs mounted in sockets are usually packaged in a PGA (grid array) manner, while CPUs mounted in slot x are all packaged in a SEC (Single Edge Connector) form. CPUs mounted in Slot x slots are all packaged in SEC (Single Edge Connector Box) form factor. There are also PLGA (Plastic Land Grid Array) and OLGA (Organic Land Grid Array) packaging technologies. Due to the increasingly fierce competition in the market, the current direction of development of CPU packaging technology is based on cost savings.
Multithreading
Simultaneous Multithreading, or SMT for short, allows multiple threads on the same processor to execute synchronously by replicating the structural state of the processor and to enjoy the execution resources of the processor, which maximizes the realization of wide-launch, chaotic superscalar processing, improves the utilization rate of the computing components, and eases the processing time of the processor. utilization of computing components, and eases access memory latency due to data-dependent or Cache misses. When multiple threads are not available, the SMT processor is almost identical to a traditional wide-emission superscalar processor, and the appeal of SMT is that it requires only a small change in the processor core design, which can result in a significant increase in performance at little additional cost. Multi-threading technology, on the other hand, can reduce the idle time of the high-speed computing cores by preparing more data for processing. This is certainly very attractive for low-end desktop systems, and Intel will support SMT on all processors starting with the 3.06GHz Pentium 4.
Multi-core
Multi-core, also referred to as single-chip multiprocessors (Chip Multiprocessors, or CMP). CMP was proposed by Stanford University, and the idea is to integrate SMP (Symmetric Multi-Processor) in massively parallel processors into the same chip, with each processor executing different processes in parallel. Compared with CMP, the flexibility of the SMT processor architecture is more prominent. However, as semiconductor processes move beyond 0.18 micron, line delays have exceeded gate delays, requiring microprocessors to be designed by dividing the basic cell structure into many smaller, better localized cells. In contrast, CMP structures have been more promising because they have been designed by dividing them into multiple processor cores, each of which is simpler and conducive to optimized design. Currently, IBM's Power 4 chip and Sun's MAJC5200 chip both use the CMP architecture. Multi-core processors can enjoy the cache inside the processor*** to improve cache utilization, while simplifying the complexity of multiprocessor system design. In the second half of 2005, new processors from Intel and AMD will also incorporate the CMP architecture. The new Anthem processor, developed under the code name Montecito, is a dual-core design with a minimum of 18MB of on-chip cache, manufactured on a 90nm process, and is designed to be an absolute challenge to today's chip industry. Each of its individual cores has its own L1, L2, and L3 cache and contains approximately 1 billion transistors.
CPU manufacturers
Intel Corporation
Intel is the big brother of CPU production, the personal computer market, which holds more than 75% of the market share, Intel production intel logo produced CPU became the de facto x86CPU technical specifications and standards. The latest Core 2 for the personal computer platform has become the CPU of choice, and the next generation of Core i5, Core i3, and Core i7 have seized the opportunity to take a significant lead in performance over other manufacturers' products.
AMD
Currently used CPUs are products of several companies, in addition to Intel, the most AMD logo a strong challenge is AMD, the latest AMD Slew Dragon II X2 and YiLong II has a very good price-performance ratio, especially with 3DNOW + technology and support for the SSE4.0 instruction set, so that it has a very good on the 3D
It's a good thing that I've been working on this for a while.
IBM and Cyrix
IBM's strength lies in the high-end labs, studios, and non-civilian CPUs The merger of National Semiconductor NS and Cyrix has allowed it to finally have its own chip production line, and its finished products will become increasingly sophisticated and complete. The current MII is also a good performer, especially since it's priced so low.
IDT Corporation
IDT is a rising star among processor makers, but it's not quite mature yet.
VIA VIA
VIA VIA is a Taiwanese motherboard chipset maker that acquired the aforementioned Cyrix and IDT's cpu division and launched its own CPU
Domestic LongChin
GodSon, nicknamed the Dog Leftover, is a general-purpose processor with state-owned intellectual property rights, and now has two generations of products that can already reach the level of Now on the market INTEL and AMD's low-end CPU level, now Longchip's English name is loogson.
ARM Ltd
Anmou International Technology, a few companies that only authorized its CPU design and did not manufacture their own. Embedded application software is most often executed by ARM architecture microprocessors.
Intel Pentium Dual Core: The Pentium D and Pentium 4EE with Presler cores, basically the Presler cores can be thought of as simply the product of two Cedar Mill cores loosely coupled together. Core 1 Generation Uses the Yonah core architecture. [1] Core 2 Generation Uses the Conroe core (not all). "Core is a leading energy-efficient new microarchitecture designed to deliver outstanding performance and energy efficiency, improving performance per watt, also known as energy efficiency ratio. Early Core was based on notebook processors.
Various Packaging
Bulk CPUs have only one CPU, no packaging. Usually a store warranty of one year. Usually supplied by manufacturers to installers who can't get rid of it and it comes into the market. Some dealers match the bulk CPU with a fan and package it to look like the original, making it a repackaged product. There is also another major source is the smuggling of bulk packages.CPU is the most important part of the computer original package CPU, also known as boxed CPU. Original package CPU, is the manufacturer for the retail market launch of the CPU products, with the original fan and the manufacturer's three-year warranty. In fact, the bulk and boxed CPU itself is no difference in quality, the main difference lies in the different channels, and thus the warranty is different, the boxed basic 3-year warranty, while the bulk of the basic warranty is only 1 year, the boxed CPU with the fan is the original fan package, while the bulk does not match the fan, or by the distributor of their own match the fan. Black box CPUs are top of the line unlocked CPUs from manufacturers, such as AMD's Black Box 5000+, which do not come with a fan and are retail products from manufacturers specifically for overclocking users. Deep-packed CPUs, also known as re-packed CPUs, are packaged by the distributor in their own bulk, with a fan. There is no manufacturer's warranty, only store warranty, usually store warranty for three years. Or smuggle the CPU from abroad to the territory for secondary packaging, plus fan. These are untaxed and slightly cheaper than bulk. Engineering samples CPU, is the processor manufacturer in the processor launched before the major board manufacturers and OEMs used to test the processor samples. Production is made to belong to the early products, but the quality is not all lower than the final retail CPU, its biggest features such as: do not lock multiplier, certain features special, is the first choice of proficient DIY. These CPUs can be found on the market occasionally, and these engineering samples are labeled with the "ES" logo (ES stands for Engine Sample).