Sentrale verwerkingseenheid: Verskil tussen weergawes

Content deleted Content added
k thumb → duimnael
k Lêerparameters vertaal
Lyn 44:
The fundamental operation of most CPUs, regardless of the physical form they take, is to execute a sequence of stored instructions called a program. Discussed here are devices that conform to the common [[Von Neumann architecture]]. The program is represented by a series of numbers that are kept in some kind of [[Memory (computers)|computer memory]]. There are four steps that nearly all Von Neumann CPUs use in their operation: '''fetch''', '''decode''', '''execute''', and '''writeback'''.
 
[[Image:Mips32_addi.png|leftlinks|frame|Diagram showing how one [[MIPS architecture|MIPS32]] instruction is decoded. {{Ref harvard|MIPSTech2005|MIPS Technologies 2005|a}}]]
 
The first step, '''fetch''', involves retrieving an [[instruction (computer science)|instruction]] (which is represented by a number or sequence of numbers) from program memory. The location in program memory is determined by a [[program counter]] (PC), which stores a number that identifies the current position in the program. In other words, the program counter keeps track of the CPU's place in the current program. After an instruction is fetched, the PC is incremented by the length of the instruction word in terms of memory units. <ref>Since the program counter counts ''memory addresses'' and not ''instructions,'' it is incremented by the number of memory units that the instruction word contains. In the case of simple fixed-length instruction word ISAs, this is always the same number. For example, a fixed-length 32-bit instruction word ISA that uses 8-bit memory words would always increment the PC by 4 (except in the case of jumps). ISAs that use variable length instruction words, such as [[x86]], increment the PC by the number of memory words corresponding to the last instruction's length. Also, note that in more complex CPUs, incrementing the PC does not necessarily occur at the end of instruction execution. This is especially the case in heavily pipelined and superscalar architectures (see the relevant sections below).</ref> Often the instruction to be fetched must be retrieved from relatively slow memory, causing the CPU to stall while waiting for the instruction to be returned. This issue is largely addressed in modern processors by caches and [[Pipeline (computer)|pipeline]] architectures (see below).
Lyn 50:
The instruction that the CPU fetches from memory is used to determine what the CPU is to do. In the '''decode''' step, the instruction is broken up into parts that have significance to other portions of the CPU. The way in which the numerical instruction value is interpreted is defined by the CPU's [[instruction set architecture]] ('''ISA'''). <ref>Because the instruction set architecture of a CPU is fundamental to its interface and usage, it is often used as a classification of the "type" of CPU. For example, a "[[PowerPC]] CPU" uses some variant of the PowerPC ISA. Some CPUs, like the Intel [[Itanium]], can actually interpret instructions for more than one ISA; however this is often accomplished by software means rather than by designing the hardware to directly support both interfaces. (See [[emulator]])</ref> Often, one group of numbers in the instruction, called the [[opcode]], indicates which operation to perform. The remaining parts of the number usually provide information required for that instruction, such as operands for an [[addition]] operation. Such operands may be given as a constant value (called an immediate value), or as a place to locate a value: a [[processor register|register]] or a [[memory address]], as determined by some [[addressing mode]]. In older designs the portions of the CPU responsible for instruction decoding were unchangeable hardware devices. However, in more abstract and complicated CPUs and ISAs, a [[microprogram]] is often used to assist in translating instructions into various configuration signals for the CPU. This microprogram is sometimes rewritable so that it can be modified to change the way the CPU decodes instructions even after it has been manufactured.
 
[[Image:CPU block diagram.svg|rightregs|duimnael|210px|Block diagram of a simple CPU]]
 
After the fetch and decode steps, the '''execute''' step is performed. During this step, various portions of the CPU are connected so they can perform the desired operation. If, for instance, an addition operation was requested, an [[arithmetic logic unit]] ('''ALU''') will be connected to a set of inputs and a set of outputs. The inputs provide the numbers to be added, and the outputs will contain the final sum. The ALU contains the circuitry to perform simple arithmetic and logical operations on the inputs (like addition and [[bitwise operation]]s). If the addition operation produces a result too large for the CPU to handle, an [[arithmetic overflow]] flag in a flags register may also be set (see the discussion of integer range below).
Lyn 69:
The way a CPU represents numbers is a design choice that affects the most basic ways in which the device functions. Some early digital computers used an electrical model of the common [[decimal]] (base ten) [[numeral system]] to represent numbers internally. A few other computers have used more exotic numeral systems like [[ternary logic|ternary]] (base three). Nearly all modern CPUs represent numbers in [[Binary numeral system|binary]] form, with each digit being represented by some two-valued physical quantity such as a "high" or "low" [[volt]]age. <ref>The physical concept of [[voltage]] is an analog one by its nature, practically having an infinite range of possible values. For the purpose of physical representation of binary numbers, set ranges of voltages are defined as one or zero. These ranges are usually influenced by the operational parameters of the switching elements used to create the CPU, such as a [[transistor]]'s threshold level.</ref>
 
[[Image:MOS_6502AD_4585_top.jpg|250px|duimnael|leftlinks|[[MOS Technology 6502|MOS 6502]] microprocessor in a [[dual in-line package]], an extremely popular 8-bit design.]]
 
Related to number representation is the size and precision of numbers that a CPU can represent. In the case of a binary CPU, a '''bit''' refers to one significant place in the numbers a CPU deals with. The number of bits (or numeral places) a CPU uses to represent numbers is often called "[[Word (computer science)|word size]]", "bit width", "data path width", or "integer precision" when dealing with strictly integer numbers (as opposed to floating point). This number differs between architectures, and often within different parts of the very same CPU. For example, an [[8-bit]] CPU deals with a range of numbers that can be represented by eight binary digits (each digit having two possible values), that is, 2<sup>8</sup> or 256 discrete numbers. In effect, integer size sets a hardware limit on the range of integers the software run by the CPU can utilize. <ref>While a CPU's integer size sets a limit on integer ranges, this can (and often is) overcome using a combination of software and hardware techniques. By using additional memory, software can represent integers many magnitudes larger than the CPU can. Sometimes the CPU's ISA will even facilitate operations on integers larger that it can natively represent by providing instructions to make large integer arithmetic relatively quick. While this method of dealing with large integers is somewhat slower than utilizing a CPU with higher integer size, it is a reasonable trade-off in cases where natively supporting the full integer range needed would be cost-prohibitive. See [[Arbitrary-precision arithmetic]] for more details on purely software-supported arbitrary-sized integers.</ref>
Lyn 78:
 
=== Clock rate ===
[[Image:1615a_logic_analyzer.jpg|duimnael|250px|rightregs|[[Logic analyzer]] showing the timing and state of a synchronous digital system.]]
{{main|Clock rate}}
 
Lyn 90:
 
=== Parallelism ===
[[Image:Nopipeline.png|duimnael|300px|rightregs|Model of a subscalar CPU. Notice that it takes fifteen cycles to complete three instructions.]]
{{main|Parallel computing}}
 
Lyn 100:
 
==== ILP: Instruction pipelining and superscalar architecture ====
[[Image:Fivestagespipeline.png|duimnael|300px|leftlinks|Basic five-stage pipeline. In the best case scenario, this pipeline can sustain a completion rate of one instruction per cycle.]]
{{main articles|[[Instruction pipelining]], [[Superscalar]]}}
 
Lyn 107:
Pipelining does, however, introduce the possibility for a situation where the result of the previous operation is needed to complete the next operation; a condition often termed data dependency conflict. To cope with this, additional care must be taken to check for these sorts of conditions and delay a portion of the instruction pipeline if this occurs. Naturally, accomplishing this requires additional circuitry, so pipelined processors are more complex than subscalar ones (though not very significantly so). A pipelined processor can become very nearly scalar, inhibited only by pipeline stalls (an instruction spending more than one clock cycle in a stage).
 
[[Image:Superscalarpipeline.png|duimnael|300px|rightregs|Simple superscalar pipeline. By fetching and dispatching two instructions at a time, a maximum of two instructions per cycle can be completed.]]
 
Further improvement upon the idea of instruction pipelining led to the development of a method that decreases the idle time of CPU components even further. Designs that are said to be '''superscalar''' include a long instruction pipeline and multiple identical execution units. In a superscalar pipeline, multiple instructions are read and passed to a dispatcher, which decides whether or not the instructions can be executed in parallel (simultaneously). If so they are dispatched to available execution units, resulting in the ability for several instructions to be executed simultaneously. In general, the more instructions a superscalar CPU is able to dispatch simultaneously to waiting execution units, the more instructions will be completed in a given cycle.