Содержание
- 2. General Information Web Page: http://www.cs.purdue.edu/homes/cs250 Office: LWSN1185 E-mail: grr@cs.purdue.edu Textbook: Essentials of Computer Architecture D. E.
- 3. Mailing List They wil be created automatically. You don’t need to use mailer.
- 4. PSOs There is no PSO the first week. The projects will be explained in the PSOs.
- 5. Grading Grade allocation Midterm: 25% Final: 25% Projects: 50% Exams also include questions about the projects.
- 6. Course Organization 1. Basics Fundamentals of Digital Logic Data Representation 2. Processors Types of Processors Instruction
- 7. Course Organization 3. Memory Types of Memory Physical and Virtual Memory Caching 4.Input/Output Devices and Interfaces
- 8. Course organization 5. Advanced Topics Parallelism Performance Measurement Architectural Hierarchy
- 9. Approach We will cover Computer Architecture From the programmers point of view. How it influences the
- 10. II. Fundamentals of Digital Logic
- 11. Voltage and Current Voltage Measure of potential Force It is measured in Volts Current Measure of
- 12. Voltage Voltage is measured with a voltmeter across two points. Typical digital circuits work with 5
- 13. Transistor Building block of digital circuits Acts like a switch A transistor has three connections: Emitter
- 14. Boolean Logic It gives the formal basis for digital circuits It uses three basic functions AND
- 15. Boolean Logic You will find that Nand and Nor Gates are very popular. By using them,
- 16. Boolean Logic In digital circuits 0 and 1 are represented as 0 = 0 volts 1
- 17. Boolean Logic Example: A B AB’ A’B +AB’ A’ B A’ B’ B A Truth Table
- 18. Truth Tables to Boolean Expressions From a Truth table you can create a boolean expression You
- 19. Sum of Products To create a sum of products from a truth table, take the 1s
- 20. Product of Sums To create a product of sums from a truth table, take the 0s
- 21. Example: Implementing add Assume we want to add two numbers where each number will be one
- 22. Implementing Add To implement an adder for 8 bits or 32 bits, many more gates are
- 23. Boolean Algebra You can manipulate the boolean expressions like normal algebraic expressions. Properties Commutative: AB =
- 24. De Morgan’s Law Negation of expressions (A+B)’ = A’B’ (AB)’ = A’ + B’
- 25. Boolean Expression Reduction You can simplify Boolean expressions to use fewer gates: Example: z = a’b’c
- 26. Karnaugh Maps To make the simplification of boolean expressions easier, we can use Karnaugh maps. A
- 27. Karnaugh Table example 1 Given an expression r = x’yz’+xyz’+x’y’z+x’yz Build a 3 variable Karnaugh map
- 28. Karnaugh Table example 2 Given an expression r = x’yz’k’+x’yz’k+xyz’k’+xyz’k+ xyzk+xyzk’+x’y’zk’+xy’zk’ Build a 4 variable Karnaugh
- 29. Using only NAND Gates Very often you build the circuits using only NAND gates. To convert
- 30. XOR Using only NAND gates +5V +5V x y 10K 10K +5V 10K z x XOR
- 31. Examples of Gates on 7400-Series Chips
- 32. Flip Flops Basic unit of memory S(set) Q R(reset) Q’ Truth Table S R Q Q’
- 33. Flip Flops. Keep Current value S(set) Q R(reset) Q’ 0 0 0 1 S(set) Q R(reset)
- 34. Flip Flops. Reset and Set S(set) Q R(reset) Q’ 1 0 0 1 S(set) Q R(reset)
- 35. Binary Counter Counts pulses (transitions from 0 to 1) Output is a binary number Contains a
- 36. Binary Counter (4 bits) t 0 5 A B C Truth Table In A B C
- 37. Clock It is an electronic circuit that produces a sequences of 0 1 0 1 0
- 38. Demultiplexor It is a circuit used to select one output A=1 B=1 C=0 0 0 0
- 39. Example of Circuit to Execute a Sequence of Steps Position drill Start drill Drill hole Remove
- 40. Unused Gates Since a chip may contain multiple gates, it is possible to use some of
- 41. Classification of Technologies Small Scale Integration (SSI) Basic Boolean Gates Medium Scale Integration (MSI) Intermediate logic
- 42. III. Data and Program Representation
- 43. Memory of a Program A program sees memory as an array of bytes that goes from
- 44. Memory Sections The memory is organized into sections called “memory mappings”. Stack Text Data Bss Heap
- 45. Memory Sections Each section has different permissions: read/write/execute or a combination of them. Text- Instructions that
- 46. Memory Sections Dynamic libraries – They are libraries shared with other processes. Each dynamic library has
- 47. Example Program hello.c int a = 5; // Stored in data section int b[20]; // Stored
- 48. Memory Gaps Between each memory section there may be gaps that do not have any memory
- 49. What is a program? A program is a file in a special format that contains all
- 50. Executable File Formats There are different executable file formats ELF – Executable Link File It is
- 51. Building a Program The programmer writes a program hello.c The preprocessor expands #define, #include, #ifdef etc
- 52. Building a program The linker puts together all object files as well as the object files
- 53. Building a Program Programmer C Preprocessor Compiler(cc) Optimizer Assembler (as) (static) Linker (ld) Editor hello.c hello.i
- 54. Original file hello.c #include main() { printf("Hello\n"); }
- 55. After preprocessor gcc -E hello.c > hello.i (-E stops compiler after running preprocessor) hello.i: /* Expanded
- 56. After assembler gcc -S hello.c (-S stops compiler after assembling) hello.s: .align 8 .LLC0: .asciz "Hello\n"
- 57. After compiling “gcc -c hello.c” generates hello.o hello.o has undefined symbols, like the printf function call
- 58. After linking “gcc –o hello hello.c” generates the hello executable Printf does not have a value
- 59. Loading a Program The loader is a program that is used to run an executable file
- 60. Loading a Program It also writes (resolves) any values in the executable to point to the
- 61. Loading a Program Loader (runtime linker) (/usr/lib/ld.so.1) Executable File Executable in memory Shared libraries (.so, .dll)
- 62. Static and Shared Libraries Shared libraries are shared across different processes. There is only one instance
- 63. Memory and Pointers A pointer is a variable that contains an address in memory. In a
- 64. Ways to get a pointer value 1. Assign a numerical value into a pointer Char *
- 65. Ways to get a pointer value 2. Get memory address from another variable: int *p; int
- 66. Ways to get a pointer value 3. Allocate memory from the heap int *p p =
- 67. Ways to get a pointer value You can pass a pointer as a parameter to a
- 68. Common Problems with Pointers When using pointers make sure the pointer is pointing to valid memory
- 69. Printing Pointers It is useful to print pointers for debugging char*i; char buff[10]; printf("ptr=%d\n", &buff[5]) Or
- 70. sizeof() operator in Pointers The size of a pointer is always 4 bytes in a 32
- 71. String Operations A string is represented in memory as a sequence of characters in ASCII terminated
- 72. String Operations The C library (libc) provides simple string functions to manipulate strings such as: char
- 73. String Operations In general the string functions will not allocate memory. You have to allocate enough
- 74. Using Pointers to Optimize Execution Assume the following function that adds the sum of integers in
- 75. Using Pointers to Optimize Execution Now the equivalent code using pointers int sum(int* array, int n)
- 76. Using Pointers to Optimize Execution When you increment a pointer to integer it will be incremented
- 77. Array Operator Equivalence We have the following equivalences: int a[20]; a[i] - is equivalent to *(a+i)
- 78. 2D Array. 1st Implementation 1st approach Normal 2D array. int a[4][3]; a[0][0]:100: a[0][1]:104: a[0][2]:108: a[1][0]:112: a[1][1]:116:
- 79. 2D Array 2nd Implementation 2nd approach Array of pointers to rows int*(a[4]); for(int i=0; i }
- 80. 2D Array 2nd Implementation 2nd approach Array of pointers to rows (cont) a[0]:100: a[1]:104: a[2]:108: a[3]:112:
- 81. 2D Array 3rd Implementation 3rd approach. a is a pointer to an array of pointers to
- 82. 2D Array 3rd Implementation a is a pointer to an array of pointers to rows. (cont.)
- 83. Advantages of Pointer Based Arrays You don’t need to know in advance the size of the
- 84. Advantages of Pointer Based Arrays Example: Triangular matrix a[0]:100: a[1]:104: a[2]:108: a[3]:112: a[1][0] a[0][0] a[2][0] a[3][0]
- 85. Pointers to Functions Pointers to functions are often used to implement Polymorphism in “C”. Polymorphism: Being
- 86. An Array Mapper typedef void (*FuncPtr)(int a); void intArrayMapper( int *array, int n, FuncPtr func )
- 87. Using the Array Mapper int a[ ] = {3,4,7,8}; main( ){ // Print the values in
- 88. A More Generic Mapper typedef void (*GenFuncPtr)(void * a); void genericArrayMapper( void *array, int n, int
- 89. Using the Generic Mapper void sumIntGen( void *pVal ){ //pVal is pointing to an int //Get
- 90. Using the Generic Mapper int a[ ] = {3,4,7,8}; main( ) { // Print integer values
- 91. Swapping two Memory Ranges In the lab1 you will implement a sort function that will sort
- 92. String Comparison in Sort Function In lab1, in your sort function, when sorting strings, you will
- 93. Bits and Bytes Bit It stores 1 or 0 Byte It is a group of 8
- 94. Interpretation of bits Sometimes device registers are mapped to memory. This is called Memory Mapped I/O.
- 95. Interpretation of bits Combination of bits are used as integers 27 26 25 24 23 22
- 96. Hexadecimal Notation Compact form to represent binary numbers It uses base 16. 4 bits represent an
- 97. Hexadecimal Notation Example: Hexadecimal: 0xF4534004 Binary: 1111 0100 0101 0011 0100 0000 0000 0100 Hexadecimal F
- 98. Example of Character Encodings EBCDIC ASCII Unicode
- 99. EBCDIC Extended Binary Coded Decimal Interchange Format It was created by IBM in the 1960s No
- 100. ASCII American Standard Code for Information Exchange Used widely in UNIX and PCs It uses 7
- 101. ASCII Table http://www.ascii.ws/ascii-chart.html
- 102. UNICODE Each character is 16 bits long (2 bytes) It is used to represent characters from
- 103. Representation of Strings In a “C” program a string is a sequence of characters delimited by
- 104. Integer Representation in Binary Each binary integer is represented in k bits where k is 8,
- 105. Integer Representation Example 10010101 = 1*2^7 + 1*2^4+1*2^2+1*2^0 = = 128 + 16 + 4 +
- 106. Binary Integer Addition Same as decimal addition: Use S1, S2 and Carry (C) to compute R
- 107. Binary Integer Addition 100 C (Carry) 1011 S1 (11) +0110 S2 (06) 01 R Truth Table
- 108. Binary Integer Addition 1100 C (Carry) 1011 S1 (11) +0110 S2 (06) 001 R Truth Table
- 109. Binary Integer Addition 11100 C (Carry) 1011 S1 (11) +0110 S2 (06) 0001 R Truth Table
- 110. Binary Integer Addition 11100 C (Carry) 1011 S1 (11) +0110 S2 (06) 10001 R (17) Truth
- 111. Binary Integer Subtraction Same as decimal subtraction: Use S1, S2 and Carry (C) to compute R
- 112. Binary Integer Subtraction 000 C (Carry) 1011 S1 (11) -0110 S2 (06) 01 R Truth Table
- 113. Binary Integer Subtraction 1000 C (Carry) 1011 S1 (11) -0110 S2 (06) 101 R Truth Table
- 114. Binary Integer Subtraction 01000 C (Carry) 1011 S1 (11) -0110 S2 (06) 0101 R Truth Table
- 115. Binary Multiplication Same as decimal multiplication Just need to memorize multiplication table for 0 and 1
- 116. Binary Multiplication 1011 x 110 0000
- 117. Binary Multiplication 1011 x 110 0000 +1011 10110
- 118. Binary Multiplication 1011 (11) x 110 ( 6) 0000 +1011 10110 +1011 1000010 (64+2=66)
- 119. Binary Multiplication. Another example 1001 (9) x 101 (5) 1001
- 120. Binary Multiplication. Another example 1001 (9) x 101 (5) 1001 +0000 01001
- 121. Binary Multiplication. Another example 1001 (9) x 101 (5) 1001 +0000 01001 +1001 101101 (32+8+4+1=45)
- 122. Binary Division Same as decimal division Just need to memorize multiplication table for 0 and 1
- 123. Binary Division ___1__ 100 | 10110 -100 001
- 124. Binary Division ___10__ 100 | 10110 -100 0011
- 125. Binary Division ___101_ (5) (4) 100 | 10110 (16+4+2=22) -100 00110 - 100 010 (2)
- 126. Binary Representation of Negative Integer Numbers Three representations Sign and Magnitude 1-complement 2-complement
- 127. Sign and Magnitude Representation 1 bit for sign Other bits for the absolute value Example: +5
- 128. 1-Complement Negative numbers are obtained by inverting all bits. Example: +5 = 00000101 -5 = 11111010
- 129. 2-Complement Negative numbers are obtained by subtracting 1 from the positive number and inverting the result.
- 130. 2-Complement 2 complement representation is widely used because the same piece of hardware used for positive
- 131. Shift Operator and Signed ints When signed numbers are shifted right, the sign number is extended
- 132. Floating Point Representation Store both the exponent and mantissa Example: 3.5x10-16 In binary the representation uses
- 133. Floating Point Representation The most common is the IEEE-754 standard s 31 e 23 m 0
- 134. Floating Point Representation Example Double value in memory (in hex): 4024 0000 0000 0000 Binary: 0100
- 135. Byte Order There are two byte orders: Little Endian – Least significant byte of the integer
- 136. Representation of 0x05 Little Endian Big Endian 0 1 2 3 0x05 0x00 0x00 0x00 0
- 137. How to know if it is Little or Big Endian int isLittleEndian() { int i =
- 138. Structures Structures are a combination in memory of primitive types. Example: struct { int i; float
- 139. Structures and Alignment Integers, floats, and pointers have to be aligned to 4 bytes (in a
- 140. Example of Alignment in Structures Example: struct { char ch1; int r; char ch2; char *
- 141. IV. Variety of Processors
- 142. Von Neumann Architecture Modern processors follow this design Programs are stored in memory, in the same
- 143. Von Neumann Architecture A computer has an address bus and a data bus that are used
- 144. Von Newman Architecture CPU RAM ROM EthernetCard USB Controler (mouse, kbd) Hard Drive CD/DVD Drive Address
- 145. Processors Digital device that performs computation using multiple steps. Types of Processors: Fixed Logic – Least
- 146. Components of a CPU Controller ALU – Arithmetic and Logical Unit Registers - Local Data Storage
- 147. Components of the CPU ALU Controller Registers External Interface Internal Connections Address Bus Data Bus
- 148. Components of the CPU Controller Controls the execution Initiates the sequence of steps Coordinates other components
- 149. Components of the CPU Registers Holds arguments and results of the operations Internal Connections Transfers values
- 150. ALU – Arithmetic Logic Unit It is the part of the CPU that performs the Arithmetic
- 151. Processor Categories Coprocessors Operates in conjunction with other processor. Example: Floating Point Accelerator. Microcontroller Small programmable
- 152. Processor Categories Embedded System Processor It is able to run sophisticated tasks More powerful than a
- 153. Evolution of Processor Technologies Discrete Logic Use TTL Gates etc used to implement processor. It could
- 154. Fetch-Execute Cycle This is the basics for programmable processors. It allows moving through the program steps
- 155. Clock Rate and Instruction Rate Clo ck rate It is the rate at which gates and
- 156. Starting a Processor When the CPU is powered on or when reset The CPU is initialized
- 157. Stopping a Processor When the application finishes or it is waiting for an event, The program
- 158. V. Processor Types and Instruction Sets
- 159. How to Choose an Instruction Set A small set is easy to implement but inconvenient for
- 160. Parts of an Instruction Opcode Specifies the instruction to be executed Operands Specifies the registers, memory
- 161. Instruction Length Fixed Length Every instruction has the same length Reduces the complexity of the hardware
- 162. General Purpose Registers They are used to store operands and results Each register has a small
- 163. Example of Using Registers Load A from location 0x100 and B from location 0x104. Store A+B
- 164. Types of Instruction Sets CISC Complex Instruction Set Computer RISC Reduced Instruction Set Computer
- 165. CISC Instruction Set It contains many instructions, often hundreds. Some instructions take longer than others to
- 166. RISC Instruction Set It contains few instructions 32 or 64 Instructions have a fixed length Each
- 167. Execution Pipeline Hardware optimization technique Allows the execution of instructions in parallel. Used by RISC architectures
- 168. Execution Pipeline An instruction is executed by the following steps: Fetch the next instruction Examine the
- 169. Execution Pipeline Fetch Instruction Examine Opcode Fetch Operands Perform Operation Store Result Stage 1 Stage 2
- 170. Execution Pipeline Each stage operate in parallel with a different instruction. As a result, an N
- 171. Pipeline Example Clock Stage1 Stage2 Stage3 Stage4 Stage5 Inst1 Inst2 Inst1 Inst3 Inst2 Inst1 Inst4 Inst3
- 172. Pipeline Control The pipeline is executed by the processor without the programmers intervention. The programmer can
- 173. Example of a pipe stall Assume the following operations: Instruction K: C Instruction K+1: D The
- 174. Example of a pipe stall Clock Stage1 Stage2 Stage3 Stage4 Stage5 Instk instk-1 instk-2 instk-3 instk-4
- 175. Pipe Stall Some reasons of a pipe stall are: Access to RAM Call an instruction that
- 176. Avoiding Pipe Stalls A programmer can delay the use of results by reordering the instructions:
- 177. Avoiding Stalls Program must be written to accommodate instruction pipeline To minimize stalls – Avoid introducing
- 178. Avoiding Stalls Example Of Avoiding Stalls (a) (b) C ← add A B C ← add
- 179. Avoiding Stalls Although hardware that uses an instruction pipeline will not run at full speed unless
- 180. VII. CPUs Microcode Protection and Protection Modes
- 181. User and Kernel Mode, Interrupts, and System Calls
- 182. Computer Architecture Review Most modern computers use the Von Newman Architecture where both programs and data
- 183. Computer Architecture Review CPU RAM ROM EthernetCard USB Controler (mouse, kbd) Hard Drive CD/DVD Drive Address
- 184. Kernel and User Mode Kernel Mode When the CPU runs in this mode: It can run
- 185. Kernel and User Mode User Mode When the CPU runs in this mode: The CPU can
- 186. Kernel and User Mode When the OS boots, it starts in kernel mode. In kernel mode
- 187. Kernel and User Mode User programs run in user mode. The programs switch to kernel mode
- 188. Kernel and User Mode Kernel Mode User Mode
- 189. Kernel and User Mode Separation of user/kernel mode is used for: Security: The OS calls in
- 190. Interrupts An interrupt is an event that requires immediate attention. In hardware, a device sets the
- 191. Steps of Servicing an Interrupt The CPU saves the Program Counter and registers in execution stack
- 192. Running with Interrupts Interrupts allow CPU and device to run in parallel without waiting for each
- 193. Poling Alternatively, the OS may decide not use interrupts for some devices and wait in a
- 194. Synchronous vs. Asynchronous Poling is also called Synchronous Processing since the execution of the device is
- 195. Interrupt Vector It is an array of pointers that point to the different interrupt handlers of
- 196. Interrupts and Kernel Mode Interrupts run in kernel mode. Why? An interrupt handler must read device/CPU
- 197. Types of Interrupts 1. Device Interrupts generated by Devices when a request is complete or an
- 198. Types of Interrupts 2. Math exceptions generated by the CPU when there is a math error.
- 199. Types of Interrupts 4. Software Interrupt generated by software with a special assembly instruction. This is
- 200. System Calls System Calls is the way user programs request services from the OS System calls
- 201. Why do we use Software Interrupts for syscalls instead of function calls? Software Interrupts will switch
- 202. System Calls Only operations that need to be executed by the OS in kernel mode are
- 203. System Calls Libc (the C library) provides wrappers for the system calls that eventually generate the
- 204. System Calls The software interrupt handler for system calls has entries for all system calls. The
- 205. Syscall Security Enforcement For example, for the open syscall the following is checked in the syscall
- 206. Syscall details Te list of all system calls can be found in /usr/include/sys/syscall.h #define SYS_exit 1
- 207. Syscall Error reporting When an error in a system call occurrs, the OS sets a global
- 208. System Calls and Interrupts Example The user program calls the write(fd, buff, n) system call to
- 209. System Calls and Interrupts Example 4. The OS tells the hard drive to write the buffer
- 210. ARM Assembly Language
- 211. ARM Architecture ARM- Acorn RISC Machine ARM is an architecture created by “ARM Holdings” ARM Holdings
- 212. ARM CPUs Chips using ARM architecture A4, A5, A6, A7 Iphone/Ipad by Apple Qualcomm’s Snapdragon Samsung
- 213. ARM Assembly Language See: http://www.cs.purdue.edu/homes/cs250/LectureNotes/arm_inst.pdf and http://www.cs.purdue.edu/homes/cs250/LectureNotes/arm-ref.pdf
- 214. Example Assembly Program test1.s: .text .global main main: stmfd sp!, {fp, lr} ldr r0, .L2 bl
- 215. Running the Assembler pi@raspberrypi:~/cs250/lab6-src$ gcc -o test1 test1.s pi@raspberrypi:~/cs250/lab6-src$ ./test1 Hello world pi@raspberrypi:~/cs250/lab6-src$
- 216. Assembly Code in Hexadecimal pi@raspberrypi:~/cs250/lab6-src$ gcc -Xassembler -a -o test1 test1.s > out pi@raspberrypi:~/cs250/lab6-src$ vi out
- 217. Calling Conventions r0 to r3: They are used to pass arguments to a function. r0 is
- 218. Condition Code Flags This flags are stored in the PSR- Processor Status Register They are updated
- 219. Updating the Condition Code Flags CMP reg1, reg2 Performs reg1-reg2 It updates N, Z, C, V
- 220. ARM Instructions ARM assembly language reference card MOVcdS reg, arg copy argument (S = set flags)
- 221. ARM Instructions TSTcd reg, arg update flags based on bitwise AND TEQcd reg, arg update flags
- 222. ARM Instructions Add-Ons: Conditions Every instruction may be have a condition appended: Example: MOV r1, r2
- 223. List of Conditions that Can be Added to Instructions AL or omitted always EQ equal (zero)
- 224. Example: Adding two numbers Implement the following program in assembler: #include int a; int b; int
- 225. Example: Adding two numbers in Assembly using Registers /* add-reg.s Adding two numbers using registers */
- 226. Adding two numbers in Assembly using Registers (cont.) .global main /* main() { */ main: stmfd
- 227. Adding two numbers in Assembly using Registers (cont.) pi@raspberrypi:~/cs250/lab6-src$ gcc -o add-reg add-reg.s pi@raspberrypi:~/cs250/lab6-src$ ./add-reg c=5
- 228. Example: Adding Two Numbers Using Global Vars /* add-global.s: Adding two numbers using global variables */
- 229. Adding Two Numbers Using Global Vars (cont.) .text /* We need to store the addresses of
- 230. Adding two numbers using Global Vars (cont.) ldr r2, addra /* Read a and put it
- 231. Adding two numbers using Global Vars (cont.) pi@raspberrypi:~/cs250/lab6-src$ gcc -o add-global add-global.s pi@raspberrypi:~/cs250/lab6-src$ ./add-global c=5
- 232. Example: Read two numbers and add them /* readadd.s Read two numbers and add them pi@raspberrypi:~/cs250/lab6-src$
- 233. Example: Read two numbers and add them (cont.) .section .data .align 2 /* Define variable 4
- 234. Example: Read two numbers and add them (cont.) .global main /* main() { */ main: stmfd
- 235. Example: Read two numbers and add them (cont.) ldr r1, addrb /* r1 ldr r1, [r1]
- 236. Example: Read two numbers and add them (cont.) pi@raspberrypi:~/cs250/lab6-src$ gcc -o readadd readadd.s pi@raspberrypi:~/cs250/lab6-src$ ./readadd a:
- 237. Mixing C and Assembly Language. Finding max in an array. max.c: #include #include extern int maxarray(int
- 238. Mixing C and Assembly Language. Finding max in an array (cont.) maxarray.s /* Find maximum of
- 239. Mixing C and Assembly Language. Finding max in an array (cont.) while: cmp r3,r1 /* while
- 240. Mixing C and Assembly Language. Finding max in an array (cont.) mov r5,#1 /* i++; */
- 241. Mixing C and Assembly Language. Finding max in an array (cont.) pi@raspberrypi:~/cs250/lab6-src$ gcc -o max max.c
- 242. Implementing String Functions in ARM Assembly Language There are two functions to load/store bytes: ldrb reg1,[reg2]
- 243. Example: strcat function in ARM assembly /* strcat-main.c:*/ #include #include extern char * mystrcat(char * a,
- 244. Example: strcat function in ARM assembly (cont.) /* Concat two strings a, b. Result is in
- 245. Example: strcat function in ARM assembly (cont.) skip2: /* Add char by char *b to *a
- 246. Example: strcat function in ARM assembly (cont.) pi@raspberrypi:~/cs250/lab6-src$ gcc -o strcat-main strcat-main.c mystrcat.s pi@raspberrypi:~/cs250/lab6-src$ ./strcat-main s1?
- 247. Midterm Review
- 248. Midterm Review II. Fundamentals of Digital Logic Voltage and Current Boolean Logic Truth Tables Implementation using
- 249. Midterm Review III. Data and program Representation Memory of a Program Memory Sections: text, Data, Bss,
- 250. Midterm Review III. Data and program Representation (cont.) Binary Addition , Subtraction, Multiplication and Division Sign
- 251. Midterm Review IV. Variety of Processors Von Neumann Architecture Address Bus and Data Bus Components of
- 252. Midterm Review V. Processor Types and Instruction Sets CISC and RISC Execution Pipeline Pipe Stall VI.
- 253. Midterm Review VI. Operand Addressing and Instruction Representation (cont.) Addressing modes: Immediate, Direct, Indirect VII. CPUs
- 254. Midterm Review VII. CPUs Microcode Protection and Protection Modes Interrupts Steps to service an interrupt Asynchronous
- 255. Midterm Review Microcode Vertical and Horizontal Microcode VIII. Assembly Language and Programming Paradigm ARM Assembly language
- 256. Midterm Material to Study Class Slides Midterm Exam Homework Review Projects Lab1-Lab6 ARM Assembly Language Everything
- 257. X86-64 Asembly Language
- 258. History Created by AMD to extend the x86 architecture to use 64 bits X86-64 is a
- 259. Register Assignment (Bryant/O’Hallaron “x86-Machine Level Programming”)
- 260. Using registers A function can use any of the argument registers. There is no need to
- 261. X86-64 C-Types (Bryant/O’Hallaron “x86-Machine Level Programming”)
- 262. Addressing modes Immediate Value movq $0x501208,%rdi #Put in register %rdi the # constant 0x501208 Direct Register
- 263. Example: Adding two numbers .text sum: # int sum(int a, int b) { movq %rdi, %rax
- 264. Assembling and running To assemble and run program: $sslab01 ~/cs250 $ gcc -o t1 t1.s $sslab01
- 265. Using the stack The stack is used to store the return address store local variables Save
- 266. Example of Using Stack long sum(long a, long b) { long tmp1 = a; long tmp2
- 267. Stack Layout Before calling sum: main() ret address %rsp After calling sum: main() ret address %rsp
- 268. Example of Using Stack .text .globl sum .type sum, @function sum: subq $24, %rsp # Create
- 269. Using flow control To test the difference conditions use: cmpq S2, S1 # S1 – S2:
- 270. Example of if statement: Obtaining maximum of two numbers long max(long a, long b) { long
- 271. Example of “if” statement: Obtaining maximum of two numbers .text .globl max max: cmpq %rsi, %rdi
- 272. Example of “while” statement: Obtaining the maximum of an array of numbers. // Finds the max
- 273. Example of “while” statement: Obtaining the maximum of an array of numbers. maxarray.s .text .globl maxarray
- 274. Example of “while” statement: Obtaining the maximum of an array of numbers using Array Dereferencing //
- 275. Same program using array dereferencing .text .globl maxarray # // Finds the max value in an
- 276. Running the program maxarray.c: long a[] = {4, 6, 3, 7, 9 }; main() { printf("maxarray(5,a)=%d\n",
- 277. Defining Global Variables in Assembly Language To create space for a global variable in assembly language
- 278. Example Using scanf in x86-64 assembler # Define global variable a in data section .data .comm
- 279. Using gdb with assembly programs Use the following instructions to debug assembly programs: stepi – steps
- 280. Using gdb (gdb) break main Breakpoint 1 at 0x4004f4 (gdb) run Starting program: /u/u3/grr/cs250/max warning: no
- 281. Lab7: Writing a Simple Compiler In this lab you will write a compiler for “Simple C”
- 282. Simple C Subset of C Only the following types are supported: long long* char char* void
- 283. Example Simple “C” program long fact(long n) { if (n==0) return 1; return n*fact(n-1); } void
- 284. Building a Compiler To help in the development of compilers, tools such as Lex and Yacc
- 285. Lex Lex takes as input a file simple.l with the regular expressions that describe the different
- 286. Yacc Yacc Takes as input a file simple.y with the grammar that describes the language. This
- 287. Lex and Yacc Interaction simple.l Parser simple.y Scanner lex simple.l lex.yy.c (lex.yy.c) yacc simple.y y.tab.c (y.tab.c)
- 288. Lex Input file simple.l It contains the regular expressions that describe the different tokens "return" {
- 289. Yacc input file simple.y It contains the grammar that describes the language. It also contains actions
- 290. Yacc input file simple.y program : function_or_var_list; function_or_var_list: function_or_var_list function | function_or_var_list global_var | /*empty */
- 291. Code generation You will need to add more actions to generate the code. An action is
- 292. Parsing tree The parser tries to parse the inout according to the grammar fact long n
- 293. Generating Code for Expressions Since the compiler will only parse the sources once, the easiest code
- 294. Example of stack based machine Arithmetic expression: 4+3*8 Equivalent in stack based machine: push 4 push
- 295. Parsing Expressions We need the hierarchy of logical, equality, relational, additive, multiplicative expressions to take into
- 296. Parsing Expressions equality_expr: relational_expr | equality_expr EQUALEQUAL relational_expr | equality_expr NOTEQUAL relational_expr ; relational_expr: additive_expr |
- 297. Parsing Expressions additive_expr: multiplicative_expr | additive_expr PLUS multiplicative_expr { fprintf(fasm, "\t# +\n"); } | additive_expr MINUS
- 298. Parsing Expressions primary_expr: STRING_CONST { // Add string to string table. // String table will be
- 299. How expressions are parsed expression logical_or_expr logical_and_expr equality_expr relational_expr multiplicative_expr additive_expr primary_expr additive_expr PLUS multiplicative_expr {fprintf(fasm,“+”)}
- 300. Expressions Code Generation You will use a Stack Virtual Machine. The bottom elements in the stack
- 301. Stack Representation
- 302. Stack Operations Depending of the stack position, the push or pop instruction will use a different
- 303. Implementing Variables Your compiler will handle three type of variables: Global variables Local Variables Arguments
- 304. Implementing Declaration of Global Variables The declaration of global variables are parsed in the rule: global_var:
- 305. Creating Space for Global Variables Global variables are stored in the data section. Generate code that
- 306. Getting a Value from a Global Variables The parse rule that should generate the code for
- 307. Saving into a global variable Storing into a global variable is implemented in the following rule
- 308. Getting a Value from a Global Variables Example: Simple C: x = 5 + g; Assembly
- 309. Implementing Declaration of Local Variables Declaration of local variables should be done in the production local_var:
- 310. Implementing Declaration of Local Variables Local variables are stored in the stack. We need to reserve
- 311. Implementing Declaration of Local Variables Two approaches: Reserve a constant maximum stack space all the time
- 312. Implementing Declaration of Local Variables Remember that the argument registers are overwritten during a function call.
- 313. Implementing Declaration of Local Variables Example: long add(long a, long b) { int c,d; c =
- 314. Implementing Declaration of Local Variables local_var_list: WORD { // first local variable local_vars_table[nlocals]=$ ; nlocals++; }
- 315. Generating code for while() long i = 0; main() { while (i i= i + 1;
- 316. Generating code for while() .data i: # long i = 0 ; .long 0 .text .globl
- 317. Generating code for while() # Body of while movq i,%rbx # i = i+1 movq $1,%r10
- 318. Passing Arguments for Calls When parsing argument to calls let the parser push the expressions to
- 319. Parsing Arguments for Calls Simple C: printf(“compute(3,4)=%d\n”, compute(3,4)); Assembly: movq string0, %rbx # push string0 -
- 320. Parsing Arguments for Calls The problem with nested calls is that a single “nargs” variable sis
- 321. Parsing Arguments for Calls Modify call_arg_list to count the arguments. The $ $ stores a variable
- 322. Parsing Arguments for Calls call : WORD LPARENT call_arguments RPARENT { int i; char * funcName
- 323. Virtual Memory Introduction VM allows running processes that have memory requirements larger than available RAM to
- 324. Virtual Memory Introduction VM only keeps in RAM the memory that is currently in use. The
- 325. Other Uses of Virtual Memory Another goal of VM is to speed up some of the
- 326. VM Implementations Process Swapping: The entire memory of the process is swapped in and out of
- 327. Paging Implementation of VM used by modern operating systems. The unit of memory that is swapped
- 328. Paging Address in bytes 0 4096 8192 232-1=4G-1 . . . VM Address in pages (page
- 329. Paging The Virtual Memory system will keep in memory the pages that are currently in use.
- 330. Backing Store Every page in the address space is backed by a file in disk, called
- 331. Swap Space Swap space is a designated area in disk that is used by the VM
- 332. Swap Space lore 208 $ df -k Filesystem kbytes used avail capacity Mounted on /dev/dsk/c0t0d0s0 1032130
- 333. Swap Space lore 206 $ /usr/sbin/swap -s total: 971192k bytes allocated + 1851648k reserved = 2822840k
- 334. Implementation of Paging Paging adds an extra indirection to memory access. This indirection is implemented in
- 335. Paging There are two types of addresses: Virtual Memory Addresses: the address that the CPU is
- 336. The Memory Management Unit CPU Memory Cache Memory Management Unit (MMU) Translation Look-Aside Buffer (TLB) Page
- 337. The Memory Management Unit The MMU has a Page Table Register that points to the current
- 338. The Memory Management Unit The value of the page table register changes every time there is
- 339. The Memory Management Unit To prevent looking up the page table at every memory access, the
- 340. VM to Hardware Address Translation The VM address is divided into two parts: Page number (higher
- 341. VM to Hardware Address Translation Page Number Offset VM Address Page Table Page Number Offset Hardware
- 342. VM to Hardware Address Translation (one-level page table) 0x2 0x345 VM Address 0x2345 Page Table 0x345
- 343. Two-Level page tables Using a one-level page table requires too much space: 220 entries * 4
- 344. Two-Level page tables The page number is divided into two parts: first-level page number and the
- 345. VM Address Translation VM address 1st level (i) 2nd level (j) offset 31 22 21 12
- 346. VM Address Translation VM address:0x00402657 i=0x1 2nd level offset 31 22 21 12 11 0 First
- 347. Example VMaddress: 0x00402 657 Physical Memory Address: 0x2 657 1.From the VM address find i, j,
- 348. Page Bits Each entry in the page table needs only 20 bits to store the page
- 349. Page Bits If a CPU operation exceeds the permissions of a page, the MMU will generate
- 350. Types of Page Fault Page Fault Page not Resident: Page not in Physical Memory, it is
- 351. Processing a Page Fault A program tries to read/write a location in memory that is in
- 352. Processing a Page Fault 4. The CPU looks up the interrupt handler that corresponds to the
- 353. Processing a Page Fault 6. Find a free page in physical memory. If there are no
- 354. Processing a Page Fault The page fault handler retries the offending instruction at the end of
- 355. Using mmap The mmap() function establishes a mapping between a process's address space and a file
- 356. Using mmap Memory 0x00000000 0xFFFFFFFF Disk File mmap ptr= 0x00020000 ptr = mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_SHARED,
- 357. Mmap parameters void *mmap(void *addr, size_t len, int prot, int flags, int fildes, off_t off); addr
- 358. Mmap parameters flags: - Semantics of the mapping: MAP_SHARED – Changes in memory will be done
- 359. Notes on mmap Writing in memory of a memory-mapped file will also update the file in
- 360. Uses of VM The VM is not only to be able to run programs that use
- 361. 1. Mmap the text segment of an executable or a shared library initially mmap does not
- 362. 1. Mmap the text segment of an executable or a shared library Physical pages where the
- 363. 1. Mmap the text segment of an executable or a shared library Virtual Memory 0x00000000 0xFFFFFFFF
- 364. 1. Mmap the text segment of an executable or a shared library Physical Memory text text
- 365. 2. Mmap the data segment of a program During the loading of a program, the OS
- 366. 2. Mmap the data segment of a program . Physical Memory Process 1 Virtual Memory Process
- 367. 2. Mmap the data segment of a program . Physical Memory Process 1 Virtual Memory Process
- 368. 3. Use of VM during fork to copy memory of the parent into the child After
- 369. 3. Use of VM during fork to copy memory of the parent into the child The
- 370. 3. Use of VM during fork to copy memory of the parent into the child .
- 371. 3. Use of VM during fork to copy memory of the parent into the child .
- 372. 4. Allocate zero-initialized memory. It is used to allocate space for bss, stack and sbrk() When
- 373. 4. Allocate zero-initialized memory. This is implemented by making the entries in the same page table
- 374. 4. Allocate zero-initialized memory. . Physical Memory Parent’s Virtual Memory 0’s page A 0’s page B
- 375. 4. Allocate zero-initialized memory. . Physical Memory Parent’s Virtual Memory 0’s page A 0’s page B
- 376. 5. Shared Memory Processes may communicate using shared memory Both processes share the same physical pages
- 377. 5. Shared Memory Physical Memory Process 1 page A page B page C page A page
- 378. 5. Shared Memory Physical Memory Process 1 page A X page B page C page A
- 379. Cache and Caching Continue Book Class slides http://www.cs.purdue.edu/homes/cs250/LectureNotes/book-slides.pdf Chapters XII, XIII, XIV, XV, XVI, XVII (12
- 380. Final Exam Review
- 381. Final Exam Review VIII. Assembly Language and Programming X86-Assembly Language Register Assignment Addressing Modes Using the
- 382. Final Exam Review XII Caches and Caching Importance of Caching Cache hit and cache miss Locality
- 383. Final Exam Review XIII Input/Output Concepts and Terminology Parallel Interface / Serial Interface Data Multiplexing XIV
- 384. Final Exam Review XVI. A Programmers View of I/O and Buffering Upper Half and Lower Half
- 385. Final Material to Study New Slides Old slides Everything up to and including chapter XIX in
- 386. Extra Slides
- 387. PIC 18 Introduction
- 388. PIC18 In the labs you will use the PIC18 This is a 8 bit processor that
- 389. PIC18 It follows a Harvard Architecture, that is, code and data are stored in separate memory.
- 390. Data Memory RAM is 4KB or 212 Therefore, pointers are 12 bits long The memory is
- 392. Memory Addresses The instructions that access data use a reduced pointer that is 8 bits long
- 393. Special Function Registers and General Function Registers The data memory is divided into SFRs – Special
- 395. Working Register (WREG) Most arithmetic and logical operations use a register called Working Register or WREG.
- 396. Processor Status Register (PSR) This is a register that contains the status of the Arithmetic Logical
- 397. Digital Input/Output PORTA, PORTB, PORTC, PORTD They are the registers that are mapped to the inputs/outputs
- 398. Digital Input/Output When configured PORTA as output for example 0 in bit RA0 of PORTA gives
- 399. Minimum PIC18
- 400. Addressing Modes Inherent (Immediate) Used in instructions that do not need an argument such as SLEEP
- 401. Indirect Addressing It uses the FSR registers and the INDF operand. There are four registers: FSR0,
- 402. Byte Operations d = 0 means destination is WREG. d = 1 means destination is a
- 403. Byte Operations (cont.) DECF f,d,a Decrement f DECFSZ f,d,a Dec f, skip if 0 DCFSNZ f,d,a
- 404. Byte Operations (cont.) NEGF f,a Negate f RLCF f,d,a Rotate left f thru Carry (not-quite multiply
- 405. Bit Operations (cont.) BCF f,b,a Bit clear, bit is indexed 0 to 7 BSF f,b,a Bit
- 406. Control Operations (cont.) BC n Branch if Carry, n is either a relative or a direct
- 407. Control Operations (cont.) CLRWDT Clear Watchdog Timer DAW Decimal Adjust W GOTO n Go to address
- 408. Operations with Literals (constants) ADDLW kk Add literal to W ANDLW kk And literal with W
- 409. Common PIC Assembler Constructions Including the PIC18 constant defined values Add #include “P18f4550.INC” at the beginning
- 410. Defining a variable To define space for a variable use “res”. Delay1 res 2 This defines
- 411. Using registers Loading a constant into WREG MOVLW 0x40 Moving the value from a register to
- 412. Adding and Subtracting Add reg1 and reg2. Put result in reg1 MOVF reg1,0 ; WREG =
- 413. Subroutines To call a subroutine … CALL foo ; Calling subroutine foo … … To define
- 414. If/else statements If (reg1 == 0x40) {XXX} else { YYY} MOVLW 0x40; WREG = reg1 CPFSEQ
- 415. Using Arrays Arrays are implemented using Indirect Indexing Defining an array of bytes called “myArray” of
- 416. Using Arrays Indexing the Array myArray[i]. Address is stored in FSR0 and then it is dereferenced
- 417. Simple Program. LED Blink #include "P18f4550.INC" CONFIG WDT=OFF; disable watchdog timer CONFIG MCLRE = ON; MCLEAR
- 418. Another Example. Rotate Segments #include "P18f4550.INC" CONFIG WDT=OFF; disable watchdog timer CONFIG MCLRE = ON; MCLEAR
- 419. Another Example (cont.) MainLoop: RRNCF PORTD ; Rotate bits in D. This causes the segments in
- 420. Example: Driving a Full-Color LED To drive the full-color LED you will use Pulse Width Modulation
- 421. Pulse Width Modulation Short Width = Low Intensity Long Width = High Intensity
- 422. Pulse Width Modulation Example MOVFF maxColor, redCount MOVFF maxColor, greenCount MOVFF maxColor, blueCount MainLoop: ;;;;;; RED
- 423. Lab5 Driving a Full Color LED Algorithm Examples are given that shows you how to drive
- 424. Algorithm for Driving Full Color LED Start Initialize Ports and Registers Initialize colors and counters MainLoop
- 426. Скачать презентацию