Sony Emotion Engine architectural overview 2002.5.20 Kim L. Vu Acronyms ALU COP DMAC DSP EE EFU GIF IPU MAC RDRAM SPRAM VPU Arithmetic Logic Unit Coprocessor Direct Memory Access Controller Digital Sound Processing Emotion Engine Elementary Functional Unit Graphics Interface Image Processing Unit Multiply-Accumulate Rambus Dynamic RAM Scratch-Pad RAM Vector Processing Unit Overview (ps2 architecture) Emotion Engine VU0 VU1 CPU + FPU IPU thought simulation, AI, physics calculations SIMD, VLIW architecture fixed geometry calculations program control real-time image data decompression Emotion Engine Features 300Mhz MIPS III CPU – – – – Two-issue superscalar,128-bit multimedia extensions 16k, 2-way instruction cache 8k, 2-way data cache 16k “scratch pad” RAM Vector Units – Both have 4 FMACS + 1 FDIV – EFP (Elementary function unit) in VU1 1 FMAC + 1 FDIV 128-bit data bus IPU – MPEG2 decoder unit 10-channel DMA Controller CPU Core Features MIPS III Instruction Set architecture 6 stage pipeline - PC Select | Fetch | Register | Exec | Cache Access | Write Back Two 64-bit integers ALUs – ALUs can be combined in”lock step” to execute 128-bit SIMD operations Load/Store Unit Branch Execution Unit – 64-entry two-branch prediction mechanism 32 128-bit registers Vector unit performance Microarchitecturally identical 4 FMACS 1 FDIV 1 Load/Store Unit 1 ALU 1 random number generator 2 issue VLIW (64-bit bundle) Two operating modes VLIW and Coprocessor mode Throughput FMAC operation – 1 cycle FDIV operation – 7 cycles 4x4 matrix * vector – 4 cycles 4x4 matrix * matrix – 16 cycles VU Features VU0 Features VU1 Flexible calculations Fixed 3D calculations VLIW mode Yes Yes Coprocessor mode Yes No VPU components 4k instruction RAM 4k data RAM VIF 16k instruction RAM 16k data RAM VIF GIF EFU 4 FMACS (2.4 Gflops) 1 FDIV (0.04 Gflops) 4 FMACS (2.4 Gflops) 1 FDIV (0.04 Gflops) 1 EFU (0.64 Gflops) Job Performance VPU0 Design Strategy - 2-modes : VLIW and coprocessor - Runs mainly in coprocessor mode Lower opcode always NOP - Controlled by CPU - Executes 32-bit MIPS coprocessor instructions - Processes 4 parallel FP instructions - VPU1 Design Strategy Can only run in VLIW mode Executes 64-bit VLIW bundle Accessed by 3D display list – 3D display list contain boht instruction and data in same structure VU Instruction Formats Instruction bundle has 2 parts – “upper” (SIMD) + “lower” “Lower” execution unit FP div/sqrt/reverse sqrt Load/store EFU(1 FMAC + 1FDIV) Jump/branch Random number generator “Upper” execution unit 4 parallel FP add/sub 4 parallel FP mul 4 parallel FP add/msub EE Teams Team 1 Handles physics, program control, AI and behavior calculations Members work closely together with each other Ease of communication through – 128-bit dedicated busses from CPU to FPU and VU0 – SPRAM – acts as CPU and VU0’s shared workspace Team 2 Handles simple geometric calculations Members act as equal partners Dedicated 128-bit bus from VPU1 to GIF Team Interoperation Serial connection – VPU0 acts as VPU1’s coprocessor – SPRAM is used to transfer data to VPU1 – VPU1 renders final image Parallel connection – GIF monitors the status of the graphics synthesizer – Both teams independently and asynchronously sends display lists References Atsushi Kunimatsu, et. al., “Vector Unit Architecture for Emotion Synthesis”, IEEE Micro, Vol. 20, No. 2, March/April 2000, pp. 40-47 K. Kutaragie et. al., “A Microprocessor with 128b CPU, 10 Floating-Point MACS, 4 Floating-Point Dividers, and MPEG2 Decoder,” ISSCC (Int’l SolidStates Circuit Conf.) Digest Tech. Papers, IEEE Press, Piscatawey, New Jersey, Feb. 1999, pp. 256-257 F. Micheal Raam, et. al., “A High-Bandwidth Superscalar Microprocessor for Multimedia Applications,” ISSCC Digest Tech. Papers, IEEE Press, Feb. 1999, pp. 258-259 Sound and Vision: A Technical Overview of the Emotion Engine by John Stokes, Ars Technica – http://arstechnica.com/reviews/1q00/playstation2/ee-1.html The Playstation2 vs. the PC: A System-level Comparison of Two 3D Platforms by John Stokes, Ars Technica – http://arstechnica.com/cpu/2q00/ps2/ps2vspc-1.html