Danh mục tài liệu

Hardware Acceleration of EDA Algorithms- P11

Số trang: 7      Loại file: pdf      Dung lượng: 145.20 KB      Lượt xem: 15      Lượt tải: 0    
Xem trước 2 trang đầu tiên của tài liệu này:

Thông tin tài liệu:

Hardware Acceleration of EDA Algorithms- P11: Single-threaded software applications have ceased to see significant gains in performanceon a general-purpose CPU, even with further scaling in very large scaleintegration (VLSI) technology. This is a significant problem for electronic designautomation (EDA) applications, since the design complexity of VLSI integratedcircuits (ICs) is continuously growing. In this research monograph, we evaluatecustom ICs, field-programmable gate arrays (FPGAs), and graphics processors asplatforms for accelerating EDA algorithms, instead of the general-purpose singlethreadedCPU....
Nội dung trích xuất từ tài liệu:
Hardware Acceleration of EDA Algorithms- P1112 Conclusions 185 Multi−Threaded Multi−Threaded Display Interface Fixed Function Wide SIMD Wide SIMD I$ D$ I$ D$ Memory Controller Memory Controller L2 Cache System Interface Texture Logic Multi−Threaded Multi−Threaded Wide SIMD Wide SIMD I$ D$ I$ D$Fig. 12.2 Larrabee architecture from Intel Shared Multiprocessor Core DRAM I/F DRAM I/F HOST I/F L2 DRAM I/F Giga Thread DRAM I/F DRAM I/F DRAM I/FFig. 12.3 Fermi architecture from NVIDIAmultiprocessor (SM). The block diagram of a single SM is shown in Fig. 12.4 andthe block diagram of a core within an SM is shown in Fig. 12.5. With these upcoming architectures, newer approaches for hardware accelerationof algorithms would become viable. These approaches could exploit the more gen-eral computing paradigm offered by the newer architectures. For example, the closecoupling between the GPU and the CPU (which reside on the same die) would186 12 Conclusions Instruction Cache Scheduler Scheduler Dispatch Dispatch Register File Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Load/Store Units X 16 Special Func Units X 4 Interconnect Network 64K Configurable Cache/Shared Mem Uniform CacheFig. 12.4 Block diagram of a single shared multiprocessor (SM) in Fermireduce the communication cost. Also, in these upcoming architectures the instruc-tion dispatch unit is distributed, and the instruction set is more general purpose.These enhancements would enable a more general computing paradigm (in compar-ison to the SIMD paradigm for current GPUs), which in turn would enable acceler-ation opportunities for more EDA applications. The approaches presented in this monograph collectively aim to contributetoward enabling the CAD community to accelerate EDA algorithms on modernhardware platforms. Our work demonstrates techniques to rearchitect several EDAalgorithms to maximally harness their performance on the alternative platformsunder consideration.References 187 CUDA Core Dispatch Port Operand Collector FP Unit INT Unit Result QueueFig. 12.5 Block diagram of a single processor (core) in SMReferences1. http://www.cs.chalmers.se/cs/research/formalmethods/minisat/main.html. The MiniSAT Page2. NVIDIA Tesla GPU Computing Processor. http://www.nvidia.com/object/IO_ 43499.html3. OmegaSim Mixed-Signal Fast-SPICE Simulator. http://www.nascentric.com/ product.html4. Lee, H.K., Ha, D.S.: An efficient, forward fault simulation algorithm based on the parallel pattern single fault propagation. In: Proceedings of the IEEE International Test Conference on Test, pp. 946–955. IEEE Computer Society, Washington, DC (1991)5. Seiler, L., Carmean, D., Sprangle, E., Forsyth, T., Abrash, M., Dubey, P., Junkins, S., Lake, A., Sugerman, J., Cavin, R., Espasa, R., Grochowski, E., Juan, T., Hanrahan, P.: Larrabee: A many-core x86 architecture for vis ...