Parallel computing

Parallel computing is a form of computation in which many instructions are carried out simultaneously (termed "in parallel"),^[1] depending on the theory that large problems can often be divided into smaller ones, and then solved concurrently ("in parallel").

There are several different forms of parallel computing:

It has been used for many years, mainly in high-performance computing, with a great increase in its use in recent years, due to the physical constraints preventing frequency scaling. Parallel computing has become the main model in computer architecture, mainly in the form of Multi-core processors.^[2] However, in recent years, power consumption by parallel computers has become a concern.^[3]

Parallel computers can be classified according to the level at which the hardware supports parallelism—with multi-core and multi-processor computers having multiple processing elements inside a single machine, while clusters, blades, MPPs, and grids use multiple computers to work on the same task.

Parallel computer programs are more difficult to program than sequential ones,^[4] because concurrency introduces several new classes of potential software bugs, of which race conditions and dead locks are the most common. however many parallel programming languages have been created to simplify parallel computers programming. But still communication and synchronization between the different subtasks is difficult while achieving good parallel program performance.

Parallel Computing Media

IBM's Blue Gene/P massively parallel supercomputer
A graphical representation of Amdahl's law. The speedup of a program from parallelization is limited by how much of the program can be parallelized. For example, if 90% of the program can be parallelized, the theoretical maximum speedup using parallel computing would be 10 times no matter how many processors are used.
Assume that a task has two independent parts, A and B. Part B takes roughly 25% of the time of the whole computation.
A graphical representation of Gustafson's law
Taiwania 3 of Taiwan, a parallel supercomputing device that joined COVID-19 research.
A canonical processor without pipeline. It takes five clock cycles to complete one instruction and thus the processor can issue subscalar performance (IPC = 0.2 < 1).
A canonical five-stage pipelined processor. In the best case scenario, it takes one clock cycle to complete one instruction and thus the processor can issue scalar performance (IPC = 1).
A canonical five-stage pipelined processor with two execution units. In the best case scenario, it takes one clock cycle to complete two instructions and thus the processor can issue superscalar performance (IPC = 2 > 1).
A logical view of a non-uniform memory access (NUMA) architecture. Processors in one directory can access that directory's memory with less latency than they can access memory in the other directory's memory.
A Beowulf cluster

References

↑ Almasi, G.S. and A. Gottlieb (1989). Highly Parallel Computing. Benjamin-Cummings publishers, Redwood City, CA.
↑ Asanovic, Krste et al. (December 18, 2006). "The Landscape of Parallel Computing Research: A View from Berkeley" (PDF). University of California, Berkeley. Technical Report No. UCB/EECS-2006-183. "Old [conventional wisdom]: Increasing clock frequency is the primary method of improving processor performance. New [conventional wisdom]: Increasing parallelism is the primary method of improving processor performance ... Even representatives from Intel, a company generally associated with the 'higher clock-speed is better' position, warned that traditional approaches to maximizing performance through maximizing clock speed have been pushed to their limit."
↑ Asanovic et al. Old [conventional wisdom]: Power is free, but transistors are expensive. New [conventional wisdom] is [that] power is expensive, but transistors are "free".
↑ Patterson, David A. and John L. Hennessy (1998). Computer Organization and Design, Second Edition, Morgan Kaufmann Publishers, p. 715. ISBN 1558604286.

Other websites

The English Wikibooks has more information on:

Distributed Systems

Parallel computing at the Open Directory Project
Lawrence Livermore National Laboratory: Introduction to Parallel Computing Archived 2013-06-10 at the Wayback Machine
Designing and Building Parallel Programs, by Ian Foster
Internet Parallel Computing Archive Archived 2002-10-12 at the Wayback Machine
Parallel processing topic area at IEEE Distributed Computing Online Archived 2011-09-28 at the Wayback Machine
Parallel Computing Works Free On-line Book Archived 2011-07-27 at the Wayback Machine
Frontiers of Supercomputing Free On-line Book Covering topics like algorithms and industrial applications

This short article about technology can be made longer. You can help Wikipedia by adding to it.

[1] Almasi, G.S. and A. Gottlieb (1989). Highly Parallel Computing. Benjamin-Cummings publishers, Redwood City, CA.

[View-Power-2] Asanovic, Krste et al. (December 18, 2006). "The Landscape of Parallel Computing Research: A View from Berkeley" (PDF). University of California, Berkeley. Technical Report No. UCB/EECS-2006-183. "Old [conventional wisdom]: Increasing clock frequency is the primary method of improving processor performance. New [conventional wisdom]: Increasing parallelism is the primary method of improving processor performance ... Even representatives from Intel, a company generally associated with the 'higher clock-speed is better' position, warned that traditional approaches to maximizing performance through maximizing clock speed have been pushed to their limit."

[3] Asanovic et al. Old [conventional wisdom]: Power is free, but transistors are expensive. New [conventional wisdom] is [that] power is expensive, but transistors are "free".

[4] Patterson, David A. and John L. Hennessy (1998). Computer Organization and Design, Second Edition, Morgan Kaufmann Publishers, p. 715. ISBN 1558604286.

[1]

[2]

[3]

[4]