Request a supercomputer!
Definition
A parallel computing system is a computer with more than one processor used for parallel processing. In the past, in a multiprocessor system, each processor was placed in a separate package, but today, with the introduction of multi-core chips, multiple processors are located together in one package. Currently, there are many different types of parallel computers that differ based on the type of connections between processors and memory. Flynn’s classification, one of the most accepted classifications for parallel computers, classifies parallel computers based on whether they include processors that all simultaneously execute the same instruction on different data (Single Instruction, Multiple Data – SIMD) or each processor executes different instructions on different data (Multiple Instruction, Multiple Data – MIMD).
Parallel Processing vs. Concurrency and Multitasking
Concurrency means that several tasks exist, and they are executed simultaneously, but a task can be interrupted, causing a pause in the overall process. Concurrency arises in computing systems where multiple computational processes run simultaneously and interact with each other (they have critical regions). The study of concurrency covers a wide range of systems, from tightly-coupled, highly concurrent parallel systems to loosely-coupled, asynchronous distributed systems. On the other hand, in parallel processing, when a main task is divided into smaller sub-tasks, these tasks can be independent. For example, if two threads or processes are running simultaneously on a single processor core, this is concurrency, but if they run on two processor cores, this is parallelism.
Multitasking refers to the simultaneous execution of two or more computer tasks by the central processing unit (CPU).
The process works as follows: 1- The processor receives an interrupt signal. 2- After receiving the interrupt signal, the processor stops its current task and saves the work done up to that point to continue later from the same point. 3- The processor then handles the device or program that requested the interrupt and processes the call. 4- After processing, the processor issues a scheduling interrupt.
Parallel Processing
Parallel processing is the simultaneous execution of a process, typically by dividing the processing tasks across multiple processors to improve efficiency and speed in reaching a solution. Sometimes, time-sharing techniques in a single processor are mistakenly considered parallel processing (multiple processes running simultaneously on a single processor). The idea is that a problem can generally be divided into smaller sub-problems that can be solved concurrently and later merged to produce a faster result.
The benefits of parallel processing over serial processing (the traditional method) include reduced computation time, the ability to solve larger problems, overcoming memory limitations, cost-effectiveness, and the use of modern technology.
Advantages of Parallel Processing
Some advantages of supercomputing systems, which are the primary driver of their rapid growth, include:
Parallel Programming
Parallel programming was created to make better use of system resources and increase the speed and performance of programs running on processors. In parallel programming, parts of the main program that can be executed simultaneously (concurrently) are divided into subprograms and run concurrently on multiple processors or threads. Parts of the program that cannot be parallelized are executed sequentially on one processor. The main difference between sequential and parallel programming is this division, though several other concepts arise that are not typically addressed in regular programming.
One primary reason for using parallel programming is to increase program execution speed, but single-core processors have the following limitations:
History
Interprocess Communication
In parallel programming, processes need to communicate with each other, and the following methods are used:
Shared Memory
In shared memory, parallel tasks communicate through a shared address space, which allows for asynchronous reading and writing. Synchronous access to these addresses requires mechanisms such as locks, semaphores, and monitors.
Message Passing
In this method, parallel tasks exchange data via messages, which can be either synchronous or asynchronous. In asynchronous communication, the sender sends the message without waiting for the receiver to be ready.
Implicit Model
In this model, communication between tasks is handled without the programmer’s involvement; the compiler manages it.
Principles of Parallel Programming
To find sufficient parallelism in a program (according to Amdahl’s law), the program must be divided into parallel and serial parts in such a way that the overhead introduced by dividing tasks across threads/processors is less than the benefits gained from parallelizing the program.
Granularity
When dividing tasks, careful attention must be given to the size of the parts that will run in parallel. Too many small tasks will lead to high overhead, and too large tasks will essentially run sequentially, reducing speed improvements.
Locality
High-volume memory has slower access speeds, while low-volume memory is faster. Programmers should ensure that algorithms primarily operate on data in local memory to improve performance.
Load Imbalance
Load imbalance occurs when some processors do not perform tasks during certain times due to insufficient parallelism or uneven task distribution. Load balancing can be static or dynamic during runtime.
Synchronization
Some parallel algorithms require synchronization of processors at certain points, such as after each iteration, to share intermediate results. One method for synchronization is using barriers, where processes wait for all others to reach the barrier before proceeding.
Race Conditions
Race conditions occur when multiple tasks concurrently access resources, leading to errors. These errors are often non-deterministic and difficult to detect. Using hardware or software locks can prevent race conditions.
Parallel Programming Tools
With tools, the programmer can design the parallel execution of the program, manage shared variables, input/output dependencies, communication between threads or processes, and decide how calculations, variables, and objects are distributed across the system.
Shared Memory Programming Tools
Distributed Memory Programming Tools
Parallel Programming Languages
Languages, libraries, and models for parallel programming are created for different memory architectures: shared, distributed, or hybrid. These include special programming interfaces for handling memory and parallel computation tasks.
Related links