ACT Reading

Unit V – Data Processing Technology

Studied by 22 people

5.0(2)

get a hint

hint

1 / 91

Tags and Description

Computer Science

ACT

Reading

University/Undergrad

92 Terms

A. Parallel Processing System

New cards

Parallel processing

refers to one or more independent operating systems managing multiple processors and performing multiple tasks.

New cards

Parallel processing

is very fast and can share the memory unit.

New cards

Flynn’s taxonomy and classification by memory structure are leading

examples of parallel processing system classification

New cards

A. Parallel Processing System

2. Flynn’s classification of parallel processing systems

New cards

2.1 Single instruction stream - single data stream,

Single Instruction Stream Single Data Stream (SISD)

is a single processor system that sequentially processes an instruction and data, one at a time.

New cards

2.1 Single instruction stream - single data stream, Single Instruction Stream Single Data Stream (SISD)

It is the conventional computer architecture that follows von Neumann’s concept.

New cards

2.2 Single instruction stream - multiple data stream,

Single Instruction Stream Multiple Data Stream (SIMD)

The structure of processing multiple data with an instruction to simultaneously perform the same operation on multiple data.

New cards

2.2 Single instruction stream - multiple data stream,

Single Instruction Stream Multiple Data Stream (SIMD)

It is also called an array processor, as it enables synchronous parallel processing.

New cards

2.3 Multiple instruction streams -single data stream, Multiple Instruction Stream Single Data Stream (MISD)

-Each processing unit in the ____ parallel computing architecture runs different instructions and processes the same data. The pipeline architecture is an example. It is not a widely used architecture.

New cards

2.4 Multiple instruction streams - multiple data stream,

Multiple Instruction Stream Multiple Data Stream (MIMD)

-In a ____ structure, multiple processors process different programs and different data, and most parallel computers fall into this category. It can be classified into a shared memory model and a distributed memory model, depending on how it uses the memory

New cards

3.1 Symmetric multiprocessor (SMP)

is a tightly-coupled system in which al processors use the main memory as the shared memory. It is easy to program since the data transfer can use shared memory

New cards

3.2 Massive parallel processor (MPP)

is a distributed memory type in which each processor has an independent memory. The loosely coupled system exchanges data between processors through a network, such as Ethernet.

New cards

3.3 Non uniform memory access (NUMA)

is a structure the combines the advantages of the SMP which is a shared memory structure that makes it easier to develop programs and the MPP structure, which offers excellent scalability.

New cards

4. Types of parallel processor technology

4.1 Instruction pipelining

The technology improves the CPU performance by dividing an operation into several stages and configuring a hardware unit for processing each stage separately in order to process different instructions simultaneously-

New cards

instruction fetching (IF), instruction decoding (ID), operand fetching (OF), and execution (EX).

The stages of the four-stage instruction pipeline are

New cards

pipeline hazard

refers to the pipeline speed exceptionally slowing down. _____ include the data hazard, the control hazard, and the structural hazard.

New cards

Data hazards

occur when the next instruction execution has to be delayed until the previous instruction has been completed because of the dependency between instruction operands.

New cards

Control hazards

are generated by branch instructions, like branch and jump which change the execution order of the instructions.

New cards

Structural hazards

are generated when instructions cannot be processed in parallel in the same clock cycle, due to hardware limitations.

New cards

5. Parallel programming technology

5.1 Compiler technology - OpenMP

is a compiler directive-based parallel programming API.

New cards

The execution model of OpenMP

is the fork/join model.

New cards

5. Parallel programming technology

5.2 Message passing parallel programming model, MPI

is a parallel programming model suitable to a distributed memory system structure.

New cards

Parallel programming

tools for message passing include High Performance FORTRAN (HPF), Parallel Virtual Machine (PVM), and Message Passing Interface (MPI). MPI has become the standard

New cards

5.3 Load balancing technologies -AMP, SMP, and BMP

adequately distributes jobs to the cores in order to increase the multi core performance.

New cards

AMP , SMP , BMP model

_____ An OS is executed independently in each processor core.

______ An OS manages al processor cores simultaneously. Application programs can operate in any core.

_______ An OS manages al process cores simultaneously, and an application program can run on a specific core

New cards

6. Graphic processing technology

6.1 Graphics processing unit (GPU)

The hardware specializes in computer graphics calculation and is mainly used for the rendering of 3D graphics.

New cards

GPU

dedicated to processing large-capacity image data generates results through parallel jobs using multiple cores.

New cards

6.2 General-purpose GPU (GPGPU)

a GPU shows high computational performance in matrix and vector operations that are mostly used for graphic rendering, the computing system intends to utilize GPUs in the general computing domain as well.

New cards

They include CUDA and OpeACC from NVIDIA, OpenCL from Khronos Group, and C++ AMP from Microsoft.

Many models supporting GPGPU programming have appeared.

New cards

CUDA

____ is a parallel computing platform and a programming model that can significantly improve computing speed with a large number of GPU cores.

New cards

CUDA

It provides intuitive GPI programming, based on the C language, and it enables quick operation using shared memory.

New cards

CUDA

is expected to show an excellent performance improvement when applied to performing tasks suitable for parallel processing operations in various fields that require a large amount of computation, such as simulation.

New cards

7. GPU-based parallel programming technology

Open Computing Language (OpenCL)

maintained and managed by Khronos Group, is an open, general-purpose parallel computing framework developed by Apple, AMD, IBM, Intel, and NVIDIA.

New cards

Open Computing Language (OpenCL)

It is an industry standard programming model for heterogeneous computer systems, consisting of GPUs, CPUs, and other processors.

New cards

C++ Accelerated Massive Parallelism (C++ AMP)

was developed by Microsoft in an open programming language for heterogeneous computing, using CPU and GPU, C++ AMP, when added to Visual Studio 2021, can increase the execution speed of C++ codes using GPU.

New cards

C++ Accelerated Massive Parallelism (C++ AMP)

intends to help developers create general-purpose programs using GPU without a high level of understanding or application capability about DirectX API.

New cards

OpenACC

NVIDIA introduced ____, a programming model based on compiler directives that abstract CUDA, _____ is a programming model for higher productivity, since it provides a relatively simple programming environment for developers.

New cards

Direct attached storage (DAS)

The storage connects a computer system with disks directly through a fiber channel or SCSI cable in order to utilize the storage capacity. It allows the computer system to manage the file system directory.

New cards

Network attached storage (NAS)

The storage has a separate file system management server (controller) to manage the storage media such as HDD and SSD.

New cards

Storage area network (SAN)

was developed to overcome the disadvantages. It uses a dedicated fiber channel switch for fast connection, and it enabled the ability to scale up the number of connected servers and storage, with less impact to the connected network load.

New cards

3. IP-SAN

This type of SAN uses the gigabit Ethernet Internet protocol (IP), instead of a fiber channel.

New cards

Fiber Channel over IP (FCIP)

is used to connect a remote SAN. It encapsulates data to TCP/IP for interconnection when transferring a frame to a remote location.

New cards

Internet fiber channel protocol (iFCP)

provides a TCP/IP connection dedicated to regional SAN, using the iFCP gateway.

New cards

Internet SCSI (iSCSI)

encapsulates SCSI commands into IP packets and transfer the I/O block data through TCP/IP. Technologies like \n IPSec ensure reliability.

New cards

4. Storage capacity management

Thin provisioning

The existing fixed-allocation storage technology uses a thick logical unit number (LUN) wasted data storage space.

New cards

4. Storage capacity management

Data de-duplication

provides a high efficiency of disk space used by removing any duplicated data when saving the data.

New cards

5. Storage disk scheduling

Disk scheduling

disk drive that stores data is a device using a rotating magnetic disk.

New cards

5. Storage disk scheduling

Disk scheduling

is a technique of efficiently processing I/O requests, when multiple users request them, in order to process different tasks.

New cards

-Maximization of I/O requests to service during a unit time

-Maximization throughput per unit time

-Minimization of the mean response time

-Minimization of response time

-Minimization of the variation of response time

Using disk scheduling has the following purposes:

New cards

5. Storage disk scheduling

Disk performance measurement indicator

can be compared with the indicators that measure disk performance.

New cards

Disk performance measurement indicators

include the access time, seeking time, rotational delay or rotational latency, and data transfer time

New cards

seeking time

indicates how long it takes to move the head from the current head position, to the track containing the data.

New cards

rotational latency

indicates how long it takes from the moment the head begins rotating to move to the track containing the data, to the moment it reaches the sector that contains the data.

New cards

data transfer time

indicates how long it takes to transfer the read data to the main memory. This section describes techniques to minimize the access time by minimizing the seeking time and the rotational latency.

New cards

First come first serve (FCFS) disk scheduling

services the requests in the order they are received. The head position moves in the order of the requested tracks in the disk standby queue.

New cards

Shortest seeking time first (SSRF) disk scheduling

The scheduling technique first services the request that is closest to the current head position, among the requested services waiting in the queue.

New cards

SCAN disk scheduling

The scheduling technique first services the request that has the shortest seeking distance from the current direction of the moving head.

New cards

LOOK disk scheduling

The technique is the same as the SCAN disk scheduling, except that the head changes its direction before reaching the outermost or innermost cylinder.

New cards

5. Storage disk scheduling

Circular SCAN (C-SCAN) disk scheduling

The SCAN technique moves the head by connecting the inner and outer tracks in a circular model.

New cards

Circular LOOK (C-LOOK) disk scheduling

it is a LOOK scheduling technique that connects the inner and outer tracks in an annular model in order to make the head move.

New cards

C. High Availability Storage

1. Redundant array of independent disks (RAID) technology

New cards

Large-capacity storage systems

generally have an error controller and backup function to safely store the massive volume of data.

New cards

RAID

is a storage technology that minimizes the factors that can cause failure, and it improves access performance by arranging a number of disks, and by creating a separate disk unit by linking them with each other

New cards

are improved availability, increased capacity, and increased speed.

main features of RAID

New cards

C. High Availability Storage

1. Redundant array of independent disks (RAID) technology

New cards

RAID-0 (Striped disk array without fault tolerance)

consists of two or more drives and uses disk striping, which stores data by dividing it into pieces of a specific size and saves it on multiple disks at once.

New cards

RAID-1 (Mirroring and Duplexing)

uses a mirroring technique that redundantly stores data on two drives. Since data is stored in redundancy, data can be restored, even if a drive fails.

New cards

RAID-4

has a separate parity drive and collects and stores parities for data verification and recovery.

New cards

RAID-5

is an improvement of RAID-4 by distributing the load of the drive that stores the parities.

New cards

RAID-6 (Stripe set with dual distributed parity)

is similar to RAID-5, except that while RAID-5 stores one parity, RAID-6 redundantly stores a parity in two drives. The configuration is more durable than RAID-5 and can store data safely.

New cards

RAID-10(Striping & Mirroring)

requires at least four drives and is a combination of RAID-0 and RAID-1 to improve I/O speed while providing data stability.

New cards

Linear tape-open (LTO)

is a standard open tape drive technology that supports high-speed data processing and a large capacity.

New cards

Virtual tape library (VTL)

is a backup solution that emulates disk storage and makes it into a virtual tape device to compensate for problems such as limited performance, scalability, and the recovery time

New cards

D. Graphic Compression Technology

1. Graphic compression type

New cards

1. Graphic compression type

Video data compression, which accounts for most of the traffic in a multimedia network, can be divided into lossless compression (reversible compression) and lossy compression (irreversible compression).

New cards

Lossless compression

is also called reversible compression.

New cards

Graphic compression type

refers to a method of restoring a compressed image without information loss from the original data while decompressing.

New cards

Lossy compression

is also called irreversible compression.

New cards

Graphic compression type

refers to a compression method when the compressed data is restored, but it does not match the original data before the compression because some data is lost.

New cards

Lossless compression

Since the compression and decompression algorithms are exactly the opposite of each other, the compression method preserves the original data's integrity, and no part of the data is lost during processing.

New cards

Lossy compression

compromises some accuracy to increase the compression rate, by allowing the loss of redundant or unnecessary data. There are two types of lossy compression methods: prediction coding and transform coding.

New cards

The prediction coding method

is used for digitizing the analog signal, Instead of separately quantizing the PCM (Pulse Code Modulation) samples, it quantizes the difference.

New cards

The transform coding method

transforms a signal from one domain (mainly a time and space domain) to another domain (mainly a frequency domain), then compresses it.

New cards

Multimedia data

includes text, image, video, and audio data. The text has the form of plain text and non-linear hypertext.

New cards

Unicode

The basic language is _____ for expressing symbols, and it uses a loss less compression method.

New cards

Multimedia data

an image is called a still image and refers to a photo, fax page, or a video frame

New cards

Multimedia data

In the transformation process, the JPEG uses DCT (Discrete Cosine Transform) in the first stage of compression, and the decompression uses the inverse DCT method.

New cards

Multimedia data

The transformation and inverse transformation apply 8 X 8 blocks.

The quantization process creates integers from the real number of the DCT transform output and converts some values to zero.

New cards

Multimedia data

The coding process arranges data in a zigzag order after quantization and before encoder input, then lossless compression is performed using run-length decoding and arithmetic coding

New cards

Video compression standard

The Moving Picture Experts Group (MPEG) is an international standardization organization. The official name of the standard is ISO/IEC JTC1/SC29/WG11.

New cards

MPEG

created the following compression formats and additional standards.

New cards