Breadcrumb

 
 

Studies on memory consistency and synchronization : failure detection in parallel programs

Title:

Studies on memory consistency and synchronization : failure detection in parallel programs

Fu, Taiqi (1998) Studies on memory consistency and synchronization : failure detection in parallel programs. PhD thesis, Concordia University.

[img]
Preview
PDF
6Mb

Abstract

We have studied two related issues in the design, execution and debugging of shared memory parallel programs: memory consistency and techniques to detect synchronization failure in parallel programs. Overlapping execution of operations from a thread is an efficient way to effectively tolerate memory latency in shared memory multiprocessor systems. for a programmer to reason about his program, a sequentially consistent machine with full generality must execute memory operations sequentially in program order, leading to poor performance. We propose Link Consistency which still preserves Sequential Consistency for parallel programs that contain no data race or synchronization failure while permitting out of (program) order execution of operations. By classifying synchronization operations according to whether they satisfy some distinct conditions ( Exclusive Producer and Exclusive Consumer ), Link Consistency allows richer relaxation of program orders than some previous work such as Release Consistency. Compared to other work that also tries to improve upon Release Consistency such as PLpc, Link Consistency is conceptually clearer for programmer, and possesses a rigorous basis for reasoning. We also address the following Synchronization Failure detection problem: Given a trace set from a complete (i.e. successful ) execution, whether there exists another execution that can be derived from the same trace set and is incomplete. The answer to the problem is useful in debugging parallel programs. After showing that the problem is NP-complete, we present some techniques to tackle the problem in practice. These include Trace Reduction, Local Failure Validation, and Partial Order Checking. Trace Reduction may reduce the size of a trace set up to a small fraction of its original size while still preserving its failure property. Local Failure Validation explores a heuristic that is based on individual semaphores rather than all semaphores together. Compared to detecting strategies based on interleaving semantics, Partial Order Checking completely avoids the state explosion problem caused by the large amount of concurrence in trace sets. Our experimental result shows that for many trace sets with synchronization failures, the above techniques solve the detection problem in time linear to the trace set sizes, which are measured by N n , the number of events or by N s x N t , the product of the number of semaphores by the number of program threads, in a trace set.

Divisions:Concordia University > Faculty of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (PhD)
Authors:Fu, Taiqi
Pagination:x, 196 leaves : ill. ; 29 cm.
Institution:Concordia University
Degree Name:Theses (Ph.D.)
Program:Dept. of Computer Science
Date:1998
Thesis Supervisor(s):Li, H. F
ID Code:416
Deposited By:Concordia University Libraries
Deposited On:27 Aug 2009 13:11
Last Modified:08 Dec 2010 10:14
Related URLs:
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Document Downloads

More statistics for this item...

Concordia University - Footer