Next: Operating Systems Laboratory Up: Department of Computer Previous: Language Processing Systems

# Distributed Parallel Processing Laboratory

/ N. N. Mirenkov / Professor
/ S. G. Sedukhin / Associate Professor
/ Shietung Peng / Assistant Professor
/ Sergey V. Ten / Assistant Professor

Our general program is high-performance computing and advanced information infrastructure. We are also thinking about a planetary machine and active knowledge being developed by humanity. We are undertaking research efforts in parallel and distributed algorithms/architectures, visual (multimedia) languages and tools. We lead joint research projects Massively Parallel Computation: Synchronous vs. Asynchronous Designs", Parallel Scientific Computations", Fine-grain parallelism for Cellular automata" and take part in projects: Aizu Supercomputer" and Efficient Algorithmic Design Techniques for Parallel Computing".

In addition, together with the Center for Language Research and students we are developing the project Advanced Computing and Communication Multimedia English-Japanese Glossary". We are studying the problem of parallel program portability, skeleton-based techniques and developing the supporting tools. Our theoretical work is related to efficient algorithms for parallel numerical computation including linear algebra and Fourier transformation as well as for fault tolerant routing on certain interconnection networks, like hypercubes, star graphs, and torus. Information about our activities can be found on the laboratory Home Page: http://www.u-aizu.ac.jp/labs/sw-dpp/, which contains links to three experimental laboratory WWW servers. These servers hold the High-Performance Computing and Communication topics, on-line CD-ROM Digital Library and serve as mirror sites of, for instance, the Computational Science Educational Project".

As one of our results we would like to mention the S4CAD system. S4CAD (System of Systolic Structures Synthesis) is a high level interactive software tool for designing, analysis and modeling of systolic VLSI-oriented array processors. Starting with a localized data dependence graph of the algorithm S4CAD automatically produces all permissible array processors, among which the designer may chose the optimal one. World Wide Web page of the S4CAD is available at http://www-dpp.u-aizu.ac.jp/HPCC/S4CAD/.

The important question of our research is parallel program transparency and accessibility of massively parallel computers to large user populations. We are developing a multimedia technology for the interactive specification of application algorithms. This technology is related to visualization, animation and sonification of data processing methods. In fact, it is a 'filmalazation' of application algorithms. Each film is based on a rather big peace of knowledge of a class of the algorithms. Our approach is close to the idea of algorithmic skeletons (higher order functions). However, skeletons leave a big gap between reality and a human (a multi-channel being). So, we use algorithmic multimedia skeletons based on both mathematical and physical abstractions.

Each film is a series of frames (computational steps) displaying one or more parameterized sets of nodes and/or moving objects in multi-dimensional space-time. Each frame highlights a subset of these nodes and/or moving objects. The films have their own sets of figures, sounds, colors and their own shape'' of the movement. Users recognize them due to a combination of all these features. Each film defines a partial order of scanning of the nodes or objects (and possibly, colors and sounds). As a rule, computation specified on different nodes (objects) of a frame is considered to be performed in parallel. Computation specified in different frames is performed sequentially. So, it is possible to say: the shorter film the better.

The user defines the specification by creating his new film. The corresponding program (sequential or parallel) is generated automatically. Each film itself is considered as a special-purpose application for which a parallel program is developed in advance as an item of the system library. The specification of a problem is considered as input data for such a parallel program. However, this data can introduce a composition of a few films, an irregularity of computation, special sizes of data, etc. It means that to reach efficiency it is impossible to have only one variant of the film implementation. So, each film is supported at least by a few parallel programs from the system library. Different programs are related to different conditions being defined by the specification. In other words, a film has a special space of decision-making. This space takes into account levels of problem granularity, irregularity, dynamics of the data decomposition, and some others related to algorithm features.

In fact, we are developing film machines where data, knowledge and algorithms (as well as results) are specified by films. Distributed film databases can be considered as active common knowledge/software being developed by humanity.

Refereed Journal Papers

1. Qian-Ping Gu and Shietung Peng. Fault tolerant routing in toroidal networks. IEICE Trans. on Information and Systems, to appear, 1996.

In this paper, we study the following node-to-node and node-to-set routing problems in $r$-dimensional torus $T^r_n$ with $r\geq 2$ and $n\geq 4$ nodes in each dimension: given at most $2r-1$ faulty nodes and non-faulty nodes $s$ and $t$ in $T^r_n$, find a fault-free path $s\rightarrow t$; and given at most $2r-k$ faulty nodes and non-faulty nodes $s$ and $t_1,\ldots,t_k$ in $T^r_n$, find $k$ fault-free node-disjoint paths $s\rightarrow t_i$, $1\leq i\leq k$. We give an $O(r^2)$ time algorithm which finds a fault-free path $s\rightarrow t$ of length at most $d(T^r_n)+1$ for the node-to-node routing, where $d(T^r_n)$ is the diameter of $T^r_n$. For node-to-set routing, we show an $O(r^3)$ time algorithm which finds $k$ fault-free node-disjoint paths $s\rightarrow t_i$, $1\leq i\leq k$, of length at most $d(T^r_n)+1$. The upper bounds on the length of the found paths are optimal. From this, Rabin diameter of $T^r_n$ is $d(T^r_n)+1$.

2. Qian-Ping Gu and Shietung Peng. Linear time algorithms for fault tolerant routing in hypercubes and star graphs. IEICE Trans. on Information and Systems, E78-D:1171--1177, 1995.

In this paper, we study the following node-to-node fault tolerant routing problem: In the presence of up to $n-1$ faulty nodes, find a fault-free path which connects any two non-faulty nodes $s$ and $t$ in an $n$-connected graph. For node-to-node fault tolerant routing in $n$-dimensional hypercubes $H_n$, we give an algorithm which finds a fault-free path $s\rightarrow t$ of length at most $d(s,t)+2\lceil\log \frac{n}{d(s,t)}\rceil$ in $O(n)$ time, where $d(s,t)$ is the distance between $s$ and $t$. We also show that a fault-free path $s\rightarrow t$ in $H_n$ of length at most $d(s,t)+2i$, $1\leq i< \lceil\log \frac{n}{d(s,t)}\rceil$, can be found in $O(d(s,t)\frac{n}{2^{i-1}}+n)$ time. For node-to-node fault tolerant routing in $n$-dimensional star graphs $G_n$, we give an algorithm which finds a fault-free path $s\rightarrow t$ of length at most $\min\{d(G_n)+3,d(s,t)+6\}$ in $O(n)$ time, where $d(G_n)=\lfloor \frac{3(n-1)}{2}\rfloor$ is the diameter of $G_n$. It is previously known that, in $H_n$, a fault-free path $s\rightarrow t$ of length at most $d(s,t)$ for $d(s,t)=n$ and at most $d(s,t)+2$ for $d(s,t)< n$ can be found in $O(d(s,t)n)$ time, and in $G_n$, a fault-free path $s\rightarrow t$ of length at most $\min\{d(G_n)+1,d(s,t)+4\}$ can be found in $O(d(s,t)n)$ time. When the time efficiency of finding the routing path is more important than the length of the path, the algorithms in this paper are better than the previous ones.

3. Qian-Ping Gu and Shietung Peng. Node-to-node cluster fault tolerant routing in star graphs. Information Processing Letters, 56:29--35, 1995.

This paper defines a new fault tolerant routing model, cluster fault tolerant routing, and gives the solutions for node-to-node cluster fault tolerant routing in star graphs.

4. Qian-Ping Gu and Shietung Peng. An efficient algorithm for node-to-node routing in hypercubes with faulty clusters. The Computer Journal, 39:14--19, 1996.

In this paper, we study node-to-node fault tolerant routing problem in $n$-dimensional hypercubes $H_n$ based on the cluster fault tolerant model. For a graph $G$, a fault cluster is a connected subgraph of $G$ such that all its nodes are faulty. In cluster fault tolerant routing problems, how many fault clusters and how large of those clusters can be tolerated are studied. It was proved that for node-to-node routing, $H_n$ can tolerate as many as $n-1$ fault clusters of diameter at most 1 with at most $2n-3$ fault nodes in total. In this paper, we give an algorithm which, given at most $n-1$ fault clusters of diameter at most 1 with $2n-3$ fault nodes in total and non-fault nodes $s$ and $t$ in $H_n$, finds a fault-free path $s\rightarrow t$ of length at most $n+2$ in $O(n)$ time. The upper bounds on the time complexity of the algorithm and the length of the fault-free path in this paper are optimal.

5. Qian-Ping Gu and Shietung Peng. Set-to-set fault tolerant routing in star graphs. IEICE Trans. on Information and Systems, E79-D:282--289, 1996.

In this paper, we give an algorithm which, given a set $F$ of at most $(n-1)-k$ faulty nodes, and two sets $S=\{s_1,\ldots,s_k\}$ and $T=\{t_1,\ldots,t_k\}$, $1\leq k\leq n-1$, of non-faulty nodes in $n$-dimensional star graphs $G_n$, finds $k$ fault-free node disjoint paths $s_i\rightarrow t_{j_i}$, where $(j_1,\ldots,j_k)$ is any permutation of $(1,\ldots,k)$, of length at most $d(G_n)+5$ in $O(kn)$ optimal time, where $d(G_n)=\lfloor \frac{3(n-1)}{2} \rfloor$ is the diameter of $G_n$.

6. Qian-Ping Gu, Satoshi Okawa, and Shietung Peng. Set-to-set fault tolerant routing in hypercubes. IEICE Trans. on Fundamentals, E79-A:483--488, 1996.

In this paper, we give an algorithm which, given a set $F$ of at most $n-k$ faulty nodes, and two sets $S=\{s_1,\ldots,s_k\}$ and $T=\{t_1,\ldots,t_k\}$, $1\leq k\leq n$, of non-faulty nodes in $n$-dimensional hypercubes $H_n$, finds $k$ fault-free node disjoint paths $s_i\rightarrow t_{j_i}$, where $(j_1,\ldots,j_k)$ is a permutation of $(1,\ldots,k)$, of length at most $n+k$ in $O(kn\log k)$ time. The result of this paper implies that $n$ disjoint paths of length at most $2n$ for set-to-set node disjoint path problem in $H_n$ can be found in $O(n^2\log n)$ time.

7. A. Kondratyev, M. Kishinevsky, A. Taubin, and S. Ten. Analysis of petri nets by ordering relations in reduced unfoldings. Formal Methods in System Design, page (paper is accepted for publication and has to appear in 1996), 1996.

A method of Petri Nets analysis based on reduced unfoldings is proposed. No restrictions are imposed on the class of general PNs. The new criterion for truncating the occurrence net significantly reduces the size of unfolding obtained by PN. We apply this method for checking implementability of Petri Nets with speed-independent ciruits. Since analysis is done by ordering relations rather than by traversal of states the state explosion problem is avoided.

Refereed Proceeding Papers

1. T. Ikedo and N. Mirenkov. Aizu supercomputer project. In M. H. Hamza, editor, Proceedings of the 7-th IASTED International Conference on Parallel and Distributed Computing and Systems, pages 433--438, Washington, DC, USA, October 1995. IASTED-ASTRA PRESS.

This paper describes a project of a massively parallel system related to Aizu's multimedia center. The system employs a highly parallel MIMD architecture using a conflict-free communication system as well as special-purpose units for graphic and sound processing and image buffering which support high-speed input-output operations. The scalable communication system is comprised by a few networks using electronic and optical links. The idea behind the project software is making use of animation films as communication units for computer-human dialog. Each film is related to series of frames (computational steps) and reflects some knowledge about data processing. Each frame highlights a substructure of data for which operations should be specified. As a rule, this substructure is a set of points and/or moving objects in a multi-dimensional space-time.

2. N. Mirenkov and T. Mirenkova. A vim film for cellular automation-like algorithms. In H. R. Arabnia, editor, Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, pages 616--625, Athens, Georgia, USA, November 1995. CSREA.

The idea behind VIM technology is the use of animation films for the interactive specification of application algorithms as well as for very-high-level parallel programming. Each film is a shape'' of computation (an algorithmic multimedia skeleton). It defines sets of points and/or moving objects in multi-dimensional space-time, and a partial order of scanning of these points and objects. This paper presents one such film related to cellular automation-like algorithms. Each frame of the film shows a few configurations and/or spatial substitutions highlighting the points where computation should be specified and/or color should be changed. To specify a problem the user employs one-click operations to delete, insert or merge frames and possibly provides some formulas for computation in the points. The corresponding program (sequential or parallel) is generated automatically.

3. N. Mirenkov and T. Mirenkova. Vim film system. In Proceedings of the 7-th International Conference on Tools with Artificial Intelligence, pages 49--54, Herndon, Virginia, USA, November 1995. IEEE CS Press.

We report here on an experimental version of the VIM film system that allows making use of animation films for the interactive specification of application algorithms as well as for the very-high-level parallel programming. Each film is a shape'' of computation (an algorithmic multimedia skeleton). It supports a set of parameterized computational schemes on a parameterized multi-dimensional structure. To specify a problem and corresponding task for a computer the user employs: modifications of the system films, compositions of these films, and seeing a film being created. The system can be used for the sequential and parallel programming.

4. N. Mirenkov and T. Mirenkova. Multimedia skeletons and filmification of methods. In Proceedings of the First International Conference on Visual Information Systems, Melbourne, Australia, pages 58--67, Melbourne, Victoria, Australia, February 1996. VUT.

The VIM system project is related to the development of a problem solving environment which integrate multimedia units for the specification (programming) of application algorithms, experiment control strategies, and visualization of data generated during the experiment. These multimedia units are special-purpose animation films where movement, color, sound and text each describe some aspect of algorithms, strategies, and visualization of data. This paper considers briefly a basis of the VIM system approach and describes the management of the films.

5. N. Mirenkov and T. Mirenkova. 1-d functions for 3-d computation. In A.M. Tentner, editor, Proceedings of the International Symposium on High-performance Computing, pages 232--237, New Orleans, USA, April 1996. The Society for Computer Simulation.

An approach based on a fundamental feature of continuous functions is proposed for consideration. It allows reducing a great part of the 3-D computation to the 1-D computation. 1-D arrays and tables can be evaluated in advance or before the main computation. Compositions of values of these arrays and tables are used for the evaluation of 3-D functions with hierarchic computational structures. The approach is supported by a set of optimization techniques to improve making use of the internal parallelism of a processor as well as the parallelism of multiple processors. To illustrate the approach some graphic computation problems are considered.

6. N. Mirenkov and T. Mirenkova. Film databases and film machines. In Proceedings of the 1996 Pacific Workshop on Distributed Multimedia Systems, Hong Kong, June 1996.

A concept of film databases as a core of film machines is proposed for consideration. Items of such databases are special-purpose animation or video films. These films are considered as units for data/knowledge acquisition and as communication units for computer-human dialog. Operations with data/knowledge units as well as retrieval operations on databases are specified by films, too. This paper presents briefly a basis of our approach and describes a film database being developed for specification (programming) application algorithms. Some details of a few films related to image processing and cellular automation-like methods are also discussed.

7. N. Mirenkov and T. Mirenkova. A vim film for linear algebra algorithms. In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, Sunnyvale, California, USA, August 1996. CSREA.

The idea behind the VIM technology is making use of animation films for the interactive specification of application algorithms as well as for the very-high-level parallel programming. Each film is a shape'' of computation (an algorithmic multimedia skeleton). It defines sets of points and/or moving objects in multi-dimensional space-time, and a partial order of scanning of these points and objects. This paper presents one such a film related to linear algebra algorithms. Each frame of the film shows parameterized 2-D and 1-D structures and highlights subsets of nodes where computation should be defined. To specify a problem the user employs one-click operations to delete, insert or merge frames and provides some formulas for computation on the nodes. The corresponding program (sequential or parallel) is generated automatically.

8. Qian-Ping Gu and Shietung Peng. $k$-pairwise cluster fault tolerant routing in star graphs. In Proc. of the 11th International Conference of Systems Engineering (ICSE'96), 1996. Recently, cluster fault tolerant (abbreviated as CFT in what follows) routing model which is a natural extension of the well studied node fault tolerant routing was introduced. For a graph $G$, a cluster $C$ is a connected subgraph of $G$, and $C$ is called fault cluster if all nodes in $C$ are faulty. In CFT routing, the number of fault clusters and the size of the fault clusters that can be tolerated are studied. In this paper, we study the following $k$-pairwise CFT routing problem in $n$-dimensional star graphs $G_n$: Given a set ${\bf F}$ of fault clusters and $k$ pairs of $2k$ distinct non-fault nodes $(s_1,t_1),\ldots,(s_k,t_k)$ in $G_n$, find $k$ fault-free node disjoint paths that pairwisely connect, one path for one pair, the $k$ node pairs. We show in this paper that for $1\leq k\leq \lceil\frac{n-1}{2}\rceil$, $G_n$ can tolerate as many as $n-2k$ fault clusters of diameter at most 2. A cluster of diameter 2 in $G_n$ has as many as $n$ nodes. The result of this paper shows that for $k$-pairwise CFT routing, star graphs $G_n$ can tolerate as many as $n(n-2k)$ fault nodes under certain conditions. We also give $O(n^2)$ optimal time algorithms which, given a tolerable set ${\bf F}$, find $k$ fault-free node disjoint paths of length at most $d(G_n)+8$ for $k$-pairwise CFT routing problem in $G_n$, where $d(G_n)=\lfloor \frac{3(n-1)}{2}\rfloor$ is the diameter of $G_n$.

9. Qian-Ping Gu and Shietung Peng. Finding $d$-separated paths in hypercubes. In Proc. of the 11th International Conference of Systems Engineering (ICSE'96), 1996.

Two paths $P$ and $Q$ in a graph $G$ are $d$-separated if for any node $u$ in $P$ and any node $v$ in $Q$ except possibly the common end-nodes, the distance between $u$ and $v$ is at least $d$. In this paper, we study the following node-to-node routing problem in hypercubes: (1) finding 2-separated paths in free space (without obstacles); (2) finding disjoint paths with path-shaped obstacles. We give algorithms to construct the 2-separated/disjoint paths for node-to-node routing problems (1) and (2), and show that the numbers of paths constructed by our algorithms are maximum in the worst case.

10. Qian-Ping Gu and Shietung Peng. Set-to-set disjoint paths in hypercubes. In Proc. of 1996 International Conference on Parallel and Distributed Systems (ICPAD'96), IEEE Computer Society, 1996.

Set-to-set node-disjoint paths problem is that given two sets $S=\{s_1,\ldots,s_k\}$ and $T=\{t_1,\ldots,t_k\}$ of nodes in a graph, find $k$ node disjoint paths $s_i\rightarrow t_{j_i}$, where $(j_1,\ldots,j_k)$ is any permutation of $(1,\ldots,k)$. For general undirect graphs $G(V,E)$, this problem is usually solved by applying flow techniques which take Poly($|V|$) time. In this paper, we give an algorithm which, given $S=\{s_1,\ldots,s_k\}$ and $T=\{t_1,\ldots,t_k\}$, $1\leq k\leq n$, in an $n$-dimensional hypercube $H_n$ which has $2^n$ nodes, finds the $k$ disjoint paths $s_i\rightarrow t_{j_i}$ of length at most $n+\log k+2$ in $O(kn\log^* k)$ time. The previous results of the length of the $k$ paths and the run time of finding the $k$ paths in $H_n$ are $n+k$ and $O(kn\log k)$, respectively.

11. Qian-Ping Gu and Shietung Peng. Optimal algorithms for node fault tolerant routing in hypercubes. In Proc. of 1996 Joint Symposium on Parallel Processing (JSPP'96), 1996.

In this paper, we give an algorithm which, given at most $n-1$ arbitrary faulty nodes and non-faulty nodes $s$ and $t$ in the $n$-dimensional hypercube $H_n$, finds a fault-free path $s\rightarrow t$ of length at most $d(s,t)+2$ for $d(s,t)< n$ and of length $d(s,t)$ for $d(s,t)=n$ in $O(n)$ time, where $d(s,t)$ is the distance between $s$ and $t$. The previous results on the above problem are that\\ (1) Path length: $d(s,t)+2$ for $d(s,t)< n$ and $d(s,t)$ for $d(s,t)=n$. The time complexity: $O(n\log n)$.\\ (2) Path length: $d(s,t)+2\lceil\log\frac{n}{d(s,t)}\rceil$. The time complexity: $O(n)$.\\ We also give an algorithm which, given at most $n-1$ faulty clusters of diameter at most 1 with $2n-3$ faulty nodes in total and non-faulty nodes $s$ and $t$ in $H_n$, finds a fault-free path $s\rightarrow t$ of length at most $d(s,t)+4$ in $O(n)$ time, where a faulty cluster is a connected subgraph of $H_n$ such that all its nodes are faulty. The upper bounds on the length of the path and the time complexity obtained in this paper are optimal.

12. Qian-Ping Gu and Shietung Peng. $d$-separated paths in hypercubes and star graphs. In Proc. of the 2nd IEEE International Conference on Algorithms and Architectures for Parallel Processing (ICA$^3$PP'93), IEEE Computer Society, 1996.

In this paper, we consider a generalized node-disjoint paths problem: $d$-separated paths problem. In a graph $G$, given two distinct nodes $s$ and $t$, two paths $P$ and $Q$, connecting $s$ and $t$, are $d$-separated if $d_{G-\{s,t\}}(u,v)\geq d$ for any $u\in P-\{s,t\}$ and $v\in Q-\{s,t\}$, where $d_{G-\{s,t\}}(u,v)$ is the distance between $u$ and $v$ in the reduced graph $G-\{s,t\}$. $d$-separated paths problem is to find as many $d$-separated paths between $s$ and $t$ as possible. When $d=1$, $d$-separated paths problem becomes conventional node-to-node disjoint paths problem. In this paper, we give the following results on $d$-separated paths problems on $n$-dimensional hypercubes and star graphs $H_n$ and $G_n$. Given $s$ and $t$ in $H_n$, there are at least $(n-2)$ 2-separated paths between $s$ and $t$. $(n-2)$ is the maximum number of $2$-separated paths between $s$ and $t$ for $d(s,t)\geq 4$. Moreover, $(n-2)$ 2-separated paths of length at most $\min\{d(s,t)+2$, $n+1\}$ for $d(s,t)< n$ and of length $n$ for $d(s,t)=n$ between $s$ and $t$ can be constructed in $O(n^2)$ optimal time. For $d\geq 3$, $d$-separated paths in $H_n$ do not exist. Given $s$ and $t$ in $G_n$, there exist exactly $(n-1)$ 3-separated paths between $s$ and $t$. For $1\leq d\leq 3$, $(n-1)$ is the maximum number of $d$-separated paths between $s$ and $t$ in $G_n$. $(n-1)$ 3-separated paths of length at most $\min\{d(s,t)+4, d(G_n)+2\}$ between $s$ and $t$ can also be constructed in $O(n^2)$ optimal time, where $d(G_n)=\lfloor \frac{3(n-1)}{2}\rfloor$. For $d\geq 5$, $d$-separated paths in $G_n$ do not exist.

13. Qian-Ping Gu and Shietung Peng. An efficient algorithm for $k$-pairwise node-disjoint paths problem in hypercubes. In Proc. of the 7th IEEE Symposium on Parallel and Distributed Processing (SPDP'95), IEEE Computer Society, October 1995.

In this paper, we give an efficient algorithm for the following $k$-pairwise node disjoint path problem in $n$-dimensional hypercubes $H_n$: Given $k=\lceil \frac{n}{2}\rceil$ pairs of $2k$ distinct nodes $(s_1,t_1),\ldots,(s_k,t_k)$ in $H_n$, $n\geq 4$, find $k$ node disjoint paths, each path connecting one pair of nodes. Our algorithm finds the $k$ node disjoint paths in $O(n^2\log^* n)$ time which improves the previous result of $O(n^2\log n)$. The length of the paths constructed in our algorithm is at most $n+\lceil \log n\rceil +1$ which improves the previous result of $2n$ as well. The result of this paper shows that the $k$-pair-diameter %and $k$-set-diameter of $H_n$ satisfy $d^P_{\lceil\frac{n}{2}\rceil}(H_n)$ of $H_n$ satisfies $d^P_{\lceil\frac{n}{2}\rceil}(H_n)\leq n+\lceil\log n\rceil+1$.

14. Qian-Ping Gu and Shietung Peng. Node-to-node cluster fault tolerant routing in star networks. In Proc. of 1995 Joint Symposium on Parallel Processing (JSPP'95), June 1995.

In this paper, we study node-to-node fault tolerant routing problem in $n$-dimensional star graphs $G_n$ based on the cluster fault tolerant model which is a natural extension of the well studied node fault tolerant model. For a graph $G$, a cluster $C$ is a connected subgraph of $G$, and $C$ is called fault cluster if all nodes in $C$ are faulty. In cluster fault tolerant routing, the number of fault clusters and the size of the fault clusters that can be tolerated are studied. For node-to-node fault tolerant routing problem in $G_n$, we prove that $G_n$ can tolerate as many as $n-2$ arbitrary fault clusters of diameter at most 2. This result shows that star graphs $G_n$ can tolerate as many as $n(n-2)\approx (\log N/\log\log N)^2$ fault nodes, where $N=n!$ is the number of nodes in $G_n$, if the fault nodes can be partitioned into clusters. We also give an $O(n^2)$ optimal time algorithm which finds a fault-free routing path of length at most $d(G_n)+7$ for node-to-node cluster fault tolerant routing problem in $G_n$, where $d(G_n)! =\lfloor \frac{3(n-1)}{2}\rfloor$ is the diameter of $G_n$.

15. Qian-Ping Gu and Shietung Peng. Node-to-set routing with optimal path length in star graphs. In Proc. of International Conference on Parallel and Distributed Computing, September 1995.

Given a node $s$ and a set $T=\{t_1,\ldots,t_k\}$ of $k$ nodes in a $k$-connected graph $G$, node-to-set routing is to find $k$ node disjoint paths $P_i: s\rightarrow t_i,~1 \leq i \leq k$. In this paper, we give $O(n^2)$ time algorithms for node-to-set routing in $n$-dimensional star graphs $G_n$ which is $(n-1)$-connected. Our algorithms find $n-1$ node disjoint paths of length at most $d(G_n)+1$ for $n\neq 4,6$ and at most $d(G_n)+2$, otherwise, for node-to-set routing in $G_n$, where $d(G_n)=\lfloor \frac{3(n-1)}{2}\rfloor$ is the diameter of $G_n$. The optimal upper bound on the length of $k$ disjoint paths for node-to-set routing in a $k$-connected graph $G$ is known as {\it Rabin diameter} of $G$. In this paper, we also prove that Rabin diameter of $G_n$ is $d(G_n)+1$ for $n\neq 4,6$ and $d(G_n)+2$, otherwise.

16. Qian-Ping Gu and Shietung Peng. Finding a routing path of optimal length in hypercubes with fault clusters. In Proc. of the 7th IASTED International Conference on Parallel and Distributed Computing and Systems, October 1995.

In this paper, we give a linear time algorithm which finds a fault-free routing path of optimal length for node-to-node cluster fault tolerant routing in hypercubes. In cluster fault tolerant routing, the routing problem in a graph $G$ is considered under the situation that a certain number of subgraphs of $G$ failed. A cluster is a connected subgraph of $G$ and a cluster is faulty if all nodes in it are faulty. It was proved that for node-to-node routing, an $n$-dimensional hypercube $H_n$ can tolerate as many as $n-1$ fault clusters of diameter at most 1 with at most $2n-3$ fault nodes in total. Given a tolerable set of fault clusters and two distinct non-fault nodes $s$ and $t$ in $H_n$, it was proved that the optimal upper bound of the length of a fault-free path connecting $s$ and $t$ is $n+2$ and the fault-free path of length at most $n+2$ can be found in $O(n^2)$ time. In this paper, we present an $O(n)$ time algorithm which finds a fault-free path of length at most $n+2$ that connects $s$ and $t$.

17. S. Peng, S. Sedukhin, and I. Sedukhin. Design of optimal systolic arrays for 2-d discrete fourier transform. In M. Shimasaki and H. Sato, editors, Proceedings of the International Symposium on Parallel and Distributed Supercomputing (PDSC'95), pages 61--69. Computer Center of the Kyushu University, Japan, September 1995.

The design of systolic array processors for computing 2-dimensional DFT is investigated. We proposed three different schemes for designing systolic arrays using systematic approach. The systematic approach guarantees to find an optimal systolic array from large solution space in terms of the number of processing elements and I/O channels, the processing time, topology, pipeline period, etc. The optimal systolic array processors are scalable, modular and suitable for VLSI-realization.

18. S. Sedukhin. An algorithm and array processors for solving the systems of linear equations. In H. R. Arabnia, editor, Proceedings of Int. Conf. on Parallel and Distributed Processing Techniques and Applications -- PDPTA'95, pages 307--316, CSREA, 115 Avalon Drive, Athens, GA 30606, USA, November 1995. The University of Georgia, USA, CSREA Press.

The design of optimal array processors for fast and stable solution of system of the linear equation is presented. As against well-known solutions this array processors based on the numerical stable algorithm which performs the diagonalization of initial matrix without the division operations. Among many admissible designs the triangular array processor derived is optimal in terms of minimal number of processing elements and total computation time, pipelining period and I/O ports.

19. S. Peng and S. Sedukhin. Systolic algorithms/architectures for division-free linear system solving. In Proceedings of the IEEE Second Intern. Conference on Algorithms $\&$ Architectures for Parallel Processing (ICA3PP-96), 11-13 June, 1996, Singapore (accepted). IEEE Singapore Section, IEEE Computer Society Press, June 1996.

The design of systolic array processors for solving linear systems of equations using division-free algorithm is presented. The design is based on a systematic approach which constructs array processors systematically by first investigating parallel versions of the division-free algorithm and their three-dimensional graphs, and then generating the planar systolic arrays by projecting the dependency graph along properly selected directions.

20. S. Peng, S. Sedukhin, and I. Sedukhin. Parallel algorithm and architecture for two-step division-free gaussian elimination. In Proceedings of the International Conference on Application-specific Systems, Architectures and Processors (ASAP'96), 19-21 August, 1996, Chicago, Illinois, USA. IEEE Computer Society, August 1996.

The design of parallel algorithms and architectures for solving linear systems using two-step division-free Gaussian elimination method is considered. The two-step method circumvents the ordinary single-step division-free method by its greater numerical stability. In spite of the rather complicated computations needed at each iteration of the two-step method, we develop first an innovative regular iterative algorithm, then a two-dimensional array processor by deriving a localized dependency graph of the algorithm and adopting a systematic approach to investigate the set of all admissible solutions and obtain the optimal architecture under linear scheduling. The optimal array processor improves the previous systolic designs based on the widely used Gaussian elimination in term of numerical stability and the time-space complexity for VLSI implementation because of absence of division operations.

21. S. Sedukhin and I. Sedukhin. Parallel rendering with the network linda system. In H. R. Arabnia, editor, Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'96), August 9-11, 1996, Sunnyvale, California, USA (accepted), CSREA, 115 Avalon Drive, Athens, GA 30606, USA, August 1996. CSREA Press.

We proposed a parallel distributed solution of a compute-intensive problem: rendering of the functionally represented geometric objects. The Network Linda programming tool was used to distribute rendering on the network of workstations of the University of Aizu. The speedup results and the load-balancing technique, applied to complex objects, are studied.

22. S. V. Ten, A. V. Savchenko, A. A. Pasko, and V. V. Savchenko. Computer assisted animation of volumetric objects. In Proceedings SPIE v. 2656: Visual Data Exploration and Analysis III, pages 312--319. SPIE, February 1996.

Generation of animation sequence of a deformed volumetric object on networked workstations is discussed. We use several deformation types: space mapping controlled by points linked to features of an object, deformation with algebraic sum and metamorphosis. These deformations are directly applied to interpolated volume data with following polygonization of an isosurface for visualization.

23. A. Kondratyev, M. Kishinevsky, A. Taubin, and S. Ten. A structural approach for the analysis of petri nets by reduced unfoldings. In Accepted for publication in Proc. of International Conference on Application and Theory of Petri Nets, Osaka, Japan, 1996.

This paper suggests a way for Petri Net analysis by checking the ordering relations between places and transitions. The method is based on unfolding the original net into an equivalent acyclic description. In anunfolding the ordering relations can be determined directly by the structure of an underlying graph. The PN properties for analysis can be various: boundedness, safety, persistency etc. The practical example of the suggested approach is given in application to the asynchronous design. The circuit behavior is specified by an interpreted Petri net, called Signal Transition Graph (STG) which is then analyzed for the implementability by asynchronous hazard-free circuit. The experimental results show that for highly parallel STGs checking the implementability by unfolding is one to two orders of magnitude less time-consuming than checking it by symbolic BDD traversal of the corresponding State Graph.

24. A. Kondratyev, M. Kishinevsky, A. Taubin, and S. Ten. Deadlock prevention using petri net unfoldings. In Accepted for publication in Proc. of Computational Engineering in Systems Application (CESA'96) Conference, France, 1996.

A deadlock prevention procedure first detects deadlocks using the unfolding, then reduces the unfolding to a deadlock-free sub-unfolding, and finally folds the deadlock-free acyclic net into a cyclic net. The method is implemented as a subroutine in the SIS tool and is ftp-available. For optimizing the size and properties of the transformed deadlock-free net we use a Petri Net re-synthesis algorithm available in the tool Petrify.

25. A. Kondratyev, M. Kishinevsky, A. Taubin, and S. Ten. Verification of asynchronous systems based on petri net unfoldings. In Accepted for publication in Proc.of IEICE CST Conference, Japan, May 1996.

We review different methods of unfoldings (McMillan, Esparza, and the ours) and show how to use unfoldings in verification of asynchronous systems.

26. S. Ten, A. Kondratyev, M. Kishinevsky, and A. Taubin. Software tool for analysis of petri nets based on unfolding (second edition). In Accepted for software tools presentation at International Conference on Application and Theory of Petri Nets, Osaka, Japan, May 1996.

A new version of a software tool offering construction of reduced occurence nets and analysis of Petri Nets features is presented. An occurence net can be produced from the original Petri Net using unfolding method using different criteria for truncating the unfolding. This tool provides analysis of the following features of Petri Net: unboundness, safety, persistency, and deadlock detection. It is incorporated into the SIS system. Additionally this tool provides capabilities to use DOTTY graphical package for previewing the results of unfolding and for interactive manipulation with original Petry Net.

Technical Reports

1. Shietung Peng, Stanislav Sedukhin, and Igor Sedukhin. Designs of optimal systolic arrays for 2-d discrete fourier transform. Technical Report, 95-1-031, August 30. 12pgs, The University of Aizu, Aizu-Wakamatsu, Japan, 1995.

2. I.S. Sedukhin, S.G. Sedukhin, A.A. Pasko, V.V. Savchenko, and N.N. Mirenkov. Parallel rendering of functionally represented geometric objects with the network linda system. Technical Report, 95-1-001, January 18, 15pgs, The University of Aizu, Aizu-Wakamatsu, Japan, 1995.

3. Qian-Ping Gu and Shietung Peng. k-pairwise cluster fault tolerant routing in star graphs. Technical Report, 95-1-002, February 21, The University of Aizu, Aizu-Wakamatsu, Japan, 1995.

4. Sergey Ten, Andrey Savchenko, Alexander Pasko, and Vladimir Savchenko. Distributed animation of volumetric objects. Technical Report, 95-1-016, April 7. 11pgs, The University of Aizu, Aizu-Wakamatsu, Japan, 1995.

5. Stanislav G. Sedukhin. An algorithm and array processors for solving the systems of linear equations. Technical Report, 95-1-026, July 3, 13pgs, The University of Aizu, Aizu-Wakamatsu, Japan, 1995.

1. Nikolay N. Mirenkov, The Journal Programming, Nauka, Russia, 1995. Member of Editorial Board.

2. Nikolay N. Mirenkov, The Journal Scientific programming, John Wiley Publisher, USA, 1995. Member of the Editorial Advisory Board.

3. Nikolay N. Mirenkov, the Second Aizu International Symposium Parallel Algorithms/Architecture Synthesis (pAs'97), Aizu-Wakamatsu, Japan, 1996. Chairman of the Programming Committee and Member of the Organizing Committee.

4. Nikolay N. Mirenkov, ACM, IEEE and The New York Academy of Sciences, 1995. Member.

5. Nikolay N. Mirenkov, American Mathematical Society, 1995. Reviewer.

6. Nikolay N. Mirenkov, Nov., 1995. Invited lecturer at the New York Academy of Sciences.

7. Nikolay N. Mirenkov, International Conference ISPAN'96, Beijin, China, 1996. Member of the Program Committee.

8. Nikolay N. Mirenkov, International Conference PaCT95, S. Petersburg, Russia, 1995. Member of the Program Committee.

9. Stanislav G. Sedukhin, the Second Aizu International Symposium Parallel Algorithms/Architecture Synthesis (pAs'97), Aizu-Wakamatsu, Japan, 1996. Member of the Organizing and Program Committees.

10. Stanislav G. Sedukhin, the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'96), 1995. Member of the Program Committee.

11. Stanislav G. Sedukhin, the Third International Conference on Applications for Computer Systems (ACS'96), 1996. Member of Program Committee.

12. Stanislav G. Sedukhin, the International Conference on Parallel and Distributed Systems (ICPADS'96), 1995. Referee.

13. Stanislav G. Sedukhin, the Parallel Processing Letters International Journal, June 1995. Referee.

14. Stanislav G. Sedukhin, the Computers and Artificial Intelligence International Journal, August 1995. Referee.

15. Stanislav G. Sedukhin, the International Journal Parallel Processing Letters (World Scientific Publ., Singapore), 1995. Member.

16. Stanislav G. Sedukhin, the IEEE and IEEE Computer Society, 1995. Member.

17. Stanislav G. Sedukhin, the ACM, 1995. Member.

18. Stanislav G. Sedukhin, the SIAM, 1995. Member.

19. Stanislav G. Sedukhin, the SIAM Group on Supercomputing, 1995. Member.

20. Stanislav G. Sedukhin, the International Conferences on Computer-Aided Design of Discrete Devices (CAD DD'95), Minsk, Belarussia, 1995. Member of Program Committee.

21. Sergey V. Ten, the Async'96, Aizu-Wakamatsu, Japan, March 1996. Chairman of the Tools Presentation section.

Next: Operating Systems Laboratory Up: Department of Computer Previous: Language Processing Systems

www@u-aizu.ac.jp
October 1996