/ Edward A. Billard / Assistant Professor
/ Alice Riedmiller / Research Associate
The Operating Systems Laboratory is participating in both top-down education and state-of-the-art research.
In education, the final phase of development of a workstation-based laboratory has been completed. The laboratory has a special layout for programmers to share 19 Sun-based workstations and 4 X terminals, connected to the laboratory file server via a network. In addition, a multimedia Macintosh has a scanner, video camera, video recorder, and television monitor.
An office for students has been built next to the laboratory for lab assistants, lab members, and visiting students. The lab assistants are exceptional students who help with lab software and exercises; the lab members are students who have made significant progress on their projects; and the visiting students are from other laboratories who are doing temporary work in the Operating Systems Lab. The students use the office to work on special projects for the laboratory and for general homework. Many of these students are scheduled to act as teaching assistants for the Operating Systems course. They have been trained on special new software, developed in the lab, that demonstrates concurrent programming---an important aspect of the course.
In addition to concurrent programming, the laboratory has developed special simulation tools for students to learn the practical and theoretical aspects of queueing systems. The software is written in a new high-level script language that allows the rapid-prototyping of software, particularly educational software. It is expected that this script language will allow the laboratory to develop new, and better, software for educational purposes.
Students in SCCP have been invited into the laboratory to test and examine the new software; it is apparent that students can quickly grasp the essentials of queueing networks using a graphical user interface. All of the students have finished an initial set of 10 queueing exercises and have started on 20 operating system exercises. The advanced students are designing their own experiments and/or writing their own graphical user interfaces. This has prepared the students for future course work and research in both operating systems and performance evaluation.
In research, three new refereed journal articles have been accepted for publication this year (and two previously accepted journal articles appeared this year). The members of the lab also had four new refereed conference papers and an interesting paper in ACM Software Engineering Notes on the use of Tcl/Tk in the rapid development of graphical user interfaces.
The major themes of the research are stability, adaptation, and group formation under the constraint of delayed information. The general goal of the research in the laboratory is to establish principles that improve the performance of highly distributed systems. The systems require adaptability, reliability, and expandability; the solutions are characterized by self-organizing and learning behaviors. The agents, or controllers, in the system make autonomous decisions and are expected to maintain reasonable performance, even as the system operates under aged or misleading information or evolves into unforeseen environments.
Some of the topics covered are the
The goal of the laboratory is to continue the educational development of students and to continue the formulation of general models of distributed systems.
Refereed Journal Papers
Distributed decision makers are modeled as players in a game with two levels. High level decisions concern the game environment and determine the willingness of the players to form a coalition (or group). Low level decisions involve the actions to be implemented within the chosen environment. Coalition and action strategies are determined by probability distributions which are updated using learning automata schemes. The payoffs are also probabilistic and there is uncertainty in the state vector since information is delayed. The goal is to reach equilibrium in both levels of decision making. The results show the conditions for instability, based on the age of information, and a trade-off between optimality and stability.
Large distributed computing systems provide the opportunity for the synergistic use of shared resources, an example model is computational ecosystems. However, as networks continue to expand, it is not obvious that all of the agents within a network should participate in one interacting set. Although smaller groups may foreit potential synergy, they also avoid much of the complexity and overhead of coordination. The nature of the dynamics is complicated by the inherent delays in networks and the state update schemes applied by the agents. Large intelligent systems must provide for 1) the participation of agents in a smaller group and/or 2) the participation of agents in the system-wide group only a fraction of the time. Both of these behaviors are modeled by a system of differential equations with a discrete distribution of time delays. Stability analysis determines the boundaries for damped and persistent oscillations based on the average of the discrete time delays. In large systems, there is a fixed ratio between these boundaries: $e\pi/2$. The results also show that stability improves with diversity in the delays.
Players in a Prisoner's Dilemma are modeled as learning automata that receive feedback from the environment and coadaptively adjust their strategies. Theory and simulations show the coevolutionary dynamics of the reward-inaction and reward-penalty schemes. The players are assumed to be physically distributed or, at least, in an environment where the effects of decisions are lagged. These systems include biological and social systems with constraints on instantaneous information or where environmental responses do not necessarily reflect the true state of the system. Linear stability analysis determines the conditions for persistent oscillations in the players' mixed strategies. Using a parameterized stochastic version of the dilemma, the results indicate that if the environment modifies the payoffs, and thus ``releases'' the prisoners from their dilemma, the prisoners become prone to instabilities in their strategies given sufficient delays. Again, the prisoners fail to coordinate their actions.
In Automonous Decentralized Systems (ADS), there is a high degree of uncertainty, especially in global state information and the performance and reliability of various resources. We model a distributed computer system of local and shared global servers, each of which can process jobs. The local servers insure a degree of autonomy and the global servers provide added computational power and redundancy. When a job arrives, an agent decides whether to schedule the job locally, or whether to ship it to a global server. The agents have access to state information concerning current loads on the global servers and the response times of completed jobs, but because of decentralization this information is delayed, and perhaps lost, along communication channels. The agents must make good decisions under the constraint that service rates are unknown and dynamic. We evaluate a deterministic algorithm and two randomizing algorithms and show that the degree of uncertainty determines which one is the most successful.
A formal approach integrates the principles of optimality, stability, and the level of communication necessary in a decentralized system which has a high degree of uncertainty. The uncertainties are in state, strategy, and gain which arise from delays in communication, probabilistic strategies, and stochastic gains, respectively. It is assumed that agents within the system establish individual goals for their probabilistic strategies and that each agent autonomously pursues its own goal, which then may change due to multi-agent interactions. The resultant dynamical equation is analyzed using techniques for delay differential equations. The results show that the parameters that lead the system to optimality also make the system more susceptible to delay-initiated instabilities, with the theory predicting the boundary between stability and instability. Agents which communicate sufficiently can avoid instabilities and maintain their search for optimality.
A model is presented of the dynamics of many controllers in a distributed system, each controlling one element of a vector and, as a system, trying to optimize a function of the vector. It is assumed that the controllers apply an adaptive method to achieve the optimal result. The optimization is challenging since the controllers have delayed information concerning the individual elements of the control vector. The potential exists for each element to be viewed with its own unique delay. Linear stability analysis provides an upper bound of the conditions for damped oscillations and a lower bound on the conditions for persistent oscillations. The results indicate that controllers can improve performance with respect to stability by 1) communicating more frequently, 2) interacting in smaller groups, 3) slowing down the adaptive search process, and/or 4) creating more diversity in the delays.
In distributed databases, load sharing across transaction processors can improve performance, however, some transaction classes may have site affinities, or locality preferences, with respect to the data accessed by the transactions. With the load balanced, the performance can degrade due to accessing data at remote sites. In this study, we present a model of the dynamics of such systems. A regression parameter is assumed to describe the optimal balance between load sharing and locality preference. The sites adaptively route transactions to achieve the specified balance under the constraint of obsolete state information. Numerical solutions to the delay differential equation show the effects on processor utilizations based on the regression parameter, the locality preferences, the system utilization, the network size, the adaptive search rate, and the information obsolescence.