Anatomy of a distributed program

A distributed program is built on several layers. At the lowest level, a network connects a group of host computers together so that they can communicate with each other. Network protocols, such as TCP/IP, allow computers to send data to each other over a network, providing the ability to packetize and address data for delivery to another machine. On top of the network protocol, higher-level services can be defined, such as directory services and security protocols. Finally, the distributed program itself runs on top of these layers, using middleware services and network protocols, as well as computer operating systems to perform coordinated tasks on the network.

At the application level, a distributed program can be broken down into the following parts:

Processes

A typical computer operating system on a computer host can run several processes at the same time. A process is created by describing a sequence of steps in a programming language, compiling the program into an executable form, and running the executable on the operating system. During operation, a process has access to computer resources (such as CPU time and input/output devices) through the operating system. A process can be entirely dedicated to a particular program, or several programs can use a single process to perform tasks.

Threads

Each process has at least one control thread. Some operating systems support the creation of multiple control threads in a single process. Each thread in a process can run independently of the other threads, although there is usually some synchronization between them. For example, one thread may monitor input through a socket connection, while another may listen for user events (keystrokes, mouse movements, etc.) and provide feedback to the user through output devices (monitor, speakers, etc.). At some point, the input from the input stream may require feedback from the user. At this point, the two streams will need to coordinate the transfer of input to the user’s attention.

Objects

Programs written in object-oriented languages consist of interacting objects. One simple definition of an object is a group of related data with available methods for querying or modifying the data (getName() , set-Name() ) or for performing certain actions based on the data (sendName(Out-putStream o ) ). A process can consist of one or more objects, and these objects can be accessed by one or more threads in the process. And with the introduction of distributed object technology such as RMI and CORBA, an object can also be logically distributed across multiple processes on multiple computers.

Agents

For the purpose of this book, we will use the term “agent” as a generic way of referring to the important functional elements of a distributed application.
While process, flow, and object are fairly well-defined entities, an agent (at least the definition we will use for this book) is a higher-level system component defined around a particular function, or utility, or role in the overall system. For example, an e-banking program can be broken down into a customer agent, a transactional agent, and an information brokerage agent. Agents can be distributed across multiple processes and consist of multiple objects and flows in those processes. Our client agent can consist of an object in a process running on the client’s desktop that listens for data and updates the local display, and an object in a process running on the bank server that makes requests and sends data back to the client. There are two objects running in different processes on separate machines, but together we can think of them as one client agent with client-side elements and server-side elements.

Thus, a distributed program can be viewed as a coordinated group of agents working to achieve a specific goal. Each of these agents can be distributed across multiple processes on remote hosts and can consist of multiple objects or control threads. Agents can also belong to multiple programs at the same time. For example, you might be developing an ATM application that consists of an account database server with client request agents distributed across the network that send requests. The account server agent and the customer request agents are agents in the ATM program, but they can also serve agents residing at the financial institution’s headquarters as part of the administrative program.