Preview only show first 10 pages with watermark. For full document please download

Message Passing

this presentation is based on distributed systems syllabus of pune university I tried to explain the message passing

   EMBED


Share

Transcript

Distributed Sys Systems tems Message Passing Prof. Aniruddha S Rumale Assistant Professor, Comp. Engg. Dept. Introduction Process means program in execution. Two comps in DS are communicating means two processes communicating with each other. Process communication is necessary to achieve some common goal. DS needs to provide InterProcess Communication ( IPC) IPC IPC requires information sharing among two or more processes. 1) Original sharing or shared data approach information to be shared is stored in common memory area 2) Copy Sharing or message passing information to to be shared is physically copied from sender process¶s address space to address spaces of all receiver processes in form of message. P1 Shared common memory Shared data approach P2 P1 P2 Message-passing approach Generally Computers in N/W do not n ot share memory. memory. So IPC in DS uses Message-passing Message-p assing over over shared data Approach. Message-passing system system provides message based IPC protocols by shielding the complex N/W protocols and also by shielding the multiple heterogeneous platforms from user user.. Features of Message-passing system 1. Si Simp mpli lici city ty should be simple and easy to use. It must be straightforward to construct new applications and communicate with existing ones using systems primitives. 2. Uni Unifor form m Sema Semanti ntics cs semantics of remote communications( communicating processes are on different nodes) should be as close as possible to local communications( communication s( communicating processes are on same node). Features of Message-passing system «continue 3.Efficiency an IPC protocol of a message-passing system system can be made efficient by reducing the number of message exchanges, as far as practicable, during the communication process. Avoiding the cost of establishing and terminating connections between same pair of processes each and every message exchange between betw een them. Minimizing the cost of maintaining the connections. Piggybacking the acknowledgement of previous messages with the next message during a connection involving several message exchanges between sender and receiver rec eiver.. Features of Message-passing system «continue 4. Reliabilty DS is prone to catastrophic catastroph ic events.( node crash, communication link (s) failures etc«) A reliable IPC protocol can cope with failure problems and guarantees the delivery of message. Handling of lost messages involves ACKs ACKs and retransmissions on the basis of timeout. Duplicate messages may be sent in the event of  failures or timeouts. A reliable IPC protocol must be capable of detecting and handling of duplicates. It involves generating and assigning sequence numbers to messages. Features of Message-passing system «continue 5.Correctness Correctness is a feature related to IPC protocols for  group communication. Atomicity : message sent to a group of receivers will be delivered to either all of them or none of them. Ordered Delivery: messages arrive at all receivers in order acceptable to an application. Survivability: guarantees that messages will be delivered correctly despite partial failures of  processes, machines, or communication links. Features of Message-passing system «continue 6.Flexibilty Not all applications require the same degree of  correctness and reliability of IPC protocols. Thus , the IPC protocol of message passing system must be flexible enough to cater to the various needs of  different applications. IPC primitives must also have the flexibility to permit any kind of control flow between the cooperating processes, including synchronous and asynchronous send/receive . Features of Message-passing system «continue 7. Security A good message-passing system must be capable of  providing a secure end-to-end communication. A message in transit on the network should not be accessible to any user other than those to whom it is addressed and the sender. This involves a) authentication of receiver(s) of a message by sender  b) authentication of sender of message by receiver(s) c) encryption of message before sending it over o ver N/W. N/W. Features of Message-passing system «continue 8. Portability the message-passing system system should itself be portable. the applications written by using the primitives of IPC protocols of message-passing system should be portable. Issues in IPC by message-passing Message is a block of information formatted by a sending process in such a manner that it is meaningful to the receiving process. Structural information Sequence Addresses  Actual Number  Sending Data or  Number  Receiving Or  Process Pointer  Of bytes/ Type P rocess Message  Address To data elements  Address ID A typical message structure Issues in IPC by message-passing «continued w ho is receiver?  who is sender and who How many receivers? One or many? Is the message guaranteed to have been accepted by its receiver(s)? Does the sender need to wait w ait for reply? How to handle catastrophic events ( node crash, link(s) failure(s), etc«) occurring during communication? If receiver is not ready; what to do with w ith message? Discard or  store in buffer? What to do if buffer is full? Can receiver choose order of acceptance to serve outstanding messages? Synchronization Central issue in communication structure is synchronization imposed on the communicating processes by the communication primitives. Two semantics Blocking and non-blocking can be used. A primitive is said to have non-blocking semantics if its invocation does not block the execution of its invoker invoker.. If execution of invoker is blocked, it is blocking semantics. semantics. These semantics are primarily used for send for send and receive primitives. Incase of blocking send primitive, sending process is blocked after  execution of send until it receives an ACK from receiver that the message is received. Incase of non-blocking send, process proceeds with its execution as soon as the message get copied to buffer .( transferred if Null-buffer is used) Synchronization« Synchronization «continued Incase of block-receive, receiving process is blocked until it receives a message (ACK). Incase of non-blocking receive, process proceeds with its execution as soon as receive primitive is executed. How non-blocking receive knows that message is arrived in buffer? 1. Polling : a test primitive is provided to allow allo w receiver to check the buffer status. A periodic execution of test is carried out called as polling. 2. Int nte errupt : when buffer get filled and becomes ready to be used by receiving process a software interrupt notifies this to receiver. receiver. Saves repeated unsuccessful check of polling. Synchronization «continued A variant variant to non-blocking non-blo cking receive primitive is conditional receive primitive. primitive. This returns control con trol to invoking process immediately,, either with a message or with an indicator of noimmediately message. In blocking-send primitive, sending process could get blocked forever if receiver crashes or if message loss due to other reasons. To To avoid this blocking primitives uses time-out value ( time-stamp, waiting-time) specifying interval of time after which the operation of blocking-primitive blocking-p rimitive ( blockingsend) is terminated with an error status. Time-out value is either default ( system calculated ) or user  defined ( human-time) with respect to communication criteria. Synchronization «continued Sender Receiver  When both send and receive Send (message) Receive (message) Execution primitives of a communication Execution suspended between two processes use suspended blocking semantics, the Execution message communication is said to be Blocked Resumed synchronous; otherwise it is State said to be asynchronous. Send (ACK) Synchronous communication Execution ACK comparatively easy to Resumed Executing State implement than asynchronous communication. Synchronous mode of communication with both send and receive having blocking semantics Synchronization «continued Synchronous communication is more reliable. If message get lost or is undelivered, un delivered, no backward error  recovery is necessary. It limits concurrency It is subject to communication deadlocks. Less flexible than asynchronous communication. Requires unnecessary waits for ACK. And it is slower than asynchronous communication. Buffering In communication, in some cases receiving process may not be ready to receive a message.Such messages need to be stored somewhere, usually in the buffer of receiver receiver,, for later reception and processing. Sending Process Message Null Buffering ( No Buffering) [ Synchronous] No temporary storage at receiver to store the message. The message remains in the sender process¶s address space and execution of send is delayed until the receiver  executes the corresponding receive.  ACK Or message is simply discarded and time-out mechanism Receiving Process is used to resend the message after a time-out period. Null Buffering Buffering« continued Single-Message Buffer [ Synchronous] Null buffer is not suitable for communication in DS if  receiver is not ready, ready, it may require more than two repeated message transfer of same message. Also receiver need to wait for time taken to transfer the message across the N/W N/W.. To avoid this a buffer with w ith a capacity to store one message is used at receivers end. This is because in synchronous mode an application module may have at most one message outstanding at a time. Sender  Buffer to Store One Message The message buffer may either be located in the kernel¶s address space or in the receiver process¶s address Receiver  space. Logical path of message transfer involves two copy operations. Buffering« continued Unbounded capacity buffer [ asynchronous] Sender  In asynchronous communication sender never  wait for receiver to be ready; causing many pending messages that yet not have been accepted by receiver; and thus requires unbounded unbou nded capacity buffer to store all unreceived messages. Unbounded capacity of buffer is practically impossible. So in practice asynchronous communication uses finite bound buffers. buffers. Buffer to Store Many Messages Receiver  Buffering« continued Finite bound or Multiple-message Multiple-message buffer [ asynchronous] 1. Unsu Unsucce ccessf ssful ul com communi municat cation ion : message transfers simply fail whenever there is no more buffer  space.send normally returns an error message to sender : receiver  buffer is full, message can¶t be delivered. makes message passing less reliable. 2. Flow-controlled communication : sender is blocked until the receiver accepts some messages. introduces synchronization : may results in deadlocks Communication based on Finite bound buffer implementation is more complex to implement and use than null buffer or single message buffer. Multidatagram messages All N/W has upper bound on data to be transmitted at a time, known as Maximum Transfer Unit (MTU). If sizeof ( message) > MTU : fragmentMTU( sizeof (eachfragment)<=MTU) : number each fragment serially : if each fragment numbered : send them in packets ( datagram) Packet = control information + message data. If MTU> message data to be send : single-datagram message, else multidatagram messages. Multidatagram messages « continued At receiver side check packets for sequence number  If packet is numbered store it in buffer  b uffer  Receive all packets with sequence numbers based on common control information. report any error to sender for retransmission of  missing packet arrange packets in order  accepts packets in order  reassemble packets to form complete message acknowledge the sender  Encoding and Decoding A message data should be meaningful to the receiving process. This needs preservation of program objects while transmission from senders address space to receivers address space. It is not not easy to achieve this on homogeneous systems and it is impossible to achieve this on heterogeneous systems. An absolute pointer value loses its meaning when w hen transferred from one process address space to another another.. Different program objects occupy varying amount of storage space. And to make a message meaningful to the receiver re ceiver,, there must be some way receiver to identify which program object is stored where w here in the message buffer and how much space each program object occupies. Encoding and Decoding «continued Due to problems in transferring program objects in their  th eir  original form, they get converted to a more mo re suitable stream form for transmission. Conversion process from original to stream form taking place at sender side is known as encoding. Conversion of stream form of program objects in to their  original form at receiver side prior to their use is known as Decoding. There are two types of representation used for encoding encodin g and decoding. Tagged Tagged and untagged representation. Encoding and Decoding «continued Tagged representation : type of each program object along with w ith its value is encoded in the message. Receiving process checks the type of each program object in message due to self-describing nature of coded data. More expensive than untagged representation. Untagged representation representation : message data only contains program objects. No information about type of any program object is given. Receiver must have prior knowledge to decode the received data as coded data format is not self-describing. Process Addressing Addressing ( Naming ) of parties p arties involved in communication is an important issue in message based communication. Explicit addressing : the process with which communication is desired is explicitly named as a parameter in the communication primitive used. Eg . Send ( process_id, message) Receive( process_id, message) In above we are sending/receiving message to/from a process identified using process_id Process Addressing «continued Implicit Addressing : a process willing to communicate does not explicitly name a process for communication in communication primitive used. Useful in client server communication when client is concerned with service and not the server from set of server-farm, who is going to serve its purpose. This is also known know n as functional addressing as address used is of service and not of process. Eg. Send_any( service_id, message) send message to any process which can provide the service identified by service_id. Receive_any(process_id, message) Receive message from any process and return the process_id on reception r eception of message. Failure handling DS is prone to following follow ing failures. sender Receiver  Send message request sender  Receiver  sender   Receiver  Send Send message received requestmessage request message lost lost Request message is Lost either because Of link failure or  Because receiver  Node may be down At time of request For communication Send response Crash restarted Receiver¶s node Response loss either due Crashed before To down sender node or  Receiving request Failed link of  For communication comunication Failure handling«continued Client To cope with these problems, IPC protocols are designed based on the idea of internal retransmissions of messages after time-outs. And prompt return of ACK messages from receiver. Four-message Server  Request ACK IPC protocol for client-server: 1. Cli Client ent send sends s reques requestt messa message ge to serv server er.. Reply 2. On rece recepti ption on of reque request st serve serverr sends sends ACK ACK to client. ACK 3. On proce processi ssing ng of client client¶¶s request request server server sends sends Reply containing results of processing to client. Blocked state 4. On recept reception ion of reply reply client client sends ACK to server  server  Executing state Failure handling«continued Three message reliable IPC protocol for client server : Client Server  Request 1. Clie Client nt sen sends ds a requ request est me mess ssag age e to server  2. On rec recept eption ion ser server ver pro proces cesses ses the request and prepares a reply and sends it to client; meanwhile client remain blocked. 3. On rec recept eption ion of repl reply y from from ser server  ver  client resumes its execution and sends a ACK to server server.. Server  remain blocked until the ACK from client. Reply ACK Blocked state Executing state Failure handling«continued Two message IPC protocol p rotocol for  client server communication : Client Server  Request 1. Clie Client nt sen sends ds a requ request est me mess ssag age e to server and enters in block state for time=time-out. 2. On re rece cept ptio ion n of req reque uest st fro from m client server processes it and prepares a reply and sends it to client. 3. Serve Servers rs ker kerne nell waits waits for for ACK ACK from clients kernel for time=timeout; In absence server  retransmits the reply to client. Reply Blocked state Executing state Failure handling «continued Unsuccessful Request server   execution client Request message Send request Time-out lost Send request Retransmit request message Time-out crash Send request Retransmit request message Time-out lost Send request  ACK Successful request Executions May produce Different results Send response Retransmit request message  ACK Send response Example of fault tolerant communication between client-server  Idempotency & handling duplicate request messages Idempotency means repeatabilty repeatabilty.. Idempotent operation produces same results without any side effects no matter how many times it is performed with the same arguments. Eg. sqrt(64)=8 for any repeated execution. Nonidempotent Nonidempote nt operations do not necessarily produce the same results when executed repeatedly with the same arguments. Eg. See the following follow ing code Debit(amount) { if ((balance-amount) >=( minbalance)) { balance=balance-amount; return(³success´,balance); } else return(³failure´,balance); }  // produces different results for amount=100 for every operation Nonidempotentt operation Nonidempoten client Send request Send request Time-out Debit(100) Debit( 100) Server  Minbalance Balance=1000 =200 lost Balance=1000--100 Balance=1000 =900 =9 00 Retransmit Debit(100) Debit(100) crash Send request Retransmit Debit(100) Debit(100) Send request Received Balance =700 =700 Desired=900 Desired=9 00 Balance=900 Balance=9 00--100 lost  ACK( Success,8 Success,800) 00) =800 =8 00 Retransmit Debit(100) Debit(100)  ACK( Success,700) Success,700) Balance=800--100 Balance=800 =700 =7 00 Example of nonidempotent operation without any measures for f ault detection Idempotency & handling duplicate request messages«continued Problem of nonidempotency can be solved using by avoiding orphan executions ( executions of client request done at server side, results of which won¶t reach to client and so may client keep retransmitting the same request; yielding in wrong result(s) ) of  requests from client. This can be achieved by using exactly-once semantics, semantics, which ensures that only one execution of server¶s server ¶s operation is performed for one request. Requires identification of orphan executions. Primitives based on exactly-once semantics are most desired but difficult to implement. Exactly-once semantics Request ID uses unique identifier for identifier for every request that a client makes. Sets up a reply cache in the kernel¶s address space on the server machine to cache replies. Execution Status Executed  / Before forwarding a request to server server,, kernel of server  Not machine checks to see if a reply already exists in reply received cache or not. If yes, yes, that means request is duplicate and already executed. So previously computed result is extracted from reply cache and new response is send to client. If no, no, request is forwarded to appropriate approp riate server server by kernel. Result obtained Reply-cache contents of Exactlyonce semantics Exactly-once operation Minbalance Server  =200 Balance=1000 client Debit(100) Debit( 100) Send request-1 request-1 lost Reqest-1 Reqest1 Send request-1 request-1 Retransmit Debit(100) Debit(100) Balance=1000 Balance= 1000--100 Time-out =900 =9 00 crash Send request-1 request-1 Retransmit Debit(100) Debit(100) Send request-1 request-1 Received Balance =900 =900 Desired=900 Desired=9 00 Reqest-1 Reqest-1 lost already executed  ACK( Success,9 Success,900) 00) Balance=900 Balance=9 00 Retransmit Debit(100) Debit(100) Reqest-1 Reqest1 already executed  ACK( Success,900) Success,900) Balance=900 Balance=9 00 Example of exactly-once operation Lost and out of sequence packets Keeping track of lost and out o ut of sequence packets is a issue in multidatagram messages. In multidatagram message transmission is said to be complete iff all packets are received by a process to which w hich it is sent. Simple way is acknowledge each packet p acket separately. separately. ( stopand-wait protocol) . This leads to communication overhead. Better approach is sending one acknowledgement ackno wledgement for  complete multidatagram message when all packets get received at receiver end.( end.( blast protocol) Lost and out of sequence packets «continued In blast protocol a node crash or a link failure may lead to following problems: one or more packets of multidatagram message are lost in communication the packets are received out of sequence by the receiver. Efficient mechanism is to use a bitmap to identify the packets of a message. In this approach header part of each eac h packet consists of two of two extra fields, fields, one of which w hich specifies the total number of packets in multidatagram message and other is the bitmap field that specifies the position of this packect in the complete message. Sender  Receiver  Type of  address address message Message ID No of  packets Packet Sequence no Rest of  Or bitmap message Lost and out of sequence packets «continued In multidatagram message a suitable buffer is set aside by receiver  rece iver  using No_of_packets field in first packet. Bitmap field gives information where exactly a received packet must be stored in set aside buffer for the particular message. selective repeat : After time-out, if all packets are not received, r eceived, Bitmap ids of nonreceived packets are communicated with the sender. On receiving this information sender sends only those packets that have not been received by receiver. The process get repeated until transmission of multidatagram message won¶t get completed. i.e. when all packets of message get received by receiver this retransmission of select packets stops. Sender of multidatagram multidatagram Message Mess age that consists of Five packets Receiver of multidatagram Message Send request message (M1,5,P1) lost Packets of  The Response Message Time-out (M1,5,P2) (M1,5,P3) Place this packet In position 3 (M1,5,P4) lost (M1,5,P5) Create buffer for five Packets and store this Packet in position 2 Buffer  For 5 Packets Place this packet In position 5 Missing packets info. (M1,5,P1) Resend missing packets (M1,5,P4) ACK Retransmit request Retransmit For missing packets Place this packet In position 1 Place this packet In position 4 M1- Messea Messeage ge ID=1 5=packets in M1 Send ACK P1,P2«= Ith packet 1 2 3 4 5 Use of bitmap to keep track of lost and out of sequence packets in multidatagram message transmission Group communication Elementary form of communication is one-to-one or unicast communication. DS require group communication (in addition to unicast) those are 1. One to many ( single sender and multiple multiple recei receiver ver ) multicast ( no of receivers are predefined and known) broadcast(( no of receivers are unknown and broadcast undefined) 2. Many to one ( multiple multiple sender sender and single single receiv receiver) er) 3. Many to many many ( multiple multiple sender and multiple multiple receiv receiver) er) Group Management In one to many communication, Receiver processes of a message form a group; closed and open. A closed group is one in which only the members of the group can send a message to the the group. In close group, group, an outside process cannot send a message to group as a whole, although it may send a message to an individual member  of group. Open group is one in which w hich any process in system can send message to group as a whole. Usage of close/open group is application specific and any flexible message passing system must support both types of groups. Facility of Dynamic of  Dynamic creation and deletion of groups is must. And a process must be allowed to enter or leave the group at any time. Group Managemen Managementt Simple mechanism for this is to use centralized group server  to manage groups and their membership information. Centralized server approach suffers from the problems of  poor reliability and poor scalability common to all centralized systems. Replication of group servers adds communication overhead in keeping group information of all group servers consistent. Group addressing Two level naming scheme is normally used. High level group name is in ASCII string that is independent of the location information of the processes in group. Low level group name depends to a large extent on underlying hardware. Special N/W address to which multiple machines can listen is called as multicast address, address, possible on some N/Ws. Packet sent to multicast address is delivered to the machines linked to multicast address. N/Ws, which can not create multicast address may have broadcasting facility by declaring a particular address such as a s zero as broadcast address.. Packet sent to broadcast address is delivered to all address machines in entire N/W N/W.. Message delivery to receiver processes User uses high level group names in programs. Centralized group server maintains mapping between high and low level group addresses ( names) along with w ith the process identifiers of  all processes for each group. Kernels of sender, receiver , and group server does appropriate mapping and unmapping operations with rest of other operations like encoding/decoding to deliver message to correctly to receiver. Sender is not at all aware of either size of group or actual mechanism used for group addressing. Sender simply sends the message to a group by specifying its high level name, and the OS takes the responsibility r esponsibility to deliver the message to all the group members. Buffered and unbuffered multicast Multicasting is asynchronous communication mechanism due to following reasons. 1. It is is unrealisti unrealistic c to expect expect a sending sending proces process s to wait until until all all the receiv receiving ing processes that belong to multicast multicast group are ready to receive rece ive the multicast message. 2. Sending Sending process may not be aware aware of all all receivi receiving ng processes processes that belon belong g to the multicast group. For an unbuffered multicast, multicast, the message is not buffered for the receiving process and is lost if receiving process is not in a state ready to receive it. So the message is received only by those processes of multicast group that are ready to receive it. or a a For  buffered multicast, multicast, the message is buffered for the receiving processes, so each process of multicast group eventually eventually receive the message. Same is true for broadcasting communication. Send to all and bulletin-board semantics  Send-to-all semantics : a copy of message is sent to each process of the multicast group and message is buffered until it is accepted by the process. Following two factors are ignored by send-to-all semantics. Relevance of a message to a particular receiver may depend on the receiver¶s state. Message not accepted within a certain time after transmission may no longer be useful; their value may depend on sender¶s state.  Bulletin-board semantics : a message to be multicast is addressed to a channel instead of being sent to every individual process of multicast group. Receiving process copies message from channel instead of  removing it when it makes a receive request on the channel. Process that have receive access right on the channel constitute the multicast group.Thus channel acts as a bulletin-board. Flexible reliability in multicast communication 1. The 00-re reli liabi abili lity ty : no response is expected by sender from any of the receiver. Useful in asynchronous multicast. 2. The 11-re reli liabi abili lity ty : sender expects the reply from any of  receivers. 3. m-out m-out-o -off-nn-re reli liabl able e : the multicast group consists of n receivers and sender expects a response from m ( 1