Introduction to the EGTP version 1 Architecture: document version 0.3 I. Overall Design A. Nonces and References Data is transmitted between EGTP nodes in the form of messages. Each message contains a unique (randomly generated) nonce which guarantees that it is unique among all messages ever transmitted in the history of the universe. When you receive a message from another EGTP node, and you wish to *reply* to that message so that the sender will get your reply, and will know that it is a reply to that particular message that she sent, you include a hash of the original message in your reply (this hash is called the "reference" -- it refers to the original message). (You always hash a canonical form of the original message, so that no issues of encoding, endianness, and so forth effect the result of the hash.) This allows the sender to know with certainty that your message is a reply to that particular query of his. It also serves as a cryptographically guaranteed ACK, so that the sender knows his message was received by you. This is also how the sender's EGTP Node associates queries with replies, and it also provides assurance that the reply messages received aren't a replay attack or some other active attack. * This seems to mean that EGTP does not deal with acks, so that it is un-reliable. Is this the case? We we implemented UDP over TCP? - I don't understand -- the reply messages *is* an ACK, and thus enables reliable (more precisely: retriable) messaging. How can I make this more clear? * Ok, but an ACK in TCP is sent even if there is no data to send. Is this the case here? ie. is it always Query, Reply... Or are you allowed to not reply. Or, put another way, if I send a packet, am I gaurenteed a reply? If I do not get a reply does egtp resend the original querry? How long does it wait? What is the retransmit strategy? How long before it decides the host is dead? etc... The nonces and references are included with the message, and are always present regardless of what communication substrate the message is delivered over, including TCP, Relay, or perhaps in the future UDP or wireless networking. Note that this design does not provide protection against active attacks for the node that receives the initial query, only for the node that sent the initial query and awaits the reply. (The node receiving the initial queries could protect themselves against replay attacks by remembering the hash of each message they have ever received, but this would require an unbounded amount of storage.) For example: Alice sends a request to Bob for a list of currently available e-texts. Bob sends back such a list in his response. Mallory (the malicious attacker) saves a copy of Bob's response. Mallory's intent is that in a few weeks' time, when Alice sends a request to Bob for a *new* list of currently available e-texts, Mallory will substitute the saved response for Bob's *new* response, thus effectively hiding any *new* e-text listings from Alice. Without the "reference" feature, Mallory might succeed at this ploy. With the "reference" feature, he will not. (As Alice will see that this reply is not a response to her *new* query.) For an example of how this design does not protect the recipient of a request, consider that Alice sends a request to Bob for a list of currently available e-texts. Mallory saves a copy of Alice's requests. Mallory can then resend that same copied request to Bob over and over, and each time Bob will think that it is a new request from Alice and answer it accordingly. In this particular example Mallory isn't able to do much damage, but in other examples he might. Fixing this problem turns out to be surprisingly subtle, at least judging by how long it has taken me, Zooko, to come up with a complete solution. That solution will form the basis of EGTP v2 design. * Do tell. ;) - In "NewArchitecture.txt", coming soon to a CVS repository near you. This problem is documented as a security bug: http://sourceforge.net/tracker/index.php?func=detail&aid=548075&group_id=44377&atid=439350 B. Timeouts Each query message that has been sent is associated with one kind of timeout, called a "hard" timeout. This is the interval after which the sender will free up all storage associated with the request. This means it will absolutely not do *anything* with an answer that might subsequently come in other than drop it on the floor. Each query message can also optionally be associated with a "soft" timeout. This is the interval after which the sender will get a notification that the soft-timeout interval has passed and no response has yet arrived. This is useful for what we call "impatience actions". The common example in Mnet is that you have requested a block of data from a peer. After the soft-timeout interval has passed, you start to suspect that the peer is unresponsive, so you request a *replacement* block from a *different* peer. If the original peer subsequently sends the block that you requested, then you go ahead and use it, just as if it had arrived before the soft-timeout. (This is something you cannot do with a hard timeout.) C. Lookup and Discovery The EGTP engine depends on two other services, a Lookup Service and a Discovery Service. The Lookup Service is used to get the current "comm strategy" for a given peerId. There are two reasons that you might have a peerId but not its associated comm strategy: 1. Someone told you "Hey, you ought to talk to that guy." and gave you the peerId but didn't give you the comm strategy. 2. The comm strategy that you had stopped working (perhaps because the peer changed IP addresses or relay servers, for example.) A good implementation of the Lookup Service might be a distributed hash table like Khashmir, storing digitally signed comm strategies. Another good implementation of a Lookup Service might be a system where whenever your IP address changes you send out your new IP address to your 5 closest friends, and everyone else knows who your friends are and knows to ask them for your current IP address. (This is the design advocated by the e-lang project's "Vat Location Service".) The Discovery Service is used to meet relay servers. Your EGTP Node invokes a method of the DiscoveryMan which says "Gee, I'd like to meet some nice new Relay Servers.". The DiscoveryMan subsequently invokes a method of your EGTP Node saying "Here are the peerIds of some nice new Relay Servers.". A good implementation of the Discovery Service might be a local configuration file that the user manually updates when she wants to change relay servers. II. General Implementation Stuff * Maybe this section should be called definitions and put first? - I don't think any of the terms in section II here are used in section I. Do you still think that would be better? * Hmm... you have changed my mind. A. nodeIds nodeIds are universally unique 20-byte identifiers. They are formed by taking the SHA1 hash of the public key of the Node, along with some extra fixed strings for cryptographic/engineering reasons. When you are talking about the nodeId of a peer (as opposed to the node Id of yourself), you call it a "peerId". B. Relay Relay is an important feature of EGTP, but it is best understood as a layer on *top* of the TCP part of EGTP. Ignore relay on your first pass through, and then once you understand EGTP-over-TCP you can understand relay as follows: To relay a message M to a recipient B through a relay R, you simply make a message M2, which says "recipient: B, message: M" and you send M2 to R. Now that you understand that much, you can look at CommStrat.Relay (implements the sending of relayed messages), RelayServerHandlers (implements the relay service) (whoops -- that file is missing from EGTP -- find it in mnet), and RelayListener (implements receiving relayed messages). C. Threading and Event Queue EGTP uses an event-driven architecture to simplify things. What this means is that there is an event queue, named "DoQ", and a single thread that runs it. The DoQ pops the next event off of the front of the queue, executes it, and then iterates. There are no other threads in the system, so any and all code that gets executed gets executed by being in an event that gets popped off the front of the DoQ. (That's not *completely* true. the DoQ. The exception is that the TCPConnection object interacts with the low-level sockets via a separate thread called "the asyncore thread". There are liberal precondition assertions to help you notice right away if you make a call from the wrong thread.) The reason event-driven programming is simpler is that when you are writing code, you do not have to worry about the twin threats of deadlock and thread inconsistency. You do not have to worry about deadlock because nothing ever locks. You do not have to worry about multi-threading inconsistency, because when you are writing (or reading) a function, you can look at the source code and see laid out right in front of you all of the operations that might effect the data during the execution of the function. These two simplifications make for a huge win in design and debugging time. Learn to love them. D. Throttling and Other Delays If a TCPConnection can't send a message, because the connection isn't open yet, or the outgoing buffer is full, or something, then it just holds the message in its own buffer and tries again when the asyncore notifies it to proceed. Throttling simply artificially enforces this situation: when throttled, the TCPConnection will not try to send any data until unthrottled. III. Specific Implementation Stuff (Classes) A. CommStrat There is a class named "CommStrat". It is a generalization of an address. Subclasses of it include CommStrat.TCP (which handles both the case that you have an IP address and port num, and the case that you have an open TCP socket), CommStrat.Relay (which means that messages get sent via a relay server), and CommStrat.Crypto (which means that messages get encrypted and then passed on to a "lower-level strategy" for actual delivery). I now believe that it was a mistake to make CommStrat.Crypto have the same role as the actual transport strats, but oh well. Strategy objects should have a send() method. B. TCPCommsHandler TCPCommsHandler is mostly responsible for managing when to open and close connections. It tries to be very clever about keeping connections open if it expects to use them again. It's complicated. We don't know for sure that the behavior is good enough to justify the performance, but on the other hand it is the product of three years of evolution, and it works, so please don't break it unless you need to. TCPCommsHandler holds a set of TCPConnection instances (one TCPConnection object per open TCP connection). It gets called on the DoQ every once in a while to clean up idle connections. C. TCPConnection TCPConnection is responsible for: a) twiddling the socket objects and managing buffers of incoming and outgoing data. Closing the connection. This is the part that most closely touches Asyncore. b) prepending lengths to outgoing messages and "chunking" incoming streams into known-length messages. Also letting the higher-level code know if an outgoing message couldn't be sent (known as "fast fail"). c) making calls to a "Throttler" object to enable bandwidth throttling. When do we fast fail? In general, you fast fail if and only if you couldn't figure out how to transport the data off of your own local machine. For example, if the only known comm strat to reach the recipient is a CommStrat.TCP, and nobody is accepting connections at that IP address, then you give up. In contrast, if you are sending via Relay then once you have sent the message off to the Relay Server you cannot then fast fail, because you can't know for sure that your message won't eventually reach the recipient (even if the Relay Server tells you that it couldn't deliver). Fast fail is a local operation that is not allowed to be vulnerable to failure or misbehavior on the part of any other agent including your Relay Server. IV. Advanced Stuff A. Retrying In EGTP v2, both sides will be safe from replay attacks. Therefore, both sides can safely *retry* messages in order to get more reliable delivery. In EGTP v1 this isn't true, in general, and operations that should be retried are coded to do so at the application level. For example, when trying to reconstruct a file, you might ask the same block server a second time for a given block. It wouldn't be safe for EGTP to do the message-retrying *for* you, because EGTP doesn't know whether it is safe for the recipient to perform the requested action twice or not. (Safe: "send me a copy the block with this blockId" (except for bandwidth resource management issues), unsafe: "Take 10 nanobux out of my account and put them into the account with this accountId".) * Ahhh.. This seems to answer the question I had above. Maybe above you should just make reference to this explanation. In EGTP v2, the EGTP layer will be able to automatically retry for you regardless of the application-layer content of the messages. * I believe this means that EGTP currently is a best effort system. Hence its like UDP. (Like my question above.) Am I right, or am I missing somthing?