Christopher Rohrs, Lime Peer Technologies LLC (LimeWire)
t o p
with ideas from Vincent Falco, Free Peers Inc. (BearShare)
We propose a revised
ping/pong scheme for the Gnutella network. Our scheme helps clients connect to
the network more easily, uses a small amount of bandwidth, and is compatible
with older clients. The key idea is to cache pongs and send
only the best N pongs to a client.
t o p
Gnutella's ping and pong messages serve two functions:
- Return IP addresses of other servents so you can establish more connections.
- Provide a measure of the horizon size, i.e., the number of hosts and files searchable.
Unfortunately the current ping/pong scheme has two problems. First, it
consumes an enormous amount of bandwidth - up to 50%
of the total Gnutella messaging bandwidth. Secondly, most returned addresses
are not reachable or not
accepting incoming connections. For this reason, it is difficult to connect
to the Gnutella network.
proposes an improved way of achieving the first function. Our method does not
introduce any new
messages and requires no modifications to the existing ping and pong message
formats; it simply changes when they are sent and how they are
interpreted. Our scheme has three
- Efficiency The amount of bandwidth used
for pings/pongs is capped at roughly 100 bytes/sec/connection, even in the
presence of an attacker or older clients.
- Quality If a host A receives a pong from
another host B, B was accepting incoming connections within the last 15
- Fairness If a host sends a ping and has
been playing by the rules, it is guaranteed a pong in response - unless
there are really no hosts accepting incoming connections.
Our scheme does not attempt to measure the horizon accurately. As long as a servent receives one pong
from each host in its horizon, pongs will consume a lot of bandwidth. As an example, receiving 4000 pongs
would require (4000 pongs)(37 bytes/pong)/(1 sec/5000 bytes)=30 seconds on a
typical 56kb/sec modem. However,
we do suggest some new ways to approximate the horizon in the appendix.
The key idea of our
scheme is to cache pongs to avoid broadcasting pings too rapidly. We also
introduce a technique called ping
multiplexing to avoid additional broadcasts while the cache is
filling. Additional refinements
include limiting the number of pongs sent in response to a ping, and throttling
incoming pings to defend against attackers and old clients. This
entire scheme can be implemented
easily and efficiently, and we give complete pseudocode at the end of this
t o p
Pong limiting is an
easy way to reduce the number of pongs on the network. First of all,
a host should only send a
pong with its own address if that host is not firewalled and can currently
accept incoming connections. This
is critical to our scheme. Remember that we are not trying
to provide information about the horizon
size; we are only interested in efficiently helping people connect to the
Second, a host
should only route a fixed number of pongs in response to a ping. If everyone
is using the above rule,
picking 10 or so pongs should be enough to help hosts connect quickly. The
question is which 10 pongs to send
back. Pongs with lower hops values
are more useful to unconnected clients trying to establish a permanent
connection, as these pongs are younger. On the
other hand, pongs with higher hops values are more useful to
neighbors trying to increase their horizon size through additional
connections. Therefore a good
approach is to try to distribute hops values equally and to send pongs from a
variety of connections. In this
case, that might mean sending one pong with a hops value of 1 from connection
1, one pong with a hops value of 2 from connection 2, etc. Clients may also strive
to send different pongs to each connection in response to each ping.
t o p
Pong limiting does not solve the efficiency problem if too
many pings are being sent. Consider a single servent A
with a stable set of good connections. Now another host B joins A and sends a
ping. A broadcasts the ping to all
hosts within its horizon and routes the pongs back. Now say that B disconnects
and is replaced by another host
C. As usual C send a ping to
A. Unless a lot of time has
elapsed since B sent its ping, the ping from C will travel to the same set of
hosts, and A will route the same set of pongs back. So twice as many messages
have been sent than necessary. More generally, any two pings that a
hosts sees within a small period of time are likely to have the same effect.
We can take advantage of this by caching the most recent pongs and avoiding the
broadcast of pings. In the above
example, A caches the pongs it saw in response to B's ping and simply sends
these back to C - without
forwarding C's ping. This is
illustrated to the right. Of
course A will have to set the GUID of each pong that it sends to C. If
every host on the network is caching
pongs, the amount of traffic will be greatly reduced.
To ensure that
hosts get fresh pongs, it is important that the cache expires every few
seconds. To refresh its cache, a
host pings all its neighbors every 3 seconds and stores all pongs sent in
response. The TTL on this ping is
the standard outgoing value, say 5 or 7. Broadcasting
a ping every few seconds may sound like a crazy idea, but
some quick analysis shows that very little bandwidth is actually used.
The trick is to observe that hosts
receiving a ping usually respond by simply dipping into their pong caches.
In other words, a ping is rarely a true
Let the cache
expiration time, in seconds, be T. Let N be the maximum number of pongs sent in
response to a ping. Then at most one ping and N pongs are
sent per connection per T seconds. Since a ping is 23 bytes and a pong is 37
bytes, the amount of bandwidth
used per connection is (23+N*37)/T bytes/sec.
Assuming T=3 and N=10, that works out to only 131 bytes/sec/connection.
So a typical modem user with 2
connections and a total bandwidth of 5 kbytes/sec will only have to devote 5%
of his bandwidth to pings and pongs - an order of magnitude less than the current
scheme. Reducing N or increasing T
can further decrease bandwidth requirements. Furthermore, no pong will be
older than M*T seconds, where M
is the maximum allowable TTL. For
M=5, that works out to only 15 seconds.
In an earlier
version of our scheme, if a host with an empty cache received a ping with TTL
i, it refreshed the cache by broadcasting the ping with a TTL of i-1—not M—and
storing the results. If the host
subsequently received a ping with a higher TTL, it would broadcast that ping
immediately. We called this
process a “TTL upgrade”. If there
is little demand for pongs, the earlier demand-driven scheme is more efficient
than our proposed scheme because only a small portion of the network is pinging
at any given time. However, if
pongs are in demand, it is less efficient because each node may send up to M
pings (and N*M pongs) per connection per T seconds.
For this reason, our proposed scheme does not use TTL
t o p
The above discussion assumes that hosts reply to a ping instantaneously. Obviously
this is not so. The worst case occurs when the caches of all hosts expire at the same
time. In this case, every ping results in a “cache miss”, so hosts must wait for pongs
to be routed from their originator. If the latency per connection is not significantly
smaller than the cache expiration time, a host may receive a pong shortly before its cache
expires. For this reason, it is advisable that T be no smaller than 3 seconds.
Furthermore, clients implementing message prioritization should make pings and pongs a
higher priority message than queries and replies.
Another difficulty arises when two pings
come in quick succession. Consider three hosts A, B, and C, connected in
a line as shown in the first picture to the right. Assume C is well-connected
to the network. A sends a ping to B and B broadcasts it
to C. Now before A has received
any replies to that ping, a new host D connects to B and sends a ping.
Because B's cache is still empty, B must broadcast D's ping if it wishes to
This second broadcast can be avoided through a technique we call ping multiplexing.
The basic idea is to “multiplex” many incoming pings into a single outgoing ping per
connection. Conversely we “demultiplex” a single incoming pong into multiple outgoing
pongs. In the above example, B does not broadcast D’s ping. Instead it waits until it
has received pongs from C and then send these along to both A and D. This is illustrated
in the last second and third pictures to the right. For the sake of simplicity, this
illustration ignores the forwarded ping sent from B to A in response to D’s ping.
To summarize: respond to a ping immediately with cached pongs, then set up information
so that later pongs will be routed to the connection as needed. This can be implemented
by maintaining two additional pieces of information for each connection: the GUID of the
last ping received from that connection and how many pongs the connection still needs.
The information is used as follows:
- When receiving a ping from a connection, store the GUID in the connection.
Respond immediately with as many cached pongs as possible, but not more than N.
Set the number of pongs the connection still needs appropriately.
- When receiving a pong from a connection, forward the pong along all connections
still needing more pongs, using the GUIDs stored on those connections.
With this implementation, traditional ping routing tables are not needed!
t o p
Dealing with Old Clients
If all clients used the above scheme, at most one ping and M
pongs would be sent per connection per T seconds. Unfortunately, new clients
will have to deal with old and
malicious clients that forward pings at unbounded speeds.
For this reason, our scheme limits the
rate that incoming pings are accepted; any pings exceeding that rate are
dropped. This is implemented by
maintaining the time that the last ping was accepted from a connection.
Again, we stress that this is solely to
defend against old clients and malicious attackers.
It is also wise to treat older clients specially when
sending pings and receiving pongs. First, a new client should
ping an old client no more than once every 30
seconds or so. The reason is that
pings to old clients result in true broadcasts and are harmful to the network.
Moreover, older clients will likely
return a flood of pongs, consuming a lot of the new client's bandwidth.
Second, pongs from older clients should
be placed in a special reserve cache.
Because these pongs include hosts that cannot accept incoming connections,
they should only be used when absolutely necessary.
For example, if a host lost all connections shortly after
the normal cache expired, the reserve cache could be used to establish new
Luckily we have a way to identify older clients.
As discussed at the O’Reilly Conference
and on the GDF, clients should handshake by initially sending pings with a
marked GUID. The ninth byte of
this GUID (i.e., GUID in C or Java) is 0xFF, and the last byte is the
protocol version number. (Note
that this ping handshaking has nothing to do with the “GNUTELLA CONNECT 0.4”
handshake string sent; unfortunately this string must be frozen for the sake of
backwards compatibility.) The
current protocol version number is 0, which stands for “Gnutella 0.5”.
We propose to use a version number of 1
for our ping/pong scheme and call this “Gnutella 0.6”.
t o p
Below is pseudocode
that puts all of the above ideas together.
In our pseudocode, cache refreshes are triggered when
receiving a ping T seconds after the last cache refill.
Another implementation is to use a
timer to refresh the cache exactly every T seconds.
In practice, both will behave nearly the same.
There are some
special cases below for dealing with old clients.
However, we have omitted some details, such as sending
periodic update pings to old clients and using the reserve cache when no new
pongs can be found. Also, there is
a special case for sending handshake pings and receiving pings from a network
- T: both the expire time for cached
pongs and the throttle time. Suggested value: 3 seconds.
- M: the max TTL. Suggested value: 7.
- N: the max number of pongs to return in response to a ping. Suggested
- CACHE: the pong cache. An M array list, each containing a
list of <address, connection> pairs. Associating an address with a
connection prevents us
from sending a cached pong down the path it came from.
- NEXT_EXPIRE: the time we will next
expire CACHE. At this time,
we will also forward pings to all connections.
- RESERVE_CACHE: the cache of expired
pongs and pongs from old clients. Used in case the cache empties and all
connections suddenly die.
Maintain the following variables per connection C:
- ACCEPT_TIME: the time we will next accept a ping from C.
- ACCEPT_GUID: the GUID of the last ping accepted on C, or null if not seen.
- NEEDED: an array of size M. NEEDED[i] is the number of pongs with hops value i needed by C.
When establishing a new connection C:
- If you don’t need any connections and have enough values in RESERVE_CACHE, send C a new ping with TTL 1 for handshaking purposes.
- Otherwise send C a new ping with TTL=M.
When receiving a ping P on connection C, even if C is a “temporary/reject connection”:
- Throttle incoming pings to defend against old clients and attackers.
If the current time is less than C.ACCEPT_TIME, drop P and return from this method.
Otherwise set C.ACCEPT_TIME to be the current time+T.
- Refresh the cache if necessary, being sure to synchronize properly.
If the current time is greater than NEXT_EXPIRE, do the following:
- Set NEXT_EXPIRE=current time+N.
- Empty the cache, copying the contents into RESERVE_CACHE.
- Send a ping with a new GUID and TTL M to all connections to newer clients. [sic]
- Special case for crawlers: if P.TTL=2 and P.HOPS=0, send pongs with the address and ports of all neighbors.
Return from this method.
- Setup pong demultiplexing tables. Set C.ACCEPT_GUID=P.GUID. For all 0P.TTL, set C.NEEDED[i]=0.
- Return a pong with my address if I can accept incoming connections right now.
- Return cached pongs. For all i between 1 and P.TTL, inclusive,
- Choose NEEDED[i] addresses from CACHE[i] as possible, or as many as possible. Be sure that no address was received from C.
- Decrement C.NEEDED[i] by the amount chosen.
- Send the addresses to C in pongs with P.GUID.
When receiving a pong P on connection C:
t o p
- If C is an old client, place in RESERVE_CACHE and return from this method.
- Add to CACHE[P.HOPS].
- Route/demultiplex P. For each connection C’ except for C,
- Check if C’ still needs a pong. If C’.NEEDED[P.HOPS]==0, continue to the next connection.
- Send P along C’ with GUID given by C’.ACCEPT_GUID.
- Decrement C’.NEEDED[P.HOPS]
Appendix: Estimating Horizon Size
Our ping/pong scheme does not provide a way to measure the horizon size. We do acknowledge, however, that there
is some value in providing reasonable horizon statistics. This information tells the user whether she is well-connected
to the network and may be useful in helping a client establish better connections. Here we outline some ways of measuring
the horizon size.
The easiest way is to maintain a large set of all pongs seen, including those from older connections. Eventually this
set will contain pongs from most non-firewalled hosts in the horizon. However, this set will grow very slowly, since at
most 10 pongs are added per 3 seconds per connection. Furthermore, the client should take care to expire entries that are
more than a few minutes old.
A similar approach is to maintain a set of all hosts responding to queries. (Remember that the query reply message
includes the address of the host sharing the file.) This also has some problems. First, this scheme only measures those
hosts in the horizon that are sharing files, although this is arguably as important as the total horizon size. Secondly
this set will also grow slowly since query replies only consume a small percentage of all messages. Finally, it is
difficult to measure the number of files being shared without adding extra information to the query hit descriptor.
The best approach is to introduce one or two messages solely for the purpose of communicating horizon size. Assume for
simplicity that the Gnutella network is a tree, i.e., there is only one path between any two hosts. A key observation is
that the number of hosts reachable from some host with TTL N is the sum of the number of hosts reachable from all its
neighbors with TTL N-1. For formally, let H[A, n, B] be the number of hosts within n hops of host A that are reachable
through B. Let H’[A, n, B] be the number of hosts within n hops of host A that are not reachable through B. Then for
any host A with neighbors N1...Nm,
H[A, n, Ni]=1+H’[Ni, n-1, A]
H’[A, n, Ni]=H[A, n, N1]+…+H[A, n, N(i-1)+H[A, n, N(i+1)]+…+H[A, n, Nm]
H’[A, 0, Ni]=0
Horizon estimation now works through a sort of dynamic programming algorithm. Each host maintains the horizon size
per connection, i.e., H[A, n, Ni] for all n and i. Every few seconds or so, hosts calculate their H’ tables from these
values, exchange them with neighbors, and use these values to update their H tables. The total horizon size reported to
the user is the sum over all i of H[A, n, Ni]. This scheme is accurate, efficient, and uses almost no bandwidth.
t o p