                 Out-of-band Shortcut for Query Hits

                          Raphael Manfredi
                    <Raphael_Manfredi@pobox.com>
                         December 31st, 2011


1. INTRODUCTION

Looking at the TCP traffic of a moderately well-connected ultra node (40 ultra
peers, 200 leaves) reveals that the size of relayed Gnutella messages splits
as follows:

	4.86% for Pong, 44.34% Queries and 50.79% for Query Hits

Seven years after the introduction of Out-of-band query hits, there are still
a number of Gnutella clients out there that do not honor the OOB flag in
queries and send their query hits through TCP.

This is a cause of concern because the relaying of query hits through Gnutella
creates traffic bottlenecks and flow-control, resulting in the dropping of
less important messages such as queries, and messes up the dynamic querying
algorithms which monitor the amount of hits received to measure the popularity
of the query, due to the delayed receiving of the query hits.

This OOB Shortcut for Query Hits ("OSQH" for short) proposal aims at fixing
some of the problem at the cost of more intensive processing and resource
consumption within ultra peers.  However, ultra peers are the backbone of
the Gnutella network, and these nodes can trade off some computing resources
to save TCP bandwidth.


2. OVERVIEW

The OSQH feature works thusly:

Ultra nodes already track all the queries that pass through them to avoid
duplicates.  For all the queries that are stored in their routing table,
they know need to track a few extra information:

- Whether the query requested OOB query hit delivery.
- The IPv4:port or IPv6:port where query hits should be delivered via OOB.
- For secure OOB, the query security token.

For IPv6 address information in queries, please refer to the IPv6-Ready
specifications which all modern servents should implement.

When query hits come back via TCP and they have more than one hop to go
before reaching their destination (TTL > 1 after local decrement), ultra nodes
look at whether the corresponding query was flagged with OOB by consulting
their routing table.

If no, then the query hit is normally routed via TCP as currently done.

If yes however, the query hit is buffered locally and after some time (to
accumulate more hits), an OOB Reply Indication is sent to the OOB destination
(as remembered in the routing table) via a LIME/12v2 or LIME/12v3 message.

The "flag" byte of the OOB Reply Indication message is extended and 0x2 is
OR-ed to the already defined "firewalled-flag" (0x1).  The 0x2 bit indicates
that the hits come from a relaying ultrapeer, and were not generated by that
ultrapeer.  This additional information can be ignored by the receiving party
or be used to activate special logic to claim all the reported hits and not
just a few.

Upon reception of the OOB Reply Indication message, the querying party will
then proceed with claiming the hits, in which case the ultra peer unbuffers
the requested amount of hits.  If more hits were buffered since the indication,
another OOB Reply Indication is sent after all the initial hits have been
claimed.

Note that the amount of hits advertised in LIME/12 is NOT the amount of
query hit messages but the sum of all the individual entries held in each
query hit.

If the servent claims 255 hits (the maximum that can be possibly requested),
the ultra peer delivers entire query hit messages until the limit of 255 hits
is reached, after which if there is additional information left, it will send
another OOB Reply Indication message.  This avoids flooding the receiving
end by controlling the amount of traffic that can be received.

If the servent claims less than 255 hits, it means the querying party is no
longer interested and the query can be flagged as "done" in the routing table.
Any further hits received for that query can be safely discarded after the
requested hits have been sent back.

The limit of 255 hits is naturally adjusted down if there were less hits
advertised initially (the ultra peer must remember how many hits it indicated).

Because UDP traffic can be lost and because this can result in the querying
party being shadowed from a potentially large amount of hits, the ultra peer
resends the OOB Reply Indication message two more times if it does not receive
any claim after some time, doubling the period each time.  There is no way for
the receiver to know that this is a re-emission, and it could be confused with
an indication that there are more hits to claim.

Therefore, the ultra node will send a new GTKG/11v1 message to indicate that
there are no more hits to claim for that query, which will also serve as an
acknowledgement to the querying party if it attempts to resend a claiming
request because it thinks the previous one got lost: lack of reception of any
query hit back for the query or lack of reception of a GTKG/11v1 after some
time will indeed indicate the likelihood that some messages were dropped.

The GTKG/11v1 is specific to this OSQH proposal and is not sent back during
the regular OOB query hit delivery that would happen between a node having
results to send back and a querying party.

If the querying party does not claim the buffered hits after some reasonable
time (and after 3 notifications), the hits are dropped and the query is marked
as "done".


3. SPECIFICATION DETAILS

3.1 Gnutella Connections

An ultrapeer supporting OSQH will advertise this in its "features supported"
vendor message to let its peers know about it ("OSQH", version 1).  Optionally
it can also be included in the Gnutella handshaking headers, in the X-Features
line, as version 0.1, such as:

	X-Features: OSQH/0.1

in case Gnutella ultra peers want to favor connections to other OSQH-supporting
peers to reduce their query hit routing.


3.2 GTKG/11v1 -- OOB Hit Exhaustion

This vendor message is specified thusly:

	Name: "OOB Hit Exhaustion"
	Vendor: GTKG
	ID: 11
	Version: 1
	TTL: 1
	Payload: None

The MUID of the vendor message matches the query's MUID.

Upon reception of this message, the querying party knows that there will be
no more hits returned by the remote ultra peer for this query, so it can
stop asking.


3.3 UDP Traffic Compression

The exchanges made via OSQH are naturally suitable for UDP traffic compression,
as specified on August 13th, 2006.  This is a negotiated feature so it is not
a pre-requisite for OSQH, although its support is strongly advised.


3.4 Push-Proxy Concerns

It might be argued that OOB-shortcutting will prevent PUSH routes to be
established.

Although it is true that PUSH routes will not be established along the
full query hit return path (since it will be taking a shortcut), a PUSH
route is still possible if one realizes that the relaying ultra node is
actually an additional push-proxy: it has seen the query hit route so
far, so it can relay a PUSH if contacted about the servent ID.

This is actually no less fragile than what the original PUSH route would
have been since the loss of one ultrapeer in the chain would break the whole
route.

Naturally, this is less of an issue for legacy servents which support the
original push-proxy specifications and the later extensions, since the query
hit will contain a few of these push-proxies.  But the querying servent
receiving the hit through the shortcut can add that relaying hop as well.

The net conclusion is that OOB-shortcutting is not preventing PUSH routes
nor makes them more fragile.  It just requires the servents to use more
modern ways to issue PUSHes: via UDP for contacting push-proxies, and
through DHT lookups for refreshing the list of push-proxies.

Raphael
