We assert that, in many delay tolerant networks, duplicates may pose a larger problem: they hinder
the ability to partially process data within the network. In-network processing is seen as
desirable because it can dramatically reduce bandwidth requirements. Unfortunately, as we will
see, if data is coarsely aggregated within the network it can be difcult to detect or eliminate
duplicates, which can lead to incorrect answers.
In-network processing has been proposed in a number of delay-prone environments. For example, in
sensor networks, bandwidth is generally scarce, especially at the edges of the network, and thus
doing some fusion or aggregation of sensor readings as data is routed is potentially benecial. A
number of papers note the benets of in-network aggregation, citing orderof magnitude or greater
bandwidth reductions for some classes of operations given particular network topologies. Similarly,
when moving data between different classes of networks (e.g. the Internet and GPRS), it may be
useful to transcode or downsample data items, sometimes in a non-deterministic way, as when
dithering an image.
If the network cannot guarantee duplicate-free semantics, some in-network operations might produce
incorrect answers: consider a sensor network attempting to compute an average over the readings
from a number of sensors. If one of these readings is duplicated, it will obviously skew answers.
We call such operations duplicate sensitive. Of course, some in-network operations are duplicate
insensitive computing a minimum of a set of readings, for example, has this property.
Thus, we have seen that, unless we wish to sacrice the availability of our network, duplicates may
arise in disconnection-prone delay tolerant networks. Furthermore, because many such networks may
wish to perform in-network computation, duplicates can be more problematic than in traditional
networks. In the next section, we examine possible techniques for mitigating the overhead of