Channel: prevent sequence holes and ghost envelopes when sending on a dying outlet

RNSChannelOutlet.send() can return a packet that never reached the wire
(link not ACTIVE, no capable interface, etc). The old Channel.send()
queued the envelope in _tx_ring before calling outlet.send(), then
tried to rewind _next_sequence and remove the envelope if the outlet
returned a failed packet. Two problems:

- Between queueing and outlet.send() returning, _tx_ring held an
envelope with packet.raw=None. Any worker thread iterating the
ring (timeout fire, proof callback) crashed in get_packet_id's
packet.get_hash() with a TypeError on None.raw.

- The rewind was only safe for a single-threaded sender: it checked
"is _next_sequence one past mine?" and skipped the rewind otherwise.
Under concurrent senders, the rewind silently failed, leaving a
hole in the on-wire sequence stream. The receiver's contiguous
seqnum rule then stalled the channel permanently with no error.

This fix serializes the reservation-and-transmit pair with a per-channel
_send_lock so the rewind is always correct, and defers queueing until
outlet.send() returns a real packet so _tx_ring never contains a
packet-less envelope. _packet_tx_op() and get_packet_id() now also
defensively skip/return-None for packet-less envelopes.

Also handle the small race where a proof arrives between outlet.send()
registering the receipt and us installing the delivery callback: after
registration, re-read the receipt status and synthesize the
_packet_delivered() call if it's already DELIVERED.
This commit is contained in:
Jeremy O'Brien
2026-05-19 09:23:48 -04:00
parent ebf544d335
commit 794e437f6d
2 changed files with 131 additions and 57 deletions
+42 -1
View File
@@ -62,7 +62,7 @@ class Packet:
def set_delivered_callback(self, callback: Callable[[Packet], None]):
self.delivered_callback = callback
def delivered(self):
with self.lock:
self.state = MessageState.MSGSTATE_DELIVERED
@@ -265,6 +265,47 @@ class TestChannel(unittest.TestCase):
self.assertEqual(MessageState.MSGSTATE_FAILED, packet.state)
self.assertFalse(envelope.tracked)
def test_send_on_failing_outlet_does_not_corrupt_state(self):
# if outlet.send() returns a packet that never reached
# the wire (LinkChannelOutlet does this when the link is not ACTIVE; the
# returned packet has raw=None), Channel.send() must not consume a
# sequence number or leave a packetless envelope in _tx_ring. Before
# the fix, the envelope was queued before outlet.send() returned, so a
# "dead" return left a raw=None envelope in the ring and silently
# advanced _next_sequence, stalling the channel on the other end.
print("Channel test send on failing outlet")
original_send = self.h.outlet.send
def ghost_send(raw):
with self.h.outlet.lock:
packet = Packet(None)
packet.state = MessageState.MSGSTATE_FAILED
self.h.outlet.packets.append(packet)
return packet
self.h.outlet.send = ghost_send
pre_sequence = self.h.channel._next_sequence
self.assertEqual(0, len(self.h.channel._tx_ring))
with self.assertRaises(RNS.Channel.ChannelException):
self.h.channel.send(MessageTest())
# Sequence must not have been consumed.
self.assertEqual(pre_sequence, self.h.channel._next_sequence)
# _tx_ring must not contain a packetless envelope.
self.assertEqual(0, len(self.h.channel._tx_ring))
# A subsequent successful send should use the same sequence number as
# was reserved for the failed attempt.
self.h.outlet.send = original_send
envelope = self.h.channel.send(MessageTest())
self.assertEqual(pre_sequence, envelope.sequence)
self.assertIsNotNone(envelope.packet)
self.assertIsNotNone(envelope.packet.raw)
self.assertTrue(envelope in self.h.channel._tx_ring)
def test_multiple_handler(self):
print("Channel test multiple handler short circuit")