Files
IRC-kosmi-relay/docs/GATEWAY_TIMING_FIX.md
cottongin db284d0677 Move troubleshooting and implementation docs to docs/
Relocate 30 non-essential .md files (investigation notes, fix summaries,
implementation details, status reports) from the project root into docs/
to reduce clutter. Core operational docs (README, quickstart guides,
configuration references) remain in the root.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-07 13:40:46 -05:00

4.1 KiB

CRITICAL FIX: Gateway Timing and Listener Start

The Problem

Messages were being queued but NEVER flushed - the Remote channel was never becoming ready:

17:21:23  📦 Remote channel not ready, queued message (1 in queue)
17:21:23  📤 Attempting to flush 1 queued messages
17:21:23  📦 Flushed 0 messages, 1 still queued  ← NEVER SENDS!

Root Cause: Starting Listener Too Early

We were starting listenForMessages() in Connect(), but Matterbridge's architecture requires:

  1. Connect() - Establish connection, set up client
  2. Router sets up gateway - Creates the b.Remote channel and wires everything together
  3. JoinChannel() - Called by router when gateway is READY
  4. THEN start receiving messages

By starting the listener in Connect(), we were receiving messages BEFORE the gateway was ready, so b.Remote wasn't set up yet.

The Solution

Delay starting the message listener until JoinChannel() is called.

Changes Made

  1. In graphql_ws_client.go:

    • Removed go c.listenForMessages() from Connect()
    • Added new method StartListening() that starts the listener
    • Added to KosmiClient interface
  2. In kosmi.go:

    • Call b.client.StartListening() in JoinChannel()
    • This is when the router has finished setting up the gateway

Code Changes

// graphql_ws_client.go
func (c *GraphQLWSClient) Connect() error {
    // ... connection setup ...
    
    c.log.Info("Native WebSocket client connected (listener will start when gateway is ready)")
    
    // DON'T start listener here!
    return nil
}

func (c *GraphQLWSClient) StartListening() {
    c.log.Info("Starting message listener...")
    go c.listenForMessages()
}

// kosmi.go
func (b *Bkosmi) JoinChannel(channel config.ChannelInfo) error {
    b.Log.Infof("Channel %s is already connected via room URL", channel.Name)
    
    // NOW start listening - the gateway is ready!
    b.Log.Info("Gateway is ready, starting message listener...")
    if b.client != nil {
        b.client.StartListening()
    }
    
    return nil
}

Why This Fixes Everything

Before (Broken)

1. Connect() starts
2. listenForMessages() starts immediately
3. Messages start arriving
4. Try to send to b.Remote → NOT READY YET
5. Queue messages
6. Router finishes setup (b.Remote now ready)
7. Try to flush → but no new messages trigger flush
8. Messages stuck in queue forever

After (Fixed)

1. Connect() starts
2. Connection established, but NO listener yet
3. Router finishes setup (b.Remote ready)
4. JoinChannel() called
5. StartListening() called
6. listenForMessages() starts NOW
7. Messages arrive
8. Send to b.Remote → SUCCESS!
9. No queue needed (but available as safety net)

Expected Behavior

Logs Should Show

17:20:43  Starting bridge: kosmi.hyperspaceout
17:20:43  Connecting to Kosmi
17:21:00  Successfully connected to Kosmi
17:21:01  Native WebSocket client connected (listener will start when gateway is ready)
17:21:01  Successfully connected to Kosmi
17:21:01  kosmi.hyperspaceout: joining main
17:21:01  Gateway is ready, starting message listener...
17:21:01  Starting message listener...
17:21:01  🎧 [KOSMI WEBSOCKET] Message listener started
17:21:23  📨 [KOSMI WEBSOCKET] Received: type=next id=subscribe-messages
17:21:23  Received message from Kosmi: [2025-11-01T17:21:23-04:00] cottongin: IKR
17:21:23  ✅ Message forwarded to Matterbridge  ← SUCCESS!

Why This Matters

This is a fundamental timing issue in how bridges integrate with Matterbridge's gateway system. The gateway must be fully initialized before messages start flowing, otherwise the routing infrastructure isn't ready.

This pattern should be followed by ALL bridges:

  • Connect and authenticate in Connect()
  • Start receiving messages in JoinChannel() (or equivalent)

Testing

go build
docker-compose build
docker-compose up -d
docker-compose logs -f matterbridge

Send messages immediately after bot connects. They should:

  1. NOT be queued
  2. Be forwarded directly to IRC/other bridges
  3. Appear in all connected channels

No more "Remote channel not ready" messages!