Files
IRC-kosmi-relay/docs/LESSONS_LEARNED.md
cottongin db284d0677 Move troubleshooting and implementation docs to docs/
Relocate 30 non-essential .md files (investigation notes, fix summaries,
implementation details, status reports) from the project root into docs/
to reduce clutter. Core operational docs (README, quickstart guides,
configuration references) remain in the root.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-07 13:40:46 -05:00

6.4 KiB

Lessons Learned: WebSocket Interception in Headless Chrome

The Problem

When implementing the Kosmi bridge, we initially tried several approaches:

  1. Native Go WebSocket Client: Failed with 403 Forbidden due to missing session cookies
  2. HTTP POST with Polling: Worked for queries but not ideal for real-time subscriptions
  3. ChromeDP with Post-Load Injection: Connected but didn't capture messages

The Solution

The key insight came from examining the working Chrome extension's inject.js file. The solution required two critical components:

1. Hook the Raw WebSocket Constructor

Instead of trying to hook into Apollo Client or other high-level abstractions, we needed to hook the raw window.WebSocket constructor:

const OriginalWebSocket = window.WebSocket;

window.WebSocket = function(url, protocols) {
  const socket = new OriginalWebSocket(url, protocols);
  
  if (url.includes('engine.kosmi.io') || url.includes('gql-ws')) {
    // Wrap addEventListener for 'message' events
    const originalAddEventListener = socket.addEventListener.bind(socket);
    socket.addEventListener = function(type, listener, options) {
      if (type === 'message') {
        const wrappedListener = function(event) {
          // Capture the message
          window.__KOSMI_MESSAGE_QUEUE__.push({
            timestamp: Date.now(),
            data: JSON.parse(event.data),
            source: 'addEventListener'
          });
          return listener.call(this, event);
        };
        return originalAddEventListener(type, wrappedListener, options);
      }
      return originalAddEventListener(type, listener, options);
    };
    
    // Also wrap the onmessage property
    let realOnMessage = null;
    Object.defineProperty(socket, 'onmessage', {
      get: function() { return realOnMessage; },
      set: function(handler) {
        realOnMessage = function(event) {
          // Capture the message
          window.__KOSMI_MESSAGE_QUEUE__.push({
            timestamp: Date.now(),
            data: JSON.parse(event.data),
            source: 'onmessage'
          });
          if (handler) { handler.call(socket, event); }
        };
      },
      configurable: true
    });
  }
  
  return socket;
};

2. Inject Before Page Load

The most critical lesson: The WebSocket hook MUST be injected before any page JavaScript executes.

Wrong Approach (Post-Load Injection)

// This doesn't work - WebSocket is already created!
chromedp.Run(ctx,
    chromedp.Navigate(roomURL),
    chromedp.WaitReady("body"),
    chromedp.Evaluate(hookScript, nil), // Too late!
)

Why it fails: By the time the page loads and we inject the script, Kosmi has already created its WebSocket connection. Our hook never gets a chance to intercept it.

Correct Approach (Pre-Load Injection)

// Inject BEFORE navigation using Page.addScriptToEvaluateOnNewDocument
chromedp.Run(ctx, chromedp.ActionFunc(func(ctx context.Context) error {
    _, err := page.AddScriptToEvaluateOnNewDocument(hookScript).Do(ctx)
    return err
}))

// Now navigate - the hook is already active!
chromedp.Run(ctx,
    chromedp.Navigate(roomURL),
    chromedp.WaitReady("body"),
)

Why it works: Page.addScriptToEvaluateOnNewDocument is a Chrome DevTools Protocol method that ensures the script runs before any page scripts. When Kosmi's JavaScript creates the WebSocket, our hook is already in place to intercept it.

Implementation in chromedp_client.go

The final implementation:

func (c *ChromeDPClient) injectWebSocketHookBeforeLoad() error {
    script := c.getWebSocketHookScript()
    
    return chromedp.Run(c.ctx, chromedp.ActionFunc(func(ctx context.Context) error {
        // Use Page.addScriptToEvaluateOnNewDocument to inject before page load
        _, err := page.AddScriptToEvaluateOnNewDocument(script).Do(ctx)
        return err
    }))
}

func (c *ChromeDPClient) Connect() error {
    // ... context setup ...
    
    // Inject hook BEFORE navigation
    if err := c.injectWebSocketHookBeforeLoad(); err != nil {
        return fmt.Errorf("failed to inject WebSocket hook: %w", err)
    }
    
    // Now navigate with hook already active
    if err := chromedp.Run(ctx,
        chromedp.Navigate(c.roomURL),
        chromedp.WaitReady("body"),
    ); err != nil {
        return fmt.Errorf("failed to navigate to room: %w", err)
    }
    
    // ... rest of connection logic ...
}

Verification

To verify the hook is working correctly, check for these log messages:

INFO Injecting WebSocket interceptor (runs before page load)...
INFO Navigating to Kosmi room: https://app.kosmi.io/room/@hyperspaceout
INFO Page loaded, checking if hook is active...
INFO ✓ WebSocket hook confirmed installed
INFO Status: WebSocket connection intercepted

If you see "No WebSocket connection detected yet", the hook was likely injected too late.

Key Takeaways

  1. Timing is Everything: WebSocket interception must happen before the WebSocket is created
  2. Use the Right CDP Method: Page.addScriptToEvaluateOnNewDocument is specifically designed for this use case
  3. Hook at the Lowest Level: Hook window.WebSocket constructor, not higher-level abstractions
  4. Wrap Both Event Mechanisms: Intercept both addEventListener and onmessage property
  5. Test with Real Messages: The connection might succeed but messages won't appear if the hook isn't working

References

Applying This Lesson to Other Projects

This pattern applies to any scenario where you need to intercept browser APIs in headless automation:

  1. Identify the API you need to intercept (WebSocket, fetch, XMLHttpRequest, etc.)
  2. Write a hook that wraps the constructor or method
  3. Inject using Page.addScriptToEvaluateOnNewDocument before navigation
  4. Verify the hook is active before the page creates the objects you want to intercept

This approach is more reliable than browser extensions for server-side automation because:

  • No browser extension installation required
  • Works in headless mode
  • Full control over the browser context
  • Can run on servers without a display