# Lessons Learned: WebSocket Interception in Headless Chrome ## The Problem When implementing the Kosmi bridge, we initially tried several approaches: 1. **Native Go WebSocket Client**: Failed with 403 Forbidden due to missing session cookies 2. **HTTP POST with Polling**: Worked for queries but not ideal for real-time subscriptions 3. **ChromeDP with Post-Load Injection**: Connected but didn't capture messages ## The Solution The key insight came from examining the working Chrome extension's `inject.js` file. The solution required two critical components: ### 1. Hook the Raw WebSocket Constructor Instead of trying to hook into Apollo Client or other high-level abstractions, we needed to hook the **raw `window.WebSocket` constructor**: ```javascript const OriginalWebSocket = window.WebSocket; window.WebSocket = function(url, protocols) { const socket = new OriginalWebSocket(url, protocols); if (url.includes('engine.kosmi.io') || url.includes('gql-ws')) { // Wrap addEventListener for 'message' events const originalAddEventListener = socket.addEventListener.bind(socket); socket.addEventListener = function(type, listener, options) { if (type === 'message') { const wrappedListener = function(event) { // Capture the message window.__KOSMI_MESSAGE_QUEUE__.push({ timestamp: Date.now(), data: JSON.parse(event.data), source: 'addEventListener' }); return listener.call(this, event); }; return originalAddEventListener(type, wrappedListener, options); } return originalAddEventListener(type, listener, options); }; // Also wrap the onmessage property let realOnMessage = null; Object.defineProperty(socket, 'onmessage', { get: function() { return realOnMessage; }, set: function(handler) { realOnMessage = function(event) { // Capture the message window.__KOSMI_MESSAGE_QUEUE__.push({ timestamp: Date.now(), data: JSON.parse(event.data), source: 'onmessage' }); if (handler) { handler.call(socket, event); } }; }, configurable: true }); } return socket; }; ``` ### 2. Inject Before Page Load The most critical lesson: **The WebSocket hook MUST be injected before any page JavaScript executes.** #### ❌ Wrong Approach (Post-Load Injection) ```go // This doesn't work - WebSocket is already created! chromedp.Run(ctx, chromedp.Navigate(roomURL), chromedp.WaitReady("body"), chromedp.Evaluate(hookScript, nil), // Too late! ) ``` **Why it fails**: By the time the page loads and we inject the script, Kosmi has already created its WebSocket connection. Our hook never gets a chance to intercept it. #### ✅ Correct Approach (Pre-Load Injection) ```go // Inject BEFORE navigation using Page.addScriptToEvaluateOnNewDocument chromedp.Run(ctx, chromedp.ActionFunc(func(ctx context.Context) error { _, err := page.AddScriptToEvaluateOnNewDocument(hookScript).Do(ctx) return err })) // Now navigate - the hook is already active! chromedp.Run(ctx, chromedp.Navigate(roomURL), chromedp.WaitReady("body"), ) ``` **Why it works**: `Page.addScriptToEvaluateOnNewDocument` is a Chrome DevTools Protocol method that ensures the script runs **before any page scripts**. When Kosmi's JavaScript creates the WebSocket, our hook is already in place to intercept it. ## Implementation in chromedp_client.go The final implementation: ```go func (c *ChromeDPClient) injectWebSocketHookBeforeLoad() error { script := c.getWebSocketHookScript() return chromedp.Run(c.ctx, chromedp.ActionFunc(func(ctx context.Context) error { // Use Page.addScriptToEvaluateOnNewDocument to inject before page load _, err := page.AddScriptToEvaluateOnNewDocument(script).Do(ctx) return err })) } func (c *ChromeDPClient) Connect() error { // ... context setup ... // Inject hook BEFORE navigation if err := c.injectWebSocketHookBeforeLoad(); err != nil { return fmt.Errorf("failed to inject WebSocket hook: %w", err) } // Now navigate with hook already active if err := chromedp.Run(ctx, chromedp.Navigate(c.roomURL), chromedp.WaitReady("body"), ); err != nil { return fmt.Errorf("failed to navigate to room: %w", err) } // ... rest of connection logic ... } ``` ## Verification To verify the hook is working correctly, check for these log messages: ``` INFO Injecting WebSocket interceptor (runs before page load)... INFO Navigating to Kosmi room: https://app.kosmi.io/room/@hyperspaceout INFO Page loaded, checking if hook is active... INFO ✓ WebSocket hook confirmed installed INFO Status: WebSocket connection intercepted ``` If you see "No WebSocket connection detected yet", the hook was likely injected too late. ## Key Takeaways 1. **Timing is Everything**: WebSocket interception must happen before the WebSocket is created 2. **Use the Right CDP Method**: `Page.addScriptToEvaluateOnNewDocument` is specifically designed for this use case 3. **Hook at the Lowest Level**: Hook `window.WebSocket` constructor, not higher-level abstractions 4. **Wrap Both Event Mechanisms**: Intercept both `addEventListener` and `onmessage` property 5. **Test with Real Messages**: The connection might succeed but messages won't appear if the hook isn't working ## References - Chrome DevTools Protocol: https://chromedevtools.github.io/devtools-protocol/ - `Page.addScriptToEvaluateOnNewDocument`: https://chromedevtools.github.io/devtools-protocol/tot/Page/#method-addScriptToEvaluateOnNewDocument - chromedp documentation: https://pkg.go.dev/github.com/chromedp/chromedp - Original Chrome extension: `.examples/chrome-extension/inject.js` ## Applying This Lesson to Other Projects This pattern applies to any scenario where you need to intercept browser APIs in headless automation: 1. Identify the API you need to intercept (WebSocket, fetch, XMLHttpRequest, etc.) 2. Write a hook that wraps the constructor or method 3. Inject using `Page.addScriptToEvaluateOnNewDocument` **before navigation** 4. Verify the hook is active before the page creates the objects you want to intercept This approach is more reliable than browser extensions for server-side automation because: - ✅ No browser extension installation required - ✅ Works in headless mode - ✅ Full control over the browser context - ✅ Can run on servers without a display