181 lines
6.4 KiB
Markdown
181 lines
6.4 KiB
Markdown
|
|
# Lessons Learned: WebSocket Interception in Headless Chrome
|
||
|
|
|
||
|
|
## The Problem
|
||
|
|
|
||
|
|
When implementing the Kosmi bridge, we initially tried several approaches:
|
||
|
|
|
||
|
|
1. **Native Go WebSocket Client**: Failed with 403 Forbidden due to missing session cookies
|
||
|
|
2. **HTTP POST with Polling**: Worked for queries but not ideal for real-time subscriptions
|
||
|
|
3. **ChromeDP with Post-Load Injection**: Connected but didn't capture messages
|
||
|
|
|
||
|
|
## The Solution
|
||
|
|
|
||
|
|
The key insight came from examining the working Chrome extension's `inject.js` file. The solution required two critical components:
|
||
|
|
|
||
|
|
### 1. Hook the Raw WebSocket Constructor
|
||
|
|
|
||
|
|
Instead of trying to hook into Apollo Client or other high-level abstractions, we needed to hook the **raw `window.WebSocket` constructor**:
|
||
|
|
|
||
|
|
```javascript
|
||
|
|
const OriginalWebSocket = window.WebSocket;
|
||
|
|
|
||
|
|
window.WebSocket = function(url, protocols) {
|
||
|
|
const socket = new OriginalWebSocket(url, protocols);
|
||
|
|
|
||
|
|
if (url.includes('engine.kosmi.io') || url.includes('gql-ws')) {
|
||
|
|
// Wrap addEventListener for 'message' events
|
||
|
|
const originalAddEventListener = socket.addEventListener.bind(socket);
|
||
|
|
socket.addEventListener = function(type, listener, options) {
|
||
|
|
if (type === 'message') {
|
||
|
|
const wrappedListener = function(event) {
|
||
|
|
// Capture the message
|
||
|
|
window.__KOSMI_MESSAGE_QUEUE__.push({
|
||
|
|
timestamp: Date.now(),
|
||
|
|
data: JSON.parse(event.data),
|
||
|
|
source: 'addEventListener'
|
||
|
|
});
|
||
|
|
return listener.call(this, event);
|
||
|
|
};
|
||
|
|
return originalAddEventListener(type, wrappedListener, options);
|
||
|
|
}
|
||
|
|
return originalAddEventListener(type, listener, options);
|
||
|
|
};
|
||
|
|
|
||
|
|
// Also wrap the onmessage property
|
||
|
|
let realOnMessage = null;
|
||
|
|
Object.defineProperty(socket, 'onmessage', {
|
||
|
|
get: function() { return realOnMessage; },
|
||
|
|
set: function(handler) {
|
||
|
|
realOnMessage = function(event) {
|
||
|
|
// Capture the message
|
||
|
|
window.__KOSMI_MESSAGE_QUEUE__.push({
|
||
|
|
timestamp: Date.now(),
|
||
|
|
data: JSON.parse(event.data),
|
||
|
|
source: 'onmessage'
|
||
|
|
});
|
||
|
|
if (handler) { handler.call(socket, event); }
|
||
|
|
};
|
||
|
|
},
|
||
|
|
configurable: true
|
||
|
|
});
|
||
|
|
}
|
||
|
|
|
||
|
|
return socket;
|
||
|
|
};
|
||
|
|
```
|
||
|
|
|
||
|
|
### 2. Inject Before Page Load
|
||
|
|
|
||
|
|
The most critical lesson: **The WebSocket hook MUST be injected before any page JavaScript executes.**
|
||
|
|
|
||
|
|
#### ❌ Wrong Approach (Post-Load Injection)
|
||
|
|
|
||
|
|
```go
|
||
|
|
// This doesn't work - WebSocket is already created!
|
||
|
|
chromedp.Run(ctx,
|
||
|
|
chromedp.Navigate(roomURL),
|
||
|
|
chromedp.WaitReady("body"),
|
||
|
|
chromedp.Evaluate(hookScript, nil), // Too late!
|
||
|
|
)
|
||
|
|
```
|
||
|
|
|
||
|
|
**Why it fails**: By the time the page loads and we inject the script, Kosmi has already created its WebSocket connection. Our hook never gets a chance to intercept it.
|
||
|
|
|
||
|
|
#### ✅ Correct Approach (Pre-Load Injection)
|
||
|
|
|
||
|
|
```go
|
||
|
|
// Inject BEFORE navigation using Page.addScriptToEvaluateOnNewDocument
|
||
|
|
chromedp.Run(ctx, chromedp.ActionFunc(func(ctx context.Context) error {
|
||
|
|
_, err := page.AddScriptToEvaluateOnNewDocument(hookScript).Do(ctx)
|
||
|
|
return err
|
||
|
|
}))
|
||
|
|
|
||
|
|
// Now navigate - the hook is already active!
|
||
|
|
chromedp.Run(ctx,
|
||
|
|
chromedp.Navigate(roomURL),
|
||
|
|
chromedp.WaitReady("body"),
|
||
|
|
)
|
||
|
|
```
|
||
|
|
|
||
|
|
**Why it works**: `Page.addScriptToEvaluateOnNewDocument` is a Chrome DevTools Protocol method that ensures the script runs **before any page scripts**. When Kosmi's JavaScript creates the WebSocket, our hook is already in place to intercept it.
|
||
|
|
|
||
|
|
## Implementation in chromedp_client.go
|
||
|
|
|
||
|
|
The final implementation:
|
||
|
|
|
||
|
|
```go
|
||
|
|
func (c *ChromeDPClient) injectWebSocketHookBeforeLoad() error {
|
||
|
|
script := c.getWebSocketHookScript()
|
||
|
|
|
||
|
|
return chromedp.Run(c.ctx, chromedp.ActionFunc(func(ctx context.Context) error {
|
||
|
|
// Use Page.addScriptToEvaluateOnNewDocument to inject before page load
|
||
|
|
_, err := page.AddScriptToEvaluateOnNewDocument(script).Do(ctx)
|
||
|
|
return err
|
||
|
|
}))
|
||
|
|
}
|
||
|
|
|
||
|
|
func (c *ChromeDPClient) Connect() error {
|
||
|
|
// ... context setup ...
|
||
|
|
|
||
|
|
// Inject hook BEFORE navigation
|
||
|
|
if err := c.injectWebSocketHookBeforeLoad(); err != nil {
|
||
|
|
return fmt.Errorf("failed to inject WebSocket hook: %w", err)
|
||
|
|
}
|
||
|
|
|
||
|
|
// Now navigate with hook already active
|
||
|
|
if err := chromedp.Run(ctx,
|
||
|
|
chromedp.Navigate(c.roomURL),
|
||
|
|
chromedp.WaitReady("body"),
|
||
|
|
); err != nil {
|
||
|
|
return fmt.Errorf("failed to navigate to room: %w", err)
|
||
|
|
}
|
||
|
|
|
||
|
|
// ... rest of connection logic ...
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
## Verification
|
||
|
|
|
||
|
|
To verify the hook is working correctly, check for these log messages:
|
||
|
|
|
||
|
|
```
|
||
|
|
INFO Injecting WebSocket interceptor (runs before page load)...
|
||
|
|
INFO Navigating to Kosmi room: https://app.kosmi.io/room/@hyperspaceout
|
||
|
|
INFO Page loaded, checking if hook is active...
|
||
|
|
INFO ✓ WebSocket hook confirmed installed
|
||
|
|
INFO Status: WebSocket connection intercepted
|
||
|
|
```
|
||
|
|
|
||
|
|
If you see "No WebSocket connection detected yet", the hook was likely injected too late.
|
||
|
|
|
||
|
|
## Key Takeaways
|
||
|
|
|
||
|
|
1. **Timing is Everything**: WebSocket interception must happen before the WebSocket is created
|
||
|
|
2. **Use the Right CDP Method**: `Page.addScriptToEvaluateOnNewDocument` is specifically designed for this use case
|
||
|
|
3. **Hook at the Lowest Level**: Hook `window.WebSocket` constructor, not higher-level abstractions
|
||
|
|
4. **Wrap Both Event Mechanisms**: Intercept both `addEventListener` and `onmessage` property
|
||
|
|
5. **Test with Real Messages**: The connection might succeed but messages won't appear if the hook isn't working
|
||
|
|
|
||
|
|
## References
|
||
|
|
|
||
|
|
- Chrome DevTools Protocol: https://chromedevtools.github.io/devtools-protocol/
|
||
|
|
- `Page.addScriptToEvaluateOnNewDocument`: https://chromedevtools.github.io/devtools-protocol/tot/Page/#method-addScriptToEvaluateOnNewDocument
|
||
|
|
- chromedp documentation: https://pkg.go.dev/github.com/chromedp/chromedp
|
||
|
|
- Original Chrome extension: `.examples/chrome-extension/inject.js`
|
||
|
|
|
||
|
|
## Applying This Lesson to Other Projects
|
||
|
|
|
||
|
|
This pattern applies to any scenario where you need to intercept browser APIs in headless automation:
|
||
|
|
|
||
|
|
1. Identify the API you need to intercept (WebSocket, fetch, XMLHttpRequest, etc.)
|
||
|
|
2. Write a hook that wraps the constructor or method
|
||
|
|
3. Inject using `Page.addScriptToEvaluateOnNewDocument` **before navigation**
|
||
|
|
4. Verify the hook is active before the page creates the objects you want to intercept
|
||
|
|
|
||
|
|
This approach is more reliable than browser extensions for server-side automation because:
|
||
|
|
- ✅ No browser extension installation required
|
||
|
|
- ✅ Works in headless mode
|
||
|
|
- ✅ Full control over the browser context
|
||
|
|
- ✅ Can run on servers without a display
|
||
|
|
|