working v1
This commit is contained in:
180
LESSONS_LEARNED.md
Normal file
180
LESSONS_LEARNED.md
Normal file
@@ -0,0 +1,180 @@
|
||||
# Lessons Learned: WebSocket Interception in Headless Chrome
|
||||
|
||||
## The Problem
|
||||
|
||||
When implementing the Kosmi bridge, we initially tried several approaches:
|
||||
|
||||
1. **Native Go WebSocket Client**: Failed with 403 Forbidden due to missing session cookies
|
||||
2. **HTTP POST with Polling**: Worked for queries but not ideal for real-time subscriptions
|
||||
3. **ChromeDP with Post-Load Injection**: Connected but didn't capture messages
|
||||
|
||||
## The Solution
|
||||
|
||||
The key insight came from examining the working Chrome extension's `inject.js` file. The solution required two critical components:
|
||||
|
||||
### 1. Hook the Raw WebSocket Constructor
|
||||
|
||||
Instead of trying to hook into Apollo Client or other high-level abstractions, we needed to hook the **raw `window.WebSocket` constructor**:
|
||||
|
||||
```javascript
|
||||
const OriginalWebSocket = window.WebSocket;
|
||||
|
||||
window.WebSocket = function(url, protocols) {
|
||||
const socket = new OriginalWebSocket(url, protocols);
|
||||
|
||||
if (url.includes('engine.kosmi.io') || url.includes('gql-ws')) {
|
||||
// Wrap addEventListener for 'message' events
|
||||
const originalAddEventListener = socket.addEventListener.bind(socket);
|
||||
socket.addEventListener = function(type, listener, options) {
|
||||
if (type === 'message') {
|
||||
const wrappedListener = function(event) {
|
||||
// Capture the message
|
||||
window.__KOSMI_MESSAGE_QUEUE__.push({
|
||||
timestamp: Date.now(),
|
||||
data: JSON.parse(event.data),
|
||||
source: 'addEventListener'
|
||||
});
|
||||
return listener.call(this, event);
|
||||
};
|
||||
return originalAddEventListener(type, wrappedListener, options);
|
||||
}
|
||||
return originalAddEventListener(type, listener, options);
|
||||
};
|
||||
|
||||
// Also wrap the onmessage property
|
||||
let realOnMessage = null;
|
||||
Object.defineProperty(socket, 'onmessage', {
|
||||
get: function() { return realOnMessage; },
|
||||
set: function(handler) {
|
||||
realOnMessage = function(event) {
|
||||
// Capture the message
|
||||
window.__KOSMI_MESSAGE_QUEUE__.push({
|
||||
timestamp: Date.now(),
|
||||
data: JSON.parse(event.data),
|
||||
source: 'onmessage'
|
||||
});
|
||||
if (handler) { handler.call(socket, event); }
|
||||
};
|
||||
},
|
||||
configurable: true
|
||||
});
|
||||
}
|
||||
|
||||
return socket;
|
||||
};
|
||||
```
|
||||
|
||||
### 2. Inject Before Page Load
|
||||
|
||||
The most critical lesson: **The WebSocket hook MUST be injected before any page JavaScript executes.**
|
||||
|
||||
#### ❌ Wrong Approach (Post-Load Injection)
|
||||
|
||||
```go
|
||||
// This doesn't work - WebSocket is already created!
|
||||
chromedp.Run(ctx,
|
||||
chromedp.Navigate(roomURL),
|
||||
chromedp.WaitReady("body"),
|
||||
chromedp.Evaluate(hookScript, nil), // Too late!
|
||||
)
|
||||
```
|
||||
|
||||
**Why it fails**: By the time the page loads and we inject the script, Kosmi has already created its WebSocket connection. Our hook never gets a chance to intercept it.
|
||||
|
||||
#### ✅ Correct Approach (Pre-Load Injection)
|
||||
|
||||
```go
|
||||
// Inject BEFORE navigation using Page.addScriptToEvaluateOnNewDocument
|
||||
chromedp.Run(ctx, chromedp.ActionFunc(func(ctx context.Context) error {
|
||||
_, err := page.AddScriptToEvaluateOnNewDocument(hookScript).Do(ctx)
|
||||
return err
|
||||
}))
|
||||
|
||||
// Now navigate - the hook is already active!
|
||||
chromedp.Run(ctx,
|
||||
chromedp.Navigate(roomURL),
|
||||
chromedp.WaitReady("body"),
|
||||
)
|
||||
```
|
||||
|
||||
**Why it works**: `Page.addScriptToEvaluateOnNewDocument` is a Chrome DevTools Protocol method that ensures the script runs **before any page scripts**. When Kosmi's JavaScript creates the WebSocket, our hook is already in place to intercept it.
|
||||
|
||||
## Implementation in chromedp_client.go
|
||||
|
||||
The final implementation:
|
||||
|
||||
```go
|
||||
func (c *ChromeDPClient) injectWebSocketHookBeforeLoad() error {
|
||||
script := c.getWebSocketHookScript()
|
||||
|
||||
return chromedp.Run(c.ctx, chromedp.ActionFunc(func(ctx context.Context) error {
|
||||
// Use Page.addScriptToEvaluateOnNewDocument to inject before page load
|
||||
_, err := page.AddScriptToEvaluateOnNewDocument(script).Do(ctx)
|
||||
return err
|
||||
}))
|
||||
}
|
||||
|
||||
func (c *ChromeDPClient) Connect() error {
|
||||
// ... context setup ...
|
||||
|
||||
// Inject hook BEFORE navigation
|
||||
if err := c.injectWebSocketHookBeforeLoad(); err != nil {
|
||||
return fmt.Errorf("failed to inject WebSocket hook: %w", err)
|
||||
}
|
||||
|
||||
// Now navigate with hook already active
|
||||
if err := chromedp.Run(ctx,
|
||||
chromedp.Navigate(c.roomURL),
|
||||
chromedp.WaitReady("body"),
|
||||
); err != nil {
|
||||
return fmt.Errorf("failed to navigate to room: %w", err)
|
||||
}
|
||||
|
||||
// ... rest of connection logic ...
|
||||
}
|
||||
```
|
||||
|
||||
## Verification
|
||||
|
||||
To verify the hook is working correctly, check for these log messages:
|
||||
|
||||
```
|
||||
INFO Injecting WebSocket interceptor (runs before page load)...
|
||||
INFO Navigating to Kosmi room: https://app.kosmi.io/room/@hyperspaceout
|
||||
INFO Page loaded, checking if hook is active...
|
||||
INFO ✓ WebSocket hook confirmed installed
|
||||
INFO Status: WebSocket connection intercepted
|
||||
```
|
||||
|
||||
If you see "No WebSocket connection detected yet", the hook was likely injected too late.
|
||||
|
||||
## Key Takeaways
|
||||
|
||||
1. **Timing is Everything**: WebSocket interception must happen before the WebSocket is created
|
||||
2. **Use the Right CDP Method**: `Page.addScriptToEvaluateOnNewDocument` is specifically designed for this use case
|
||||
3. **Hook at the Lowest Level**: Hook `window.WebSocket` constructor, not higher-level abstractions
|
||||
4. **Wrap Both Event Mechanisms**: Intercept both `addEventListener` and `onmessage` property
|
||||
5. **Test with Real Messages**: The connection might succeed but messages won't appear if the hook isn't working
|
||||
|
||||
## References
|
||||
|
||||
- Chrome DevTools Protocol: https://chromedevtools.github.io/devtools-protocol/
|
||||
- `Page.addScriptToEvaluateOnNewDocument`: https://chromedevtools.github.io/devtools-protocol/tot/Page/#method-addScriptToEvaluateOnNewDocument
|
||||
- chromedp documentation: https://pkg.go.dev/github.com/chromedp/chromedp
|
||||
- Original Chrome extension: `.examples/chrome-extension/inject.js`
|
||||
|
||||
## Applying This Lesson to Other Projects
|
||||
|
||||
This pattern applies to any scenario where you need to intercept browser APIs in headless automation:
|
||||
|
||||
1. Identify the API you need to intercept (WebSocket, fetch, XMLHttpRequest, etc.)
|
||||
2. Write a hook that wraps the constructor or method
|
||||
3. Inject using `Page.addScriptToEvaluateOnNewDocument` **before navigation**
|
||||
4. Verify the hook is active before the page creates the objects you want to intercept
|
||||
|
||||
This approach is more reliable than browser extensions for server-side automation because:
|
||||
- ✅ No browser extension installation required
|
||||
- ✅ Works in headless mode
|
||||
- ✅ Full control over the browser context
|
||||
- ✅ Can run on servers without a display
|
||||
|
||||
Reference in New Issue
Block a user