5.5 KiB
WebSocket 403 Error Analysis
Date: October 31, 2025
Issue: Direct WebSocket connection to wss://engine.kosmi.io/gql-ws returns 403 Forbidden
Tests Performed
Test 1: No Authentication
./test-websocket -mode 2
Result: 403 Forbidden ❌
Test 2: Origin Header Only
./test-websocket -mode 3
Result: 403 Forbidden ❌
Test 3: With JWT Token
./test-websocket-direct -token <CAPTURED_TOKEN>
Result: 403 Forbidden ❌
Test 4: With Session Cookies + Token
./test-session -room <URL> -token <TOKEN>
Result: 403 Forbidden ❌
Note: No cookies were set by visiting the room page
Analysis
Why 403?
The 403 error occurs during the WebSocket handshake, BEFORE we can send the connection_init message with the JWT token. This means:
- ❌ It's NOT about the JWT token (that's sent after connection)
- ❌ It's NOT about cookies (no cookies are set)
- ❌ It's NOT about the Origin header (we're sending the correct origin)
- ✅ It's likely a security measure at the WebSocket server or proxy level
Possible Causes
-
Cloudflare/CDN Protection
- Server: "Cowboy" with "Via: 1.1 Caddy"
- May have bot protection that detects non-browser clients
- Requires JavaScript challenge or proof-of-work
-
TLS Fingerprinting
- Server may be checking the TLS client hello fingerprint
- Go's TLS implementation has a different fingerprint than browsers
- This is commonly used to block bots
-
WebSocket Sub-protocol Validation
- May require specific WebSocket extension headers
- Browser sends additional headers that we're not replicating
-
IP-based Rate Limiting
- Previous requests from the same IP may have triggered protection
- Would explain why browser works but our client doesn't
Evidence from ChromeDP
ChromeDP DOES work because:
- It's literally a real Chrome browser
- Has the correct TLS fingerprint
- Passes all JavaScript challenges
- Has complete browser context
Recommended Solution
Hybrid Approach: ChromeDP for Token, Native for WebSocket
Since:
- JWT tokens are valid for 1 year
- ChromeDP successfully obtains tokens
- Native WebSocket cannot bypass 403
Solution: Use ChromeDP to get the token once, then cache it:
type TokenCache struct {
token string
expiration time.Time
mu sync.RWMutex
}
func (c *TokenCache) Get() (string, error) {
c.mu.RLock()
defer c.mu.RUnlock()
if c.token != "" && time.Now().Before(c.expiration) {
return c.token, nil // Use cached token
}
// Token expired or missing, get new one via ChromeDP
return c.refreshToken()
}
func (c *TokenCache) refreshToken() (string, error) {
c.mu.Lock()
defer c.mu.Unlock()
// Launch ChromeDP, visit room, extract token
token := extractTokenViaChromeDPOnce()
// Cache for 11 months (give 1 month buffer)
c.token = token
c.expiration = time.Now().Add(11 * 30 * 24 * time.Hour)
return token, nil
}
Benefits:
- ✅ Only need ChromeDP once per year
- ✅ Native WebSocket for all subsequent connections
- ✅ Lightweight after initial token acquisition
- ✅ Automatic token refresh when expired
Alternative: Keep ChromeDP
If we can't bypass the 403, we should optimize the ChromeDP approach instead:
-
Reduce Memory Usage
- Use headless-shell instead of full Chrome (~50MB vs ~200MB)
- Disable unnecessary Chrome features
- Clean up resources aggressively
-
Reduce Startup Time
- Keep Chrome instance alive between restarts
- Use Chrome's remote debugging instead of launching new instance
-
Accept the Trade-off
- 200MB RAM is acceptable for a relay service
- 3-5 second startup is one-time cost
- It's the most reliable solution
Next Steps
Option A: Continue Investigation
- Try different TLS libraries (crypto/tls alternatives)
- Analyze browser's exact WebSocket handshake with Wireshark
- Try mimicking browser's TLS fingerprint
- Test from different IP addresses
Option B: Implement Hybrid Solution
- Extract token from ChromeDP session
- Implement token caching with expiration
- Try native WebSocket with cached token
- Verify if 403 still occurs
Option C: Optimize ChromeDP
- Switch to chromedp/headless-shell
- Implement Chrome instance pooling
- Optimize memory usage
- Document performance characteristics
Recommendation
Go with Option C: Optimize ChromeDP
Reasoning:
- ChromeDP is proven to work 100%
- Token caching won't help if WebSocket still returns 403
- The 403 is likely permanent without a real browser context
- Optimization can make ChromeDP acceptable for production
- ~100MB RAM for a bridge service is reasonable
Implementation:
// Use chromedp/headless-shell Docker image
FROM chromedp/headless-shell:latest
// Optimize Chrome flags
chromedp.Flag("disable-gpu", true),
chromedp.Flag("disable-dev-shm-usage", true),
chromedp.Flag("single-process", true), // Reduce memory
chromedp.Flag("no-zygote", true), // Reduce memory
// Keep instance alive
func (b *Bkosmi) KeepAlive() {
// Don't close Chrome between messages
// Only restart if crashed
}
Conclusion
The 403 Forbidden error is likely a security measure that cannot be easily bypassed without a real browser context. The most pragmatic solution is to optimize and embrace the ChromeDP approach rather than trying to reverse engineer the security mechanism.
Status: ChromeDP remains the recommended implementation ✅