Guide to Going Component-Free: Implementing Pure Vanilla JavaScript WebSockets for Remote ESP32-CAM AV Control
Going Component‑Free: Implementing Pure Vanilla JavaScript WebSockets for Remote ESP32‑CAM AV Control
Control an ESP32‑CAM from a browser without pulling in React, Vue, or any third‑party library. By using the native WebSocket API and straightforward HTML5, you can stream video, trigger snapshots, and command pan/tilt/zoom (PTZ) in real‑time—all while keeping the page lightweight and SEO‑friendly.
Why Go Vanilla?
- Performance: No bundle size overhead; the browser handles everything.
- SEO & Accessibility: Content is directly crawlable; no client‑side rendering delays.
- Maintainability: One HTML file, one script – easy to debug and extend.
- Portability: Works on any modern browser, from desktops to mobile.
Prerequisites
Hardware
- ESP32‑CAM module (AI‑Thinker board recommended)
- Power supply (5 V / 2 A)
- Wi‑Fi network (same subnet as your client)
Software
- Arduino IDE (or PlatformIO) with ESP32 board support
- Basic knowledge of JavaScript, HTML, and the ESP32‑CAM API
- A modern browser (Chrome, Edge, Firefox, Safari)
1️⃣ Setting Up the ESP32‑CAM WebSocket Server
The ESP32‑CAM runs a tiny WebSocket server that pushes JPEG frames and receives control commands. Below is a minimal sketch.
#include <WiFi.h>
#include <WebServer.h>
#include <WebSocketsServer.h>
#include <esp_camera.h>
// ---- Wi‑Fi credentials ----
const char* ssid = "YOUR_SSID";
const char* password = "YOUR_PASSWORD";
// ---- WebSocket on port 81 ----
WebSocketsServer webSocket = WebSocketsServer(81);
// ---- Camera configuration (AI‑Thinker) ----
camera_config_t camConfig = {
.ledc_channel = LEDC_CHANNEL_0,
.ledc_timer = LEDC_TIMER_0,
.pin_d0 = 5,
.pin_d1 = 18,
.pin_d2 = 19,
.pin_d3 = 21,
.pin_d4 = 36,
.pin_d5 = 39,
.pin_d6 = 34,
.pin_d7 = 35,
.pin_xclk = 0,
.pin_pclk = 22,
.pin_vsync = 25,
.pin_href = 23,
.pin_sscb_sda = 26,
.pin_sscb_scl = 27,
.pin_pwdn = 32,
.pin_reset = -1,
.x_clk_freq_hz = 20000000,
.pixel_format = PIXFORMAT_JPEG,
.frame_size = FRAMESIZE_QVGA, // 320x240
.jpeg_quality = 12,
.fb_count = 2,
};
void handleWebSocketMessage(void *arg, uint8_t *data, size_t len) {
// Simple command parser (e.g., "TAKE_SNAPSHOT")
String msg = String((char*)data);
if (msg == "TAKE_SNAPSHOT") {
camera_fb_t *fb = esp_camera_fb_get();
if (fb) {
webSocket.sendBIN(0, fb->buf, fb->len);
esp_camera_fb_return(fb);
}
}
// Add PTZ commands here (if you have a servo board attached)
}
void onWebSocketEvent(uint8_t num, WStype_t type, uint8_t * payload, size_t length){
switch(type){
case WStype_TEXT:
handleWebSocketMessage(nullptr, payload, length);
break;
default: break;
}
}
void setup() {
Serial.begin(115200);
WiFi.begin(ssid, password);
while (WiFi.status() != WL_CONNECTED) delay(500);
Serial.println("WiFi connected: " + WiFi.localIP().toString());
// Init camera
esp_err_t err = esp_camera_init(&camConfig);
if (err != ESP_OK) {
Serial.printf("Camera init failed with error 0x%x", err);
return;
}
// Start WebSocket server
webSocket.begin();
webSocket.onEvent(onWebSocketEvent);
}
void loop() {
webSocket.loop();
// Broadcast live JPEG frames at ~10 FPS
static uint32_t lastMs = 0;
if (millis() - lastMs > 100) {
camera_fb_t *fb = esp_camera_fb_get();
if (fb) {
webSocket.broadcastBIN(fb->buf, fb->len);
esp_camera_fb_return(fb);
lastMs = millis();
}
}
}
This sketch does three things:
- Connects to Wi‑Fi.
- Initialises the camera in JPEG mode.
- Starts a WebSocket server on port 81 that continuously streams frames and listens for text commands.
2️⃣ Building the Pure Vanilla JavaScript Client
The client consists of a single HTML file with embedded CSS and JavaScript. No external libraries are loaded, which keeps the page fast and SEO‑friendly.
<!-- index.html – place this file on any web server (or open locally) -->
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>ESP32‑CAM Remote Control</title>
<meta name="description" content="Pure vanilla JavaScript WebSocket interface for ESP32‑CAM live streaming and AV control.">
<style>
body{font-family:Arial,Helvetica,sans-serif;background:#fafafa;margin:0;padding:0;}
.container{max-width:960px;margin:auto;padding:20px;}
.video-box{position:relative;width:100%;padding-top:56.25%;background:#000;border-radius:8px;overflow:hidden;box-shadow:0 4px 12px rgba(0,0,0,0.08);}
.video-box img{position:absolute;top:0;left:0;width:100%;height:100%;object-fit:cover;}
.controls{margin-top:15px;display:flex;flex-wrap:wrap;gap:10px;}
.btn{background:#6B7C3A;color:#fff;padding:10px 15px;border:none;border-radius:6px;cursor:pointer;transition:background .2s;}
.btn:hover{background:#566024;}
.status{margin-top:10px;font-size:0.9rem;color:#555;}
</style>
</head>
<body>
<div class="container">
<h1>ESP32‑CAM Remote Control</h1>
<div class="video-box">
<img id="liveFeed" src="" alt="Live stream from ESP32‑CAM">
</div>
<div class="controls">
<button class="btn" id="snapshotBtn">Take Snapshot</button>
<button class="btn" id="panLeftBtn">Pan Left</button>
<button class="btn" id="panRightBtn">Pan Right</button>
<button class="btn" id="tiltUpBtn">Tilt Up</button>
<button class="btn" id="tiltDownBtn">Tilt Down</button>
</div>
<div class="status" id="statusMsg">Connecting...</div>
</div>
<script>
// ----- Configuration -----
const ESP_IP = '192.168.1.45'; // replace with your ESP32‑CAM IP
const WS_PORT = 81;
const WS_URL = `ws://${ESP_IP}:${WS_PORT}`;
// ----- UI references -----
const liveFeed = document.getElementById('liveFeed');
const statusMsg = document.getElementById('statusMsg');
const snapshotBtn = document.getElementById('snapshotBtn');
const panLeftBtn = document.getElementById('panLeftBtn');
const panRightBtn = document.getElementById('panRightBtn');
const tiltUpBtn = document.getElementById('tiltUpBtn');
const tiltDownBtn = document.getElementById('tiltDownBtn');
// ----- WebSocket handling -----
let ws;
function initWebSocket() {
ws = new WebSocket(WS_URL);
ws.binaryType = 'blob'; // receive JPEG as Blob
ws.onopen = () => {
statusMsg.textContent = '✅ Connected to ESP32‑CAM';
};
ws.onmessage = event => {
// If the server sent binary data (JPEG frame)
if (event.data instanceof Blob) {
const url = URL.createObjectURL(event.data);
liveFeed.src = url;
// Revoke after a short delay to free memory
setTimeout(() => URL.revokeObjectURL(url), 250);
}
};
ws.onerror = err => {
console.error('WebSocket error:', err);
statusMsg.textContent = '⚠️ Connection error';
};
ws.onclose = () => {
statusMsg.textContent = '🔌 Disconnected – retrying...';
// Auto‑reconnect after 2 seconds
setTimeout(initWebSocket, 2000);
};
}
// ----- Command helpers -----
function sendCommand(cmd) {
if (ws && ws.readyState === WebSocket.OPEN) {
ws.send(cmd);
} else {
console.warn('WebSocket not ready – command ignored');
}
}
// ----- Button actions -----
snapshotBtn.onclick = () => sendCommand('TAKE_SNAPSHOT');
panLeftBtn.onclick = () => sendCommand('PAN_LEFT');
panRightBtn.onclick = () => sendCommand('PAN_RIGHT');
tiltUpBtn.onclick = () => sendCommand('TILT_UP');
tiltDownBtn.onclick = () => sendCommand('TILT_DOWN');
// ----- Initialise -----
initWebSocket();
</script>
</body>
</html>
Key points in the script:
- Set
binaryType = 'blob'so JPEG frames arrive as binary data. - Use
URL.createObjectURLfor fast image rendering without base64 conversion. - Automatic reconnection logic keeps the UI alive if the ESP32‑CAM restarts.
3️⃣ Adding PTZ (Pan/Tilt/Zoom) Support
If your ESP32‑CAM is paired with a servo board, extend the Arduino sketch to react to the new commands. Below is a quick addition.
// Assume two SG90 servos on GPIO 13 (pan) and GPIO 14 (tilt)
#include <Servo.h>
Servo panServo, tiltServo;
void setupServos() {
panServo.attach(13);
tiltServo.attach(14);
panServo.write(90); // center position
tiltServo.write(90);
}
void handleWebSocketMessage(void *arg, uint8_t *data, size_t len) {
String cmd = String((char*)data);
if (cmd == "PAN_LEFT") panServo.write(panServo.read() - 10);
else if (cmd == "PAN_RIGHT") panServo.write(panServo.read() + 10);
else if (cmd == "TILT_UP") tiltServo.write(tiltServo.read() - 10);
else if (cmd == "TILT_DOWN") tiltServo.write(tiltServo.read() + 10);
else if (cmd == "TAKE_SNAPSHOT") {
// existing snapshot logic
}
}
Adjust the angle step (10°) to suit your hardware. The same sendCommand function on the client side works without modification.
4️⃣ Security & Performance Best Practices
Secure the Connection
- Prefer
wss://when serving the page over HTTPS; you can terminate TLS on a reverse proxy (e.g., Nginx) and proxy to the ESP32‑CAM. - Implement a simple token handshake: the client sends a secret string right after
onopen, and the ESP validates before broadcasting.
Reduce Bandwidth
- Set
.frame_size = FRAMESIZE_QVGAor evenQQVGAfor low‑speed networks. - Adjust
.jpeg_quality(10‑20 range) to balance quality vs. size. - Throttle the broadcast interval (e.g., 150 ms for ~7 fps).
5️⃣ Debugging Tips
| Scenario | Check | Solution |
|---|---|---|
| WebSocket fails to open |