Building a Real-time Audio Chat Application with Go and WebSockets from Scratch
This guide will walk you through creating a real-time audio chat application using Go and WebSockets, built entirely from scratch without cloning an existing repository. We’ll set up the project structure, implement a WebSocket server to pair two clients for audio communication, and create a simple frontend to capture and play audio in the browser. By the end, you’ll have a functional application where two users can connect and talk in real-time.
Table of Contents
- Project Setup
- WebSocket Server Implementation
- Frontend Implementation
- Generating Self-Signed Certificates
- Running and Testing the Application
- Security Considerations
- Troubleshooting
Project Setup
Let’s start by creating the project directory and initializing it as a Go module.
Create the project directory and initialize the Go module: Open a terminal and run the following commands:
mkdir duoduel cd duoduel go mod init duoduel
This creates a directory named
duoduel
and initializes it as a Go module with the nameduoduel
.Install the required dependency: We’ll use the Gorilla WebSocket library to handle WebSocket connections. Install it by running:
go get github.com/gorilla/websocket
Set up the directory structure: Create the following directories and subdirectories inside the
duoduel
directory:mkdir server mkdir server/websocket mkdir server/cert mkdir static mkdir static/js
After running these commands, your project structure should look like this:
duoduel/ ├── server/ │ ├── main.go (to be created) │ ├── websocket/ │ │ ├── handler.go (to be created) │ │ └── client.go (to be created) │ └── cert/ │ ├── cert.pem (to be generated) │ └── key.pem (to be generated) └── static/ ├── index.html (to be created) └── js/ └── main.js (to be created)
We’ll populate these files in the next sections.
WebSocket Server Implementation
The server will handle WebSocket connections, pair clients together, and forward audio data between them. We’ll create three Go files to achieve this.
1. Creating server/main.go
Create a file named main.go
inside the server
directory and add the following code:
package main
import (
"duoduel/websocket"
"flag"
"log"
"net/http"
"path/filepath"
)
func main() {
// Command line flags
var (
useHTTPS = flag.Bool("https", true, "Use HTTPS")
certFile = flag.String("cert", "cert/cert.pem", "Path to certificate file")
keyFile = flag.String("key", "cert/key.pem", "Path to key file")
port = flag.String("port", "8443", "Port to listen on")
)
flag.Parse()
// Serve static files
staticDir := filepath.Join("..", "static")
fs := http.FileServer(http.Dir(staticDir))
http.Handle("/", fs)
// WebSocket endpoint
http.HandleFunc("/ws", websocket.HandleConnection)
// Start server
if *useHTTPS {
log.Printf("Server starting with HTTPS on port %s...", *port)
err := http.ListenAndServeTLS(":"+*port, *certFile, *keyFile, nil)
if err != nil {
log.Fatal("ListenAndServeTLS: ", err)
}
} else {
log.Printf("Server starting with HTTP on port 8080...")
err := http.ListenAndServe(":8080", nil)
if err != nil {
log.Fatal("ListenAndServe: ", err)
}
}
}
Explanation:
- This file sets up an HTTP server.
- It serves static files (like
index.html
) from the../static
directory, relative to theserver
directory. - It defines a
/ws
endpoint for WebSocket connections, handled by a function we’ll define next. - It supports both HTTP and HTTPS, configurable via command-line flags, defaulting to HTTPS on port 8443.
2. Creating server/websocket/handler.go
Create a file named handler.go
inside the server/websocket
directory and add the following code:
package websocket
import (
"log"
"net/http"
"sync"
"time"
"github.com/gorilla/websocket"
)
// Global variables to manage connections
var (
waitingClient *Client
mu sync.Mutex
)
// Configure the upgrader
var upgrader = websocket.Upgrader{
CheckOrigin: func(r *http.Request) bool {
return true // Allow connections from any origin (for development)
},
ReadBufferSize: 1024,
WriteBufferSize: 1024,
HandshakeTimeout: 10 * time.Second,
}
// HandleConnection handles new WebSocket connections
func HandleConnection(w http.ResponseWriter, r *http.Request) {
log.Printf("New connection request from: %s", r.RemoteAddr)
// Upgrade HTTP connection to WebSocket
conn, err := upgrader.Upgrade(w, r, nil)
if err != nil {
log.Println("Failed to upgrade connection:", err)
return
}
// Set up ping/pong to keep connection alive
conn.SetReadDeadline(time.Now().Add(60 * time.Second))
conn.SetPongHandler(func(string) error {
conn.SetReadDeadline(time.Now().Add(60 * time.Second))
return nil
})
// Create new client
client := NewClient(conn)
mu.Lock()
if waitingClient == nil {
// No waiting client; this client waits
waitingClient = client
mu.Unlock()
client.SendJSON(map[string]string{"type": "waiting"})
handleClientMessages(client)
} else {
// Pair with the waiting client
peer := waitingClient
waitingClient = nil
client.Peer = peer
peer.Peer = client
mu.Unlock()
client.SendJSON(map[string]string{"type": "connected"})
peer.SendJSON(map[string]string{"type": "connected"})
go handleClientMessages(client)
}
}
// handleClientMessages reads messages and forwards audio data
func handleClientMessages(client *Client) {
defer func() {
client.Close()
mu.Lock()
if waitingClient == client {
waitingClient = nil
}
mu.Unlock()
}()
for {
messageType, message, err := client.Conn.ReadMessage()
if err != nil {
log.Println("Read error:", err)
break
}
if messageType == websocket.BinaryMessage && client.Peer != nil {
client.Peer.SendBinary(websocket.BinaryMessage, message)
}
// Ignore other message types for now
}
}
Explanation:
- This file manages WebSocket connections.
HandleConnection
upgrades HTTP requests to WebSocket connections and pairs clients:- If no client is waiting, the new client becomes the
waitingClient
and waits. - If a client is waiting, the new client is paired with it, and both are notified.
- If no client is waiting, the new client becomes the
handleClientMessages
runs in a loop, reading messages from a client:- If the message is binary (audio data) and the client has a peer, it forwards the data to the peer.
- It handles cleanup when a client disconnects.
3. Creating server/websocket/client.go
Create a file named client.go
inside the server/websocket
directory and add the following code:
package websocket
import (
"log"
"sync"
"time"
"github.com/gorilla/websocket"
)
type Client struct {
Conn *websocket.Conn
mu sync.Mutex
Peer *Client
}
func NewClient(conn *websocket.Conn) *Client {
conn.SetReadLimit(1024 * 1024) // 1MB max message size
conn.SetWriteDeadline(time.Now().Add(10 * time.Second))
return &Client{
Conn: conn,
Peer: nil,
}
}
func (c *Client) SendJSON(message interface{}) error {
c.mu.Lock()
defer c.mu.Unlock()
c.Conn.SetWriteDeadline(time.Now().Add(10 * time.Second))
return c.Conn.WriteJSON(message)
}
func (c *Client) SendBinary(messageType int, data []byte) error {
if len(data) < 10 {
return nil // Ignore small messages to avoid noise
}
c.mu.Lock()
defer c.mu.Unlock()
c.Conn.SetWriteDeadline(time.Now().Add(15 * time.Second))
return c.Conn.WriteMessage(messageType, data)
}
func (c *Client) Close() {
if c.Peer != nil {
c.Peer.SendJSON(map[string]string{"type": "disconnected"})
c.Peer.Peer = nil
}
msg := websocket.FormatCloseMessage(websocket.CloseNormalClosure, "Session ended")
c.Conn.WriteControl(websocket.CloseMessage, msg, time.Now().Add(5*time.Second))
c.Conn.Close()
}
Explanation:
- Defines a
Client
struct to represent a WebSocket client with a connection and a peer. NewClient
initializes a client with a connection.SendJSON
sends status messages (e.g., “waiting”, “connected”).SendBinary
sends audio data to the client.Close
handles cleanup, notifying the peer of disconnection.
Frontend Implementation
The frontend will connect to the WebSocket server, capture audio from the microphone, send it as PCM data, and play received audio from the peer.
1. Creating static/index.html
Create a file named index.html
inside the static
directory and add the following code:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Duoduel Audio Chat</title>
</head>
<body>
<h1>Duoduel Audio Chat</h1>
<div id="status">Connecting...</div>
<button id="mute">Mute</button>
<script src="js/main.js"></script>
</body>
</html>
Explanation:
- A simple HTML page with:
- A heading.
- A
status
div to display connection status. - A
mute
button to toggle the microphone. - A script tag to include
main.js
.
2. Creating static/js/main.js
Create a file named main.js
inside the static/js
directory and add the following code:
let socket;
let audioContext;
let isMuted = false;
let stream;
let source;
let processor;
let gainNode;
const SAMPLE_RATE = 48000;
const CHUNK_SIZE = 4096;
function connectWebSocket() {
socket = new WebSocket('wss://localhost:8443/ws');
socket.onopen = () => {
console.log('WebSocket connected');
setupAudio();
};
socket.onmessage = (event) => {
if (event.data instanceof Blob) {
// Handle binary audio data
event.data.arrayBuffer().then(buffer => {
playAudioData(buffer);
});
} else {
// Handle JSON status messages
const message = JSON.parse(event.data);
handleMessage(message);
}
};
socket.onclose = () => {
console.log('WebSocket closed');
document.getElementById('status').textContent = 'Disconnected';
};
socket.onerror = (error) => {
console.error('WebSocket error:', error);
};
}
function handleMessage(message) {
switch (message.type) {
case 'waiting':
document.getElementById('status').textContent = 'Waiting for partner...';
break;
case 'connected':
document.getElementById('status').textContent = 'Connected to partner';
break;
case 'disconnected':
document.getElementById('status').textContent = 'Partner disconnected';
break;
default:
console.log('Unknown message type:', message.type);
}
}
async function setupAudio() {
try {
audioContext = new (window.AudioContext || window.webkitAudioContext)({
sampleRate: SAMPLE_RATE
});
if (isIOS()) {
// For iOS, resume audio context on user interaction
document.addEventListener('touchstart', () => {
audioContext.resume();
}, { once: true });
}
const constraints = {
audio: {
echoCancellation: false,
noiseSuppression: false,
autoGainControl: false,
sampleRate: 44100,
channelCount: 1
},
video: false
};
stream = await navigator.mediaDevices.getUserMedia(constraints);
source = audioContext.createMediaStreamSource(stream);
gainNode = audioContext.createGain();
gainNode.gain.value = 2.5; // Amplify input
source.connect(gainNode);
processor = audioContext.createScriptProcessor(CHUNK_SIZE, 1, 1);
processor.onaudioprocess = (e) => {
if (!isMuted && socket.readyState === WebSocket.OPEN) {
const inputData = e.inputBuffer.getChannelData(0);
const pcmData = new Int16Array(inputData.length);
for (let i = 0; i < inputData.length; i++) {
const s = Math.max(-1, Math.min(1, inputData[i]));
pcmData[i] = s < 0 ? s * 32768 : s * 32767;
}
socket.send(pcmData.buffer);
}
};
gainNode.connect(processor);
processor.connect(audioContext.destination); // Required to trigger processing
} catch (error) {
console.error('Error setting up audio:', error);
}
}
function playAudioData(buffer) {
const pcmData = new Int16Array(buffer);
const floatData = new Float32Array(pcmData.length);
for (let i = 0; i < pcmData.length; i++) {
floatData[i] = pcmData[i] / 32768;
}
const audioBuffer = audioContext.createBuffer(1, floatData.length, SAMPLE_RATE);
audioBuffer.getChannelData(0).set(floatData);
const source = audioContext.createBufferSource();
source.buffer = audioBuffer;
source.connect(audioContext.destination);
source.start();
}
function toggleMute() {
isMuted = !isMuted;
document.getElementById('mute').textContent = isMuted ? 'Unmute' : 'Mute';
}
function isIOS() {
return /iPad|iPhone|iPod/.test(navigator.userAgent);
}
// Initialize
connectWebSocket();
document.getElementById('mute').addEventListener('click', toggleMute);
Explanation:
- WebSocket Connection: Connects to
wss://localhost:8443/ws
and handles connection events. - Audio Setup: Requests microphone access, processes audio into PCM format using
ScriptProcessorNode
, and sends it over WebSocket when not muted. - Audio Playback: Receives PCM data, converts it to a playable format, and plays it using the Web Audio API.
- Mute Functionality: Toggles the microphone on/off.
- iOS Compatibility: Resumes the audio context on touch for iOS devices.
Generating Self-Signed Certificates
Since the server uses HTTPS by default, we need self-signed certificates for development. Run these commands from the duoduel
directory:
cd server
mkdir cert
openssl req -x509 -newkey rsa:4096 -keyout cert/key.pem -out cert/cert.pem -days 365 -nodes
- Follow the prompts to generate
cert.pem
andkey.pem
inserver/cert
. - These are for development only; use proper certificates in production.
Running and Testing the Application
Start the server: Navigate to the
server
directory and run:cd server go run main.go -https=true -port=8443
This starts the server on
https://localhost:8443
.Test the application:
- Open two browser tabs (e.g., Chrome or Firefox).
- In each tab, navigate to
https://localhost:8443
. - Accept the self-signed certificate warning.
- The first tab will show “Waiting for partner…”.
- When the second tab connects, both will show “Connected to partner”, and you can talk.
- Use the “Mute” button to toggle your microphone.
Security Considerations
- HTTPS: Always use HTTPS in production to encrypt WebSocket traffic.
- Origin Checking: The
CheckOrigin
function allows all origins for simplicity. In production, restrict it to trusted domains. - Certificates: Replace self-signed certificates with ones from a trusted authority in production.
Troubleshooting
No Audio:
- Check the browser console (
F12
) for errors. - Ensure microphone permissions are granted.
- Test with different browsers or devices.
- Check the browser console (
Connection Fails:
- Verify the server is running and port 8443 is open.
- Check the WebSocket URL (
wss://localhost:8443/ws
). - Ensure the certificate is accepted.
Choppy Audio:
- Adjust
CHUNK_SIZE
inmain.js
(e.g., try 2048 or 8192). - Test with a faster network or locally.
- Adjust
Congratulations! You’ve built a real-time audio chat application from scratch. This basic version pairs two users, but you could extend it with features like multi-user rooms, better audio buffering, or a prettier UI. Happy coding! 🎤