Base64 Encoding Explained: How It Works and When to Use It
24 February, 2026 Security
Base64 is one of those encoding schemes every developer encounters but few take the time to fully understand. It shows up in JWT tokens, email attachments, embedded images, API authentication headers, and dozens of other places. This article explains exactly how Base64 works, why it exists, and when to reach for it - and when to avoid it.
What Is Base64?
Base64 is a binary-to-text encoding scheme. It converts arbitrary binary data into a string of printable ASCII characters. The name comes from the fact that it uses 64 distinct characters to represent data.
The reason Base64 exists is historical. Many older protocols - SMTP for email, HTTP headers, some database fields - were designed for ASCII text only. They either rejected bytes above 127, stripped control characters, or mangled line endings. Binary data could not survive transit through these channels unchanged. Base64 solves this by converting binary to a safe subset of ASCII that every system agrees on.
The 64-character alphabet consists of: uppercase letters A-Z (26), lowercase letters a-z (26), digits 0-9 (10), plus + and / (2). Together that is exactly 64 characters - enough to represent 6 bits of information per character.
How Base64 Works
The algorithm operates on 3 bytes at a time (24 bits total), converting them into 4 Base64 characters (6 bits each).
Step-by-step walkthrough encoding "Man":
The string "Man" in ASCII is three bytes:
M= 0x4D =01001101a= 0x61 =01100001n= 0x6E =01101110
Concatenate all 24 bits:
01001101 01100001 01101110
Split into four 6-bit groups:
010011 010110 000101 101110
Convert each 6-bit group to its decimal value:
010011= 19010110= 22000101= 5101110= 46
Look up each value in the Base64 alphabet:
- 19 =
T - 22 =
W - 5 =
F - 46 =
u
Result: "Man" encodes to "TWFu".
The Base64 Alphabet
| Value | Char | Value | Char | Value | Char | Value | Char |
|---|---|---|---|---|---|---|---|
| 0 | A | 16 | Q | 32 | g | 48 | w |
| 1 | B | 17 | R | 33 | h | 49 | x |
| 2 | C | 18 | S | 34 | i | 50 | y |
| 3 | D | 19 | T | 35 | j | 51 | z |
| 4 | E | 20 | U | 36 | k | 52 | 0 |
| 5 | F | 21 | V | 37 | l | 53 | 1 |
| 6 | G | 22 | W | 38 | m | 54 | 2 |
| 7 | H | 23 | X | 39 | n | 55 | 3 |
| 8 | I | 24 | Y | 40 | o | 56 | 4 |
| 9 | J | 25 | Z | 41 | p | 57 | 5 |
| 10 | K | 26 | a | 42 | q | 58 | 6 |
| 11 | L | 27 | b | 43 | r | 59 | 7 |
| 12 | M | 28 | c | 44 | s | 60 | 8 |
| 13 | N | 29 | d | 45 | t | 61 | 9 |
| 14 | O | 30 | e | 46 | u | 62 | + |
| 15 | P | 31 | f | 47 | v | 63 | / |
These 64 characters were chosen because they are printable ASCII, universally supported, and safe in most text-based protocols. The only two characters that vary between variants are positions 62 and 63 (+ and /).
Padding
Input data is not always a multiple of 3 bytes. When it isn't, Base64 uses = as a padding character.
- 1 remaining byte (8 bits): produces 2 Base64 characters and
==at the end - 2 remaining bytes (16 bits): produces 3 Base64 characters and
=at the end - 0 remaining bytes: no padding needed
Example: encoding "Ma" (2 bytes, 16 bits):
01001101 01100001
010011 010110 0001(00)
The last group is padded with two zero bits to form a valid 6-bit value: 000100 = 4 = E. Result: "TWE=".
Padding is required by the standard but some implementations - particularly JWT - omit it. Decoders must handle both cases.
Base64 Variants
| Variant | Chars 62-63 | Padding | Line Breaks | Primary Use |
|---|---|---|---|---|
| Standard (RFC 4648) | + / |
Required | None | General purpose |
| URL-safe (RFC 4648 §5) | - _ |
Required | None | URLs, JWT, OAuth |
| MIME (RFC 2045) | + / |
Required | Every 76 chars | Email attachments |
| Base64url no-padding | - _ |
Omitted | None | JWT specifically |
Standard Base64 is the default. Use it when encoding data for storage or APIs that accept Base64.
URL-safe Base64 replaces + with - and / with _. This is critical because + means a space in URL query strings and / is a path separator. JWT tokens, OAuth state parameters, and anything that travels in a URL must use this variant.
MIME Base64 inserts a CRLF line break every 76 characters. This is required for email attachments per RFC 2045 and is what base64_encode() in many mail libraries produces by default.
Base64url without padding is used in JWT (JSON Web Token). The padding = character is a special character in URLs and would require percent-encoding. JWT drops padding entirely and relies on the decoder to infer it from the string length.
Size Overhead
Every 3 bytes of input become 4 characters of output. That is a 33% size increase.
size_in_bytes * 4 / 3 (rounded up to next multiple of 4 with padding)
For a 1 MB image: 1,048,576 bytes * 4/3 ≈ 1.37 MB of Base64 text. This matters when embedding images in CSS or storing Base64 in a database column - plan for the overhead.
Common Use Cases
Data URIs in HTML and CSS - Embedding small images directly in markup avoids an extra HTTP request:
<img src="data:image/png;base64,iVBORw0KGgoAAAANS...">
This is practical for icons and tiny images. For anything over a few kilobytes, a separate file is more efficient.
Email attachments (MIME) - All binary email attachments are Base64-encoded. The MIME standard requires it because SMTP was designed for 7-bit ASCII text. Your email client decodes this automatically.
JWT tokens - A JWT has three parts separated by dots: header.payload.signature. The header and payload are Base64url-encoded JSON. The signature is also Base64url-encoded bytes. Note that encoding is not encryption - the header and payload are trivially readable by anyone.
HTTP Basic Authentication - The Authorization header encodes credentials as username:password in Base64:
Authorization: Basic dXNlcjpwYXNzd29yZA==
This provides no security on its own - it is just encoding. Always use HTTPS.
Binary data in JSON - JSON has no native binary type. Storing image bytes, cryptographic keys, or any binary blob in JSON requires encoding it as a Base64 string.
API keys and tokens - Many API keys encode random bytes in Base64 to produce a URL-safe, copy-pasteable string.
You can encode or decode Base64 directly in your browser without sending data to any server.
When NOT to Use Base64
It is not encryption. Base64 is completely reversible with zero secret knowledge. Anyone who sees a Base64 string can decode it in seconds. Never use it to "hide" sensitive data.
Do not encode plain text unless required by a protocol. If you control the system, send UTF-8 text as-is. Encoding a JSON body in Base64 before putting it in another JSON field adds size and complexity for no benefit.
Avoid it for large binary file transfers. HTTP supports binary content natively via Content-Type and streaming. There is no reason to Base64-encode a video file for an HTTP upload - it will be 33% larger and slower.
Do not store large Base64 strings in relational databases unless you have a specific reason. Use a binary column (BLOB, BYTEA) for binary data and store decoded bytes.
Base64 vs Other Encodings
| Encoding | Size vs raw | Characters | Use case |
|---|---|---|---|
| Hex | 2x | 0-9, a-f | Checksums, hashes, debugging |
| Base32 | 1.6x | A-Z, 2-7 | OTP seeds, case-insensitive contexts |
| Base58 | ~1.37x | alphanumeric minus 0/O/I/l | Bitcoin addresses, human-readable IDs |
| Base64 | 1.33x | A-Z, a-z, 0-9, +/ | General binary-to-text encoding |
Hex encodes each byte as two characters (00-ff). Simpler to implement manually, but twice the size. Good for displaying checksums and hashes where readability matters more than efficiency.
Base32 uses only uppercase letters and digits 2-7, avoiding ambiguous characters like 0/O and 1/l/I. Useful when humans must type or read the encoded value - TOTP secret keys often use Base32.
Base58 (used by Bitcoin) removes 0, O, I, and l from the alphanumeric set to prevent visual confusion. Not standardised outside of cryptocurrency contexts.
Code Examples
PHP
// Standard Base64
$encoded = base64_encode('Hello, World!');
// SGVsbG8sIFdvcmxkIQ==
$decoded = base64_decode('SGVsbG8sIFdvcmxkIQ==');
// Hello, World!
// URL-safe Base64 (replace + with - and / with _)
$urlSafe = strtr(base64_encode($data), '+/', '-_');
// URL-safe without padding (for JWT)
$jwtSafe = rtrim(strtr(base64_encode($data), '+/', '-_'), '=');
// Decode URL-safe
$decoded = base64_decode(strtr($jwtSafe, '-_', '+/'));
Python
import base64
# Standard Base64
encoded = base64.b64encode(b'Hello, World!')
# b'SGVsbG8sIFdvcmxkIQ=='
decoded = base64.b64decode(b'SGVsbG8sIFdvcmxkIQ==')
# b'Hello, World!'
# URL-safe Base64
url_encoded = base64.urlsafe_b64encode(b'Hello, World!')
# b'SGVsbG8sIFdvcmxkIQ==' (same here, differs for data with + or /)
# Decode URL-safe (add padding if missing)
def decode_base64url(s):
padding = 4 - len(s) % 4
if padding != 4:
s += '=' * padding
return base64.urlsafe_b64decode(s)
JavaScript
// Browser (works with strings only)
const encoded = btoa('Hello, World!');
// 'SGVsbG8sIFdvcmxkIQ=='
const decoded = atob('SGVsbG8sIFdvcmxkIQ==');
// 'Hello, World!'
// Node.js (handles binary data correctly)
const buf = Buffer.from('Hello, World!', 'utf8');
const encoded = buf.toString('base64');
// 'SGVsbG8sIFdvcmxkIQ=='
// URL-safe Base64 in JavaScript
function toBase64url(base64) {
return base64.replace(/\+/g, '-').replace(/\//g, '_').replace(/=+$/, '');
}
function fromBase64url(base64url) {
const base64 = base64url.replace(/-/g, '+').replace(/_/g, '/');
const padding = base64.length % 4;
return atob(padding ? base64 + '='.repeat(4 - padding) : base64);
}
Note on btoa in browsers: btoa only accepts Latin-1 strings. To encode Unicode text, convert to UTF-8 bytes first:
const encoded = btoa(
String.fromCharCode(...new TextEncoder().encode('Hello'))
);
Decoding Without a Library
To decode "SGVsbG8=" manually:
-
Look up each character in the alphabet:
S= 18,G= 6,V= 21,s= 44,b= 27,G= 6,8= 60,== padding
-
Write the 6-bit values:
010010 000110 010101 101100 011011 000110 111100
-
Concatenate and split into bytes (8 bits):
01001000 01100101 01101100 01101100 01101111
-
Convert to ASCII:
0x48=H,0x65=e,0x6C=l,0x6C=l,0x6F=o
-
Result:
"Hello"- the=padding discards the last two bits.
Conclusion
Base64 is a practical tool with a specific job: safely encoding binary data as printable text for protocols that cannot handle raw bytes. Understanding its mechanics - the 6-bit grouping, the alphabet, the variants, and the padding rules - lets you use it correctly and debug problems quickly when something comes out garbled.
The most important rules in practice: use URL-safe Base64 for anything that travels in a URL or JWT, never confuse encoding with encryption, and account for the 33% size overhead when dealing with large payloads.
Use the Base64 encoder to encode and decode values directly in your browser - no data is sent to any server.