What Is Base64 Encoding and When Should You Use It
If you work with APIs, email systems, or web development, you have encountered Base64 even if you did not recognise it. Those long strings of letters and numbers that look like gibberish at the start of an email attachment, a data: URL in CSS, or the middle segment of a JWT token? That is Base64. It is one of the oldest and most quietly load-bearing pieces of internet plumbing, and almost every piece of software you use leans on it somewhere.
A short history of Base64
Base64 is part of a family called "radix-64" or "printable encodings," whose job is to represent arbitrary bytes using only the small alphabet of characters that a text-based system is guaranteed to pass through unchanged. The earliest widely-used member is uuencode, written by Mary Ann Horton at UC Berkeley around 1980 to ship binary files over Usenet and email when those systems would corrupt anything above 7-bit ASCII.
The Base64 alphabet itself was first standardised in RFC 989 (1987) for Privacy-Enhanced Mail (PEM), an early attempt at signed and encrypted email. PEM died, but its encoding scheme survived and was canonised in RFC 1421 (1993) and then in the MIME specification (RFC 1521 and 1522 in 1993, revised to RFCs 2045-2049 in 1996). MIME made Base64 the default way to attach binary files to email, and from there the encoding spread to nearly every text-only transport on the internet.
In 2006, IETF consolidated the scattered Base64 definitions into RFC 4648, which defines Base64, Base32 and Base16 in a single document. RFC 4648 also defined the URL-safe variant in section 5, which swapped the two non-URL-friendly characters (+ and /) for - and _. JSON Web Tokens (RFC 7519, 2015) standardised on URL-safe Base64 with the padding stripped. Today, every email attachment, every PEM-encoded certificate, every data: URL, every JWT, and every multipart upload boundary depends on Base64.
How Base64 works: the math
Base64 takes three input bytes (24 bits) and rewrites them as four output characters (6 bits each), using a 64-symbol alphabet. The mapping is fixed:
| Index range | Characters |
|---|---|
| 0-25 | A-Z |
| 26-51 | a-z |
| 52-61 | 0-9 |
| 62 | + (standard) or - (URL-safe) |
| 63 | / (standard) or _ (URL-safe) |
So Hello becomes:
- ASCII bytes:
0x48 0x65 0x6C 0x6C 0x6F(5 bytes) - Binary:
01001000 01100101 01101100 01101100 01101111 - Re-grouped into 6-bit chunks:
010010 000110 010101 101100 011011 000110 1111 - Last chunk is short, padded with zero bits:
010010 000110 010101 101100 011011 000110 111100 - Lookup:
S G V s b G 8(only 7 chars from 6 groups of 6 bits = 36 bits, padding for the missing 4) - Padding: add
=to round to a multiple of 4 output characters:SGVsbG8=
The output is always a multiple of 4 characters. If the input length modulo 3 is 1, you get two = padding characters; if it is 2, you get one =; if it is 0, no padding. Padding is sometimes stripped (notably in JWT and in URL fragments) and decoders are expected to tolerate that.
The 33 % size overhead comes from this 3-to-4 expansion: every 3 bytes of input become 4 characters of output, an increase of one third. There is no way to reduce it without changing the alphabet (Base85 / Ascii85 expands by only 25 % using 85 printable characters, at the cost of a more complex encoder).
Common use cases
Email attachments. SMTP, the protocol that carries 95 % of email between servers, was designed in 1982 (RFC 821) for 7-bit ASCII. Every binary attachment you send (an image, a PDF, a ZIP) is Base64-encoded by your mail client before transmission and decoded by the recipient's. The MIME headers in an email tell the recipient which parts are Base64 and which are plain text.
Data URLs in HTML and CSS. A data:image/png;base64,iVBORw0KGgo... URL embeds a binary file directly in the document. Useful for small icons under 1-2 KB where the saved HTTP request outweighs the 33 % size overhead and the loss of caching.
API payloads. When a JSON or XML API needs to accept a binary value (a file upload, a signature, a profile picture), the standard pattern is to Base64-encode the bytes and ship them as a string field. The receiver decodes them on the server side. This is how OpenAI's image input works, how Stripe receives file uploads, and how most cloud functions accept binary input.
HTTP Basic Authentication. The Authorization: Basic <token> header carries a Base64-encoded username:password pair (RFC 7617). This is encoding, not encryption: anyone who sees the header sees the password. Basic Auth requires HTTPS for that reason.
Certificates and keys. PEM files (-----BEGIN CERTIFICATE----- ... -----END CERTIFICATE-----) wrap a Base64-encoded blob of DER-encoded ASN.1 bytes. Every TLS certificate, every SSH key file, every code-signing certificate is Base64 inside a PEM envelope.
JWT tokens. A JWT is three URL-safe-Base64 segments separated by dots: <header>.<payload>.<signature>. The Base64 encoding lets a JWT travel safely in headers, URLs, and cookies.
How to encode and decode
- Choose encode or decode: select the direction of conversion.
- Paste text or upload a file: enter text directly or drag and drop a file (up to 5 MB for browser-side encoding).
- Pick the variant: standard Base64 for email and certificates, URL-safe for JWT and URL fragments. The tool defaults to standard.
- Copy the result: the output updates instantly. Copy it to your clipboard, or use the download button for long outputs.
Variants of Base64
Several Base64-like encodings exist for specific situations:
| Variant | Differences | Where it is used |
|---|---|---|
| Standard (RFC 4648 §4) | A-Z, a-z, 0-9, +, /, = padding | Email (MIME), PEM, generic binary-to-text |
| URL-safe (RFC 4648 §5) | + becomes -, / becomes _ | JWT, URL fragments, filenames |
| MIME (RFC 2045) | Line breaks every 76 chars | Email body, mail headers (with =?utf-8?B?...?=) |
| crypt(3) / htpasswd | Different alphabet (./0-9A-Za-z) | Old Unix password hashes (DES-based) |
| Base64Url no-padding | URL-safe without trailing = | JWT (per RFC 7515) |
| Base32 (RFC 4648 §6) | 32-char alphabet, case-insensitive | TOTP secrets, Onion addresses |
| Base58 | 58-char alphabet (no 0, O, I, l) | Bitcoin addresses, IPFS CIDs |
| Ascii85 / Base85 | 85-char alphabet, 25 % overhead | PDF, PostScript |
Most of the time you want either standard or URL-safe Base64. The others come up in specific protocols.
When to use Base64
Use it when:
- You need to embed a small image (under 5 KB) directly in HTML or CSS, saving one HTTP request.
- An API requires binary data as a text string in a JSON or XML payload.
- You are passing binary data through a system that only supports text (email, log entries, query parameters).
- You are encoding a JWT, certificate, key, or any structured binary blob.
- You need a deterministic, self-contained string representation that any language can decode.
Do not use it when:
- The file is large. Base64 adds 33 % overhead, prevents browser caching of the binary as a separate resource, and forces the entire blob through the page parser.
- You need security. Base64 is not encryption; it is trivially reversible.
- You can serve the file normally. A plain
<img src="photo.jpg">is more efficient than a Base64 data URL for anything over a few KB. - You need a compact representation. Hex is simpler if size does not matter; Base85 is denser if it does.
Common pitfalls
- Confusing encoding with encryption. Base64 is reversible by anyone in milliseconds. Putting a "secret" through Base64 protects it from nothing except shoulder-surfing.
- The 33 % size penalty. A 1 MB image becomes a 1.33 MB string, and inline data URLs are downloaded with the parent HTML every visit, with no separate cache.
- Line breaks in MIME Base64. The MIME variant of Base64 inserts
\r\nevery 76 characters. If you paste MIME Base64 into a JSON value or a URL it will fail; strip the newlines first. - Padding stripping in JWT. JWT uses URL-safe Base64 with the
=padding removed. A library that strictly requires padding will reject valid JWTs; one that does not produce padding will create tokens other libraries reject. RFC 7515 mandates "no padding" for the JWS standard. - URL-safe vs standard mix-up. Decoding a URL-safe string with a standard decoder fails on the
-and_characters; decoding a standard string with a URL-safe decoder fails on the+and/. - Unicode input handling. Base64 operates on bytes, not characters. If you Base64-encode a string of UTF-8 emoji, you must first decide on a byte encoding (almost always UTF-8). Different runtimes have different defaults; specify it explicitly.
- Streaming partial decoders. A correctly-implemented Base64 stream decoder waits for groups of 4 input characters before emitting 3 output bytes. Naive implementations that decode one character at a time produce garbage.
- Trailing whitespace and BOM. Some text editors append a newline or a UTF-8 byte-order mark to the file when you save. That extra byte changes the Base64 output. Diff your encoded result against an upstream source if you see unexpected mismatches.
+interpreted as space in URLs. Standard Base64's+becomes(space) when percent-decoded by a URL parser. This is exactly why URL-safe Base64 exists.
Alternatives and adjacent encodings
Base64 is the default, not the only option. The right choice depends on the channel and the size budget.
| Encoding | Overhead | Strength | Best for |
|---|---|---|---|
| Hex (Base16) | 100 % | Trivial to read, every byte is two chars | Debug output, short identifiers, color codes |
| Base32 (RFC 4648) | 60 % | Case-insensitive, no look-alike characters | TOTP secrets, Onion addresses, voice dictation |
| Base64 standard | 33 % | Universal, every language has it | Email, PEM, generic transport |
| Base64 URL-safe | 33 % | URL- and filename-safe | JWT, URL fragments |
| Base58 | ~37 % | No 0/O/I/l confusion, no special chars | Bitcoin addresses, IPFS CIDs |
| Ascii85 / Base85 | 25 % | Denser than Base64 | PDF, PostScript |
| Base91 | ~22 % | Even denser, more complex | Rare, niche compression contexts |
| Multipart upload | 0 % | Native binary transport over HTTP | File uploads (browsers do this for you) |
| gzip + Base64 | varies | Sometimes smaller than raw Base64 | Pre-compressed payloads |
For most everyday work, the answer is Base64 (standard or URL-safe). For binary file uploads over HTTP, the right answer is usually multipart/form-data, which does not encode at all.
Privacy and the encoder
The Base64 encoder and decoder run entirely in your browser. The text or file you input is processed by JavaScript on your device, the result is rendered to the page, and nothing is sent to a server. Nothing is logged, nothing is stored after you navigate away, and no analytics tag sees the content. For things you might Base64-encode (PEM certificates, private keys, JWT payloads from production systems, draft API requests with real customer data), that local-only flow is the right default. The whole tool can run offline once the page is loaded, which you can verify by switching off your network and re-encoding the same input.
Frequently Asked Questions
Does Base64 encryption protect my data?
No. Base64 is encoding, not encryption. Anyone can decode a Base64 string in milliseconds, it provides zero security. If you need to protect data, use actual encryption (AES, RSA, or higher-level tools like GnuPG and age).
Why does Base64 make files larger?
Base64 encoding increases data size by approximately 33%. Three bytes of binary data become four Base64 characters. This overhead is the trade-off for being able to transmit binary data safely as text through systems that may strip or mangle non-printable bytes.
Can I encode files, not just text?
Yes. Any file (images, PDFs, audio) can be encoded to Base64. This is commonly used for embedding small images directly in HTML or CSS as data URLs, and for shipping certificates and keys as PEM text.
When should I NOT use Base64?
Do not use it for large files. A 1 MB image becomes 1.33 MB as Base64 text, and the browser cannot cache it separately. For anything over a few KB, serving the file normally is more efficient.
What is the difference between standard Base64 and URL-safe Base64?
Standard Base64 (RFC 4648 section 4) uses the characters A-Z, a-z, 0-9, +, / with = padding. URL-safe Base64 (RFC 4648 section 5) swaps + for - and / for _ so the string is safe to drop into a URL or a filename without percent-encoding. JWT tokens use the URL-safe variant.
Why does Base64 sometimes have one or two = signs at the end?
The = is padding. Base64 encodes input in 3-byte groups; if the input length is not a multiple of 3, the last group is padded with zero bits and one or two = characters mark the missing bytes. One = means one missing byte, two = means two missing bytes.