Free Binary to Text Converter
Convert between binary and text instantly.
A Short History of Binary Representation
Binary is the fundamental language of computers (every character, number and instruction is ultimately represented as a sequence of 0s and 1s) but the idea predates computers by 250 years. Gottfried Wilhelm Leibniz wrote "Explication de l'Arithmétique Binaire" (submitted to the French Academy of Sciences in 1703, printed in the Mémoires in 1705), the first formal Western description of binary arithmetic. Leibniz was inspired in part by the trigrams and hexagrams of the Chinese I Ching, which encode every divinatory pattern as a six-line stack of broken or unbroken lines (essentially 6-bit binary). George Boole's An Investigation of the Laws of Thought (1854) gave binary the algebraic foundations (AND, OR, NOT, exclusive-or) that still underpin every digital circuit. Claude Shannon's MIT master's thesis "A Symbolic Analysis of Relay and Switching Circuits" (submitted August 1937) made the leap that connected Boolean algebra to electrical engineering: every relay-and-switch circuit corresponds to a Boolean expression, and vice versa. Shannon's thesis is widely considered the most influential master's thesis of the 20th century. The first electronic computers were a mix of decimal (ENIAC, 1945, used decimal counters internally) and binary (the Manchester Baby, June 1948, the first stored-program binary computer; EDSAC, May 1949, the first practical one); by the early 1950s binary was the universal default.
The 8-bit byte arrived later. The word "byte" was coined by Werner Buchholz at IBM in June 1956 during the design of the IBM Stretch computer, originally with various widths (6-bit, 8-bit, 9-bit). The 8-bit byte was standardised by IBM with the announcement of System/360 on 7 April 1964; from that point on, "one byte" was understood to mean 8 bits across the industry. (Older literature occasionally clarifies "octet" to be unambiguous, IETF specs still prefer "octet" for this reason.)
Character Encodings, Turning Letters Into Bits
Binary is the bottom layer; the layer above is the character encoding that maps letters and symbols to specific bit patterns. Baudot code (Émile Baudot, invented 1870, patented 1874) was the first widely-used binary text encoding, 5 bits per character, used by teleprinters and Telex networks for over a century. 5 bits gives only 32 codes, which forced the Baudot code to use shift characters (one for letters, one for numbers/punctuation) to expand the addressable set. ASCII (American Standard Code for Information Interchange) was published as ASA X3.4-1963 on 17 June 1963 by the American Standards Association (the body was renamed ANSI in 1969, which is why the same standard appears later as ANSI X3.4-1986). ASCII uses 7 bits to encode 128 characters: control codes (0-31), punctuation and digits (32-64), uppercase letters (65-90), more punctuation (91-96), lowercase letters (97-122), final punctuation (123-127). The 7-bit width was chosen for compatibility with paper-tape telegraphy hardware. ASCII became the dominant English-language encoding for the next two decades; the canonical revision ANSI X3.4-1986 is essentially identical and is what people mean today when they say "ASCII."
Extended ASCII / ISO 8859 family (ISO 8859-1 published 1987, the rest through the 1990s) filled the high 128 characters of an 8-bit byte with regional alphabets, ISO 8859-1 (Latin-1) covered Western European languages, 8859-2 (Latin-2) Eastern European, 8859-5 Cyrillic, 8859-6 Arabic, 8859-7 Greek, 8859-8 Hebrew, 8859-9 (Latin-5) Turkish, 8859-11 Thai. This produced fifteen incompatible 8-bit encodings, 0xE9 meant é in Latin-1, but a different character in 8859-5 (Cyrillic), undefined in strict ASCII, and yet another character in Mac Roman. The mismatch produced the famous mojibake condition (Japanese: 文字化け, "character transformation", corrupted-looking text from encoding mismatches).
The Unicode project began as a response. The Unicode Consortium was incorporated on 3 January 1991; Unicode 1.0 was published in October 1991 with about 7,000 characters. By Unicode 16.0 (released 10 September 2024) the standard covers more than 154,000 characters across 168 scripts. Unicode is a code-point system (a unique numeric identifier for every character) but it is not directly a binary encoding. Multiple encodings of Unicode exist: UTF-32 (4 bytes per character, fixed-width), UTF-16 (2 or 4 bytes, variable), and the dominant one for the modern web: UTF-8.
UTF-8 was designed by Ken Thompson with Rob Pike on a placemat in a New Jersey diner around 2 September 1992; Plan 9 was running on it by 8 September. UTF-8 is variable-length: 1 byte for ASCII characters (U+0000 to U+007F), 2 bytes for U+0080 to U+07FF, 3 bytes for U+0800 to U+FFFF, 4 bytes for U+10000 to U+10FFFF. The high bits of each byte indicate its position in a multi-byte sequence (0xxxxxxx = 1-byte ASCII, 110xxxxx = first byte of 2-byte sequence, 10xxxxxx = continuation byte, etc.) which makes UTF-8 self-synchronising: you can start decoding from any random position and find the next character boundary by looking at the next few bytes. UTF-8 is also backward-compatible with ASCII: every ASCII file is a valid UTF-8 file. As of 2026, W3Techs reports that approximately 98.9% of all web pages declare UTF-8 as their encoding, it is overwhelmingly the world's text encoding.
ASCII Binary Examples
A few representative ASCII characters in their 8-bit binary form (with the leading 0 since ASCII is technically 7-bit but byte-aligned):
- 'A' (capital A) = decimal 65 =
01000001 - 'a' (lowercase a) = decimal 97 =
01100001(note: differs from 'A' by exactly one bit (bit 5) which is why the case-conversion XOR-with-0x20 trick works) - '0' (digit zero) = decimal 48 =
00110000 - '9' (digit nine) = decimal 57 =
00111001 - ' ' (space) = decimal 32 =
00100000 - '\n' (newline / line feed) = decimal 10 =
00001010 - '!' (exclamation mark) = decimal 33 =
00100001
"Hello" in ASCII becomes 01001000 01100101 01101100 01101100 01101111: five bytes, one per character. In UTF-8 it's identical because every ASCII character is also a valid 1-byte UTF-8 character. "Café" in UTF-8 is 01000011 01100001 01100110 11000011 10101001: four characters, but five bytes, because é (U+00E9) requires two bytes (11000011 10101001) under UTF-8.
UTF-8 Multi-Byte Encoding, Mechanically
UTF-8's encoding rules use specific high-bit patterns to indicate byte position. 1-byte (ASCII range U+0000 to U+007F): 0xxxxxxx: high bit is 0, the remaining 7 bits are the code point. 2-byte (U+0080 to U+07FF): 110xxxxx 10xxxxxx: first byte starts with 110, continuation byte starts with 10; the x bits combine to give the 11-bit code point. 3-byte (U+0800 to U+FFFF): 1110xxxx 10xxxxxx 10xxxxxx: first byte starts with 1110, two continuation bytes; 16-bit code point. 4-byte (U+10000 to U+10FFFF): 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx: covers the whole Unicode space including emoji. The Russian letter п (U+043F) needs 2 bytes in UTF-8 (11010000 10111111); the Chinese character 中 (U+4E2D) needs 3 bytes; the emoji 🎉 (U+1F389) needs 4 bytes. Encoding text as UTF-8 binary always produces a valid sequence; decoding requires checking that continuation bytes have the 10 prefix (otherwise the input is invalid UTF-8 and the decoder typically replaces the bad sequence with the replacement character U+FFFD).
Notation Conventions
Binary text shows up in the wild in several conventions, all referring to the same underlying bytes. Space-separated bytes: 01001000 01100101 01101100 01101100 01101111: the most readable form, common in tutorials and puzzles. No-space continuous: 0100100001100101011011000110110001101111: more compact, requires the reader to know byte boundaries are every 8 digits. Comma-separated: 01001000,01100101,01101100,01101100,01101111: common in CSV-encoded test data. Hex pairs (Base16): 48 65 6c 6c 6f: same bytes shown two hex digits each instead of eight binary; much more compact and common in programmer-facing tools (hexdump, hex editors). Decimal: 72 101 108 108 111: the underlying integer value of each byte. This tool accepts space-separated, comma-separated and continuous binary input for binary-to-text; produces space-separated 8-bit binary for text-to-binary output.
Common Use Cases
- Computer science education. Binary representations are foundational in CS101, students learn what a "byte" is by encoding their name and reading it back.
- Hidden message puzzles. Escape rooms, geocaching coordinates, ARG (alternate reality game) clues, capture-the-flag challenges all use binary-encoded text as a common cipher because it's instantly recognisable as "binary" but requires the reader to know the conversion.
- Steganography. Hiding messages in plain sight by encoding them as binary that looks like noise, a binary string in an image's least-significant-bit data, or hidden in the trailing whitespace of an email.
- Debugging encoding issues. When "Café" displays as "Café" or "€" instead of "€", the bytes reveal what happened, a UTF-8 sequence got interpreted as Latin-1 or vice versa. Looking at the actual binary representation makes the encoding mismatch obvious.
- Forensic analysis. Recovering text from corrupted files, partial backups or damaged storage often means looking at the raw bytes and decoding them by hand.
- Learning low-level programming. Bitwise operations, bit shifting, mask construction, all become intuitive when you can see the binary representation of the values.
- Binary art and tattoos. "I love you" in binary as a wedding gift, an XKCD-style binary T-shirt, a binary number tattoo. The aesthetic of pure 0s and 1s has a small but real subculture.
Encoding Gotchas Worth Knowing About
Mojibake from encoding mismatch. The same byte sequence interpreted under different encodings produces different (and usually garbled) text. 0xE9 is é in Latin-1 but the start of an invalid UTF-8 sequence (any byte ≥ 0x80 must be either a continuation byte or the start of a multi-byte sequence). When a UTF-8 file is opened as Latin-1, every multi-byte UTF-8 character becomes 2-4 garbled Latin-1 characters, the canonical "Café" → "Café" failure mode. BOM (Byte Order Mark). A 2-3 byte sequence at the very start of a Unicode file that indicates the byte order: UTF-16 little-endian starts with FF FE, UTF-16 big-endian with FE FF, UTF-8 BOM (rarely used, sometimes called "UTF-8 with BOM") with EF BB BF. The BOM is helpful for distinguishing Unicode encodings but is mandatory only for UTF-16 and UTF-32; the IETF discourages adding it to UTF-8 files because it breaks tools that expect the file to start with normal content. Endianness matters for UTF-16 and UTF-32 (which order the bytes of a multi-byte code unit) but not for UTF-8 (which is byte-stream encoded, order is determined by the spec, not by hardware).
ASCII Reference Table
Privacy: Why Browser-Only Matters Even Here
Binary conversions seem innocuous, but the text being converted is exactly the kind of thing where privacy matters: hidden message puzzles between friends, confidential phrases being encoded for steganography experiments, sensitive strings being debugged for encoding issues, or simply anything where the reader expects the binary to be a private representation. This tool runs entirely in your browser via JavaScript, verify in DevTools' Network tab while you convert, or take the page offline (airplane mode) after it loads and the converter still works. Safe for puzzle clues, sensitive debugging strings or any text you wouldn't want copied onto a stranger's hard drive.
Frequently Asked Questions
What format should the binary input be in?
Three formats work: space-separated 8-bit groups (01001000 01100101 01101100 01101100 01101111: the most readable), no-space continuous (0100100001100101011011000110110001101111: automatically grouped into 8-bit bytes by the parser), or comma-separated (01001000,01100101,...). Each 8-bit group represents one byte in the output.
Does this support emoji or non-English characters?
Yes, via UTF-8 encoding. Text-to-binary expands every Unicode character into its UTF-8 byte sequence: ASCII characters (A-Z, 0-9, basic punctuation) are 1 byte each; accented Latin and Greek characters are 2 bytes; CJK ideographs are 3 bytes; emoji and supplementary-plane characters are 4 bytes. So "Café" produces 5 bytes (4 characters because é is 2 bytes in UTF-8), and a single 🎉 emoji produces 4 bytes. Binary-to-text decodes the UTF-8 byte sequence back into the original Unicode characters.
Why are there 8 digits per character?
Because the unit of computer data is the byte (8 bits), and that convention has been universal since IBM standardised it with the System/360 in 1964. ASCII actually uses only 7 bits (values 0-127), but it's stored byte-aligned with a leading zero. One byte can represent 256 distinct values (0-255), which covers all standard keyboard ASCII characters with room left over for accented Latin characters in older single-byte encodings (Latin-1, Mac Roman, Windows-1252) or as continuation bytes in UTF-8.
Where does binary actually come from?
The mathematical idea predates computers by 250 years. Gottfried Wilhelm Leibniz wrote the first formal Western description of binary arithmetic in 1703 ("Explication de l'Arithmétique Binaire"), inspired in part by the Chinese I Ching's hexagrams. George Boole's Laws of Thought (1854) gave binary its algebraic foundations (AND, OR, NOT). Claude Shannon's MIT thesis (1937) connected Boolean algebra to electrical relay circuits, the foundational moment for digital electronics. The first electronic binary computers came in the late 1940s (Manchester Baby June 1948, EDSAC May 1949).
What's the difference between ASCII and UTF-8?
ASCII (1963) is a 7-bit fixed-width encoding covering 128 characters, basic English alphabet, digits, common punctuation, control codes. UTF-8 (Thompson + Pike, 1992) is a variable-length encoding of the entire Unicode standard (~155,000 characters as of Unicode 16.0 in September 2024). UTF-8 is backward-compatible with ASCII: every valid ASCII byte sequence is also valid UTF-8. The difference matters above the ASCII range, é, 中, 🎉 all need multiple bytes in UTF-8. As of 2026, ~98.9% of web pages declare UTF-8 as their encoding (per W3Techs).
Are my conversions sent anywhere?
No. Conversion runs entirely in your browser via JavaScript. The text and binary you paste never cross the network, verify in DevTools' Network tab while you click Convert, or take the page offline (airplane mode) after it loads and the tool still works.