Binary Translator: text to binary, done right
Convert text to binary and back with correct UTF-8 encoding, inspect every character’s bytes, and switch between binary, decimal, hexadecimal, and octal. Accurate, private, and free.
| Char | Unicode | Decimal | Hex | UTF-8 binary |
|---|
Text and binary, without the bugs
Most online binary translators quietly mishandle anything beyond plain English. They read each character's code unit directly, which works for A to Z but produces wrong bytes for accented letters and breaks entirely on emoji, because those characters are not single bytes. This translator encodes text as true UTF-8, the standard the web runs on, so cafe with an accent and a smiling emoji convert to the exact bytes a computer would store and decode back perfectly. Type on either side and the other updates live; a per-character table shows every character's Unicode code point, decimal, hex, and binary; and a second mode converts whole numbers between binary, decimal, hexadecimal, and octal. Everything runs in your browser, with nothing uploaded.
How do you convert text to binary?
Each character is assigned a number by a standard, then that number is written in base 2. In UTF-8, the letter H is code point 72, which is 01001000 in eight bits, and i is 105, or 01101001, so Hi becomes 01001000 01101001. Characters beyond the basic set use two to four bytes: an accented e is two bytes, and most emoji are four. To go back, split the binary into 8-bit bytes, read each as a number, and decode those bytes as UTF-8. The translator above does all of this instantly in both directions.
The how it works section breaks down the character-to-bits pipeline step by step.
How to use the binary translator
Type or paste your text
Enter text in the left box and the binary appears on the right instantly, one byte per character for plain English and more for accented letters and emoji. The character count, byte count, and bit count update as you type.
Or paste binary to decode
Paste binary into the right box to read it back as text. Spaces, commas, or no separators all work; the translator strips formatting, regroups the bits into bytes, and decodes them as UTF-8.
Choose your format
Pick 8-bit or 7-bit grouping and a separator of space, comma, or none. Eight-bit with spaces is the common, readable default; seven-bit suits classic ASCII exercises.
Read the per-character breakdown
The table below the boxes shows every character with its Unicode code point, decimal value, hex, and binary, which turns the tool into a learning aid rather than a black box.
Copy, download, or share
Copy either side, download the result as a text file, or copy a shareable link that reopens the exact conversion. Nothing is stored on a server; the link carries the text itself.
How binary represents text
Computers store everything as bits, so text needs a numbering system: a character encoding. A character first maps to a number called a code point, that number is encoded into one or more bytes, and each byte is eight bits. The pipeline below traces a single character all the way down to the ones and zeros, and the place-value chart shows how eight bits add up to a byte's value.
The same four steps run for every character. For plain ASCII the code point fits in one byte; for the rest of Unicode, UTF-8 spreads it across two to four bytes using a defined bit pattern.
An 8-bit byte is read as a sum of powers of two. Each 1 contributes its column value; here 01001000 is 64 plus 8, which is 72, the code point for the capital letter H.
UTF-8: why emoji and accents work here
ASCII only ever defined 128 characters, enough for English but not for the world's writing systems or emoji. UTF-8 extends it without breaking it: the original ASCII characters stay one byte, and everything else uses two, three, or four bytes with a self-describing bit pattern. A translator that assumes one byte per character will corrupt any text containing these, which is the single most common bug in free binary tools. This one uses the browser's real UTF-8 encoder, so the bytes it shows are the bytes a computer actually stores.
| Code-point range | Bytes | Covers | Example |
|---|---|---|---|
| U+0000 to U+007F | 1 | Basic Latin (ASCII): A–Z, digits, punctuation | A → 01000001 |
| U+0080 to U+07FF | 2 | Latin accents, Greek, Cyrillic, Arabic, Hebrew | é → 11000011 10101001 |
| U+0800 to U+FFFF | 3 | Most CJK, many symbols | € → 3 bytes |
| U+10000 to U+10FFFF | 4 | Emoji, rare scripts, supplementary planes | 😀 → 4 bytes |
Binary, decimal, hex, and octal
Binary is base 2, but the same value can be written in any base, and programmers lean on hexadecimal and octal because they pack binary neatly: one hex digit is exactly four bits, one octal digit exactly three. The Number bases tab converts a value between all four at once. The table shows the same small numbers across every base so the pattern is visible.
| Decimal | Binary | Hex | Octal |
|---|---|---|---|
| 0 | 0 | 0 | 0 |
| 1 | 1 | 1 | 1 |
| 2 | 10 | 2 | 2 |
| 3 | 11 | 3 | 3 |
| 4 | 100 | 4 | 4 |
| 5 | 101 | 5 | 5 |
| 6 | 110 | 6 | 6 |
| 7 | 111 | 7 | 7 |
| 8 | 1000 | 8 | 10 |
| 9 | 1001 | 9 | 11 |
| 10 | 1010 | A | 12 |
| 11 | 1011 | B | 13 |
| 12 | 1100 | C | 14 |
| 13 | 1101 | D | 15 |
| 14 | 1110 | E | 16 |
| 15 | 1111 | F | 17 |
| 16 | 10000 | 10 | 20 |
ASCII to binary reference table
The printable ASCII characters and their binary codes are the backbone of text encoding, and in UTF-8 these are identical to the single-byte values. This is a working subset; the translator handles the full Unicode range, but these are the codes worth recognizing on sight.
| Char | Decimal | Hex | Binary | Char | Decimal | Hex | Binary |
|---|---|---|---|---|---|---|---|
| space | 32 | 0x20 | 00100000 | B | 66 | 0x42 | 01000010 |
| ! | 33 | 0x21 | 00100001 | Z | 90 | 0x5A | 01011010 |
| 0 | 48 | 0x30 | 00110000 | a | 97 | 0x61 | 01100001 |
| 1 | 49 | 0x31 | 00110001 | b | 98 | 0x62 | 01100010 |
| 9 | 57 | 0x39 | 00111001 | z | 122 | 0x7A | 01111010 |
| @ | 64 | 0x40 | 01000000 | { | 123 | 0x7B | 01111011 |
| A | 65 | 0x41 | 01000001 | ~ | 126 | 0x7E | 01111110 |
What most binary translators get wrong
| Capability | Typical translator | This translator |
|---|---|---|
| Accented characters | Often wrong bytes | Correct UTF-8 multi-byte encoding |
| Emoji and symbols | Corrupted or dropped | Full four-byte UTF-8 round-trip |
| Per-character breakdown | Not shown | Live table with code point, decimal, hex, binary |
| Decode tolerant of formatting | Needs exact spacing | Accepts spaces, commas, or none |
| Number-base conversion | Separate tool | Binary, decimal, hex, octal in one tab |
| Bit-grouping options | Fixed | 8-bit or 7-bit, choosable separator |
| Shareable conversion | Rare | Encoded in the link |
| Privacy | Text sent to servers | Fully client-side; works offline |
One translator, many reasons
Computer science homework
Encoding exercises ask you to convert text to binary by hand, then check the answer. The per-character table shows the working, code point to decimal to binary, so you can verify each step rather than just trusting a final string, and switch to 7-bit for classic ASCII assignments.
Debugging encodings
When text arrives garbled, the question is usually which bytes are really there. Paste it here to see the exact UTF-8 bytes and code points, spot the stray byte, and confirm whether a string is valid ASCII or carries multi-byte characters that a one-byte assumption would break.
Teaching how text is stored
The pipeline from character to code point to bytes to bits is abstract until it is visible. Type a word, show the table on screen, add an emoji to reveal four bytes, and the idea that everything is numbers underneath stops being a slogan.
Codes, ciphers, and curiosity
Binary shows up in puzzle hunts, escape rooms, and maker projects. Encode a hidden message, decode one you found, or convert between bases for a microcontroller register, all without installing anything or sending the text anywhere.
Six binary conversion mistakes
Assuming one byte per character
It holds for English but fails the moment an accent or emoji appears, and a one-byte assumption turns those into wrong numbers or question marks.
Encode as UTF-8, which uses one to four bytes per character, as this translator does.Mixing up bits and bytes
A byte is eight bits, so a 5-character ASCII word is 5 bytes but 40 bits; quoting one when you mean the other is a constant source of confusion.
Watch the live counters here: they report characters, bytes, and bits separately.Dropping leading zeros
The code for a space is 100000, but as a byte it must be 00100000; trimming the leading zeros breaks fixed-width decoding.
Always pad each byte to its full width. The translator pads to 8 bits by default.Confusing the bases
The string 1000 is eight in binary, a thousand in decimal, and different again in hex, so a number is meaningless without its base.
Label the base, or use the Number bases tab to see all four at once.Regrouping binary wrongly
Decoding requires splitting the stream into the same size groups it was encoded in; an off-by-one in grouping scrambles every character after it.
Keep the byte width consistent; this tool regroups by your chosen width and flags leftover bits.Trusting tools that hide their work
A translator that only emits a final string gives you no way to catch an encoding error or learn from the result.
Prefer a tool that shows the per-character breakdown, so every byte is accountable.Binary terms, defined
- Bit
- A binary digit, 0 or 1, the smallest unit of information. Eight bits make a byte.
- Byte
- A group of eight bits, able to represent 256 values (0 to 255). One ASCII character is one byte.
- Binary (base 2)
- A number system using only 0 and 1, where each position is a power of two: 1, 2, 4, 8, and so on.
- Code point
- The number a character encoding assigns to a character, written like U+0041 for the letter A (decimal 65).
- ASCII
- A 1960s standard mapping 128 characters to the numbers 0 to 127, covering English letters, digits, and punctuation.
- Unicode
- The universal character set covering virtually every writing system and emoji, assigning each a unique code point.
- UTF-8
- The dominant way to encode Unicode as bytes: ASCII stays one byte, other characters use two to four, and the web runs on it.
- Hexadecimal (base 16)
- A base-16 system using 0 to 9 then A to F, where one hex digit equals exactly four bits, making it a compact stand-in for binary.