Base Encoding Guide

Published Updated 7 min read

Base encoding converts binary data into text using a limited set of characters. This guide covers the most common encoding formats, their algorithms, and practical applications.

Try the interactive Base Encoding Tool →

What is Base Encoding?

Base encoding represents binary data as text using printable ASCII characters. It’s not encryption—it provides no security. Use it for data transmission and compatibility, not confidentiality.

When to Use Base Encoding

Security Warning

Base encoding is not encryption. Anyone can decode it instantly. For security:

Format Comparison

FormatAlphabet SizePaddingCase SensitiveOverheadBest For
Base6464 charsYes (=)Yes~33%General purpose, MIME, data URIs
Base64 URL64 charsNoYes~33%URLs, filenames, JWT tokens
Base3232 charsYes (=)No~60%Human input, TOTP secrets, Tor
Base5858 charsNoYes~37%Cryptocurrency, IPFS, short codes

Base64 (Standard)

Overview

Base64 is the most widely used encoding format. It converts every 3 bytes (24 bits) into 4 Base64 characters (6 bits each).

Alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/

Padding: Uses = when input length isn’t divisible by 3

How it Works

  1. Convert input text to binary (UTF-8)
  2. Split binary into 6-bit groups
  3. Map each 6-bit value (0-63) to a character
  4. Add padding if needed

Example Encoding

Text:     "Hi!"
UTF-8:    0x48 0x69 0x21
Binary:   01001000 01101001 00100001
Grouped:  010010 000110 100100 100001
Decimal:  18     6      36     33
Base64:   S      G      k      h
Result:   "SGkh"

Example with Padding

Text:     "Hi"
UTF-8:    0x48 0x69
Binary:   01001000 01101001
Grouped:  010010 000110 1001[00] (2 bits short, pad with 0s)
Decimal:  18     6      36
Base64:   S      G      k
Padding:  Need 1 more char to make multiple of 4
Result:   "SGk="

Use Cases

ApplicationWhy Base64
Email attachments (MIME)SMTP is text-only, can’t send binary
Data URIs (data:image/png;base64,...)Embed images/fonts in HTML/CSS
HTTP Basic AuthAuthorization: Basic base64(user:pass)
JSON/XML binary dataBoth formats are text-based
Certificates (PEM format)-----BEGIN CERTIFICATE-----

Advantages & Disadvantages

Advantages:

Disadvantages:

Base64 URL-safe

Overview

A Base64 variant designed for URLs and filenames. Replaces problematic characters and removes padding.

Alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_

Padding: None (removed)

Changes from Standard Base64:

How it Works

  1. Encode using standard Base64
  2. Replace + with -
  3. Replace / with _
  4. Remove trailing = padding

Example

Text:           "Hi!"
Standard Base64: "SGkh"
URL-safe:       "SGkh" (no changes needed in this case)

Text:           ">>?"
Standard Base64: "Pj4/"
URL-safe:       "Pj4_" (/ becomes _)

Use Cases

ApplicationWhy Base64 URL
JWT tokensTokens often passed in URL parameters
URL parametersNo encoding needed for - and _
FilenamesSafe across all filesystems
URL shortenersCompact and clean URLs
OAuth state parametersPassed as URL query params

Advantages

Base32

Overview

Base32 uses uppercase letters and digits 2-7, making it case-insensitive and human-friendly. It converts every 5 bytes (40 bits) into 8 Base32 characters (5 bits each).

Alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZ234567

Padding: Uses = when input length isn’t divisible by 5

Case: Insensitive (decoding accepts lowercase)

How it Works

  1. Convert input to binary (UTF-8)
  2. Split binary into 5-bit groups
  3. Map each 5-bit value (0-31) to a character
  4. Add padding to make length a multiple of 8

Example Encoding

Text:     "Hi"
UTF-8:    0x48 0x69
Binary:   01001000 01101001
Grouped:  01001 00001 10100 1[0000] (pad to 5 bits)
Decimal:  9     1     20    16
Base32:   J     B     U     Q
Padding:  Need 4 more chars to make 8
Result:   "JBUQ===="

Character Choice

Why A-Z and 2-7?

ExcludedReason
0 (zero)Looks like O (letter)
1 (one)Looks like I or l
8, 9Reserved for extended Base32 variants

This makes Base32 ideal for:

Use Cases

ApplicationWhy Base32
TOTP/HOTP secrets (2FA)Users manually enter seeds, case-insensitive helps
Tor hidden services.onion addresses use Base32
Recovery codesEasy to read and type correctly
License keysUnambiguous characters reduce support requests
DNS labelsCase-insensitive, RFC 4648 standard

Advantages & Disadvantages

Advantages:

Disadvantages:

Base32 Alphabet Variants

RFC 4648 defines the standard Base32 alphabet, but several variants exist for specific use cases:

RFC 4648 Standard

Alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZ234567

The most common variant, used in:

RFC 4648 Extended Hex

Alphabet: 0123456789ABCDEFGHIJKLMNOPQRSTUV

Uses digits first, then letters:

z-base-32

Alphabet: ybndrfg8ejkmcpqxot1uwisza345h769

Designed for human usability:

Crockford’s Base32

Alphabet: 0123456789ABCDEFGHJKMNPQRSTVWXYZ

Created by Douglas Crockford for human-readable IDs:

Special features:

Bech32

Alphabet: qpzry9x8gf2tvdw0s3jn54khce6mua7l

Designed for Bitcoin SegWit addresses:

Key design choices:

Choosing a Base32 Variant

VariantBest ForKey Feature
RFC 4648TOTP, general useStandard, widest support
Extended HexHex-compatible systemsPreserves sort order
z-base-32Verbal communicationMaximum clarity
CrockfordShort IDs, human inputError tolerance
Bech32Crypto addressesError detection

Base58

Overview

Base58 excludes visually ambiguous characters, making it ideal for manual entry and copy-paste. No padding needed.

Alphabet: 123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz

Padding: None

Case: Sensitive

Character Choice

Excluded Characters:

CharacterReason
0 (zero)Looks like O (capital o)
O (capital o)Looks like 0 (zero)
I (capital i)Looks like l (lowercase L) or 1
l (lowercase L)Looks like I or 1

Advantages of this alphabet:

How it Works

Unlike Base64/Base32, Base58 treats input as a large number:

  1. Convert input bytes to a big integer
  2. Repeatedly divide by 58
  3. Map remainders to alphabet characters
  4. Preserve leading zero bytes

Example Encoding

Text:     "Hi"
UTF-8:    0x48 0x69
Integer:  0x4869 = 18537 (decimal)

18537 ÷ 58 = 319 remainder 45 → 'j'
319 ÷ 58 = 5 remainder 29   → 'W'
5 ÷ 58 = 0 remainder 5     → '5'

Result: "5Wj" (reversed remainders)

Base58Check (Bitcoin variant)

Bitcoin uses Base58Check, which adds a version byte (identifies address type) and checksum (first 4 bytes of SHA256(SHA256(data))) for error detection.

Example Bitcoin Address Structure

Version + Payload + Checksum
   1 byte   20 bytes   4 bytes
     ↓         ↓         ↓
    [00] [pubkey hash] [checksum]

    Encoded as Base58Check

    1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa

Use Cases

ApplicationWhy Base58
Bitcoin/crypto addressesUnambiguous, includes checksum variant
IPFS content IDs (v0)Human-readable content addressing
Blockchain explorersShort, URL-safe identifiers
Vanity addressesUsers can “mine” custom prefixes
Short codesMore compact than Base32

Advantages & Disadvantages

Advantages:

Disadvantages:

Custom Alphabets

When to Use Custom Alphabets

Create a custom alphabet when:

Not for security. Custom alphabets provide obscurity, not cryptographic protection.

How to Design a Custom Alphabet

1. Choose your character set

Consider what constraints you have:

2. Pick the alphabet size

Larger alphabets = more compact encoding:

3. Order matters for sorting

If you need encoded values to sort the same as the original data, put characters in sort order:

Design Checklist

Common Examples

Alphanumeric (Base36) - Case-insensitive, no special chars:

0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ

No vowels (Base32) - Avoid accidental words:

23456789BCDFGHJKLMNPQRSTVWXYZ

DNA encoding (Base4):

ACGT

Lowercase hex (Base16):

0123456789abcdef

Learn More

Try the interactive Base Encoding Tool →

For specifications, see RFC 4648 (Base64, Base32, Base16) and Bitcoin Base58Check.