QR Code Internals: Encoding, Error Correction, and Capacity

16 March, 2026 Algorithms

QR codes are everywhere - on product packaging, restaurant menus, boarding passes, and payment terminals. Most developers treat them as black boxes: feed in a URL, get an image. But QR codes are a beautifully engineered standard with fascinating internals: polynomial arithmetic over finite fields, eight distinct mask patterns, four encoding modes, and a versioning system that scales from a 21x21 grid to a 177x177 matrix.

This article opens the black box. Understanding how QR codes work helps you make better decisions about capacity, error correction levels, print sizes, and logo embedding - and helps you build better QR code generators.

What Is a QR Code

QR (Quick Response) code was invented in 1994 by Masahiro Hara at Denso Wave, a subsidiary of Toyota supplier Denso. The original use case was tracking automotive parts in manufacturing. The name "Quick Response" refers to the design goal: fast decoding by industrial optical readers, not just handheld scanners.

The standard is defined by ISO/IEC 18004:2015. Denso Wave deliberately did not enforce their patent, making QR codes free to use. This open approach is a large part of why QR codes became the dominant 2D barcode format worldwide.

A QR code is a matrix barcode: a 2D grid of black and white square modules (pixels). Unlike 1D barcodes (which encode data in line widths and spacings), QR codes encode data in two dimensions, achieving much higher data density.

Key structural elements:

Finder patterns - three 7x7 black squares in corners (all except bottom-right), used to locate and orient the code
Separators - one-module white borders around each finder pattern
Timing patterns - alternating black/white stripes connecting the finder patterns, used to determine module size
Alignment patterns - smaller positioning markers in higher-version codes to correct distortion
Format information - encodes the error correction level and mask pattern
Data modules - the actual encoded payload, interleaved with error correction bytes

Version and Size

QR codes come in 40 versions. Version 1 is the smallest; version 40 is the largest.

The size formula is straightforward:

Side length = (4 × Version) + 17 modules

Common sizes:

Version	Formula	Size
1	(4×1) + 17 = 21	21 × 21
5	(4×5) + 17 = 37	37 × 37
10	(4×10) + 17 = 57	57 × 57
20	(4×20) + 17 = 97	97 × 97
40	(4×40) + 17 = 177	177 × 177

Each version adds 4 modules per side compared to the previous one. This consistent step is what gives the formula its simplicity.

Version selection is not arbitrary - it must be large enough to hold your data given your chosen error correction level and encoding mode. Libraries handle version selection automatically, but choosing a shorter URL or a more efficient encoding mode lets you use a smaller (lower-version) QR code, which scans faster and prints better at small sizes.

Data Encoding Modes

QR codes support multiple encoding modes, each optimised for a different character set. Choosing the right mode determines how efficiently your data is packed into the code.

Numeric Mode

Characters: digits 0-9 only
Encoding: three digits packed into 10 bits (3.33 bits per character)
Use case: phone numbers, serial numbers, product codes

Three digits are encoded as a single 10-bit binary number (000-999 requires at most 10 bits). Two remaining digits use 7 bits; one remaining digit uses 4 bits.

Alphanumeric Mode

Characters: 0-9, uppercase A-Z, and the symbols $, %, *, +, -, ., /, :, and space (45 characters total)
Encoding: two characters packed into 11 bits (5.5 bits per character)
Use case: URLs (if you uppercase them), simple text

Pairs of characters are converted to base-45 values and encoded together. Important: URLs with uppercase letters fit alphanumeric mode; URLs with lowercase letters require byte mode.

Byte Mode

Characters: any byte value (typically UTF-8 or ISO-8859-1)
Encoding: 8 bits per character
Use case: URLs with lowercase letters, unicode text, arbitrary binary data

Most general-purpose QR code URLs use byte mode because lowercase letters are not in the alphanumeric character set. If you want to maximise capacity, use a URL shortener to reduce the payload length before encoding.

Kanji Mode

Characters: Japanese Shift JIS characters
Encoding: 13 bits per character
Use case: Japanese text

Kanji mode is highly efficient for Japanese but not useful for other scripts. Some implementations support ECI (Extended Channel Interpretation) mode to specify other character encodings.

Mode Efficiency Comparison

Mode	Bits per char	Example capacity (v10-L)
Numeric	3.33	~680 digits
Alphanumeric	5.5	~410 chars
Byte (Latin)	8	~271 bytes
Kanji	13	~167 characters

Libraries automatically select the most efficient mode for your input. Some advanced libraries switch modes mid-stream (mixed mode) to get even better efficiency for input like https://EXAMPLE.COM/12345.

Error Correction Levels

QR codes embed redundant data so they can be decoded even if part of the code is damaged, dirty, or obscured by a logo.

There are four error correction levels:

Level	Name	Recovery capacity	Data capacity cost
L	Low	~7% of modules	Lowest penalty
M	Medium	~15% of modules	Moderate
Q	Quartile	~25% of modules	Significant
H	High	~30% of modules	Highest penalty

"Recovery capacity" means: if up to that percentage of data modules are damaged or unreadable, the code can still be fully decoded.

Choosing an Error Correction Level

Level L - use for QR codes displayed on screens, clean digital environments, or any context where physical damage is impossible. Maximum data capacity.
Level M - the default in most libraries. Good balance for printed codes in reasonably good condition.
Level Q - use for industrial or outdoor printing where codes may get dirty or worn.
Level H - use when you want to embed a logo inside the QR code. The logo occupies real module area, so you need H-level redundancy to compensate. The 30% recovery limit means you can safely cover up to about 30% of the code area with a logo - but practical experience suggests staying under 25% for reliable scanning.

Reed-Solomon Error Correction

The error correction in QR codes uses the Reed-Solomon algorithm, invented in 1960 by Irving S. Reed and Gustav Solomon at MIT Lincoln Laboratory. Their original paper was published in the Journal of the Society for Industrial and Applied Mathematics.

Reed-Solomon codes are not limited to QR codes. They are used in:

CDs and DVDs (scratch recovery)
DSL modems and RAID storage
Space communications - the Voyager 1 and 2 probes used Reed-Solomon coding
Digital television broadcasting (DVB)
Data Matrix and PDF417 barcodes

How It Works (Conceptually)

Reed-Solomon is based on polynomial arithmetic over finite fields (also called Galois fields). For QR codes, the finite field is GF(2^8) - a field with 256 elements (0-255), which matches a byte.

The core idea: treat your data as the coefficients of a polynomial. Then evaluate this polynomial at additional points beyond what you need. These extra evaluated points are the error correction codewords. If some values are lost or corrupted, you can reconstruct the original polynomial using the remaining correct values - like fitting a curve through known points when some points have been erased.

Reed-Solomon can recover from two types of errors:

Erasures - corrupted bytes at known positions (e.g., you know module row 3 is torn off). More efficient to recover: one erasure "costs" one correction symbol.
Errors - corrupted bytes at unknown positions. Harder to recover: one error "costs" two correction symbols (you must locate it and fix it).

For QR codes, the decoder knows which modules it cannot read (dirt, damage) so it can treat them as erasures and use its correction budget more efficiently.

Format Information

Every QR code encodes meta-information about itself in a format information strip: a pattern of 15 bits placed near the finder patterns, repeated twice for redundancy.

What Format Information Contains

The 5 data bits encode:

Error correction level (2 bits): L=01, M=00, Q=11, H=10
Mask pattern number (3 bits): 0-7

These 5 bits are protected by a BCH (Bose-Chaudhuri-Hocquenghem) error correction code, expanding to 15 bits total. The entire 15-bit string is XOR'd with the mask pattern 101010000010010 to prevent the format information from accidentally creating a finder-pattern-like appearance.

Finder Patterns

The three finder patterns - 7x7 squares in the top-left, top-right, and bottom-left corners - are the most distinctive visual feature of a QR code. Their structure:

7x7 dark square border
5x5 white interior
3x3 dark centre

This specific concentric-square structure was chosen because it has a distinctive dark:light:dark:light:dark ratio of 1:1:3:1:1 that can be recognised regardless of the scanning angle or scale. A scanner can find these patterns in any orientation and use them to calculate the rotation and distortion of the code.

Timing Patterns

The timing patterns are single-module-wide alternating black/white stripes connecting the finder patterns horizontally and vertically. They start and end at the finder pattern separators. The scanner uses them to count module positions precisely, handling cases where the code is slightly distorted or the module size varies.

Masking

Raw data encoded into a QR code can create large uniform regions - big blocks of all-black or all-white modules, or regular patterns that look like finder patterns. These confuse scanners.

Masking fixes this by XOR-ing the data modules (not the structural elements) with one of eight predefined patterns:

Mask	Formula
0	(row + col) mod 2 == 0
1	row mod 2 == 0
2	col mod 3 == 0
3	(row + col) mod 3 == 0
4	(row/2 + col/3) mod 2 == 0
5	(rowcol) mod 2 + (rowcol) mod 3 == 0
6	((rowcol) mod 2 + (rowcol) mod 3) mod 2 == 0
7	((row+col) mod 2 + (row*col) mod 3) mod 2 == 0

The encoder generates the QR code with all 8 masks, evaluates each against 4 penalty criteria, and selects the mask with the lowest total penalty score:

Rule 1 - penalise runs of 5 or more same-colour modules in a row or column
Rule 2 - penalise 2x2 blocks of same-colour modules
Rule 3 - penalise patterns that resemble finder patterns
Rule 4 - penalise unbalanced proportions of dark/light modules (ideal is 50/50)

The selected mask number is then stored in the format information strip so decoders know which mask to reverse.

Capacity Table

Capacity depends on version, error correction level, and encoding mode. Here is a representative table:

Version	Error Level	Numeric digits	Alphanumeric	Bytes (UTF-8/Latin)
1	L	41	25	17
1	H	10	6	7
5	L	154	93	64
5	H	45	27	22
10	L	321	195	134
10	H	154	93	64
20	L	1171	711	488
20	H	466	283	154
40	L	7089	4296	2953
40	H	1852	1119	784

A typical HTTPS URL like https://example.com/products/item?ref=homepage is about 50 characters. With level M error correction, that fits comfortably in version 4 (37x37 modules). With level H (for logo embedding), you need version 6 or higher.

Use Cases and Best Practices

URL Shortening

Long URLs force higher QR code versions: more modules, smaller print size, more error-prone. Before encoding:

Use a URL shortener (bit.ly, your own short domain)
Remove unnecessary tracking parameters when possible
Consider uppercase-only domains for alphanumeric mode efficiency

Quiet Zone

Every QR code requires a quiet zone - a minimum 4-module-wide border of white space around the entire code. This is specified in ISO/IEC 18004 and is not optional. Without it, scanners may fail to locate the finder patterns.

In practice, add at least 4 modules (more is better) of white margin when placing a QR code on a coloured background.

Minimum Print Size

For reliable mobile phone scanning, QR codes need a minimum physical size:

2.5 cm (1 inch) minimum side length is the commonly cited rule of thumb
Higher version codes (more modules) need proportionally more physical space to maintain per-module legibility
Scanning distance matters: a code that works at arm's length needs to be larger to also work from 1 meter

Logo Embedding

Embedding a logo is a popular design choice. The rules:

Use level H error correction - only level H (30% recovery) gives you meaningful protection against the modules hidden by the logo
Keep the logo under 25-30% of the total area - even though H nominally allows 30%, real-world scanners have variable error correction budgets and partial damage elsewhere in the code may already consume some correction capacity
Centre the logo - the centre of a QR code contains data modules, not structural elements; centring minimises the chance of covering a critical structural element
Test extensively - scan with multiple devices and apps before printing at scale
Avoid covering finder patterns - the three corner squares are structural and cannot be replaced by error correction

Dynamic vs Static QR Codes

Static QR codes encode the destination URL directly. The QR code itself is immutable once printed.

Dynamic QR codes encode a short URL that redirects to the actual destination. The redirect target can be changed at any time without reprinting. Dynamic codes:

Are shorter (lower version, smaller size, faster scanning)
Provide scan analytics
Can be updated (new destination, seasonal campaigns)
Require an ongoing service (the short URL redirect must remain active)

For permanent use cases (business cards, packaging), dynamic codes are safer because they let you fix destination URLs after printing.

Code Examples

PHP: chillerlan/php-qrcode

<?php

declare(strict_types=1);

require 'vendor/autoload.php';

use chillerlan\QRCode\QRCode;
use chillerlan\QRCode\QROptions;
use chillerlan\QRCode\Common\EccLevel;

// Basic usage - returns a data URI for inline HTML
$data = 'https://example.com';
$dataUri = (new QRCode)->render($data);
echo '<img src="' . $dataUri . '" alt="QR Code">';

// With options
$options = new QROptions([
    'eccLevel'      => EccLevel::H,      // High error correction for logo embedding
    'imageBase64'   => true,
    'imageTransparent' => false,
    'scale'         => 10,               // 10 pixels per module
    'quietzoneSize' => 4,                // 4 modules quiet zone
    'outputType'    => QRCode::OUTPUT_IMAGE_PNG,
]);

$qrcode = new QRCode($options);

// Render to a file
file_put_contents('/tmp/qr.png', $qrcode->render($data));

// Render to a data URI for HTML embedding
$dataUri = $qrcode->render($data);

// Get version info
$matrix = $qrcode->getMatrix($data);
echo 'Version: ' . $matrix->version() . PHP_EOL;
echo 'Size: ' . $matrix->size() . ' × ' . $matrix->size() . ' modules' . PHP_EOL;

Install with:

composer require chillerlan/php-qrcode

Python: qrcode

import qrcode
from qrcode.constants import ERROR_CORRECT_L, ERROR_CORRECT_M, ERROR_CORRECT_Q, ERROR_CORRECT_H
from qrcode.image.styledpil import StyledPilImage
from qrcode.image.styles.moduledrawers import RoundedModuleDrawer

# Basic usage
img = qrcode.make('https://example.com')
img.save('/tmp/qr_basic.png')

# With full options
qr = qrcode.QRCode(
    version=None,           # Auto-select minimum version
    error_correction=ERROR_CORRECT_H,
    box_size=10,            # Pixels per module
    border=4,               # Modules of quiet zone
)
qr.add_data('https://example.com')
qr.make(fit=True)          # Auto-calculate version

img = qr.make_image(fill_color='black', back_color='white')
img.save('/tmp/qr.png')

# Inspect the version selected
print(f'Version: {qr.version}')
print(f'Size: {qr.modules_count} × {qr.modules_count} modules')

# Styled QR code with rounded modules (requires Pillow)
qr_styled = qrcode.QRCode(error_correction=ERROR_CORRECT_H)
qr_styled.add_data('https://example.com')
img_styled = qr_styled.make_image(
    image_factory=StyledPilImage,
    module_drawer=RoundedModuleDrawer()
)
img_styled.save('/tmp/qr_styled.png')

Install with:

pip install qrcode[pil]

JavaScript: qrcode (npm)

import QRCode from 'qrcode';

// Render to canvas (browser)
async function renderToCanvas(canvasElement, text) {
  await QRCode.toCanvas(canvasElement, text, {
    errorCorrectionLevel: 'H',
    margin: 4,
    scale: 8,
    color: {
      dark: '#000000',
      light: '#ffffff',
    },
  });
}

// Render to data URL (browser or Node.js)
async function renderToDataUrl(text) {
  const dataUrl = await QRCode.toDataURL(text, {
    errorCorrectionLevel: 'M',
    type: 'image/png',
    margin: 4,
    width: 300,
  });
  return dataUrl; // "data:image/png;base64,..."
}

// Render to SVG string (good for server-side rendering)
async function renderToSvg(text) {
  const svg = await QRCode.toString(text, {
    type: 'svg',
    errorCorrectionLevel: 'M',
    margin: 4,
  });
  return svg;
}

// Example usage
const url = 'https://example.com/product/12345';
const svgString = await renderToSvg(url);
console.log(svgString);

Install with:

npm install qrcode

For TypeScript:

npm install qrcode @types/qrcode

Checking QR Code Capacity in JavaScript

import QRCode from 'qrcode';

async function analyzeCapacity(data, ecLevel = 'M') {
  const segments = QRCode.create(data, { errorCorrectionLevel: ecLevel });
  const version = segments.version;
  const side = 4 * version + 17;

  console.log(`Data: "${data}" (${data.length} chars)`);
  console.log(`Error correction: ${ecLevel}`);
  console.log(`QR version: ${version}`);
  console.log(`Matrix size: ${side} × ${side} modules`);
}

// Compare error correction levels
for (const level of ['L', 'M', 'Q', 'H']) {
  await analyzeCapacity('https://example.com/some/path', level);
}

Common Mistakes

Using level H when you do not need it - if you are not embedding a logo, level H wastes capacity and forces a larger QR version. Use level M for most printed codes.

Forgetting the quiet zone - placing a QR code flush against a coloured border or edge of a page causes reliable scan failures. Always keep 4+ modules of white space around the code.

Encoding mixed case URLs in alphanumeric mode - alphanumeric mode only supports uppercase A-Z. A URL like https://example.com/Page contains lowercase letters and requires byte mode. Some tools silently fall back to byte mode; others throw an error.

Testing only on one device - different QR scanner apps implement the standard with different tolerances. Test with at least iOS Camera, Android Camera, and one dedicated scanning app.

Printing too small - a high-version code (v20+) printed at 2cm is essentially unscannable. Use a URL shortener to reduce version, or increase the physical print size.

Why the Internals Matter

QR codes pack a remarkable amount of engineering into a small printed square: Reed-Solomon polynomial codes invented for space communications, eight mathematically defined masking patterns, four encoding modes with different efficiency profiles, and a penalty-scoring system that automatically picks the best mask for your data.

Understanding these internals helps you make better engineering decisions: choosing the right error correction level, knowing when to shorten URLs, understanding the 30% logo budget, and setting the correct quiet zone. These details separate QR codes that scan reliably from ones that frustrate users.

QR Code Internals: Encoding, Error Correction, and Capacity

What Is a QR Code

Version and Size

Data Encoding Modes

Numeric Mode

Alphanumeric Mode

Byte Mode

Kanji Mode

Mode Efficiency Comparison

Error Correction Levels

Choosing an Error Correction Level

Reed-Solomon Error Correction

How It Works (Conceptually)

Format Information

What Format Information Contains

Finder Patterns

Timing Patterns

Masking

Capacity Table

Use Cases and Best Practices

URL Shortening

Quiet Zone

Minimum Print Size

Logo Embedding

Dynamic vs Static QR Codes

Code Examples

PHP: chillerlan/php-qrcode

Python: qrcode

JavaScript: qrcode (npm)

Checking QR Code Capacity in JavaScript

Common Mistakes

Why the Internals Matter

More Articles

Catastrophic Backtracking: How One Regex Can Take Your Site Down

Cron Job Not Running? A Field-Tested Debugging Checklist

GEO in 2026: Getting Cited by AI Answer Engines