QR Code Internals: Encoding, Error Correction, and Capacity
16 March, 2026 Algorithms
QR codes are everywhere - on product packaging, restaurant menus, boarding passes, and payment terminals. Most developers treat them as black boxes: feed in a URL, get an image. But QR codes are a beautifully engineered standard with fascinating internals: polynomial arithmetic over finite fields, eight distinct mask patterns, four encoding modes, and a versioning system that scales from a 21x21 grid to a 177x177 matrix.
This article opens the black box. Understanding how QR codes work helps you make better decisions about capacity, error correction levels, print sizes, and logo embedding - and helps you build better QR code generators.
What Is a QR Code
QR (Quick Response) code was invented in 1994 by Masahiro Hara at Denso Wave, a subsidiary of Toyota supplier Denso. The original use case was tracking automotive parts in manufacturing. The name "Quick Response" refers to the design goal: fast decoding by industrial optical readers, not just handheld scanners.
The standard is defined by ISO/IEC 18004:2015. Denso Wave deliberately did not enforce their patent, making QR codes free to use. This open approach is a large part of why QR codes became the dominant 2D barcode format worldwide.
A QR code is a matrix barcode: a 2D grid of black and white square modules (pixels). Unlike 1D barcodes (which encode data in line widths and spacings), QR codes encode data in two dimensions, achieving much higher data density.
Key structural elements:
- Finder patterns - three 7x7 black squares in corners (all except bottom-right), used to locate and orient the code
- Separators - one-module white borders around each finder pattern
- Timing patterns - alternating black/white stripes connecting the finder patterns, used to determine module size
- Alignment patterns - smaller positioning markers in higher-version codes to correct distortion
- Format information - encodes the error correction level and mask pattern
- Data modules - the actual encoded payload, interleaved with error correction bytes
Version and Size
QR codes come in 40 versions. Version 1 is the smallest; version 40 is the largest.
The size formula is straightforward:
Side length = (4 × Version) + 17 modules
Common sizes:
| Version | Formula | Size |
|---|---|---|
| 1 | (4×1) + 17 = 21 | 21 × 21 |
| 5 | (4×5) + 17 = 37 | 37 × 37 |
| 10 | (4×10) + 17 = 57 | 57 × 57 |
| 20 | (4×20) + 17 = 97 | 97 × 97 |
| 40 | (4×40) + 17 = 177 | 177 × 177 |
Each version adds 4 modules per side compared to the previous one. This consistent step is what gives the formula its simplicity.
Version selection is not arbitrary - it must be large enough to hold your data given your chosen error correction level and encoding mode. Libraries handle version selection automatically, but choosing a shorter URL or a more efficient encoding mode lets you use a smaller (lower-version) QR code, which scans faster and prints better at small sizes.
Data Encoding Modes
QR codes support multiple encoding modes, each optimised for a different character set. Choosing the right mode determines how efficiently your data is packed into the code.
Numeric Mode
- Characters: digits 0-9 only
- Encoding: three digits packed into 10 bits (3.33 bits per character)
- Use case: phone numbers, serial numbers, product codes
Three digits are encoded as a single 10-bit binary number (000-999 requires at most 10 bits). Two remaining digits use 7 bits; one remaining digit uses 4 bits.
Alphanumeric Mode
- Characters: 0-9, uppercase A-Z, and the symbols
$,%,*,+,-,.,/,:, and space (45 characters total) - Encoding: two characters packed into 11 bits (5.5 bits per character)
- Use case: URLs (if you uppercase them), simple text
Pairs of characters are converted to base-45 values and encoded together. Important: URLs with uppercase letters fit alphanumeric mode; URLs with lowercase letters require byte mode.
Byte Mode
- Characters: any byte value (typically UTF-8 or ISO-8859-1)
- Encoding: 8 bits per character
- Use case: URLs with lowercase letters, unicode text, arbitrary binary data
Most general-purpose QR code URLs use byte mode because lowercase letters are not in the alphanumeric character set. If you want to maximise capacity, use a URL shortener to reduce the payload length before encoding.
Kanji Mode
- Characters: Japanese Shift JIS characters
- Encoding: 13 bits per character
- Use case: Japanese text
Kanji mode is highly efficient for Japanese but not useful for other scripts. Some implementations support ECI (Extended Channel Interpretation) mode to specify other character encodings.
Mode Efficiency Comparison
| Mode | Bits per char | Example capacity (v10-L) |
|---|---|---|
| Numeric | 3.33 | ~680 digits |
| Alphanumeric | 5.5 | ~410 chars |
| Byte (Latin) | 8 | ~271 bytes |
| Kanji | 13 | ~167 characters |
Libraries automatically select the most efficient mode for your input. Some advanced libraries switch modes mid-stream (mixed mode) to get even better efficiency for input like https://EXAMPLE.COM/12345.
Error Correction Levels
QR codes embed redundant data so they can be decoded even if part of the code is damaged, dirty, or obscured by a logo.
There are four error correction levels:
| Level | Name | Recovery capacity | Data capacity cost |
|---|---|---|---|
| L | Low | ~7% of modules | Lowest penalty |
| M | Medium | ~15% of modules | Moderate |
| Q | Quartile | ~25% of modules | Significant |
| H | High | ~30% of modules | Highest penalty |
"Recovery capacity" means: if up to that percentage of data modules are damaged or unreadable, the code can still be fully decoded.
Choosing an Error Correction Level
- Level L - use for QR codes displayed on screens, clean digital environments, or any context where physical damage is impossible. Maximum data capacity.
- Level M - the default in most libraries. Good balance for printed codes in reasonably good condition.
- Level Q - use for industrial or outdoor printing where codes may get dirty or worn.
- Level H - use when you want to embed a logo inside the QR code. The logo occupies real module area, so you need H-level redundancy to compensate. The 30% recovery limit means you can safely cover up to about 30% of the code area with a logo - but practical experience suggests staying under 25% for reliable scanning.
Reed-Solomon Error Correction
The error correction in QR codes uses the Reed-Solomon algorithm, invented in 1960 by Irving S. Reed and Gustav Solomon at MIT Lincoln Laboratory. Their original paper was published in the Journal of the Society for Industrial and Applied Mathematics.
Reed-Solomon codes are not limited to QR codes. They are used in:
- CDs and DVDs (scratch recovery)
- DSL modems and RAID storage
- Space communications - the Voyager 1 and 2 probes used Reed-Solomon coding
- Digital television broadcasting (DVB)
- Data Matrix and PDF417 barcodes
How It Works (Conceptually)
Reed-Solomon is based on polynomial arithmetic over finite fields (also called Galois fields). For QR codes, the finite field is GF(2^8) - a field with 256 elements (0-255), which matches a byte.
The core idea: treat your data as the coefficients of a polynomial. Then evaluate this polynomial at additional points beyond what you need. These extra evaluated points are the error correction codewords. If some values are lost or corrupted, you can reconstruct the original polynomial using the remaining correct values - like fitting a curve through known points when some points have been erased.
Reed-Solomon can recover from two types of errors:
- Erasures - corrupted bytes at known positions (e.g., you know module row 3 is torn off). More efficient to recover: one erasure "costs" one correction symbol.
- Errors - corrupted bytes at unknown positions. Harder to recover: one error "costs" two correction symbols (you must locate it and fix it).
For QR codes, the decoder knows which modules it cannot read (dirt, damage) so it can treat them as erasures and use its correction budget more efficiently.
Format Information
Every QR code encodes meta-information about itself in a format information strip: a pattern of 15 bits placed near the finder patterns, repeated twice for redundancy.
What Format Information Contains
The 5 data bits encode:
- Error correction level (2 bits): L=01, M=00, Q=11, H=10
- Mask pattern number (3 bits): 0-7
These 5 bits are protected by a BCH (Bose-Chaudhuri-Hocquenghem) error correction code, expanding to 15 bits total. The entire 15-bit string is XOR'd with the mask pattern 101010000010010 to prevent the format information from accidentally creating a finder-pattern-like appearance.
Finder Patterns
The three finder patterns - 7x7 squares in the top-left, top-right, and bottom-left corners - are the most distinctive visual feature of a QR code. Their structure:
- 7x7 dark square border
- 5x5 white interior
- 3x3 dark centre
This specific concentric-square structure was chosen because it has a distinctive dark:light:dark:light:dark ratio of 1:1:3:1:1 that can be recognised regardless of the scanning angle or scale. A scanner can find these patterns in any orientation and use them to calculate the rotation and distortion of the code.
Timing Patterns
The timing patterns are single-module-wide alternating black/white stripes connecting the finder patterns horizontally and vertically. They start and end at the finder pattern separators. The scanner uses them to count module positions precisely, handling cases where the code is slightly distorted or the module size varies.
Masking
Raw data encoded into a QR code can create large uniform regions - big blocks of all-black or all-white modules, or regular patterns that look like finder patterns. These confuse scanners.
Masking fixes this by XOR-ing the data modules (not the structural elements) with one of eight predefined patterns:
| Mask | Formula |
|---|---|
| 0 | (row + col) mod 2 == 0 |
| 1 | row mod 2 == 0 |
| 2 | col mod 3 == 0 |
| 3 | (row + col) mod 3 == 0 |
| 4 | (row/2 + col/3) mod 2 == 0 |
| 5 | (rowcol) mod 2 + (rowcol) mod 3 == 0 |
| 6 | ((rowcol) mod 2 + (rowcol) mod 3) mod 2 == 0 |
| 7 | ((row+col) mod 2 + (row*col) mod 3) mod 2 == 0 |
The encoder generates the QR code with all 8 masks, evaluates each against 4 penalty criteria, and selects the mask with the lowest total penalty score:
- Rule 1 - penalise runs of 5 or more same-colour modules in a row or column
- Rule 2 - penalise 2x2 blocks of same-colour modules
- Rule 3 - penalise patterns that resemble finder patterns
- Rule 4 - penalise unbalanced proportions of dark/light modules (ideal is 50/50)
The selected mask number is then stored in the format information strip so decoders know which mask to reverse.
Capacity Table
Capacity depends on version, error correction level, and encoding mode. Here is a representative table:
| Version | Error Level | Numeric digits | Alphanumeric | Bytes (UTF-8/Latin) |
|---|---|---|---|---|
| 1 | L | 41 | 25 | 17 |
| 1 | H | 10 | 6 | 7 |
| 5 | L | 154 | 93 | 64 |
| 5 | H | 45 | 27 | 22 |
| 10 | L | 321 | 195 | 134 |
| 10 | H | 154 | 93 | 64 |
| 20 | L | 1171 | 711 | 488 |
| 20 | H | 466 | 283 | 154 |
| 40 | L | 7089 | 4296 | 2953 |
| 40 | H | 1852 | 1119 | 784 |
A typical HTTPS URL like https://example.com/products/item?ref=homepage is about 50 characters. With level M error correction, that fits comfortably in version 4 (37x37 modules). With level H (for logo embedding), you need version 6 or higher.
Use Cases and Best Practices
URL Shortening
Long URLs force higher QR code versions: more modules, smaller print size, more error-prone. Before encoding:
- Use a URL shortener (bit.ly, your own short domain)
- Remove unnecessary tracking parameters when possible
- Consider uppercase-only domains for alphanumeric mode efficiency
Quiet Zone
Every QR code requires a quiet zone - a minimum 4-module-wide border of white space around the entire code. This is specified in ISO/IEC 18004 and is not optional. Without it, scanners may fail to locate the finder patterns.
In practice, add at least 4 modules (more is better) of white margin when placing a QR code on a coloured background.
Minimum Print Size
For reliable mobile phone scanning, QR codes need a minimum physical size:
- 2.5 cm (1 inch) minimum side length is the commonly cited rule of thumb
- Higher version codes (more modules) need proportionally more physical space to maintain per-module legibility
- Scanning distance matters: a code that works at arm's length needs to be larger to also work from 1 meter
Logo Embedding
Embedding a logo is a popular design choice. The rules:
- Use level H error correction - only level H (30% recovery) gives you meaningful protection against the modules hidden by the logo
- Keep the logo under 25-30% of the total area - even though H nominally allows 30%, real-world scanners have variable error correction budgets and partial damage elsewhere in the code may already consume some correction capacity
- Centre the logo - the centre of a QR code contains data modules, not structural elements; centring minimises the chance of covering a critical structural element
- Test extensively - scan with multiple devices and apps before printing at scale
- Avoid covering finder patterns - the three corner squares are structural and cannot be replaced by error correction
Dynamic vs Static QR Codes
Static QR codes encode the destination URL directly. The QR code itself is immutable once printed.
Dynamic QR codes encode a short URL that redirects to the actual destination. The redirect target can be changed at any time without reprinting. Dynamic codes:
- Are shorter (lower version, smaller size, faster scanning)
- Provide scan analytics
- Can be updated (new destination, seasonal campaigns)
- Require an ongoing service (the short URL redirect must remain active)
For permanent use cases (business cards, packaging), dynamic codes are safer because they let you fix destination URLs after printing.
Code Examples
PHP: chillerlan/php-qrcode
<?php
declare(strict_types=1);
require 'vendor/autoload.php';
use chillerlan\QRCode\QRCode;
use chillerlan\QRCode\QROptions;
use chillerlan\QRCode\Common\EccLevel;
// Basic usage - returns a data URI for inline HTML
$data = 'https://example.com';
$dataUri = (new QRCode)->render($data);
echo '<img src="' . $dataUri . '" alt="QR Code">';
// With options
$options = new QROptions([
'eccLevel' => EccLevel::H, // High error correction for logo embedding
'imageBase64' => true,
'imageTransparent' => false,
'scale' => 10, // 10 pixels per module
'quietzoneSize' => 4, // 4 modules quiet zone
'outputType' => QRCode::OUTPUT_IMAGE_PNG,
]);
$qrcode = new QRCode($options);
// Render to a file
file_put_contents('/tmp/qr.png', $qrcode->render($data));
// Render to a data URI for HTML embedding
$dataUri = $qrcode->render($data);
// Get version info
$matrix = $qrcode->getMatrix($data);
echo 'Version: ' . $matrix->version() . PHP_EOL;
echo 'Size: ' . $matrix->size() . ' × ' . $matrix->size() . ' modules' . PHP_EOL;
Install with:
composer require chillerlan/php-qrcode
Python: qrcode
import qrcode
from qrcode.constants import ERROR_CORRECT_L, ERROR_CORRECT_M, ERROR_CORRECT_Q, ERROR_CORRECT_H
from qrcode.image.styledpil import StyledPilImage
from qrcode.image.styles.moduledrawers import RoundedModuleDrawer
# Basic usage
img = qrcode.make('https://example.com')
img.save('/tmp/qr_basic.png')
# With full options
qr = qrcode.QRCode(
version=None, # Auto-select minimum version
error_correction=ERROR_CORRECT_H,
box_size=10, # Pixels per module
border=4, # Modules of quiet zone
)
qr.add_data('https://example.com')
qr.make(fit=True) # Auto-calculate version
img = qr.make_image(fill_color='black', back_color='white')
img.save('/tmp/qr.png')
# Inspect the version selected
print(f'Version: {qr.version}')
print(f'Size: {qr.modules_count} × {qr.modules_count} modules')
# Styled QR code with rounded modules (requires Pillow)
qr_styled = qrcode.QRCode(error_correction=ERROR_CORRECT_H)
qr_styled.add_data('https://example.com')
img_styled = qr_styled.make_image(
image_factory=StyledPilImage,
module_drawer=RoundedModuleDrawer()
)
img_styled.save('/tmp/qr_styled.png')
Install with:
pip install qrcode[pil]
JavaScript: qrcode (npm)
import QRCode from 'qrcode';
// Render to canvas (browser)
async function renderToCanvas(canvasElement, text) {
await QRCode.toCanvas(canvasElement, text, {
errorCorrectionLevel: 'H',
margin: 4,
scale: 8,
color: {
dark: '#000000',
light: '#ffffff',
},
});
}
// Render to data URL (browser or Node.js)
async function renderToDataUrl(text) {
const dataUrl = await QRCode.toDataURL(text, {
errorCorrectionLevel: 'M',
type: 'image/png',
margin: 4,
width: 300,
});
return dataUrl; // "data:image/png;base64,..."
}
// Render to SVG string (good for server-side rendering)
async function renderToSvg(text) {
const svg = await QRCode.toString(text, {
type: 'svg',
errorCorrectionLevel: 'M',
margin: 4,
});
return svg;
}
// Example usage
const url = 'https://example.com/product/12345';
const svgString = await renderToSvg(url);
console.log(svgString);
Install with:
npm install qrcode
For TypeScript:
npm install qrcode @types/qrcode
Checking QR Code Capacity in JavaScript
import QRCode from 'qrcode';
async function analyzeCapacity(data, ecLevel = 'M') {
const segments = QRCode.create(data, { errorCorrectionLevel: ecLevel });
const version = segments.version;
const side = 4 * version + 17;
console.log(`Data: "${data}" (${data.length} chars)`);
console.log(`Error correction: ${ecLevel}`);
console.log(`QR version: ${version}`);
console.log(`Matrix size: ${side} × ${side} modules`);
}
// Compare error correction levels
for (const level of ['L', 'M', 'Q', 'H']) {
await analyzeCapacity('https://example.com/some/path', level);
}
Common Mistakes
Using level H when you do not need it - if you are not embedding a logo, level H wastes capacity and forces a larger QR version. Use level M for most printed codes.
Forgetting the quiet zone - placing a QR code flush against a coloured border or edge of a page causes reliable scan failures. Always keep 4+ modules of white space around the code.
Encoding mixed case URLs in alphanumeric mode - alphanumeric mode only supports uppercase A-Z. A URL like https://example.com/Page contains lowercase letters and requires byte mode. Some tools silently fall back to byte mode; others throw an error.
Testing only on one device - different QR scanner apps implement the standard with different tolerances. Test with at least iOS Camera, Android Camera, and one dedicated scanning app.
Printing too small - a high-version code (v20+) printed at 2cm is essentially unscannable. Use a URL shortener to reduce version, or increase the physical print size.
Why the Internals Matter
QR codes pack a remarkable amount of engineering into a small printed square: Reed-Solomon polynomial codes invented for space communications, eight mathematically defined masking patterns, four encoding modes with different efficiency profiles, and a penalty-scoring system that automatically picks the best mask for your data.
Understanding these internals helps you make better engineering decisions: choosing the right error correction level, knowing when to shorten URLs, understanding the 30% logo budget, and setting the correct quiet zone. These details separate QR codes that scan reliably from ones that frustrate users.