MD5, SHA-1, SHA-256, SHA-512: Which Hashing Algorithm to Choose
4 March, 2026 Security
MD5, SHA-1, SHA-256, SHA-512: Which Hashing Algorithm to Choose
Hashing is one of the most misused concepts in software development. Developers reach for MD5 to "hash passwords" or use SHA-1 for "security" without understanding what these algorithms actually provide - or what attacks have already broken them. This article explains cryptographic hash functions from first principles, covers every major algorithm with its current security status, and gives you concrete guidance on which to use for each purpose.
What Is Hashing
A cryptographic hash function takes an input of arbitrary length and produces a fixed-length output called a digest or hash. It is a one-way function: given the output, you cannot recover the input.
Three properties define a cryptographically secure hash function:
Preimage resistance - Given a hash value h, it is computationally infeasible to find any input m such that hash(m) = h. This is the "one-way" property.
Second preimage resistance - Given an input m1, it is computationally infeasible to find a different input m2 such that hash(m1) = hash(m2). You cannot substitute one document for another while keeping the same hash.
Collision resistance - It is computationally infeasible to find any two distinct inputs m1 and m2 such that hash(m1) = hash(m2). Note that collisions necessarily exist (infinite inputs map to finite outputs), but finding them must be impractical.
The avalanche effect means a tiny change in input - flipping a single bit - produces a completely different output. This makes hash functions useful for integrity verification: if even one byte changes, the hash changes unpredictably.
SHA-256("hello") = 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824
SHA-256("Hello") = 185f8db32921bd46d35cc49a5b4c8e07e6c2bd58543c73a0d34e1c15b73b2b4b
One capital letter - completely different hash.
MD5
MD5 (Message Digest Algorithm 5) was designed in 1991 by Ron Rivest as an improvement over MD4. It produces a 128-bit (16-byte) digest, typically represented as a 32-character hexadecimal string.
For most of the 1990s, MD5 was the standard choice for data integrity and digital signatures. That era is over.
The break: In 2004, Xiaoyun Wang and colleagues demonstrated collisions in MD5 in approximately 2^24 operations - roughly 17 million hash computations. On modern hardware, this takes seconds. By 2007, Marc Stevens demonstrated "chosen-prefix collision" attacks: given two arbitrary prefixes, attackers can append crafted bytes to make both produce the same MD5 hash. This is far more dangerous than a simple collision.
CVE-2004-2761 documents the fundamental weakness of MD5 as used in digital certificates. In 2008, researchers used chosen-prefix MD5 collisions to forge a rogue certification authority certificate - they created a fake CA that browsers trusted. This directly led to the deprecation of MD5 in TLS certificates.
MD5 output speed is often cited as an advantage for non-security uses (checksums, hash tables). But for any security-relevant purpose, MD5 is completely broken and must not be used.
MD5 in PHP:
<?php
// Informational checksum only - never for security
$checksum = md5_file('/path/to/downloaded/file.zip');
echo $checksum; // 32-char hex string
// DO NOT use for passwords or security verification
// md5($password) is catastrophically insecure
SHA-1
SHA-1 (Secure Hash Algorithm 1) was designed in 1995 by the NSA and published by NIST. It produces a 160-bit (20-byte) digest.
SHA-1 succeeded MD5 as the standard algorithm for digital signatures, TLS certificates, and code signing throughout the 2000s. That era ended definitively in 2017.
Theoretical attacks began in 2005 when Xiaoyun Wang's team reduced the theoretical cost of SHA-1 collisions from 2^80 (the brute force bound) to approximately 2^63 operations.
The SHAttered attack (2017) was the first practical SHA-1 collision. Google's security researchers produced two different PDF files with identical SHA-1 hashes. The compute cost was approximately 9.2 x 10^18 SHA-1 compressions - equivalent to roughly $100,000 in cloud compute at 2017 prices. The cost has only dropped since.
CVE-2005-4900 covers the theoretical weakness. The real-world consequence was the industry-wide removal of SHA-1 from certificate chains: Chrome removed SHA-1 certificate support in January 2017, Firefox followed in February 2017. TLS connections using SHA-1 signatures are now rejected by all modern browsers.
SHA-1 is still acceptable for non-security purposes where collision resistance is not required (Git object identifiers use SHA-1 - a known limitation the Git project is migrating away from). For anything security-relevant, SHA-1 is broken.
SHA-256
SHA-256 is part of the SHA-2 family, designed by the NSA and published by NIST in 2001. It produces a 256-bit (32-byte) digest.
SHA-256 is the current workhorse of cryptographic security. As of 2024, no practical attacks against SHA-256 are known. The best known theoretical attack reduces the security margin only marginally from the theoretical 2^128 operations (the birthday bound for 256-bit hashes).
SHA-256 is used for:
- Bitcoin - proof-of-work mining and transaction verification
- TLS 1.3 - the default hash for certificate signatures and HMAC
- Code signing - Windows Authenticode, macOS code signing, APT/RPM package verification
- Git (SHA-256 mode) - newer repositories migrating from SHA-1
- S/MIME and PGP - email signing
SHA-256 is the right choice for general-purpose cryptographic hashing where you need a well-audited, widely supported, still-secure algorithm.
SHA-256 in multiple languages:
<?php
declare(strict_types=1);
// PHP - hash() supports all SHA-2 variants
$hash = hash('sha256', 'my data');
echo $hash; // 64-char hex string
// HMAC for message authentication
$hmac = hash_hmac('sha256', $message, $secretKey);
// File hashing
$fileHash = hash_file('sha256', '/path/to/file');
import hashlib
# Basic SHA-256
digest = hashlib.sha256(b'my data').hexdigest()
# HMAC
import hmac
mac = hmac.new(secret_key.encode(), message.encode(), hashlib.sha256).hexdigest()
# File hashing
sha256 = hashlib.sha256()
with open('/path/to/file', 'rb') as f:
for chunk in iter(lambda: f.read(65536), b''):
sha256.update(chunk)
file_hash = sha256.hexdigest()
// Browser and Node.js - SubtleCrypto API
async function sha256(data) {
const encoder = new TextEncoder();
const buffer = encoder.encode(data);
const hashBuffer = await crypto.subtle.digest('SHA-256', buffer);
const hashArray = Array.from(new Uint8Array(hashBuffer));
return hashArray.map(b => b.toString(16).padStart(2, '0')).join('');
}
// Node.js - crypto module
const crypto = require('crypto');
const hash = crypto.createHash('sha256').update('my data').digest('hex');
SHA-512
SHA-512 is also part of the SHA-2 family, using the same overall Merkle-Damgard construction as SHA-256 but operating on 64-bit words with a larger state. It produces a 512-bit (64-byte) digest.
SHA-512 is slower than SHA-256 on 32-bit hardware, but on 64-bit hardware the gap narrows considerably because SHA-512's 64-bit word operations map directly to native 64-bit instructions. On some 64-bit platforms, SHA-512 is actually faster than SHA-256 for long inputs.
SHA-512 is appropriate when:
- You need a larger security margin for long-lived documents or signatures
- You are hashing very large files and need to avoid any birthday-problem concerns at lower bit counts
- The application has requirements specifying 512-bit digests (FIPS compliance, certain government standards)
For most applications, SHA-256's 256-bit output provides ample security. SHA-512 is not "more necessary" for security in typical web applications - it is an option when additional margin is required.
SHA-3
SHA-3 (Keccak) was standardised by NIST in 2015 following a public competition that ran from 2007. Unlike SHA-1 and SHA-2, which use the Merkle-Damgard construction, SHA-3 uses a "sponge construction" based on the Keccak-f permutation.
This architectural difference is significant: SHA-3 is immune to length extension attacks that affect SHA-1 and SHA-2. A length extension attack allows an attacker who knows hash(secret || message) to compute hash(secret || message || extension) without knowing the secret. SHA-3's sponge construction makes this impossible.
SHA-3 provides:
SHA3-224,SHA3-256,SHA3-384,SHA3-512- fixed-output hash functionsSHAKE128,SHAKE256- extendable-output functions (XOFs) with variable-length output
SHA-3 is not "better" than SHA-2 for most purposes today - both are secure, and SHA-2 has vastly more deployment, library support, and hardware acceleration. SHA-3's value is as an alternative construction: if a catastrophic flaw were discovered in the Merkle-Damgard construction (affecting both SHA-1 and SHA-2), SHA-3 would remain unaffected.
<?php
// PHP 7.1+ supports SHA-3
$hash = hash('sha3-256', 'my data');
$hash512 = hash('sha3-512', 'my data');
Collision Attacks: The Math
The birthday problem provides the mathematical basis for collision attacks. In a hash function with k-bit output, the number of possible outputs is 2^k. The birthday paradox tells us that if you generate random hashes, you expect to find a collision after approximately 2^(k/2) operations.
The collision probability formula:
P(collision) ≈ 1 - e^(-n²/2^(k+1))
Where:
nis the number of hashes computedkis the number of output bitseis Euler's number (~2.718)
For SHA-256 (k = 256): you need approximately 2^128 hashes to find a collision with reasonable probability. At 10^18 hashes per second (the entire Bitcoin network's hashrate), this takes longer than the age of the universe.
For MD5 (k = 128): the birthday bound is 2^64, but Wang's 2004 attack achieved collisions at 2^24 - eight billion times faster than the birthday bound. This is what "collision resistance broken" means: attackers found a mathematical shortcut that bypasses the birthday bound entirely.
"Collision resistance broken" does not automatically mean preimage resistance is broken. MD5's preimage resistance is still theoretically intact at 2^123 operations (slightly below the theoretical 2^128). But broken collision resistance means an attacker can create two different documents with the same hash - sufficient to forge signatures, certificates, and integrity checks.
Hashing vs Encryption
This is a fundamental misconception that causes real security vulnerabilities.
Hashing is one-way and irreversible. Given hash("password123"), you cannot recover "password123". The hash function destroys information by design. No key, no decryption, no reversal.
Encryption is two-way and reversible. Given encrypt(key, "secret message"), you can recover "secret message" using the key. Encryption preserves information - it transforms it.
They serve completely different purposes:
- Hash functions verify integrity (has this data changed?) and authenticity without revealing the original data
- Encryption provides confidentiality - only parties with the key can read the data
When a developer "hashes passwords for security," they are using the right tool correctly. When a developer "encrypts passwords so we can recover them for the user," they have introduced a critical vulnerability - stored encrypted passwords can be mass-decrypted if the key is compromised. Passwords should never be recoverable.
The common confusion: both hashing and encryption "scramble" data so it is not human-readable. But one discards the key, the other keeps it.
Password Hashing
Using MD5, SHA-1, SHA-256, or SHA-512 directly on passwords is a critical security error. Here is why:
Speed is the enemy. General-purpose hash functions are designed to be fast. SHA-256 can compute billions of hashes per second on modern GPU hardware. An attacker with a leaked password database can attempt every word in a dictionary, every common password, and large swaths of the input space in hours.
Rainbow tables. Precomputed tables of hash(word) -> word pairs allow instant lookup for unsalted hashes. A 200 GB rainbow table for MD5 covers all alphanumeric passwords up to 10 characters. Without salting, md5("password") always produces the same hash, making rainbow table attacks trivial.
What password hashing requires:
- Intentional slowness - the algorithm must be configurable to take hundreds of milliseconds, making brute force impractical.
- Salting - a unique random value added to each password before hashing, eliminating rainbow table attacks.
- Work factors - the slowness must be adjustable as hardware improves.
bcrypt
bcrypt (1999) was the first widely-adopted password hashing algorithm designed for this purpose. Its cost factor controls the number of rounds:
<?php
declare(strict_types=1);
// Password hashing with bcrypt
$hash = password_hash($password, PASSWORD_BCRYPT, ['cost' => 12]);
// Verification
$isValid = password_verify($password, $hash);
// Check if rehashing is needed (if cost factor changed)
if (password_needs_rehash($hash, PASSWORD_BCRYPT, ['cost' => 12])) {
$hash = password_hash($password, PASSWORD_BCRYPT, ['cost' => 12]);
}
Cost 12 produces approximately 250ms per hash on typical server hardware (2024). This means an attacker can attempt roughly 4 guesses per second per CPU core - a dramatic reduction from SHA-256's billions.
Bcrypt's limitation: it truncates passwords at 72 bytes. Longer passwords are not rejected - they are silently truncated, which is a subtle security issue.
Argon2id
Argon2 won the Password Hashing Competition in 2015 and is OWASP's current recommendation. It has three variants: Argon2d (resistant to GPU attacks), Argon2i (resistant to side-channel attacks), Argon2id (combines both - use this).
Argon2id parameters:
- time - number of iterations
- memory - amount of RAM consumed (makes GPU attacks expensive)
- parallelism - number of parallel threads
OWASP recommends: time=3, memory=65536 (64 MB), parallelism=4.
<?php
declare(strict_types=1);
// Argon2id - recommended by OWASP
$hash = password_hash($password, PASSWORD_ARGON2ID, [
'time_cost' => 3,
'memory_cost' => 65536, // 64 MB
'threads' => 4,
]);
$isValid = password_verify($password, $hash);
import argon2
from argon2 import PasswordHasher
ph = PasswordHasher(
time_cost=3,
memory_cost=65536, # 64 MB
parallelism=4,
)
hash_str = ph.hash(password)
is_valid = ph.verify(hash_str, password)
The memory cost is Argon2's key advantage over bcrypt: it forces attackers to use large amounts of RAM per attempt, which does not scale efficiently on GPUs or ASICs (which have limited memory per core).
Algorithm Comparison Table
| Algorithm | Output Bits | Speed (relative) | Collision Resistance | Use Case |
|---|---|---|---|---|
| MD5 | 128 | Very fast | Broken (2^24) | Legacy checksums only |
| SHA-1 | 160 | Fast | Broken (2^63) | Legacy only, being deprecated |
| SHA-256 | 256 | Moderate | Secure (~2^128) | General purpose, TLS, code signing |
| SHA-512 | 512 | Moderate (fast on 64-bit) | Secure (~2^256) | High-security margin needed |
| SHA-3-256 | 256 | Moderate | Secure (~2^128) | Alternative to SHA-256, no length extension |
| bcrypt | 184 | Slow (by design) | N/A - password KDF | Password storage |
| Argon2id | Variable | Slow (by design) | N/A - password KDF | Password storage (recommended) |
Which Algorithm to Choose
| Use Case | Algorithm | Notes |
|---|---|---|
| Password storage | Argon2id | OWASP recommendation. bcrypt is acceptable fallback. |
| Data integrity check (trusted source) | SHA-256 | Standard choice, widely supported. |
| Digital signatures | SHA-256 or SHA-512 | Per your PKI's requirements; SHA-256 is typical. |
| HMAC message authentication | SHA-256 or SHA-512 | SHA-256 is standard; SHA-512 for higher margin. |
| File checksums (downloads) | SHA-256 | Replace MD5 and SHA-1 checksums with SHA-256. |
| TLS certificates | SHA-256 | SHA-1 is rejected by browsers; SHA-256 is required. |
| Bitcoin / blockchain | SHA-256 | Domain-specific; already standardised. |
| Non-security checksums (hash tables) | MD5 or xxHash | MD5 is fast and fine for non-security uses. |
| Government / FIPS compliance | SHA-256 or SHA-3-256 | Confirm specific FIPS requirements. |
| Untrusted input, no length extension | SHA-3-256 | Or use HMAC with SHA-256 instead. |
Decision Criteria
Use SHA-256 as your default for any new security-relevant hashing unless you have a specific reason not to.
Use Argon2id for any password storage. Never use SHA-256 (or any fast hash) for passwords.
Avoid MD5 and SHA-1 for any security purpose. They are broken for collision resistance and deprecated across the industry.
Use SHA-512 when your use case specifically requires a larger output (certain compliance requirements, or hashing data that will outlive current security margins for decades).
Use SHA-3 when you need immunity to length extension attacks without using HMAC, or as a defense-in-depth measure when you want an alternative construction family.
What to Actually Use
The gap between "hashing" and "secure hashing" is large, and the difference between "secure hashing" and "password hashing" is even larger. MD5 and SHA-1 are cryptographically broken and must not be used for any security purpose. SHA-256 is the correct general-purpose choice. Argon2id is the correct password storage choice.
The birthday problem math makes it clear why bit count matters - doubling the output bits squares the attack cost. But mathematical security means nothing if you choose the wrong algorithm for the job: using SHA-256 for passwords is nearly as dangerous as using MD5, because speed is the vulnerability.
Generate and verify hashes instantly with the hash generator. Compute SHA-256, SHA-512, MD5 checksums, and compare hash values directly in your browser without sending data to any server.