How to Verify File Integrity with Hashes

· 9 min read

When you download software, firmware, or important documents, how do you know the file is exactly what the publisher intended? File hashing gives you a cryptographic fingerprint, a unique string that changes if even a single byte of the file is different. Verifying a hash takes seconds and can save you from installing tampered software, flashing a broken firmware that bricks a device, or trusting a corrupted backup that fails when you actually need to restore from it.

A short history of hash functions

The idea of a checksum is older than computing itself, accountants used arithmetic digit-sums to detect transcription errors long before there were files to verify. Cryptographic hashes appeared in the late 1980s. Ron Rivest published MD4 in 1990 and MD5 in 1991, which became the de facto checksum for two decades. NIST standardised SHA-0 in 1993, then quickly withdrew it in favour of SHA-1 in 1995 after a flaw was discovered. The SHA-2 family (SHA-224, SHA-256, SHA-384, SHA-512) followed in 2001 to address the looming weakness of SHA-1.

Each generation was retired by attacks that turned out to be cheaper than expected. MD5 collisions were demonstrated in 2004, made practical for digital certificates in 2008, and are now trivial. SHA-1 fell to Google's SHAttered attack in 2017, when researchers produced two different PDFs with identical SHA-1 digests. SHA-2 has held up since 2001, and SHA-3 (Keccak, standardised in 2015) provides a structurally different fallback in case SHA-2 ever breaks. The lesson is that hash algorithms have a shelf life; the right algorithm today may need replacing within a decade.

How file hashing works

A hash function reads every byte of a file and produces a fixed-length string. The same file always produces the same hash. Change one byte, and the hash changes completely, this is called the avalanche effect and is the property that makes hashes useful for verification.

Example:

The two hashes share no recognisable pattern even though the files differ by a single bit. That sensitivity is what makes verification possible: generate the hash, compare it to the published hash, and you know instantly whether the file is authentic.

Internally, modern hash functions chunk the file into fixed-size blocks (64 bytes for SHA-256, 128 for SHA-512), feed each block through a compression function, and chain the state forward. The output is the final state after the last block has been mixed in. Because the chain depends on every byte, no shortcut lets an attacker change content without rewriting the entire hash.

How to verify a file

  1. Find the official hash, the software publisher typically lists file hashes on their download page, often labelled "SHA-256 checksum" or "SHA256SUMS file".
  2. Upload your downloaded file: select the file in the hash calculator. The hash is computed locally in your browser; the file never leaves your machine.
  3. Compare the hashes: if your calculated hash matches the official hash exactly, the file is authentic and uncorrupted. Copy-paste both into a text diff if the strings are long.
  4. Match the algorithm: SHA-256 hashes only match other SHA-256 hashes. If the publisher gives you SHA-512, generate SHA-512 too; mixing algorithms is the most common "they do not match" mistake.
  5. Verify the published hash itself when possible: a signed SHA256SUMS file (signed with the publisher's GPG key) tells you the hash list was not tampered with, which the bare hash on a download page does not.

When to verify file hashes

Supported algorithms

Algorithm Hash length Output bits Recommendation
MD5 32 hex chars 128 Legacy only, broken, accidental-corruption only
SHA-1 40 hex chars 160 Legacy only, broken, do not trust for security
SHA-224 56 hex chars 224 Niche; prefer SHA-256
SHA-256 64 hex chars 256 Recommended general-purpose standard
SHA-384 96 hex chars 384 High security, used in TLS 1.3 cipher suites
SHA-512 128 hex chars 512 Maximum strength of SHA-2, fast on 64-bit CPUs
SHA3-256 64 hex chars 256 Different internal design from SHA-2, future-proof
BLAKE2b/BLAKE3 varies 256 or 512 Fastest modern hashes, used by rsync, restic
CRC32 8 hex chars 32 Error detection only, not a security hash

If you have a choice, SHA-256 is the right default. Use SHA-512 on 64-bit machines for slightly better performance (the algorithm is tuned for 64-bit words). Reach for BLAKE3 when throughput matters most, it can saturate modern NVMe SSDs in ways SHA-256 cannot.

Hash vs digital signature

A hash tells you whether a file changed. A digital signature tells you whether it changed AND who created the hash. A signature is a hash that has been encrypted with the publisher's private key; you decrypt it with their public key, recompute the hash, and check that the two match. If they do, you know the file is intact AND that the publisher (or someone with their private key) approved it.

When a download page shows both a SHA-256 hash and a .sig or .asc file, the hash protects against corruption and accidental tampering, but the signature is what protects against an attacker who breached the download server. The attacker can swap the file and update the displayed hash; they cannot forge a valid signature without the publisher's key.

Common pitfalls

Alternative tools and contexts

A web hash calculator is the fastest path when you have one file to check. For repeated use or scripting, command-line tools are the standard.

Tool Platform Strength Watch out for
Web hash calculator Browser No install, file never uploads One file at a time
sha256sum Linux Fast, scriptable, GNU coreutils --check reads SHA256SUMS files
shasum -a 256 macOS, BSD Bundled, same output format Different binary name than Linux
Get-FileHash Windows PowerShell First-class on Windows Output format differs from sha256sum
certutil -hashfile Windows cmd Available on every Windows Verbose output needs parsing
openssl dgst -sha256 Cross-platform If you already have OpenSSL Slower than dedicated tools
b3sum Cross-platform BLAKE3, multi-GB/s throughput Newer, less ubiquitous
rhash Cross-platform Computes many algorithms at once Extra install

In CI/CD pipelines the same task usually runs as sha256sum file > file.sha256 during build and sha256sum -c file.sha256 during verify, sometimes wrapped in a signed manifest. The principle (compute, publish, compare on retrieval) is identical to what the browser tool does interactively.

Privacy and the hash calculator

The hash calculator runs entirely in your browser. The file you select is read with the FileReader API, fed through the Web Crypto SubtleCrypto interface, and the resulting hash is shown to you. The file's bytes never travel to our servers, there is no upload, no log of which files were hashed, and no analytics on file sizes or extensions. For sensitive material, contracts, medical records, private keys, the difference between a tool that uploads and one that hashes locally is the difference between trusting one third party and trusting none. A hash is a tiny output (64 hex characters for SHA-256), but the input it summarises can be very revealing. Keeping that input client-side is the right default for any verification task.

Frequently Asked Questions

How do I compare a file hash to the official one?

After generating the hash, compare it character by character with the hash published by the file's source (usually on the download page). If every character matches, the file is authentic and uncorrupted. Even one character difference means the file has been modified.

Which hash algorithm should I use?

SHA-256 is the standard for file verification. Use whichever algorithm the publisher provides. If you have a choice, SHA-256 offers a good balance of security and performance.

Can a corrupted file have the correct hash?

It is theoretically possible (a collision) but statistically negligible with SHA-256. The odds are so astronomically low that for all practical purposes, matching hashes guarantee identical files.

Is my file uploaded to a server?

No. The hash is calculated entirely in your browser. Your file never leaves your device, making it safe for any file including sensitive documents.

What is the difference between a hash and a digital signature?

A hash proves a file has not changed since the hash was computed; anyone can verify it. A digital signature proves both integrity AND identity, the publisher signs the hash with their private key, and you verify with their public key. Hashes alone do not protect against a hacker who replaced both the file and the published hash on the same compromised mirror.

Why are MD5 and SHA-1 considered insecure?

Researchers have demonstrated practical collision attacks for both. In 2017 Google produced two different PDFs with identical SHA-1 hashes (the SHAttered attack), and MD5 collisions can be generated in seconds on a laptop. For deliberate-tamper detection use SHA-256 or stronger; MD5 and SHA-1 still work for catching accidental corruption but should never be trusted as security boundaries.