Cryptographic hash function

A cryptographic hash function at work. Even small changes in the source input (here in the word "over") drastically change the resulting output, by the so-called avalanche effect

A cryptographic hash function is a hash function which takes an input (or 'message') and returns a fixed-size string of bytes. The string is called the 'hash value', 'message digest', 'digital fingerprint', 'digest' or 'checksum'.

The ideal hash function has three main properties:

It is extremely easy to calculate a hash for any given data.
It is extremely computationally difficult to calculate an alphanumeric text that has a given hash.
It is extremely unlikely that two slightly different messages will have the same hash.

Uses

Functions with these properties are used as hash functions for a variety of purposes, not only in cryptography. Practical applications include message integrity checks, digital signatures, authentication, and various information security applications.^[1]

A hash function takes a string of any length as input and produces a fixed length string which acts as a kind of "signature" for the data provided. In this way, a person knowing the "hash value" is unable to know the original message, but only the person who knows the original message can prove the "hash value" is created from that message.

A cryptographic hash function should behave as much as possible like a random function while still being deterministic and efficiently computable. A cryptographic hash function is considered "insecure" from a cryptographic point of view, if either of the following is computationally feasible:

Finding a (previously unseen) message that matches a given hash values.
Finding "collisions", in which two different messages have the same hash value.

An attacker who can find any of the above computations can use them to substitute an authorized message with an unauthorized one.^[2]

Ideally, it should be impossible to find two different messages whose digests ("hash values") are similar. Also, one would not want an attacker to be able to learn anything useful about a message from its digest ("hash values"). Of course the attacker learns at least one piece of information, the digest itself, by which the attacker can recognise if the same message occurred again.

In various standards and applications, the two most commonly used hash functions are MD5 and SHA-1.

In 2005, security defects were identified showing that a possible mathematical weakness might exist, like attacks, and recommending a stronger hash function.

In 2007 the National Institute of Standards and Technology announced a contest to design a hash function which will be given the name SHA-3 and be the subject of a FIPS standard.^[3]

Different hash algorithms

MD5: It was designed by Ronald Rivest in 1991 which replaces the earlier version MD4. It is specified as "RFC 1321" in 1992.^[4]
SHA-1: It was developed as part of a project by the U.S. government
RIPEMD-160: It stands for "RACE Integrity Primitives Evaluation Message Digest". It was developed by Hans Dobbertin, Antoon Bosselaers, and Bart Preneel at the COSIC research group at the Katholieke Universiteit Leuven in Leuven, Belgium, and it was published in 1996.
Whirlpool
SHA-2
SHA-3
BLAKE2
BLAKE3

Cryptographic Hash Function Media

The Merkle–Damgård hash construction

Related pages

Avalanche effect

References

↑ Shai Halevi and Hugo Krawczyk, Randomized Hashing and Digital Signatures Archived 2009-06-20 at the Wayback Machine
↑ Alexander Sotirov, Marc Stevens, Jacob Appelbaum, Arjen Lenstra, David Molnar, Dag Arne Osvik, Benne de Weger, MD5 considered harmful today: Creating a rogue CA certificate, accessed March 29, 2009.
↑ NIST.gov - Computer Security Division - Computer Security Resource Center. Retrieved 2008-10-28.
↑ Ciampa, Mark. CompTIA Security+ 2008 in depth (2009). Australia; United States: Course Technology/Cengage Learning. p. 290. ISBN 978-1-59863-913-1.

Bibliography

Bruce Schneier. Applied Cryptography. John Wiley & Sons, 1996. ISBN 0-471-11709-9.

Other websites

Hash'em all! – free online text and file hashing with different algorithms
The Hash function lounge Archived 2008-12-25 at the Wayback Machine – a list of hash functions and known attacks
Helger Lipmaa's links on hash functions Archived 2008-12-21 at the Wayback Machine
Diagrams explaining cryptographic hash functions
An Illustrated Guide to Cryptographic Hashes by Steve Friedl
Cryptanalysis of MD5 and SHA: Time for a New Standard by Bruce Schneier
Hash collision Q&A
Attacking hash functions by poisoned messages (construction of multiple sensible Postscript messages with the same hash function) Archived 2006-08-08 at the Wayback Machine
What is a hash function? Archived 2006-12-06 at the Wayback Machine from RSA Laboratories
Password Hashing in PHP Archived 2012-11-29 at the Wayback Machine by James McGlinn at the PHP Security Consortium
The code monkey's guide to cryptographic hashes Archived 2009-06-09 at the Wayback Machine by Val Henson, "in language that any programmer (and even some managers) can understand."

[1] Shai Halevi and Hugo Krawczyk, Randomized Hashing and Digital Signatures Archived 2009-06-20 at the Wayback Machine

[2] Alexander Sotirov, Marc Stevens, Jacob Appelbaum, Arjen Lenstra, David Molnar, Dag Arne Osvik, Benne de Weger, MD5 considered harmful today: Creating a rogue CA certificate, accessed March 29, 2009.

[3] NIST.gov - Computer Security Division - Computer Security Resource Center. Retrieved 2008-10-28.

[4] Ciampa, Mark. CompTIA Security+ 2008 in depth (2009). Australia; United States: Course Technology/Cengage Learning. p. 290. ISBN 978-1-59863-913-1.

[1]

[2]

[3]

[4]