A hash function is function that can be used to transform data of arbitrary size to a unique string of bytes, it’s like being able to attribute a global unique identifiers to anything. This might seem like nothing, but this simple fabrication is extremely useful to build many things in cryptography and Bitcoin! To be more clear, a hash function takes an arbitrary-length input (a file, a message, a video, and so on) and produces a fixed-length output (for example, 256 bits for SHA-256). Hashing the same input produces the same digest or hash and hashing two similar texts gives two very different results.
The main property of a hash function is that one cannot revert the algorithm, meaning that one shouldn’t be able to find the input from just the output. Hash functions are one-way, if I write the sha-256 result of something, no one on this planet can calculate the source I used to get that result.
SHA-2 and SHA-3 are the two most widely adopted hash functions. SHA-2 is based on the Merkle–Damgård construction, while SHA-3 is based on the sponge construction. SHA-2 provides 4 different versions, producing outputs of 224, 256, 384 & 512 (256 is the most used). Here is a good article to know more about the mathematics behind SHA-256.
Let’s do sha256 on the word “foobar” with this command:
echo -n foobar | sha256sum
The result is : “c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2”.
Now, let’s calcule “fopbar” sha-256 value with:
echo -n fopbar | sha256sum
The result is : “04006a569077f11c3d1e5f3f5994e10a40d50fb3679ab89b053d1236024002be”.
As you can see, both results are very different!
The SHA1 hash function is now completely unsafe as researchers have achieved the first practical SHA-1 collision, generating two PDF files with the same signature.