As a Python enthusiast, I’m constantly seeking ways to enhance my programming skills and explore new techniques. Today, I want to introduce you to the fascinating world of cryptographic hashing. More specifically, we’ll dive into how to create a SHA3-512 hash of a file in Python. With the ever-growing importance of data security, understanding hashing algorithms and their applications is crucial. So, let’s embark on this journey together and unlock the power of SHA3-512 hashing in Python.
Encryption and hashing have served as the foundation for new security modules, among other network security developments. One of the most used hash algorithms is the Secure Hash Algorithm(SHA) with a digest size of 512 bits, or SHA 512. Although there are numerous variations, SHA 512 has been the most often used in practical applications.
SHA-3 (Secure Hash Algorithm 3) is the latest member of the Secure Hash Algorithm family of standards. Although part of the same series of standards, SHA-3 is internally different from the MD5-like structure of SHA-1 and SHA-2. SHA-3 instances are drop-in replacements for SHA-2, intended to have identical security properties. The SHA-3 family consists of six hash functions with digests (hash values) that are 128, 224, 256, 384 or 512 bits: SHA3-224, SHA3-256, SHA3-384, SHA3-512, SHAKE128, SHAKE256.
The SHA-3 or Keccak algorithm is one of the most secure and efficient hashing algorithms and some claim that it won’t be cracked in the next 20 – 30 years. Developments in the quantum computing world might decrease that time frame but it is still one of the best hashing algorithm we have got right now.
The hash function generates the same output hash for the same input string. This means that, you can use this string to validate files or text or anything when you pass it across the network or even otherwise. SHA3-512 can act as a stamp or for checking if the data is valid or not.
The 512 in the name SHA3-512 refers to the final hash digest value, meaning that regardless of the amount of plaintext or cleartext, the hash value will always be 512 bits.
For example –
Input String | Output Hash |
---|---|
hi | 154013cb8140c753f0ac358da6110fe237481b26c75c3ddc1b59eaf9dd7b46a0a3aeb2cef164b3c82d65b38a4e26ea9930b7b2cb3c01da4ba331c95e62ccb9c3 |
debugpointer | ac09d8e98cd7d60927600334167ca22a79fad316a8af4fdc1d076c6cfe72c44778cc47223eb296f3ddc13dcfd4bb395f6346c0f29f2f2f4a46f4d9bcee63d1df |
computer science is amazing! I love it. | 308a9fd755db73031ddae5f58734c8b57db3bc5ea24bfc6de7df962158e24bbe697219a72ccdeaa03c1848f2c35ef46d050ec0b678465de309e8a43bdf248c64 |
Check this out – If you are looking for SHA3-512 hash of a String.
Create SHA3-512 hash of a file in Python
SHA3-512 hash can be created using the python’s default module hashlib
.
Incorrect Way to create SHA3-512 Hash of a file in Python
But, you have to note that you cannot create a hash of a file by just specifying the name of the file like this-
# this is NOT correct
import hashlib
print(hashlib.sha3_512("filename.jpg".encode('UTF-8')).hexdigest())
Output of the above code-
a244f26ddd042c8367d50b6ff377c52fa6d1dcb533acfdee42f6fc26ea46a25aec94dae95ab4f9d27168e7f4c317ee7a1666892e71bfd6a70fc5f4d80f872841
The above value is NOT the SHA3-512 hash of the file. But, it is the SHA3-512 hash of the string filename.jpg
itself.
Correct Way to create SHA3-512 Hash of a file in Python
You have to read the contents of the file to create SHA3-512 hash of the file itself. It’s simple, we can just read the contents of the file and create the hash.
The process of creating an SHA3-512 hash in python is very simple. First import hashlib, then encode your string that you want to hash i.e., converts the string into the byte equivalent using encode(), then pass it through the hashlib.sha3_512()
function. We print the hexdigest
value of the hash m
, which is the hexadecimal equivalent encoded string.
import hashlib
file_name = 'filename.jpg'
with open(file_name) as f:
data = f.read()
sha512hash = hashlib.sha3_512(data).hexdigest()
SHA3-512 Hash of Large Files in Python
In the above code, there is one problem. If the file is a 10 Gb file, let’s say a large log file or a dump of traffic or a Game like FIFA or others. If you want to compute SHA3-512 hash of it, it would probably chew up your memory.
Here is a memory optimized way of computing SHA3-512 hash, where we read chunks of 4096 bytes (can be customised as per your requirement, size of your system, size of your file etc.,). So, in this process we sequentially process the chunks and update the hash. So, in this process, let’s say there are 1000 such chunks of the file, the hash_sha512
is updated 1000 times.
At the end we return the hexdigest
value of the hash m
, which is the hexadecimal equivalent encoded string.
import hashlib
# A utility function that can be used in your code
def compute_sha512(file_name):
hash_sha512 = hashlib.sha3_512()
with open(file_name, "rb") as f:
for chunk in iter(lambda: f.read(4096), b""):
hash_sha512.update(chunk)
return hash_sha512.hexdigest()
Compare and Verify SHA3-512 hash of a file using python
You need to verify the SHA3-512 hash at the server or at some other point or logic in your code.
To verify the SHA3-512 hash you will have to create the SHA3-512 hash of the original file again.
Then compare the original SHA3-512 value that the source has generated and SHA3-512 that you generate.
import hashlib
file_name = 'filename.jpg'
original_sha512 = 'a244f26ddd042c8367d50b6ff377c52fa6d1dcb533acfdee42f6fc26ea46a25aec94dae95ab4f9d27168e7f4c317ee7a1666892e71bfd6a70fc5f4d80f872841'
with open(file_name) as f:
data = f.read()
sha512_returned = hashlib.sha3_512(data).hexdigest()
if original_sha512 == sha512_returned:
print "SHA3-512 verified."
else:
print "SHA3-512 verification failed."
The process of SHA3-512 creation and verification is easy as we discussed above. Happy Coding!
I’m glad that you found the content useful. We’ve reached the end of our journey into creating SHA3-512 hashes of files in Python. I hope you’ve found this exploration as captivating and enlightening as I have. By harnessing the power of cryptographic hashing, we can fortify data integrity and security in our applications. Remember, this is just the tip of the iceberg when it comes to the vast field of cryptography, but mastering the fundamentals sets us on a path towards more robust and secure programming practices. So, keep expanding your Python skills, keep exploring new techniques, and keep delving into the exciting world of cryptography. Happy coding!