Git workshop: BLOB object type

Hey, I'm a postgraduate in Cyber Security with practical experience in Software Engineering and DevOps Operations. The top player on TryHackMe platform, multilingual speaker (Kazakh, Russian, English, Spanish, and Turkish), curios person, bookworm, geek, sports lover, and just a good guy to speak with!
In Git, data is stored as objects. Each object is identified by a unique SHA-1 hash. There are several types of objects in Git: commit, tree, blob, and tag. Among these, the "blob" object type represents the content of a file.
A blob stands for "binary large object" and it's just a chunk of data. Each version of a file in a Git repository corresponds to a blob. The blob holds the file data, but it doesn't contain any metadata about the file (like its name or its path). Metadata is stored in a tree object that references the blob.
In essence, the blob is the most fundamental object in Git: it represents the content of a single version of a file.
Workshop
Objective:
To understand the relationship between file content and Git blobs and to be able to inspect blobs directly.
Prerequisites:
A basic understanding of Git.
Git installed on your system.
Exercise Steps:
Setup a new Git repository:
mkdir git-blob-exercise cd git-blob-exercise git initCreate a new file and inspect its contents:
echo "Hello, Git BLOB!" > hello.txt cat hello.txtAdd the file to Git and commit:
git add hello.txt git commit -m "Initial commit"Find the blob hash for the file:
git ls-tree HEADYou'll see output like:

Note down the
1be678f79f1078a680269a9a4f30b69e29624dd7valueInspect the blob content:
git cat-file -p 1be678f79f1078a680269a9a4f30b69e29624dd7This should output:
Hello, Git BLOB!Modify the file and inspect its new blob:
echo "Hello again, Git BLOB!" >> hello.txt git add hello.txt git commit -m "Update hello.txt" git ls-tree HEADYou'll see a new blob hash
24fb00014f282b21890906c857d5ef719776efc8forhello.txt.
Inspect the new blob content:
git cat-file -p 24fb00014f282b21890906c857d5ef719776efc8This should output the updated content of the file.

Discussion:
Blobs represent the content of a file, not its path or name.
Different content will generate a different blob hash, even if the file name remains the same. This is because the blob's hash is generated based on its content.
Trees, another type of Git object, are responsible for holding the filenames and structuring directories. They reference blobs for the actual file content.
git hash-object and git cat-file commands
Both git hash-object and git cat-file are lower-level Git commands that deal with the internal workings of Git. Let's delve into each of them:
git hash-object
The git hash-object command takes a file and calculates the SHA-1 checksum for the file's content. This checksum is what Git uses to uniquely identify objects within its object database. The command essentially emulates how Git computes the hash of an object based on its content.
# Syntax
git hash-object <file>
# Example
git hash-object README.md
Options:
-w: Writes the object to the object database. This allows you to manually create an object hash and store it within your Git repository.
# Example: Compute hash and write the object to the database
git hash-object -w README.md
git cat-file
The git cat-file command is like the counterpart of git hash-object. It is used to view the type or the content of an object in the Git database given its hash.
# Syntax
git cat-file -t <hash> # Show type of object
git cat-file -p <hash> # Show content of object
# Example: Show type of an object
git cat-file -t 5d8265c
# Output: commit
# Example: Show content of an object
git cat-file -p 5d8265c
Most valuable options:
-t: Display the type of the object. This will return one of four types: 'blob', 'tree', 'commit', or 'tag'.-p: "Pretty-print" the contents of the object. Useful for viewing commits and trees.-s: Display the size of the object.
# Example: Show size of an object
git cat-file -s 5d8265c
Working Together
These commands are usually used in sequence for debugging or script automation. For instance, you can calculate the hash for a file using git hash-object, and then use that hash with git cat-file to view the original content and validate the integrity of the file.
# Find the hash of README.md
hash=$(git hash-object README.md)
# Use that hash to retrieve the object's type and content
git cat-file -t $hash
git cat-file -p $hash
In summary, git hash-object and git cat-file provide a way to interact with Git's internal object model. They are particularly useful for debugging, scripting, or deep diving into how Git works.





