Skip to main content

Command Palette

Search for a command to run...

The Structure of Git Blob Objects

Published
2 min read
The Structure of Git Blob Objects
M

Hey, I'm a postgraduate in Cyber Security with practical experience in Software Engineering and DevOps Operations. The top player on TryHackMe platform, multilingual speaker (Kazakh, Russian, English, Spanish, and Turkish), curios person, bookworm, geek, sports lover, and just a good guy to speak with!

In Git, a blob (short for "binary large object") is used to store the content of a file. A blob is essentially a snapshot of the content of a file at a given point in time. Unlike trees and commits, blobs are very straightforward: they contain only the content of the file and have no additional metadata like file name or permissions. Blobs themselves are not concerned with file names or directory structures; that information is maintained by tree objects.

Structure of a Blob Object

A blob object typically consists of:

  1. Header: The header contains the object type ("blob") and the length of the content in bytes, separated by a space, and ending with a null byte (\0).

    • Example: If the content is "Hello, Git!" then the header might look like: blob 11\0.
  2. Content: This is the actual content of the file, stored as-is, usually in a compressed form.

    • Example: The text "Hello, Git!"
  3. SHA-1 Hash: The SHA-1 hash of the blob serves as a unique identifier. It is generated by hashing the header and content.

The blob object is stored in a compressed form (usually zlib compression) in the .git/objects/ directory. The object file name is the SHA-1 hash of the object. The first two characters of the hash are used as the name of the subdirectory inside .git/objects/, and the remaining 38 characters serve as the filename within that directory.

Example

For example, if you have a file called hello.txt with the content "Hello, Git!", the blob object would have the following structure:

  1. Header: blob 11\0

  2. Content: Hello, Git!

  3. SHA-1 Hash: The hash generated by hashing the header and content together.

You could use the git hash-object command to find the SHA-1 hash for the content of hello.txt:

git hash-object hello.txt

And you could use the git cat-file command to inspect the blob object:

git cat-file -p <blob_hash>

This will output Hello, Git!, the content of the file.

Blob objects are designed to be simple containers for file content, while the complexity of relationships, history, and metadata is managed by tree and commit objects.

More from this blog

M

Maxat Akbanov's blog

259 posts

Postgraduate in Cyber Security with experience in Software Engineering and DevOps Operations.