Git Objects and How to Add an Empty Directory to Git Repository

1. Introduction

Git provides ways to specify whether and which local files and directories are part of a given repository. Still, Git itself tracks files, not directories. Thus, it’s non-trivial to add a directory without any files at all in it to a repository.

In this tutorial, we explore Git objects to come up with ways to add an empty and almost empty directory to a Git repository. First, we delve into the main Git objects. After that, we briefly mention the link between Git and filesystem objects. Next, we set up a sample repository. Finally, we go through different ways of creating or preserving an empty directory in Git.

We tested the code in this tutorial on Debian 12 (Bookworm) with GNU Bash 5.2.15. Unless otherwise specified, it should work in most POSIX-compliant environments.

2. Sample Repository

For demonstration purposes, let’s clone and list a sample Git repository:

$ git clone https://github.com/f/awesome-chatgpt-prompts && cd awesome-chatgpt-prompts
[...]
$ tree
.
├── CNAME
├── _config.yml
├── CONTRIBUTING.md
├── _layouts
│   └── default.html
├── LICENSE
├── prompts.csv
├── README.md
└── scripts
    └── find-prompt

3 directories, 8 files

Here, the tree command shows several files and two (2) directories: _layouts and find-prompt. Importantly, we don’t see the .git subdirectory or other hidden objects.

3. Git Objects

Git uses three kinds of objects to construct, organize, and maintain data:

blob: binary data storage
tree: low-level index tree of blobs, not commit tree
commit: a snapshot of potentially multilevel trees

These objects usually live under .git/objects/.

Although annotated tags are also objects, we won’t go into them.

Let’s briefly understand each of the others.

3.1. Git Blob

The main function of blobs is to store data and reduce redundancy while increasing flexibility.

New blobs appear in several situations:

add new file to track
add tracked file to staging
change synchronization

Blobs have a fairly basic format:

blob <SIZE_BYTES>\0<CONTENT_BINARY>

As an object, a blob usually remains until no object points to it. For example, a rebase with a drop or squash might lead to such a situation.

3.2. Git Tree

Although users rarely need to understand or directly use them, Git is based around tree structures. In this case, we talk about the low-level trees that describe data in a repository.

Such trees in Git associate blobs with file paths and permissions that belong to them. Effectively, this is what a directory does in most conventional filesystems.

Tree formats build on top of blob definitions by adding a tree header:

tree <SIZE_BYTES>\0
<FILE_1_MODE> <1_PATH>\0<1_BLOB_HASH>
<FILE_1_MODE> <2_PATH>\0<2_BLOB_HASH>
...
<FILE_N_MODE> <N_PATH>\0<N_BLOB_HASH>

Here, we can already understand why an empty, unreferenced directory won’t serve a purpose in Git.

In fact, commit operations trigger the creation of a new tree after a blob is already cached, i.e., a file is added. This is called a root tree and serves as a snapshot of the working directory:

$ git ls-tree 9f94573322353b1f1ccb298c7f8383fc64a589e8
040000 tree 112461b5254d5c2929e158e20f396e8594095ab2    .github
100644 blob 3571f7ca907e841f7aa19052d8ca842175ee8f50    CNAME
100644 blob 157836ef5c6f9f77330c4b775317b290985176d4    CONTRIBUTING.md
100644 blob 0e259d42c996742e9e3cba14c677129b2c1b6311    LICENSE
100644 blob 12ad110a86ef5d9d6195d251d1bf74a8c7d848ed    README.md
100644 blob 37c2dff3edf8dda54bbc5affaaaa39e72c0e772f    _config.yml
040000 tree ef29873b74edc934cf13b5cff2d498d6c1fbc5eb    _layouts
100644 blob afbe51859a3a341210262125756d603f444990b2    prompts.csv
040000 tree 285f36b3cb794bedc3ee98bea91455ee7deca681    scripts

Here, the ls-tree subcommand lists the root tree, which includes two object types:

tree, i.e., subdirectory definition
blob, i.e., file (change) definition

From left to right, each object has several fields:

permissions
type specifier
hash identifier
filesystem name

Similar to other objects, we can use the show subcommand on tree hashes:

$ git show 285f36b3cb794bedc3ee98bea91455ee7deca681

find-prompt

In this case, we see that the find-prompt file is part of the given subdirectory tree. In this output, subdirectories have a / forward slash suffix.

3.3. Git Commits

Effectively, a commit just points to the root tree object of a given snapshot. From there, Git orients itself around the structure to handle the data.

As we already saw, by using a commit identifier (ID), we can see the snapshot tree. That’s because the snapshot root tree is the commit ID:

$ git rev-parse 285f36b3cb794bedc3ee98bea91455ee7deca681^{tree}
285f36b3cb794bedc3ee98bea91455ee7deca681

As the rev-parse subcommand shows, the identifiers are identical.

4. Git Files and Directories

In general, filesystem objects are of two main types:

file: non-directory objects with data
directory: contains file lists as metadata

Even though both types are seen as files by an operating system (OS) like Linux, directories are actually file containers.

From the viewpoint of Git, repositories are local directories. However, the root directory of a repository isn’t of consequence.

What usually matters to Git is the file tree.

5. Attempt to Directly add Empty Directory

Let’s attempt to add the void empty subdirectory to the repository we have:

$ mkdir void
$ git add void

No issues so far. Also, we can see the directory in the local filesystem:

$ tree
.
├── CNAME
├── _config.yml
├── CONTRIBUTING.md
├── _layouts
│   └── default.html
├── LICENSE
├── prompts.csv
├── README.md
├── scripts
│   └── find-prompt
└── void

4 directories, 8 files

Now, we commit the changes:

$ git commit --all --message 'empty directory'
On branch main
Your branch is up to date with 'origin/main'.

nothing to commit, working tree clean

As expected, Git ignores void and refuses to add and track it as a new object in the repository since it’s not related to any files – the items that Git is supposed to track.

There are some situations in which this might become a problem. Depending on the context and restrictions, we can work around it in different ways.

6. Force-Add Empty Directory as Tree

Although not a standard or established workflow, we can add a completely empty directory to a Git repository via the *tree subcommands.

Again, we use void as the name of our empty directory:

$ EMPTYDIR=void

To begin with, we use the mktree subcommand to create an empty tree object and a parent tree that contains it:

$ emptytree=$(cat /dev/null | git mktree)
roottree=$(printf '040000 tree %s\t%s' $emptytree $EMPTYDIR | git mktree)

In both cases, we just pipe the correct syntax to git mktree.

Next, we commit the tree and check the structure:

$ initcommit=$(git commit-tree $roottree -m 'tree with empty directory')
$ git branch branch1 $commit
$ git checkout branch1

Let’s confirm the existence of the empty directory tree:

$ git ls-tree HEAD
040000 tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904    void

At this point, we have an empty directory in the Git metadata but no reflection of that in the filesystem. This is rarely useful, but leveraging commit-tree is the only way to even have the directory object in Git.

All other methods involve at least one file.

7. Using the Official .gitkeep Method

When we need to preserve a directory regardless of its contents as part of a repository structure, we use the .gitkeep file.

To clarify, let’s see how we go about doing that for the void empty directory:

$ mkdir void
$ touch void/.gitkeep

There are several benefits to using .gitkeep:

standard Git file
hidden in UNIX-like environments
empty file
filename relates to Git
filename declares purpose

Still, there are times when we might want to ensure no other files are present in the directory.

8. Adding a .gitignore

Continuing the trend of placeholders, we can go a step further and use another Git-specific metafile:

$ mkdir void
$ touch void/.gitignore

While the ignore suffix can be confusing, there are several benefits to this approach as well:

standard Git file
hidden in UNIX-like environments
small file
filename relates to Git
ability to control directory contents with regard to Git

In particular, we can populate the .gitignore file with specific contents, assuming the empty directory is void:

$ cat void/.gitignore
# ignore all directory files
*
# except .gitignore
!.gitignore

This way, we ensure that Git ignores any other files in the directory we want empty.

9. Using a Placeholder

If a given directory has a special function apart from being needed by other parts of the repository, we can also add a placeholder file.

Let’s see an example:

$ mkdir void
$ touch void/placeholder

Of course, we can keep it hidden in UNIX-like environments by prefixing it with a period. Such a file would be akin to the .placeholder files in the standard cron directories.

In this case, we use the placeholder file to ensure Git tracks the directory that contains it:

$ git add void
$ git commit --all --message 'almost empty directory'
[main b506662] almost empty directory
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 void/placeholder

Even at this point, Git doesn’t separately refer to void, only to the file within.

One variation of this method is a README file with a description of the need for this directory.

This solution works when we don’t mind having a custom object within the empty directory.

10. Summary

In this article, we explored the concepts of Git objects with the main example of creating and maintaining an empty directory within a repository.

In conclusion, although Git doesn’t track directories, we have ways to work around this limitation and add an empty or almost empty directory to a Git repository.

Full Archive

About Baeldung