Appendix A — Git Submodules
A submodule is a Git repository embedded inside another Git repository. The outer repository (the superproject) records a specific commit SHA from the inner repository (the submodule) rather than copying the submodule's files. This makes submodules useful when a project depends on another project's source code at a fixed, controlled version — for example, a shared library, a vendored dependency, or a documentation theme.
When to Use Submodules
Submodules are appropriate when:
- You need to include another repository's source code at a pinned version that you control explicitly.
- The dependency is developed separately (perhaps by a different team or organisation) and has its own release cycle.
- You want contributors to be able to update the dependency deliberately, with the update recorded as a commit in the superproject.
Submodules are not a good fit when:
- You just need a package: use a language package manager (
npm,pip,cargo,go mod) instead. - The dependency changes frequently and you always want the latest version: the overhead of updating a submodule pointer exceeds the benefit.
- Your team is not familiar with submodule workflows: the extra conceptual overhead causes errors (forgetting to initialise, working in a detached HEAD in the submodule, pushing the superproject without pushing the submodule).
Adding a Submodule
For example, to add a shared library at lib/mylib:
This does three things:
- Clones the submodule repository into
lib/mylib/. - Creates or updates
.gitmodulesat the superproject root — a configuration file that records the submodule name, path, and URL. - Stages both
.gitmodulesand a special gitlink entry forlib/mylib(file mode160000).
Commit to record the addition:
.gitmodules
The gitlink entry in the superproject's tree stores only the submodule's commit SHA — not its files. Run git cat-file -p HEAD^{tree} on the superproject and you will see a 160000 mode entry for the submodule path:
Cloning a Repository with Submodules
When you clone a superproject, submodule directories exist but are empty by default:
git clone https://github.com/example/superproject.git
cd superproject
ls lib/mylib # empty directory
To populate them, initialise and fetch the submodule content:
git submodule init # register submodules from .gitmodules into .git/config
git submodule update # clone each submodule and check out the pinned commit
Or do both in one step:
To recursively initialise nested submodules (submodules within submodules):
The most convenient approach is to clone with submodules fully populated from the start:
The Detached HEAD in Submodules
After git submodule update, each submodule is in a detached HEAD state — it is checked out at the specific commit SHA recorded by the superproject, not on any branch. This is intentional: the superproject owns the precise version; the submodule has no branch to accidentally advance.
If you need to make changes inside the submodule, explicitly check out a branch first:
Then return to the superproject and update the pinned SHA to point to the new commit.
Updating a Submodule to a Newer Commit
Fetch and move the pointer manually
cd lib/mylib
git fetch
git switch main
git pull
cd ../..
git add lib/mylib
git commit -m "Update mylib to latest main"
Use git submodule update --remote
--remote fetches from the submodule's upstream and updates the working tree to the latest commit on the tracking branch:
By default --remote tracks the main branch. You can configure a different branch per submodule in .gitmodules:
Checking Submodule Status
Output format: a prefix character, the current commit SHA, the path, and the tag if any:
| Prefix | Meaning |
|---|---|
(space) |
Submodule is at the commit the superproject expects |
- |
Submodule has not been initialised (empty directory) |
+ |
Submodule is at a different commit than the superproject expects |
U |
Submodule has merge conflicts |
git submodule status
# a3f8c21d4e5b (lib/mylib) # clean
# +b1c2d3e4f5a6 (lib/mylib) # ahead or behind
# -0000000000000000000000000000000000000000 lib/mylib # not initialised
To see a summary of which commits changed across a git pull of the superproject:
Pushing Superproject and Submodule Changes Together
A common mistake is pushing superproject commits that reference submodule commits that have not yet been pushed. Anyone who clones or pulls the superproject will get a gitlink pointing to a SHA that does not exist on the remote.
Guard against this with --recurse-submodules=check:
This aborts the push if any submodule commit referenced by the superproject has not been pushed. Use on-demand to push submodules automatically before pushing the superproject:
Set this as the default:
Running Commands Across All Submodules
git submodule foreach executes a shell command in each submodule directory:
git submodule foreach git status
git submodule foreach git pull origin main
git submodule foreach 'echo "Submodule: $name at $(git rev-parse --short HEAD)"'
The $name variable expands to the submodule name defined in .gitmodules.
Removing a Submodule
Removing a submodule requires cleaning up in several places:
# 1. Unregister the submodule
git submodule deinit -f lib/mylib
# 2. Remove the submodule from the index and working tree
git rm -f lib/mylib
# 3. Remove the cached submodule metadata
rm -rf .git/modules/lib/mylib
# 4. Commit
git commit -m "Remove mylib submodule"
Skipping step 3 leaves stale metadata in .git/modules/ that can cause errors if you later try to add a submodule at the same path.
Common Pitfalls
Forgetting --recurse-submodules when cloning
Contributors who clone the repository without --recurse-submodules (or forget to run git submodule update --init) will have empty submodule directories and likely encounter build errors. Document the setup steps in your project's README.
Working in detached HEAD
After git submodule update, the submodule is in detached HEAD. Committing there creates commits unreachable from any branch — they will eventually be garbage collected. Always git switch <branch> inside the submodule before making changes.
Forgetting to push the submodule before the superproject
The superproject commit points to a submodule SHA that only exists locally. Other contributors can clone the superproject but cannot fetch that SHA. Use push --recurse-submodules=on-demand as the default (above) to prevent this.
Nested submodules
Submodules can themselves contain submodules. Always use --recursive with clone, update, and foreach to ensure all levels are populated.
Quick Reference
# Add a submodule
git submodule add <url> <path>
# Clone including submodules
git clone --recurse-submodules <url>
# Initialise + populate after clone
git submodule update --init --recursive
# Check status of all submodules
git submodule status
# Update submodule to latest remote commit
git submodule update --remote <path>
# Run a command in every submodule
git submodule foreach <command>
# Push, ensuring submodule commits are pushed first
git push --recurse-submodules=on-demand
# Remove a submodule
git submodule deinit -f <path>
git rm -f <path>
rm -rf .git/modules/<path>
git commit -m "Remove <name> submodule"
Summary
- A submodule embeds a separate Git repository inside a superproject, pinned to a specific commit SHA.
- Use submodules for vendored dependencies or shared libraries with independent release cycles. Use a package manager for ordinary dependencies.
- Always clone with
--recurse-submodules, or rungit submodule update --init --recursiveafter cloning. - Submodule working trees are in detached HEAD after update — check out a branch before committing.
- Protect against unpushed submodule references with
push.recurseSubmodules = on-demand. - Removing a submodule requires
git submodule deinit,git rm, and deleting.git/modules/<name>.
Further reading: Git Tools — Submodules (Pro Git)
Previous: Chapter 29 — SHA-1, Hashing & Object Storage Internals · Next: Appendix B — Git LFS (Large File Storage)