Skip to content

Avoid the_repository in merge-ort and replay#2048

Open
newren wants to merge 6 commits intogitgitgadget:masterfrom
newren:avoid_the_repository
Open

Avoid the_repository in merge-ort and replay#2048
newren wants to merge 6 commits intogitgitgadget:masterfrom
newren:avoid_the_repository

Conversation

@newren
Copy link

@newren newren commented Feb 18, 2026

Changes since v1:

  • Add a preparatory patch removing the_repository check from blob prefetching in both merge-ort and diff*; it's no longer necessary
  • Fix casing mismatch
  • Simplify the hammer a bit based on the new first patch, but add some simple comments explaining it

Remove explicit uses of the_repository and the_hash_algo from merge-ort, and since this has now been done multiple times for both merge-ort and replay, implement a small measure to prevent them from returning to either merge-ort or replay.

See https://lore.kernel.org/git/CABPp-BH7E1Bh2g0vR3T4NEsv34DvFQPzMuJSsqtOAaWY-fFCxg@mail.gmail.com/ and https://lore.kernel.org/git/CABPp-BFuwvqiCTCCpoyT6em9_1-qrgPWHWhrufQ3UuZ+Kfkb6A@mail.gmail.com/ for recent discussions on these.

As noted in the comments on v1, I actually do not know why prefetch_for_content_merges() needs to use the_repository. When I introduced it back in 2bff554 (merge-ort: add prefetching for content merges, 2021-06-22), I was just looking at diffcore_std() and trying to mimic how it did the prefetch, and it has such a comparison. If anyone knows why diffcore_std() needs to compare against the_repository, I'd love to hear...

Series overview:
Patches 1-3: Mostly mechanical removal of existing uses
Patches 4-5: Simple hammer to prevent the problem from returning

cc: "Kristoffer Haugsbakk" kristofferhaugsbakk@fastmail.com
cc: Patrick Steinhardt ps@pks.im
cc: Elijah Newren newren@gmail.com

@newren newren force-pushed the avoid_the_repository branch from e2edab9 to d75a71a Compare February 18, 2026 02:37
@newren
Copy link
Author

newren commented Feb 18, 2026

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 18, 2026

Submitted as pull.2048.git.1771406115.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-2048/newren/avoid_the_repository-v1

To fetch this version to local tag pr-2048/newren/avoid_the_repository-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-2048/newren/avoid_the_repository-v1

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 19, 2026

User "Kristoffer Haugsbakk" <kristofferhaugsbakk@fastmail.com> has been added to the cc: list.

@newren newren force-pushed the avoid_the_repository branch from d75a71a to f9e2474 Compare February 19, 2026 12:59
@gitgitgadget
Copy link

gitgitgadget bot commented Feb 19, 2026

User Patrick Steinhardt <ps@pks.im> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 19, 2026

User Elijah Newren <newren@gmail.com> has been added to the cc: list.

@newren newren force-pushed the avoid_the_repository branch from f9e2474 to 6a78332 Compare February 19, 2026 18:46
Prefetching of blobs from promisor remotes was added to diff in
7fbbcb2 (diff: batch fetching of missing blobs, 2019-04-05).  In
that commit,

  https://lore.kernel.org/git/20190405170934.20441-1-jonathantanmy@google.com/

was squashed into

  https://lore.kernel.org/git/44de02e584f449481e6fb00cf35d74adf0192e9d.1553895166.git.jonathantanmy@google.com/

without the extra explanation about the squashed changes being added to
the commit message; in particular, this explanation from that first link
is absent:

> Also, prefetch only if the repository being diffed is the_repository
> (because we do not support lazy fetching for any other repository
>  anyway).

Then, later, this checking was spread from diff.c to diffcore-rename.c
and diffcore-break.c by 95acf11 (diff: restrict when prefetching
occurs, 2020-04-07) and then further split in d331dd3
(diffcore-rename: allow different missing_object_cb functions,
2021-06-22).  I also copied the logic from prefetching blobs from
diff.c to merge-ort.c in 2bff554 (merge-ort: add prefetching for
content merges, 2021-06-22).

The reason for all these checks was noted above -- we only supported
lazy fetching for the_repository.  However, that changed with
ef830cc (promisor-remote: teach lazy-fetch in any repo,
2021-06-17), so these checks are now unnecessary.  Remove them.

Signed-off-by: Elijah Newren <newren@gmail.com>
In order to get rid of a usage of the_repository, we need to know the
value of opt->repo; pass it along to write_tree().  Once we have the
repository, though, we no longer need to pass
opt->repo->hash_algo->rawsz, we can have write_tree() look up that value
itself.

Signed-off-by: Elijah Newren <newren@gmail.com>
We have a perfectly valid repository available and do not need to use
the_repository, except for one location in
prefetch_for_content_merges().

Signed-off-by: Elijah Newren <newren@gmail.com>
We have a perfectly valid repository available and do not need to use
the_hash_algo (a shorthand for the_repository->hash_algo), so use the
known repository instead.

Signed-off-by: Elijah Newren <newren@gmail.com>
Due to the use of DEFAULT_ABBREV, we cannot get rid of our usage of
USE_THE_REPOSITORY_VARIABLE.  However, we have removed all other uses of
the_repository in merge-ort a few times.  But they keep coming back.

Define the_repository to make it a compilation error so that they don't
come back any more.

Signed-off-by: Elijah Newren <newren@gmail.com>
Due to the use of DEFAULT_ABBREV, we cannot get rid of our usage of
USE_THE_REPOSITORY_VARIABLE.  We have removed all other uses of
the_repository before, but without removing that definition, they keep
coming back.

Define the_repository to make it a compilation error so that they don't
come back any more.

Signed-off-by: Elijah Newren <newren@gmail.com>
@newren newren force-pushed the avoid_the_repository branch from 6a78332 to 67db46f Compare February 20, 2026 00:26
@newren
Copy link
Author

newren commented Feb 20, 2026

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 20, 2026

Submitted as pull.2048.v2.git.1771552788.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-2048/newren/avoid_the_repository-v2

To fetch this version to local tag pr-2048/newren/avoid_the_repository-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-2048/newren/avoid_the_repository-v2

@@ -7176,7 +7176,7 @@ void diffcore_std(struct diff_options *options)
* If no prefetching occurs, diffcore_rename() will prefetch if it
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patrick Steinhardt wrote on the Git mailing list (how to reply to this email):

On Fri, Feb 20, 2026 at 01:59:43AM +0000, Elijah Newren via GitGitGadget wrote:
> diff --git a/diff.c b/diff.c
> index 35b903a9a0..91d81f66ad 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -7176,7 +7176,7 @@ void diffcore_std(struct diff_options *options)
>  	 * If no prefetching occurs, diffcore_rename() will prefetch if it
>  	 * decides that it needs inexact rename detection.
>  	 */
> -	if (options->repo == the_repository && repo_has_promisor_remote(the_repository) &&
> +	if (repo_has_promisor_remote(the_repository) &&

I wonder though -- shouldn't we also pass `options->repo` to
`repo_has_promisor_remote()` now? Otherwise we may support backfill
fetches from arbitrary repositories, but we'll only do them in case the
main repository has a promisor remote.

Patrick

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments