Commits · e411aae3850c59b5c32b16e339c4ce8af284183a · lvzhengyang / git2

03 Aug, 2020 6 commits

repo: honor the init.defaultBranch setting · e411aae3

As part of a push towards more inclusive language, git is reconsidering
using "master" as the default branch name. As a first step, this
setting will be configurable with the `init.defaultBranch` configuration
option. Honor this during repository initialization.

During initialization, we will create an initial branch:

1. Using the `initial_head` setting, if specified;
2. Using the `HEAD` configured in a template, if it exists;
3. Using the `init.defaultBranch` configuration option, if it is set; or
4. Using `master` in the absence of additional configuration.

committed Aug 03, 2020

e411aae3 Browse Files

Merge pull request #5596 from libgit2/ethomson/sanitizer_ci · c71321a0
```
sanitizer ci: skip negotiate tests
```
Edward Thomson committed Aug 03, 2020
c71321a0 Browse Files

sanitizer ci: skip negotiate tests · 6973c570

We don't build with SPNEGO enabled on our focal-based sanitizer builds,
so we need to disable the negotiate tests.

committed Aug 03, 2020

6973c570 Browse Files

Merge pull request #5569 from lhchavez/ci-sanitizers · 11a62973
```
Add CI support for Memory and UndefinedBehavior Sanitizers
```
Edward Thomson committed Aug 03, 2020
11a62973 Browse Files
Merge pull request #5563 from pks-t/pks/worktree-heads · c5d41d46
```
Access HEAD via the refdb backends
```
Edward Thomson committed Aug 03, 2020
c5d41d46 Browse Files
Merge pull request #5582 from libgit2/pks-config-map-optimization · 52ccbc5d
```
config_entries: Avoid excessive map operations
```
Edward Thomson committed Aug 03, 2020
52ccbc5d Browse Files

13 Jul, 2020 1 commit

config_entries: Avoid excessive map operations · f2400a9c

When appending config entries, we currently always first get the
currently existing map entry and then afterwards update the map to
contain the current config value. In the common scenario where keys
aren't being overridden, this is the best we can do. But in case a key
gets set multiple times, then we'll also perform these two map
operations. In extreme cases, hashing the map keys will thus start to
dominate performance.

Let's optimize the pattern by using a separately allocated map entry.
Currently, we always put the current list entry into the map and update
it to get any overridden multivar. As these list entries are also used
to iterate config entries, we cannot update them in-place in the map and
are thus forced to always set the map to contain the new entry. But with
a separately allocated map entry, we can now create one once per config
key and insert it into the map. Whenever appending a new config value
with the same key, we can now just update the map entry in-place instead
of having to replace the map entry completely.

This reduces calls to the hashing function by half and trades the
improved runtime for one more allocation per unique config key. Given
that the refactoring arguably improves code readability by splitting
concerns of the `config_entry_list` type and not having to track it in
two different structures, this alone would already be reason enough to
take the trade.

Given a pathological case of a gitconfig with 100.000 repeated keys and
a section of length 10.000 characters, this reduces runtime by half from
approximately 14 seconds to 7 seconds as expected.

committed Jul 13, 2020

f2400a9c Browse Files

12 Jul, 2020 22 commits

Merge pull request #5396 from lhchavez/mwindow-file-limit · a83fd510
```
mwindow: set limit on number of open files
```
Edward Thomson committed Jul 12, 2020
a83fd510 Browse Files
Minor nits and style formatting · 92d42eb3
lhchavez committed Jul 12, 2020

92d42eb3 Browse Files

tests: verify renaming branch really updates worktree HEAD · ce4cb073

In case where a branch is getting renamed, all HEADs of the main
repository and of its worktrees that point to the old branch need to get
updated to point to the new branch. We already do so and have a test for
this, but the test only verifies that we're able to lookup the updated
HEAD, not what it contains.

Let's make the test more specific by verifying the updated HEAD also has
the correct updated symbolic target.

committed Jul 12, 2020

ce4cb073 Browse Files

refs: remove function to read HEAD directly · 5434f9a3

With the last user of `git_reference__read_head` gone, let's remove it
as it's been reading references without consulting the refdb backends.

committed Jul 12, 2020

5434f9a3 Browse Files

repository: retrieve worktree HEAD via refdb · 65895410

The function `git_repository_head_for_worktree` currently uses
`git_reference__read_head` to directly read a given worktree's HEAD from
the filesystem. This is broken in case the repository uses a different
refdb implementation than the filesystem-based one, so let's instead
open the worktree as a real repository and use `git_reference_lookup`.
This also fixes the case where the worktree's HEAD is not a symref, but
a detached HEAD, which would have resulted in an error previously.

committed Jul 12, 2020

65895410 Browse Files

repository: remove function to iterate over HEADs · d1f210fc

The function `git_repository_foreach_head` is broken, as it directly
interacts with the on-disk representation of the reference database,
thus assuming that no other refdb is used for the given repository. As
this is an internal function only and all users have been replaced,
let's remove this function.

committed Jul 12, 2020

d1f210fc Browse Files

branch: determine whether a branch is checked out via refdb · ac5fbe31

We currently determine whether a branch is checked out via
`git_repository_foreach_head`. As this function reads references
directly from the disk, it breaks our refdb abstraction in case the
repository uses a different reference backend implementation than the
filesystem-based one. So let's use `git_repository_foreach_worktree`
instead -- while it's less efficient, it is at least correct in all
corner cases.

committed Jul 12, 2020

ac5fbe31 Browse Files

refs: update HEAD references via refdb · 7216b048

When renaming a reference, we need to iterate over every HEAD and
potentially update it in case it is a symbolic reference pointing to the
previous name of the renamed reference. Most importantly, this doesn't
only include HEADs from the repo we're renaming the reference in, but we
also need to iterate over HEADs from linked worktrees.

In order to update the HEADs, we directly read them from the worktree's
gitdir and thus assume that both repository and worktrees use the
filesystem-based reference backend. But this breaks as soon as one got a
repository with a different refdb and breaks our own abstractions. So
let's instead update HEAD references via the refdb by first opening each
worktree as a repository and then using the usual functions to read and
update HEADs. This is a lot less efficient than the current code, but
it's not like we can really help this: going via the refdb is mandatory.

committed Jul 12, 2020

7216b048 Browse Files

repository: introduce new function to iterate over all worktrees · 2fcb4f28

Given a Git repository, it's non-trivial to iterate over all worktrees
that are associated with it, including the "main" repository. This
commit adds a new internal function `git_repository_foreach_worktree`
that does this for us.

committed Jul 12, 2020

2fcb4f28 Browse Files

Merge pull request #5570 from libgit2/pks/refdb-refactorings · 26b9e489
```
refdb: a set of preliminary refactorings for the reftable backend
```
Edward Thomson committed Jul 12, 2020
26b9e489 Browse Files

refdb: avoid unlimited spinning in case of symref cycles · 34987447

To determine whether another reflog entry needs to be written for HEAD
on a reference update, we need to see whether HEAD directly or
indirectly points to the reference we're updating. The resolve logic is
currently completely unbounded except an error occurs, which effectively
means that we'd be spinning forever in case we have a symref loop in the
repository refdb.

Let's fix the issue by using `git_refdb_resolve` instead, which is
always bounded.

committed Jul 12, 2020

34987447 Browse Files

refs: replace reimplementation of reference resolver · b895547c

The refs code currently has a second implementation that resolves
references in order to find any final symbolic reference pointing to a
nonexistent target branch. As we've just extended `git_refdb_resolve` to
also return such references, let's use that one instead in order to
reduce code duplication.

committed Jul 12, 2020

b895547c Browse Files

refdb: return resolved symbolic refs pointing to nonexistent refs · cf7dd05b

In some cases, resolving references requires us to also know about the
final symbolic reference that's pointing to a nonexistent branch, e.g.
in an empty repository where the main branch is yet unborn but HEAD
already points to it. Right now, the resolving logic is thus split up
into two, where one is the new refdb implementation and the second one
is an ad-hoc implementation inside "refs.c".

Let's extend `git_refdb_resolve` to also return such final dangling
references pointing to nonexistent branches so we can deduplicate the
resolving logic.

committed Jul 12, 2020

cf7dd05b Browse Files

refs: move resolving of references into the refdb · c54f40e4

Resolving of symbolic references is currently implemented inside the
"refs" layer. As a result, it's hard to call this function from
low-level parts that only have a refdb available, but no repository, as
the "refs" layer always operates on the repository-level. So let's move
the function into the generic "refdb" implementation to lift this
restriction.

committed Jul 12, 2020

c54f40e4 Browse Files

Merge pull request #5547 from pks-t/pks/cmake-modernization-pt2 · ae30009e
```
CMake modernization pt2
```
Patrick Steinhardt committed Jul 12, 2020
ae30009e Browse Files

tests: reflog: remove unused signature · 9703d26f

There's two tests that create a commit signature, but never make any use
of it. Let's remove these to avoid any confusion.

committed Jul 12, 2020

9703d26f Browse Files

refdb: extract function to check whether to append HEAD to the reflog · 1f39593b

The logic to determine whether a reflog entry should be for the HEAD
reference is non-trivial. Currently, the only user of this is the
filesystem-based refdb, but with the advent of the reftable refdb we're
going to add a second user that's interested in having the same
behaviour.

Let's pull out a new function that checks whether a given reference
should cause a entry to be written to the HEAD reflog as a preparatory
step.

committed Jul 12, 2020

1f39593b Browse Files

refdb: extract function to check whether a reflog should be written · e02478b1

The logic to determine whether a reflog should be written is
non-trivial. Currently, the only user of this is the filesystem-based
refdb, but with the advent of the reftable refdb we're going to add a
second user that's interested in having the same behaviour.

Let's pull out a new function that checks whether a given reference
should cause a reflog to be written as a preparatory step.

committed Jul 12, 2020

e02478b1 Browse Files

cmake: remove CheckPrototypeDefinition module · 9bc6e655

In the past, we've imported the CheckPrototypeDefinition into our own
module directory as it wasn't yet available in all supported CMake
versions. Now that we require at least CMake v3.5, we don't need to
bundle it anymore as it's included with the distribution already.

Let's drop the included modules and always use upstream's version.

committed Jul 12, 2020

9bc6e655 Browse Files

cmake: use target-specific compile definitions · 4218403e

We set up some compile definitions as part of our src/CMakeLists.txt.
While the definitions are global, we really only need them as part of
the git2internal target which compiles all the objects. Let's thus use
`target_compile_definitions` instead of `add_definitions`.

committed Jul 12, 2020

4218403e Browse Files

cmake: use git2internal target to populate sources · 53911edd

Modern CMake is usually target-driven in that a target is first defined
and then the likes of `target_sources`, `target_include_directories`
etc. are used to further populate the target. We still use old-style
CMake, where we first set up a set of variables and then populate the
target in a single call.

Let's migrate to modern CMake usage by starting to populate the sources
of our git2internal target piece-by-piece. While this is a small step,
it allows us to convert to target-based build instructions
piece-by-piece.

committed Jul 12, 2020

53911edd Browse Files

cmake: specify project version · 19eb1e4b

We currently do not set up a project version within CMake, meaning that
it can't be use by other projects including libgit2 as a sub-project and
also not by other tools like IDEs.

This commit changes this to always set up a project version, but instead
of extracting it from the "version.h" header we now set it up directly.
This is mostly to avoid mis-use of the previous `LIBGIT2_VERSION`
variables, as we should now always use the `libgit2_VERSION` ones that
are set up by CMake if one provides the "VERSION" keyword to the
`project()` call. While this is one more moving target we need to adjust
on releases, this commit also adjusts our release script to verify that
the project version was incremented as expected.

committed Jul 12, 2020

19eb1e4b Browse Files

09 Jul, 2020 4 commits

Add CI support for Memory and UndefinedBehavior Sanitizers · 6a917c04

This change adds two new build targets: MSan and UBSan. This is because
even though OSS-Fuzz is great and adds a lot of coverage, it only does
that for the fuzz targets, so the rest of the codebase is not
necessarily run with the Sanitizers ever :( So this change makes sure
that MSan/UBSan warnings don't make it into the codebase.

As part of this change, the Ubuntu focal container is introduced. It
builds mbedTLS and libssh2 as debug libraries into /usr/local and as
MSan-enabled libraries into /usr/local/msan. This latter part is needed
because MSan requires the binary and all its dependent libraries to be
built with MSan support so that memory allocations and deallocations are
tracked correctly to avoid false positives.

committed Jul 09, 2020

6a917c04 Browse Files

Merge pull request #5568 from lhchavez/ubsan · 325375e3
```
Make the tests run cleanly under UndefinedBehaviorSanitizer
```
Edward Thomson committed Jul 09, 2020
325375e3 Browse Files
Merge pull request #5567 from lhchavez/msan · 2ffa426e
```
Make the tests pass cleanly with MemorySanitizer
```
Edward Thomson committed Jul 09, 2020
2ffa426e Browse Files
Merge pull request #5561 from A-Ovchinnikov-mx/a-ovchin/windres-rc · 60536163
```
Enable building git2.rc resource script with GCC
```
Edward Thomson committed Jul 09, 2020
60536163 Browse Files

02 Jul, 2020 1 commit
- Merge pull request #5571 from lhchavez/ntlmclient-sanitizers · 8720ae8a
```
Make NTLMClient Memory and UndefinedBehavior Sanitizer-clean
```
  Edward Thomson committed Jul 02, 2020
  8720ae8a Browse Files
01 Jul, 2020 3 commits
- Use __GNUC__ macro in the resource script · dc1deb3b
```
Fix the default LIBGIT2_FILENAME for GNU windres
```
  Alexander Ovchinnikov committed Jul 01, 2020
  dc1deb3b Browse Files
- Review: Rename the stringize macro · 71000441
  Alexander Ovchinnikov committed Jul 01, 2020
  
  71000441 Browse Files
- Enable building git2.rc resource script with GCC · 5c40456b
  Alexander Ovchinnikov committed Jul 01, 2020
  
  5c40456b Browse Files
30 Jun, 2020 3 commits

Make NTLMClient Memory and UndefinedBehavior Sanitizer-clean · 7c964416

This change makes the code pass the libgit2 tests cleanly when
MSan/UBSan are enabled. Notably:

* Changes malloc/memset combos into calloc for easier auditing.
* Makes `write_buf` return early if the buffer length is empty to avoid
  arithmetic with NULL pointers (which UBSan does not like).
* Initializes a few arrays that were sometimes being read before being
  written to.

committed Jun 30, 2020

7c964416 Browse Files

Make the tests pass cleanly with MemorySanitizer · 3a197ea7

This change:

* Initializes a few variables that were being read before being
  initialized.
* Includes https://github.com/madler/zlib/pull/393. As such,
  it only works reliably with `-DUSE_BUNDLED_ZLIB=ON`.

committed Jun 30, 2020

3a197ea7 Browse Files

Make the tests run cleanly under UndefinedBehaviorSanitizer · d0656ac8

This change makes the tests run cleanly under
`-fsanitize=undefined,nullability` and comprises of:

* Avoids some arithmetic with NULL pointers (which UBSan does not like).
* Avoids an overflow in a shift, due to an uint8_t being implicitly
  converted to a signed 32-bit signed integer after being shifted by a
  32-bit signed integer.
* Avoids a unaligned read in libgit2.
* Ignores unaligned reads in the SHA1 library, since it only happens on
  Intel processors, where it is _still_ undefined behavior, but the
  semantics are moderately well-understood.

Of notable omission is `-fsanitize=integer`, since there are lots of
warnings in zlib and the SHA1 library which probably don't make sense to
fix and I could not figure out how to silence easily. libgit2 itself
also has ~100s of warnings which are mostly innocuous (e.g. use of enum
constants that only fit on an `uint32_t`, but there is no way to do that
in a simple fashion because the data type chosen for enumerated types is
implementation-defined), and investigating whether there are worrying
warnings would need reducing the noise significantly.

committed Jun 30, 2020

d0656ac8 Browse Files