1. 03 Aug, 2020 6 commits
  2. 13 Jul, 2020 1 commit
    • config_entries: Avoid excessive map operations · f2400a9c
      When appending config entries, we currently always first get the
      currently existing map entry and then afterwards update the map to
      contain the current config value. In the common scenario where keys
      aren't being overridden, this is the best we can do. But in case a key
      gets set multiple times, then we'll also perform these two map
      operations. In extreme cases, hashing the map keys will thus start to
      dominate performance.
      
      Let's optimize the pattern by using a separately allocated map entry.
      Currently, we always put the current list entry into the map and update
      it to get any overridden multivar. As these list entries are also used
      to iterate config entries, we cannot update them in-place in the map and
      are thus forced to always set the map to contain the new entry. But with
      a separately allocated map entry, we can now create one once per config
      key and insert it into the map. Whenever appending a new config value
      with the same key, we can now just update the map entry in-place instead
      of having to replace the map entry completely.
      
      This reduces calls to the hashing function by half and trades the
      improved runtime for one more allocation per unique config key. Given
      that the refactoring arguably improves code readability by splitting
      concerns of the `config_entry_list` type and not having to track it in
      two different structures, this alone would already be reason enough to
      take the trade.
      
      Given a pathological case of a gitconfig with 100.000 repeated keys and
      a section of length 10.000 characters, this reduces runtime by half from
      approximately 14 seconds to 7 seconds as expected.
      Patrick Steinhardt committed
  3. 12 Jul, 2020 22 commits
    • Merge pull request #5396 from lhchavez/mwindow-file-limit · a83fd510
      mwindow: set limit on number of open files
      Edward Thomson committed
    • Minor nits and style formatting · 92d42eb3
      lhchavez committed
    • tests: verify renaming branch really updates worktree HEAD · ce4cb073
      In case where a branch is getting renamed, all HEADs of the main
      repository and of its worktrees that point to the old branch need to get
      updated to point to the new branch. We already do so and have a test for
      this, but the test only verifies that we're able to lookup the updated
      HEAD, not what it contains.
      
      Let's make the test more specific by verifying the updated HEAD also has
      the correct updated symbolic target.
      Patrick Steinhardt committed
    • refs: remove function to read HEAD directly · 5434f9a3
      With the last user of `git_reference__read_head` gone, let's remove it
      as it's been reading references without consulting the refdb backends.
      Patrick Steinhardt committed
    • repository: retrieve worktree HEAD via refdb · 65895410
      The function `git_repository_head_for_worktree` currently uses
      `git_reference__read_head` to directly read a given worktree's HEAD from
      the filesystem. This is broken in case the repository uses a different
      refdb implementation than the filesystem-based one, so let's instead
      open the worktree as a real repository and use `git_reference_lookup`.
      This also fixes the case where the worktree's HEAD is not a symref, but
      a detached HEAD, which would have resulted in an error previously.
      Patrick Steinhardt committed
    • repository: remove function to iterate over HEADs · d1f210fc
      The function `git_repository_foreach_head` is broken, as it directly
      interacts with the on-disk representation of the reference database,
      thus assuming that no other refdb is used for the given repository. As
      this is an internal function only and all users have been replaced,
      let's remove this function.
      Patrick Steinhardt committed
    • branch: determine whether a branch is checked out via refdb · ac5fbe31
      We currently determine whether a branch is checked out via
      `git_repository_foreach_head`. As this function reads references
      directly from the disk, it breaks our refdb abstraction in case the
      repository uses a different reference backend implementation than the
      filesystem-based one. So let's use `git_repository_foreach_worktree`
      instead -- while it's less efficient, it is at least correct in all
      corner cases.
      Patrick Steinhardt committed
    • refs: update HEAD references via refdb · 7216b048
      When renaming a reference, we need to iterate over every HEAD and
      potentially update it in case it is a symbolic reference pointing to the
      previous name of the renamed reference. Most importantly, this doesn't
      only include HEADs from the repo we're renaming the reference in, but we
      also need to iterate over HEADs from linked worktrees.
      
      In order to update the HEADs, we directly read them from the worktree's
      gitdir and thus assume that both repository and worktrees use the
      filesystem-based reference backend. But this breaks as soon as one got a
      repository with a different refdb and breaks our own abstractions. So
      let's instead update HEAD references via the refdb by first opening each
      worktree as a repository and then using the usual functions to read and
      update HEADs. This is a lot less efficient than the current code, but
      it's not like we can really help this: going via the refdb is mandatory.
      Patrick Steinhardt committed
    • repository: introduce new function to iterate over all worktrees · 2fcb4f28
      Given a Git repository, it's non-trivial to iterate over all worktrees
      that are associated with it, including the "main" repository. This
      commit adds a new internal function `git_repository_foreach_worktree`
      that does this for us.
      Patrick Steinhardt committed
    • Merge pull request #5570 from libgit2/pks/refdb-refactorings · 26b9e489
      refdb: a set of preliminary refactorings for the reftable backend
      Edward Thomson committed
    • refdb: avoid unlimited spinning in case of symref cycles · 34987447
      To determine whether another reflog entry needs to be written for HEAD
      on a reference update, we need to see whether HEAD directly or
      indirectly points to the reference we're updating. The resolve logic is
      currently completely unbounded except an error occurs, which effectively
      means that we'd be spinning forever in case we have a symref loop in the
      repository refdb.
      
      Let's fix the issue by using `git_refdb_resolve` instead, which is
      always bounded.
      Patrick Steinhardt committed
    • refs: replace reimplementation of reference resolver · b895547c
      The refs code currently has a second implementation that resolves
      references in order to find any final symbolic reference pointing to a
      nonexistent target branch. As we've just extended `git_refdb_resolve` to
      also return such references, let's use that one instead in order to
      reduce code duplication.
      Patrick Steinhardt committed
    • refdb: return resolved symbolic refs pointing to nonexistent refs · cf7dd05b
      In some cases, resolving references requires us to also know about the
      final symbolic reference that's pointing to a nonexistent branch, e.g.
      in an empty repository where the main branch is yet unborn but HEAD
      already points to it. Right now, the resolving logic is thus split up
      into two, where one is the new refdb implementation and the second one
      is an ad-hoc implementation inside "refs.c".
      
      Let's extend `git_refdb_resolve` to also return such final dangling
      references pointing to nonexistent branches so we can deduplicate the
      resolving logic.
      Patrick Steinhardt committed
    • refs: move resolving of references into the refdb · c54f40e4
      Resolving of symbolic references is currently implemented inside the
      "refs" layer. As a result, it's hard to call this function from
      low-level parts that only have a refdb available, but no repository, as
      the "refs" layer always operates on the repository-level. So let's move
      the function into the generic "refdb" implementation to lift this
      restriction.
      Patrick Steinhardt committed
    • Merge pull request #5547 from pks-t/pks/cmake-modernization-pt2 · ae30009e
      CMake modernization pt2
      Patrick Steinhardt committed
    • tests: reflog: remove unused signature · 9703d26f
      There's two tests that create a commit signature, but never make any use
      of it. Let's remove these to avoid any confusion.
      Patrick Steinhardt committed
    • refdb: extract function to check whether to append HEAD to the reflog · 1f39593b
      The logic to determine whether a reflog entry should be for the HEAD
      reference is non-trivial. Currently, the only user of this is the
      filesystem-based refdb, but with the advent of the reftable refdb we're
      going to add a second user that's interested in having the same
      behaviour.
      
      Let's pull out a new function that checks whether a given reference
      should cause a entry to be written to the HEAD reflog as a preparatory
      step.
      Patrick Steinhardt committed
    • refdb: extract function to check whether a reflog should be written · e02478b1
      The logic to determine whether a reflog should be written is
      non-trivial. Currently, the only user of this is the filesystem-based
      refdb, but with the advent of the reftable refdb we're going to add a
      second user that's interested in having the same behaviour.
      
      Let's pull out a new function that checks whether a given reference
      should cause a reflog to be written as a preparatory step.
      Patrick Steinhardt committed
    • cmake: remove CheckPrototypeDefinition module · 9bc6e655
      In the past, we've imported the CheckPrototypeDefinition into our own
      module directory as it wasn't yet available in all supported CMake
      versions. Now that we require at least CMake v3.5, we don't need to
      bundle it anymore as it's included with the distribution already.
      
      Let's drop the included modules and always use upstream's version.
      Patrick Steinhardt committed
    • cmake: use target-specific compile definitions · 4218403e
      We set up some compile definitions as part of our src/CMakeLists.txt.
      While the definitions are global, we really only need them as part of
      the git2internal target which compiles all the objects. Let's thus use
      `target_compile_definitions` instead of `add_definitions`.
      Patrick Steinhardt committed
    • cmake: use git2internal target to populate sources · 53911edd
      Modern CMake is usually target-driven in that a target is first defined
      and then the likes of `target_sources`, `target_include_directories`
      etc. are used to further populate the target. We still use old-style
      CMake, where we first set up a set of variables and then populate the
      target in a single call.
      
      Let's migrate to modern CMake usage by starting to populate the sources
      of our git2internal target piece-by-piece. While this is a small step,
      it allows us to convert to target-based build instructions
      piece-by-piece.
      Patrick Steinhardt committed
    • cmake: specify project version · 19eb1e4b
      We currently do not set up a project version within CMake, meaning that
      it can't be use by other projects including libgit2 as a sub-project and
      also not by other tools like IDEs.
      
      This commit changes this to always set up a project version, but instead
      of extracting it from the "version.h" header we now set it up directly.
      This is mostly to avoid mis-use of the previous `LIBGIT2_VERSION`
      variables, as we should now always use the `libgit2_VERSION` ones that
      are set up by CMake if one provides the "VERSION" keyword to the
      `project()` call. While this is one more moving target we need to adjust
      on releases, this commit also adjusts our release script to verify that
      the project version was incremented as expected.
      Patrick Steinhardt committed
  4. 09 Jul, 2020 4 commits
  5. 02 Jul, 2020 1 commit
  6. 01 Jul, 2020 3 commits
  7. 30 Jun, 2020 3 commits
    • Make NTLMClient Memory and UndefinedBehavior Sanitizer-clean · 7c964416
      This change makes the code pass the libgit2 tests cleanly when
      MSan/UBSan are enabled. Notably:
      
      * Changes malloc/memset combos into calloc for easier auditing.
      * Makes `write_buf` return early if the buffer length is empty to avoid
        arithmetic with NULL pointers (which UBSan does not like).
      * Initializes a few arrays that were sometimes being read before being
        written to.
      lhchavez committed
    • Make the tests pass cleanly with MemorySanitizer · 3a197ea7
      This change:
      
      * Initializes a few variables that were being read before being
        initialized.
      * Includes https://github.com/madler/zlib/pull/393. As such,
        it only works reliably with `-DUSE_BUNDLED_ZLIB=ON`.
      lhchavez committed
    • Make the tests run cleanly under UndefinedBehaviorSanitizer · d0656ac8
      This change makes the tests run cleanly under
      `-fsanitize=undefined,nullability` and comprises of:
      
      * Avoids some arithmetic with NULL pointers (which UBSan does not like).
      * Avoids an overflow in a shift, due to an uint8_t being implicitly
        converted to a signed 32-bit signed integer after being shifted by a
        32-bit signed integer.
      * Avoids a unaligned read in libgit2.
      * Ignores unaligned reads in the SHA1 library, since it only happens on
        Intel processors, where it is _still_ undefined behavior, but the
        semantics are moderately well-understood.
      
      Of notable omission is `-fsanitize=integer`, since there are lots of
      warnings in zlib and the SHA1 library which probably don't make sense to
      fix and I could not figure out how to silence easily. libgit2 itself
      also has ~100s of warnings which are mostly innocuous (e.g. use of enum
      constants that only fit on an `uint32_t`, but there is no way to do that
      in a simple fashion because the data type chosen for enumerated types is
      implementation-defined), and investigating whether there are worrying
      warnings would need reducing the noise significantly.
      lhchavez committed