1. 27 Jul, 2021 2 commits
    • graph: Create `git_graph_reachable_from_any()` · ce5400cd
      This change introduces a new API function
      `git_graph_reachable_from_any()`, that answers the question whether a
      commit is reachable from any of the provided commits through following
      parent edges.
      
      This function can take advantage of optimizations provided by the
      existence of a `commit-graph` file, since it makes it faster to know
      whether, given two commits X and Y, X cannot possibly be an reachable
      from Y.
      
      Part of: #5757
      lhchavez committed
    • commit-graph: Introduce `git_commit_list_generation_cmp` · 6f544140
      This change makes calculations of merge-bases a bit faster when there
      are complex graphs and the commit times cause visiting nodes multiple
      times. This is done by visiting the nodes in the graph in reverse
      generation order when the generation number is available instead of
      commit timestamp. If the generation number is missing in any pair of
      commits, it can safely fall back to the old heuristic with no negative
      side-effects.
      
      Part of: #5757
      lhchavez committed
  2. 19 Jul, 2021 1 commit
  3. 03 Mar, 2021 1 commit
  4. 27 Nov, 2020 1 commit
  5. 09 Jun, 2020 2 commits
    • tree-wide: do not compile deprecated functions with hard deprecation · c6184f0c
      When compiling libgit2 with -DDEPRECATE_HARD, we add a preprocessor
      definition `GIT_DEPRECATE_HARD` which causes the "git2/deprecated.h"
      header to be empty. As a result, no function declarations are made
      available to callers, but the implementations are still available to
      link against. This has the problem that function declarations also
      aren't visible to the implementations, meaning that the symbol's
      visibility will not be set up correctly. As a result, the resulting
      library may not expose those deprecated symbols at all on some platforms
      and thus cause linking errors.
      
      Fix the issue by conditionally compiling deprecated functions, only.
      While it becomes impossible to link against such a library in case one
      uses deprecated functions, distributors of libgit2 aren't expected to
      pass -DDEPRECATE_HARD anyway. Instead, users of libgit2 should manually
      define GIT_DEPRECATE_HARD to hide deprecated functions. Using "real"
      hard deprecation still makes sense in the context of CI to test we don't
      use deprecated symbols ourselves and in case a dependant uses libgit2 in
      a vendored way and knows it won't ever use any of the deprecated symbols
      anyway.
      Patrick Steinhardt committed
    • tree-wide: mark local functions as static · a6c9e0b3
      We've accumulated quite some functions which are never used outside of
      their respective code unit, but which are lacking the `static` keyword.
      Add it to reduce their linkage scope and allow the compiler to optimize
      better.
      Patrick Steinhardt committed
  6. 01 Jun, 2020 1 commit
  7. 01 Apr, 2020 1 commit
    • merge: cache negative cache results for similarity metrics · 4dfcc50f
      When computing renames, we cache the hash signatures for each of the
      potentially conflicting entries so that we do not need to repeatedly
      read the file and can at least halfway efficiently determine whether two
      files are similar enough to be deemed a rename. In order to make the
      hash signatures meaningful, we require at least four lines of data to be
      present, resulting in at least four different hashes that can be
      compared. Files that are deemed too small are not cached at all and
      will thus be repeatedly re-hashed, which is usually not a huge issue.
      
      The issue with above heuristic is in case a file does _not_ have at
      least four lines, where a line is anything separated by a consecutive
      run of "\n" or "\0" characters. For example "a\nb" is two lines, but
      "a\0\0b" is also just two lines. Taken to the extreme, a file that has
      megabytes of consecutive space- or NUL-only may also be deemed as too
      small and thus not get cached. As a result, we will repeatedly load its
      blob, calculate its hash signature just to finally throw it away as we
      notice it's not of any value. When you've got a comparitively big file
      that you compare against a big set of potentially renamed files, then
      the cost simply expodes.
      
      The issue can be trivially fixed by introducing negative cache entries.
      Whenever we determine that a given blob does not have a meaningful
      representation via a hash signature, we store this negative cache marker
      and will from then on not hash it again, but also ignore it as a
      potential rename target. This should help the "normal" case already
      where you have a lot of small files as rename candidates, but in the
      above scenario it's savings are extraordinarily high.
      
      To verify we do not hit the issue anymore with described solution, this
      commit adds a test that uses the exact same setup described above with
      one 50 megabyte blob of '\0' characters and 1000 other files that get
      renamed. Without the negative cache:
      
      $ time ./libgit2_clar -smerge::trees::renames::cache_recomputation >/dev/null
      real    11m48.377s
      user    11m11.576s
      sys     0m35.187s
      
      And with the negative cache:
      
      $ time ./libgit2_clar -smerge::trees::renames::cache_recomputation >/dev/null
      real    0m1.972s
      user    0m1.851s
      sys     0m0.118s
      
      So this represents a ~350-fold performance improvement, but it obviously
      depends on how many files you have and how big the blob is. The test
      number were chosen in a way that one will immediately notice as soon as
      the bug resurfaces.
      Patrick Steinhardt committed
  8. 22 Nov, 2019 1 commit
  9. 10 Oct, 2019 1 commit
    • refs: fix locks getting forcibly removed · 3335a034
      The flag GIT_FILEBUF_FORCE currently does two things:
           1. It will cause the filebuf to create non-existing leading
              directories for the file that is about to be written.
           2. It will forcibly remove any pre-existing locks.
      While most call sites actually do want (1), they do not want to
      remove pre-existing locks, as that renders the locking mechanisms
      effectively useless.
      Introduce a new flag `GIT_FILEBUF_CREATE_LEADING_DIRS` to
      separate both behaviours cleanly from each other and convert
      callers to use it instead of `GIT_FILEBUF_FORCE` to have them
      honor locked files correctly.
      
      As this conversion removes all current users of `GIT_FILEBUF_FORCE`,
      this commit removes the flag altogether.
      Sebastian Henke committed
  10. 23 Aug, 2019 1 commit
  11. 24 Jun, 2019 1 commit
  12. 14 Jun, 2019 1 commit
    • Rename opt init functions to `options_init` · 0b5ba0d7
      In libgit2 nomenclature, when we need to verb a direct object, we name
      a function `git_directobject_verb`.  Thus, if we need to init an options
      structure named `git_foo_options`, then the name of the function that
      does that should be `git_foo_options_init`.
      
      The previous names of `git_foo_init_options` is close - it _sounds_ as
      if it's initializing the options of a `foo`, but in fact
      `git_foo_options` is its own noun that should be respected.
      
      Deprecate the old names; they'll now call directly to the new ones.
      Edward Thomson committed
  13. 10 Jun, 2019 1 commit
  14. 15 Feb, 2019 3 commits
    • oidmap: introduce high-level setter for key/value pairs · 2e0a3048
      Currently, one would use either `git_oidmap_insert` to insert key/value pairs
      into a map or `git_oidmap_put` to insert a key only. These function have
      historically been macros, which is why their syntax is kind of weird: instead of
      returning an error code directly, they instead have to be passed a pointer to
      where the return value shall be stored. This does not match libgit2's common
      idiom of directly returning error codes.Furthermore, `git_oidmap_put` is tightly
      coupled with implementation details of the map as it exposes the index of
      inserted entries.
      
      Introduce a new function `git_oidmap_set`, which takes as parameters the map,
      key and value and directly returns an error code. Convert all trivial callers of
      `git_oidmap_insert` and `git_oidmap_put` to make use of it.
      Patrick Steinhardt committed
    • oidmap: introduce high-level getter for values · 9694ef20
      The current way of looking up an entry from a map is tightly coupled with the
      map implementation, as one first has to look up the index of the key and then
      retrieve the associated value by using the index. As a caller, you usually do
      not care about any indices at all, though, so this is more complicated than
      really necessary. Furthermore, it invites for errors to happen if the correct
      error checking sequence is not being followed.
      
      Introduce a new high-level function `git_oidmap_get` that takes a map and a key
      and returns a pointer to the associated value if such a key exists. Otherwise,
      a `NULL` pointer is returned. Adjust all callers that can trivially be
      converted.
      Patrick Steinhardt committed
    • maps: use uniform lifecycle management functions · 351eeff3
      Currently, the lifecycle functions for maps (allocation, deallocation, resize)
      are not named in a uniform way and do not have a uniform function signature.
      Rename the functions to fix that, and stick to libgit2's naming scheme of saying
      `git_foo_new`. This results in the following new interface for allocation:
      
      - `int git_<t>map_new(git_<t>map **out)` to allocate a new map, returning an
        error code if we ran out of memory
      
      - `void git_<t>map_free(git_<t>map *map)` to free a map
      
      - `void git_<t>map_clear(git<t>map *map)` to remove all entries from a map
      
      This commit also fixes all existing callers.
      Patrick Steinhardt committed
  15. 22 Jan, 2019 1 commit
  16. 01 Dec, 2018 1 commit
  17. 28 Nov, 2018 1 commit
  18. 20 Oct, 2018 1 commit
  19. 19 Oct, 2018 2 commits
  20. 10 Jun, 2018 1 commit
  21. 04 Feb, 2018 2 commits
    • merge: virtual commit should be last argument to merge-base · 1403c612
      Our virtual commit must be the last argument to merge-base: since our
      algorithm pushes _both_ parents of the virtual commit, it needs to be
      the last argument, since merge-base:
      
      > Given three commits A, B and C, git merge-base A B C will compute the
      > merge base between A and a hypothetical commit M
      
      We want to calculate the merge base between the actual commit ("two")
      and the virtual commit ("one") - since one actually pushes its parents
      to the merge-base calculation, we need to calculate the merge base of
      "two" and the parents of one.
      Tyrie Vella committed
    • merge: reverse merge bases for recursive merge · b924df1e
      When the commits being merged have multiple merge bases, reverse the
      order when creating the virtual merge base.  This is for compatibility
      with git's merge-recursive algorithm, and ensures that we build
      identical trees.
      
      Git does this to try to use older merge bases first.  Per 8918b0c:
      
      > It seems to be the only sane way to do it: when a two-head merge is
      > done, and the merge-base and one of the two branches agree, the
      > merge assumes that the other branch has something new.
      >
      > If we start creating virtual commits from newer merge-bases, and go
      > back to older merge-bases, and then merge with newer commits again,
      > chances are that a patch is lost, _because_ the merge-base and the
      > head agree on it. Unlikely, yes, but it happened to me.
      Edward Thomson committed
  22. 21 Jan, 2018 1 commit
    • merge: recursive uses larger conflict markers · 185b0d08
      Git uses longer conflict markers in the recursive merge base - two more
      than the default (thus, 9 character long conflict markers).  This allows
      users to tell the difference between the recursive merge conflicts and
      conflicts between the ours and theirs branches.
      
      This was introduced in git d694a17986a28bbc19e2a6c32404ca24572e400f.
      
      Update our tests to expect this as well.
      Edward Thomson committed
  23. 11 Nov, 2017 2 commits
  24. 03 Jul, 2017 1 commit
    • Make sure to always include "common.h" first · 0c7f49dd
      Next to including several files, our "common.h" header also declares
      various macros which are then used throughout the project. As such, we
      have to make sure to always include this file first in all
      implementation files. Otherwise, we might encounter problems or even
      silent behavioural differences due to macros or defines not being
      defined as they should be. So in fact, our header and implementation
      files should make sure to always include "common.h" first.
      
      This commit does so by establishing a common include pattern. Header
      files inside of "src" will now always include "common.h" as its first
      other file, separated by a newline from all the other includes to make
      it stand out as special. There are two cases for the implementation
      files. If they do have a matching header file, they will always include
      this one first, leading to "common.h" being transitively included as
      first file. If they do not have a matching header file, they instead
      include "common.h" as first file themselves.
      
      This fixes the outlined problems and will become our standard practice
      for header and source files inside of the "src/" from now on.
      Patrick Steinhardt committed
  25. 21 Jun, 2017 1 commit
    • merge: fix potential free of uninitialized memory · 4dc87e72
      The function `merge_diff_mark_similarity_exact` may error our early and,
      when it does so, free the `ours_deletes_by_oid` and
      `theirs_deletes_by_oid` variables. While the first one can never be
      uninitialized due to the first call actually assigning to it, the second
      variable can be freed without being initialized.
      
      Fix the issue by initializing both variables to `NULL`.
      Patrick Steinhardt committed
  26. 17 May, 2017 1 commit
  27. 23 Mar, 2017 1 commit
  28. 13 Feb, 2017 1 commit
    • repository: rename `path_repository` and `path_gitlink` · 84f56cb0
      The `path_repository` variable is actually confusing to think
      about, as it is not always clear what the repository actually is.
      It may either be the path to the folder containing worktree and
      .git directory, the path to .git itself, a worktree or something
      entirely different. Actually, the intent of the variable is to
      hold the path to the gitdir, which is either the .git directory
      or the bare repository.
      
      Rename the variable to `gitdir` to avoid confusion. While at it,
      also rename `path_gitlink` to `gitlink` to improve consistency.
      Patrick Steinhardt committed
  29. 09 Feb, 2017 1 commit
  30. 01 Jan, 2017 1 commit
  31. 29 Dec, 2016 1 commit
  32. 14 Nov, 2016 1 commit
  33. 18 Oct, 2016 1 commit