1. 30 Jun, 2020 1 commit
  2. 18 Jan, 2020 1 commit
  3. 22 Nov, 2019 2 commits
  4. 20 Jul, 2019 1 commit
  5. 18 Jul, 2019 1 commit
  6. 15 Jun, 2019 1 commit
  7. 22 Jan, 2019 1 commit
  8. 01 Dec, 2018 1 commit
  9. 10 Jun, 2018 1 commit
  10. 03 Jan, 2018 1 commit
    • diff_generate: avoid excessive stats of .gitattribute files · d8896bda
      When generating a diff between two trees, for each file that is to be
      diffed we have to determine whether it shall be treated as text or as
      binary files. While git has heuristics to determine which kind of diff
      to generate, users can also that default behaviour by setting or
      unsetting the 'diff' attribute for specific files.
      
      Because of that, we have to query gitattributes in order to determine
      how to diff the current files. Instead of hitting the '.gitattributes'
      file every time we need to query an attribute, which can get expensive
      especially on networked file systems, we try to cache them instead. This
      works perfectly fine for every '.gitattributes' file that is found, but
      we hit cache invalidation problems when we determine that an attribuse
      file is _not_ existing. We do create an entry in the cache for missing
      '.gitattributes' files, but as soon as we hit that file again we
      invalidate it and stat it again to see if it has now appeared.
      
      In the case of diffing large trees with each other, this behaviour is
      very suboptimal. For each pair of files that is to be diffed, we will
      repeatedly query every directory component leading towards their
      respective location for an attributes file. This leads to thousands or
      even hundreds of thousands of wasted syscalls.
      
      The attributes cache already has a mechanism to help in that scenario in
      form of the `git_attr_session`. As long as the same attributes session
      is still active, we will not try to re-query the gitmodules files at all
      but simply retain our currently cached results. To fix our problem, we
      can create a session at the top-most level, which is the initialization
      of the `git_diff` structure, and use it in order to look up the correct
      diff driver. As the `git_diff` structure is used to generate patches for
      multiple files at once, this neatly solves our problem by retaining the
      session until patches for all files have been generated.
      
      The fix has been tested with linux.git by calling
      `git_diff_tree_to_tree` and `git_diff_to_buf` with v4.10^{tree} and
      v4.14^{tree}.
      
                      | time    | .gitattributes stats
          without fix | 33.201s | 844614
          with fix    | 30.327s | 4441
      
      While execution only improved by roughly 10%, the stat(3) syscalls for
      .gitattributes files decreased by 99.5%. The benchmarks were quite
      simple with best-of-three timings on Linux ext4 systems. One can assume
      that for network based file systems the performance gain will be a lot
      larger due to a much higher latency.
      Patrick Steinhardt committed
  11. 15 Dec, 2017 1 commit
    • diff_file: properly refcount blobs when initializing file contents · 2388a9e2
      When initializing a `git_diff_file_content` from a source whose data is
      derived from a blob, we simply assign the blob's pointer to the
      resulting struct without incrementing its refcount. Thus, the structure
      can only be used as long as the blob is kept alive by the caller.
      
      Fix the issue by using `git_blob_dup` instead of a direct assignment.
      This function will increment the refcount of the blob without allocating
      new memory, so it does exactly what we want. As
      `git_diff_file_content__unload` already frees the blob when
      `GIT_DIFF_FLAG__FREE_BLOB` is set, we don't need to add new code
      handling the free but only have to set that flag correctly.
      Patrick Steinhardt committed
  12. 03 Jul, 2017 1 commit
    • Make sure to always include "common.h" first · 0c7f49dd
      Next to including several files, our "common.h" header also declares
      various macros which are then used throughout the project. As such, we
      have to make sure to always include this file first in all
      implementation files. Otherwise, we might encounter problems or even
      silent behavioural differences due to macros or defines not being
      defined as they should be. So in fact, our header and implementation
      files should make sure to always include "common.h" first.
      
      This commit does so by establishing a common include pattern. Header
      files inside of "src" will now always include "common.h" as its first
      other file, separated by a newline from all the other includes to make
      it stand out as special. There are two cases for the implementation
      files. If they do have a matching header file, they will always include
      this one first, leading to "common.h" being transitively included as
      first file. If they do not have a matching header file, they instead
      include "common.h" as first file themselves.
      
      This fixes the outlined problems and will become our standard practice
      for header and source files inside of the "src/" from now on.
      Patrick Steinhardt committed
  13. 29 Dec, 2016 1 commit
  14. 26 May, 2016 2 commits
  15. 03 Nov, 2015 1 commit
  16. 25 Jun, 2015 1 commit
  17. 22 Jun, 2015 2 commits
    • submodule: add an ignore option to status · c6f489c9
      This lets us specify in the status call which ignore rules we want to
      use (optionally falling back to whatever the submodule has in its
      configuration).
      
      This removes one of the reasons for having `_set_ignore()` set the value
      in-memory. We re-use the `IGNORE_RESET` value for this as it is no
      longer relevant but has a similar purpose to `IGNORE_FALLBACK`.
      
      Similarly, we remove `IGNORE_DEFAULT` which does not have use outside of
      initializers and move that to fall back to the configuration as well.
      Carlos Martín Nieto committed
    • submodule: don't let status change an existing instance · 64bbd47a
      As submodules are becomes more like values, we should not let a status
      check to update its properties. Instead of taking a submodule, have
      status take a repo and submodule name.
      Carlos Martín Nieto committed
  18. 12 Jun, 2015 1 commit
    • diff: introduce binary diff callbacks · 8147b1af
      Introduce a new binary diff callback to provide the actual binary
      delta contents to callers.  Create this data from the diff contents
      (instead of directly from the ODB) to support binary diffs including
      the workdir, not just things coming out of the ODB.
      Edward Thomson committed
  19. 19 Feb, 2015 1 commit
  20. 20 May, 2014 1 commit
  21. 06 May, 2014 1 commit
    • Add filter options and ALLOW_UNSAFE · 5269008c
      Diff and status do not want core.safecrlf to actually raise an
      error regardless of the setting, so this extends the filter API
      with an additional options flags parameter and adds a flag so that
      filters can be applied with GIT_FILTER_OPT_ALLOW_UNSAFE, indicating
      that unsafe filter application should be downgraded from a failure
      to a warning.
      Russell Belfer committed
  22. 25 Mar, 2014 2 commits
    • Fix submodule leaks and invalid references · 591e8295
      This cleans up some places I missed that could hold onto submodule
      references and cleans up the way in which the repository cache is
      both reloaded and released so that existing submodule references
      aren't destroyed inappropriately.
      Russell Belfer committed
    • Make submodules externally refcounted · a15c7802
      `git_submodule` objects were already refcounted internally in case
      the submodule name was different from the path at which it was
      stored.  This makes that refcounting externally used as well, so
      `git_submodule_lookup` and `git_submodule_add_setup` return an
      object that requires a `git_submodule_free` when done.
      Russell Belfer committed
  23. 27 Feb, 2014 1 commit
    • Add buffer to buffer diff and patch APIs · 6789b7a7
      This adds `git_diff_buffers` and `git_patch_from_buffers`.  This
      also includes a bunch of internal refactoring to increase the
      shared code between these functions and the blob-to-blob and
      blob-to-buffer APIs, as well as some higher level assert helpers
      in the tests to also remove redundancy.
      Russell Belfer committed
  24. 25 Jan, 2014 1 commit
  25. 15 Oct, 2013 1 commit
    • Diff API cleanup · 10672e3e
      This lays groundwork for separating formatting options from diff
      creation options.  This groups the formatting flags separately
      from the diff list creation flags and reorders the options.  This
      also tweaks some APIs to further separate code that uses patches
      from code that just looks at git_diffs.
      Russell Belfer committed
  26. 11 Oct, 2013 1 commit
  27. 17 Sep, 2013 4 commits
    • Merge git_buf and git_buffer · a9f51e43
      This makes the git_buf struct that was used internally into an
      externally available structure and eliminates the git_buffer.
      
      As part of that, some of the special cases that arose with the
      externally used git_buffer were blended into the git_buf, such as
      being careful about git_buf objects that may have a NULL ptr and
      allowing for bufs with a valid ptr and size but zero asize as a
      way of referring to externally owned data.
      Russell Belfer committed
    • Add ident filter · 4b11f25a
      This adds the ident filter (that knows how to replace $Id$) and
      tweaks the filter APIs and code so that git_filter_source objects
      actually have the updated OID of the object being filtered when
      it is a known value.
      Russell Belfer committed
    • Extend public filter api with filter lists · 2a7d224f
      This moves the git_filter_list into the public API so that users
      can create, apply, and dispose of filter lists.  This allows more
      granular application of filters to user data outside of libgit2
      internals.
      
      This also converts all the internal usage of filters to the public
      APIs along with a few small tweaks to make it easier to use the
      public git_buffer stuff alongside the internal git_buf.
      Russell Belfer committed
    • Create public filter object and use it · 85d54812
      This creates include/sys/filter.h with a basic definition of a
      git_filter and then converts the internal code to use it.  There
      are related internal objects (git_filter_list) that we will want
      to publish at some point, but this is a first step.
      Russell Belfer committed
  28. 25 Jul, 2013 1 commit
    • Make rename detection file size fix better · effdbeb3
      The previous fix for checking file sizes with rename detection
      always loads the blob.  In this version, if the odb backend can
      get the object header without loading the whole thing into memory,
      then we'll just use that, so that we can eliminate possible rename
      sources & targets without loading them.
      Russell Belfer committed
  29. 24 Jul, 2013 1 commit
  30. 18 Jun, 2013 1 commit
    • Add "as_path" parameters to blob and buffer diffs · 74ded024
      This adds parameters to the four functions that allow for blob-to-
      blob and blob-to-buffer differencing (either via callbacks or by
      making a git_diff_patch object).  These parameters let you say
      that filename we should pretend the blob has while doing the diff.
      If you pass NULL, there should be no change from the existing
      behavior, which is to skip using attributes for file type checks
      and just look at content.  With the parameters, you can plug into
      the new diff driver functionality and get binary or non-binary
      behavior, plus function context regular expressions, etc.
      
      This commit also fixes things so that the git_diff_delta that is
      generated by these functions will actually be populated with the
      data that we know about the blobs (or buffers) so you can use it
      appropriately.  It also fixes a bug in generating patches from
      the git_diff_patch objects created via these functions.
      
      Lastly, there is one other behavior change that may matter.  If
      there is no difference between the two blobs, these functions no
      longer generate any diff callbacks / patches unless you have
      passed in GIT_DIFF_INCLUDE_UNMODIFIED.  This is pretty natural,
      but could potentially change the behavior of existing usage.
      Russell Belfer committed
  31. 12 Jun, 2013 1 commit
    • Fix diff header naming issues · 360f42f4
      This makes the git_diff_patch definition private to diff_patch.c
      and fixes a number of other header file naming inconsistencies to
      use `git_` prefixes on functions and structures that are shared
      between files.
      Russell Belfer committed
  32. 11 Jun, 2013 1 commit
    • Implement regex pattern diff driver · 5dc98298
      This implements the loading of regular expression pattern lists
      for diff drivers that search for function context in that way.
      This also changes the way that diff drivers update options and
      interface with xdiff APIs to make them a little more flexible.
      Russell Belfer committed
  33. 10 Jun, 2013 1 commit
    • Reorganize diff and add basic diff driver · 114f5a6c
      This is a significant reorganization of the diff code to break it
      into a set of more clearly distinct files and to document the new
      organization.  Hopefully this will make the diff code easier to
      understand and to extend.
      
      This adds a new `git_diff_driver` object that looks of diff driver
      information from the attributes and the config so that things like
      function content in diff headers can be provided.  The full driver
      spec is not implemented in the commit - this is focused on the
      reorganization of the code and putting the driver hooks in place.
      
      This also removes a few #includes from src/repository.h that were
      overbroad, but as a result required extra #includes in a variety
      of places since including src/repository.h no longer results in
      pulling in the whole world.
      Russell Belfer committed