1. 03 Jan, 2018 1 commit
    • diff_generate: avoid excessive stats of .gitattribute files · d8896bda
      When generating a diff between two trees, for each file that is to be
      diffed we have to determine whether it shall be treated as text or as
      binary files. While git has heuristics to determine which kind of diff
      to generate, users can also that default behaviour by setting or
      unsetting the 'diff' attribute for specific files.
      
      Because of that, we have to query gitattributes in order to determine
      how to diff the current files. Instead of hitting the '.gitattributes'
      file every time we need to query an attribute, which can get expensive
      especially on networked file systems, we try to cache them instead. This
      works perfectly fine for every '.gitattributes' file that is found, but
      we hit cache invalidation problems when we determine that an attribuse
      file is _not_ existing. We do create an entry in the cache for missing
      '.gitattributes' files, but as soon as we hit that file again we
      invalidate it and stat it again to see if it has now appeared.
      
      In the case of diffing large trees with each other, this behaviour is
      very suboptimal. For each pair of files that is to be diffed, we will
      repeatedly query every directory component leading towards their
      respective location for an attributes file. This leads to thousands or
      even hundreds of thousands of wasted syscalls.
      
      The attributes cache already has a mechanism to help in that scenario in
      form of the `git_attr_session`. As long as the same attributes session
      is still active, we will not try to re-query the gitmodules files at all
      but simply retain our currently cached results. To fix our problem, we
      can create a session at the top-most level, which is the initialization
      of the `git_diff` structure, and use it in order to look up the correct
      diff driver. As the `git_diff` structure is used to generate patches for
      multiple files at once, this neatly solves our problem by retaining the
      session until patches for all files have been generated.
      
      The fix has been tested with linux.git by calling
      `git_diff_tree_to_tree` and `git_diff_to_buf` with v4.10^{tree} and
      v4.14^{tree}.
      
                      | time    | .gitattributes stats
          without fix | 33.201s | 844614
          with fix    | 30.327s | 4441
      
      While execution only improved by roughly 10%, the stat(3) syscalls for
      .gitattributes files decreased by 99.5%. The benchmarks were quite
      simple with best-of-three timings on Linux ext4 systems. One can assume
      that for network based file systems the performance gain will be a lot
      larger due to a much higher latency.
      Patrick Steinhardt committed
  2. 03 Jul, 2017 1 commit
    • Make sure to always include "common.h" first · 0c7f49dd
      Next to including several files, our "common.h" header also declares
      various macros which are then used throughout the project. As such, we
      have to make sure to always include this file first in all
      implementation files. Otherwise, we might encounter problems or even
      silent behavioural differences due to macros or defines not being
      defined as they should be. So in fact, our header and implementation
      files should make sure to always include "common.h" first.
      
      This commit does so by establishing a common include pattern. Header
      files inside of "src" will now always include "common.h" as its first
      other file, separated by a newline from all the other includes to make
      it stand out as special. There are two cases for the implementation
      files. If they do have a matching header file, they will always include
      this one first, leading to "common.h" being transitively included as
      first file. If they do not have a matching header file, they instead
      include "common.h" as first file themselves.
      
      This fixes the outlined problems and will become our standard practice
      for header and source files inside of the "src/" from now on.
      Patrick Steinhardt committed
  3. 17 Feb, 2017 2 commits
  4. 29 Dec, 2016 1 commit
  5. 06 Oct, 2016 1 commit
  6. 21 Jun, 2016 1 commit
  7. 26 May, 2016 1 commit
  8. 15 Aug, 2015 1 commit
  9. 10 Apr, 2015 1 commit
    • Fix checking of return value for regcomp. · 129022ee
      The regcomp function returns a non-zero value if compilation of
      a regular expression fails. In most places we only check for
      negative values, but positive values indicate an error, as well.
      Fix this tree-wide, fixing a segmentation fault when calling
      git_config_iterator_glob_new with an invalid regexp.
      Patrick Steinhardt committed
  10. 03 Mar, 2015 1 commit
    • config: borrow refcounted references · 9a97f49e
      This changes the get_entry() method to return a refcounted version of
      the config entry, which you have to free when you're done.
      
      This allows us to avoid freeing the memory in which the entry is stored
      on a refresh, which may happen at any time for a live config.
      
      For this reason, get_string() has been forbidden on live configs and a
      new function get_string_buf() has been added, which stores the string in
      a git_buf which the user then owns.
      
      The functions which parse the string value takea advantage of the
      borrowing to parse safely and then release the entry.
      Carlos Martín Nieto committed
  11. 19 Feb, 2015 1 commit
  12. 15 Feb, 2015 1 commit
  13. 13 Feb, 2015 2 commits
  14. 16 May, 2014 1 commit
  15. 13 May, 2014 1 commit
  16. 12 May, 2014 1 commit
  17. 07 May, 2014 1 commit
  18. 18 Apr, 2014 1 commit
  19. 17 Apr, 2014 1 commit
  20. 27 Jan, 2014 1 commit
    • Update Javascript userdiff driver and tests · 082e82db
      Writing a sample Javascript driver pointed out some extra
      whitespace handling that needed to be done in the diff driver.
      This adds some tests with some sample javascript code that I
      pulled off of GitHub just to see what would happen.  Also, to
      clean up the userdiff test data, I did a "git gc" and packed
      up the test objects.
      Russell Belfer committed
  21. 24 Jan, 2014 4 commits
    • Got some permission to use userdiff patterns · c7c260a5
      I contacted a number of Git authors and lined up their permission
      to relicense their work for use in libgit2 and copied over their
      code for diff driver xfuncname patterns.  At this point, the code
      I've copied is taken verbatim from core Git although Thomas Rast
      warned me that the C++ patterns, at least, really need an update.
      I've left off patterns where I don't feel like I have permission
      at this point until I hear from more authors.
      Russell Belfer committed
    • Import git drivers and test HTML driver · 2c65602e
      Reorganize the builtin driver table slightly so that core Git
      builtin definitions can be imported verbatim.  Then take a few of
      the core Git drivers and pull them in.
      
      This also creates a test of diffs with the builtin HTML driver
      which led to some small error handling fixes in the driver
      selection logic.
      Russell Belfer committed
    • Initial take on builtin drivers with multiline · a5a38643
      This extends the diff driver parser to support multiline driver
      definitions along with ! prefixing for negated matches.  This
      brings the driver function pattern parsing in line with core Git.
      
      This also adds an internal table of driver definitions and a
      fallback code path that will look in that table for diff drivers
      that are set with attributes without having a definition in the
      config file.  Right now, I just populated the table with a kind
      of simple HTML definition that is similar to the core Git def.
      Russell Belfer committed
  22. 11 Dec, 2013 1 commit
    • Add config read fns with controlled error behavior · 9f77b3f6
      This adds `git_config__lookup_entry` which will look up a key in
      a config and return either the entry or NULL if the key was not
      present.  Optionally, it can either suppress all errors or can
      return them (although not finding the key is not an error for this
      function).  Unlike other accessors, this does not normalize the
      config key string, so it must only be used when the key is known
      to be in normalized form (i.e. all lower-case before the first dot
      and after the last dot, with no invalid characters).
      
      This also adds three high-level helper functions to look up config
      values with no errors and a fallback value.  The three functions
      are for string, bool, and int values, and will resort to the
      fallback value for any error that arises.  They are:
      
      * `git_config__get_string_force`
      * `git_config__get_bool_force`
      * `git_config__get_int_force`
      
      None of them normalize the config `key` either, so they can only
      be used for internal cases where the key is known to be in normal
      format.
      Russell Belfer committed
  23. 08 Aug, 2013 1 commit
  24. 11 Jul, 2013 1 commit
  25. 05 Jul, 2013 1 commit
    • Diff hunk context off by one on long lines · a5f9b5f8
      The diff hunk context string that is returned to xdiff need not
      be NUL terminated because the xdiff code just copies the number of
      bytes that you report directly into the output.  There was an off
      by one in the diff driver code when the header context was longer
      than the output buffer size, the output buffer length included
      the NUL byte which was copied into the hunk header.
      
      Fixes #1710
      Russell Belfer committed
  26. 12 Jun, 2013 4 commits
  27. 11 Jun, 2013 1 commit
    • Implement regex pattern diff driver · 5dc98298
      This implements the loading of regular expression pattern lists
      for diff drivers that search for function context in that way.
      This also changes the way that diff drivers update options and
      interface with xdiff APIs to make them a little more flexible.
      Russell Belfer committed
  28. 10 Jun, 2013 2 commits
    • Reorganize diff and add basic diff driver · 114f5a6c
      This is a significant reorganization of the diff code to break it
      into a set of more clearly distinct files and to document the new
      organization.  Hopefully this will make the diff code easier to
      understand and to extend.
      
      This adds a new `git_diff_driver` object that looks of diff driver
      information from the attributes and the config so that things like
      function content in diff headers can be provided.  The full driver
      spec is not implemented in the commit - this is focused on the
      reorganization of the code and putting the driver hooks in place.
      
      This also removes a few #includes from src/repository.h that were
      overbroad, but as a result required extra #includes in a variety
      of places since including src/repository.h no longer results in
      pulling in the whole world.
      Russell Belfer committed