1. 03 Jun, 2020 6 commits
    • config: ignore unreadable configuration files · c0bf387f
      Modified `config_file_open()` so it returns 0 if the config file is
      not readable, which happens on global config files under macOS
      sandboxing (note that for some reason `access(F_OK)` DOES work with
      sandboxing, but it is lying). Without this read check sandboxed
      applications on macOS can not open any repository, because
      `config_file_read()` will return GIT_ERROR when it cannot read the
      global /Users/username/.gitconfig file, and the upper layers will
      just completely abort on GIT_ERROR when attempting to load the
      global config file, so no repositories can be opened.
      Wil Shipley committed
    • Fix uninitialized stack memory and NULL ptr dereference in stash_to_index · 234a2e6e
      Caught by static analysis.
      Philip Kelley committed
    • Fix typo causing removal of symbol 'git_worktree_prune_init_options' · aee48d27
      Commit 0b5ba0d7 replaced this function with an "option_init"
      equivallent, but misspelled the replacement function. As a result, this
      symbol has been missing from libgit2.so ever since.
      Seth Junot committed
    • merge: cache negative cache results for similarity metrics · 1740bbb3
      When computing renames, we cache the hash signatures for each of the
      potentially conflicting entries so that we do not need to repeatedly
      read the file and can at least halfway efficiently determine whether two
      files are similar enough to be deemed a rename. In order to make the
      hash signatures meaningful, we require at least four lines of data to be
      present, resulting in at least four different hashes that can be
      compared. Files that are deemed too small are not cached at all and
      will thus be repeatedly re-hashed, which is usually not a huge issue.
      
      The issue with above heuristic is in case a file does _not_ have at
      least four lines, where a line is anything separated by a consecutive
      run of "\n" or "\0" characters. For example "a\nb" is two lines, but
      "a\0\0b" is also just two lines. Taken to the extreme, a file that has
      megabytes of consecutive space- or NUL-only may also be deemed as too
      small and thus not get cached. As a result, we will repeatedly load its
      blob, calculate its hash signature just to finally throw it away as we
      notice it's not of any value. When you've got a comparitively big file
      that you compare against a big set of potentially renamed files, then
      the cost simply expodes.
      
      The issue can be trivially fixed by introducing negative cache entries.
      Whenever we determine that a given blob does not have a meaningful
      representation via a hash signature, we store this negative cache marker
      and will from then on not hash it again, but also ignore it as a
      potential rename target. This should help the "normal" case already
      where you have a lot of small files as rename candidates, but in the
      above scenario it's savings are extraordinarily high.
      
      To verify we do not hit the issue anymore with described solution, this
      commit adds a test that uses the exact same setup described above with
      one 50 megabyte blob of '\0' characters and 1000 other files that get
      renamed. Without the negative cache:
      
      $ time ./libgit2_clar -smerge::trees::renames::cache_recomputation >/dev/null
      real    11m48.377s
      user    11m11.576s
      sys     0m35.187s
      
      And with the negative cache:
      
      $ time ./libgit2_clar -smerge::trees::renames::cache_recomputation >/dev/null
      real    0m1.972s
      user    0m1.851s
      sys     0m0.118s
      
      So this represents a ~350-fold performance improvement, but it obviously
      depends on how many files you have and how big the blob is. The test
      number were chosen in a way that one will immediately notice as soon as
      the bug resurfaces.
      Patrick Steinhardt committed
  2. 01 Apr, 2020 1 commit
  3. 28 Mar, 2020 2 commits
  4. 26 Mar, 2020 7 commits
  5. 23 Mar, 2020 3 commits
  6. 22 Mar, 2020 1 commit
  7. 21 Mar, 2020 1 commit
  8. 18 Mar, 2020 1 commit
  9. 17 Mar, 2020 1 commit
  10. 14 Mar, 2020 1 commit
    • cmake: use install directories provided via GNUInstallDirs · 87fc539f
      We currently hand-code logic to configure where to install our artifacts
      via the `LIB_INSTALL_DIR`, `INCLUDE_INSTALL_DIR` and `BIN_INSTALL_DIR`
      variables. This is reinventing the wheel, as CMake already provide a way
      to do that via `CMAKE_INSTALL_<DIR>` paths, e.g. `CMAKE_INSTALL_LIB`.
      This requires users of libgit2 to know about the discrepancy and will
      require special hacks for any build systems that handle these variables
      in an automated way. One such example is Gentoo Linux, which sets up
      these paths in both the cmake and cmake-utils eclass.
      
      So let's stop doing that: the GNUInstallDirs module handles it in a
      better way for us, especially so as the actual values are dependent on
      CMAKE_INSTALL_PREFIX. This commit removes our own set of variables and
      instead refers users to use the standard ones.
      
      As a second benefit, this commit also fixes our pkgconfig generation to
      use the GNUInstallDirs module. We had a bug there where we ignored the
      CMAKE_INSTALL_PREFIX when configuring the libdir and includedir keys, so
      if libdir was set to "lib64", then libdir would be an invalid path. With
      GNUInstallDirs, we can now use `CMAKE_INSTALL_FULL_LIBDIR`, which
      handles the prefix for us.
      Patrick Steinhardt committed
  11. 13 Mar, 2020 6 commits
  12. 10 Mar, 2020 4 commits
  13. 08 Mar, 2020 2 commits
  14. 06 Mar, 2020 2 commits
  15. 05 Mar, 2020 2 commits
    • Merge pull request #5432 from libgit2/ethomson/sslread · 76e45960
      httpclient: use a 16kb read buffer for macOS
      Patrick Steinhardt committed
    • httpclient: use a 16kb read buffer for macOS · 502e5d51
      Use a 16kb read buffer for compatibility with macOS SecureTransport.
      
      SecureTransport `SSLRead` has the following behavior:
      
      1. It will return _at most_ one TLS packet's worth of data, and
      2. It will try to give you as much data as you asked for
      
      This means that if you call `SSLRead` with a buffer size that is smaller
      than what _it_ reads (in other words, the maximum size of a TLS packet),
      then it will buffer that data for subsequent calls.  However, it will
      also attempt to give you as much data as you requested in your SSLRead
      call.  This means that it will guarantee a network read in the event
      that it has buffered data.
      
      Consider our 8kb buffer and a server sending us 12kb of data on an HTTP
      Keep-Alive session.  Our first `SSLRead` will read the TLS packet off
      the network.  It will return us the 8kb that we requested and buffer the
      remaining 4kb.  Our second `SSLRead` call will see the 4kb that's
      buffered and decide that it could give us an additional 4kb.  So it will
      do a network read.
      
      But there's nothing left to read; that was the end of the data.  The
      HTTP server is waiting for us to provide a new request.  The server will
      eventually time out, our `read` system call will return, `SSLRead` can
      return back to us and we can make progress.
      
      While technically correct, this is wildly ineffecient.  (Thanks, Tim
      Apple!)
      
      Moving us to use an internal buffer that is the maximum size of a TLS
      packet (16kb) ensures that `SSLRead` will never buffer and it will
      always return everything that it read (albeit decrypted).
      Edward Thomson committed