1. 05 Apr, 2020 1 commit
    • docs: add documentation for our coding style · ffb6a576
      For years, we've repeatedly had confusion about what our actual coding
      style is not only for newcomers, but also across the core contributors.
      This can mostly be attributed to the fact that we do not have any coding
      conventions written down. This is now a thing of the past with the
      introduction of a new document that gives an initial overview of our
      style and most important best practices for both our C codebase as well
      as for CMake.
      
      While the proposed coding style for our C codebase should be rather
      uncontroversial, the coding style for CMake might be. This can be
      attributed to multiple facts. First, the CMake code base doesn't really
      have any uniform coding style and is quite outdated in a lot of places.
      Second, the proposed coding style actually breaks with our existing one:
      we currently use all-uppercase function names and variables, but the
      documented coding style says we use all-lowercase function names but
      all-uppercase variables.
      
      It's common practice in CMake to write variables in all upper-case, and
      in fact all variables made available by CMake are exactly that. As
      variables are case-sensitive in CMake, we cannot and shouldn't break
      with this. In contrast, function calls are case insensitive, and modern
      CMake always uses all-lowercase ones. I argue we should do the same to
      get in line with other codebases and to reduce the likelihood of
      repetitive strain injuries.
      
      So especially for CMake, the proposed coding style says something we
      don't have yet. I'm fine with that, as the document explicitly says that
      it's what we want to have and not what we have right now.
      Patrick Steinhardt committed
  2. 04 Apr, 2020 4 commits
  3. 03 Apr, 2020 1 commit
  4. 02 Apr, 2020 4 commits
  5. 01 Apr, 2020 10 commits
    • Merge pull request #5461 from pks-t/pks/refdb-fs-unused-header · b8eec0b2
      refdb_fs: remove unused header file
      Edward Thomson committed
    • Making get_delta_base() conform to the general error-handling pattern · ba59a4a2
      This makes get_delta_base() return the error code as the return value
      and the delta base as an out-parameter.
      lhchavez committed
    • pack: Improve error handling for get_delta_base() · f3273725
      This change moves the responsibility of setting the error upon failures
      of get_delta_base() to get_delta_base() instead of its callers. That
      way, the caller chan always check if the return value is negative and
      mark the whole operation as an error instead of using garbage values,
      which can lead to crashes if the .pack files are malformed.
      lhchavez committed
    • Merge pull request #5466 from pks-t/pks/patch-modechange-with-rename · 1c7fb212
      patch: correctly handle mode changes for renames
      Edward Thomson committed
    • Merge pull request #5474 from pks-t/pks/gitignore-cleanup · 85533f37
      gitignore: clean up patterns from old times
      Edward Thomson committed
    • Merge pull request #5478 from pks-t/pks/readme-ci-update · 2662da48
      README.md: update build matrix to reflect our latest releases
      Edward Thomson committed
    • cmake: streamline backend detection · 541de515
      We're currently doing unnecessary work to auto-detect backends even if
      the functionality is disabled altogether. Let's fix this by removing the
      extraneous FOO_BACKEND variables, instead letting auto-detection modify
      the variable itself.
      Patrick Steinhardt committed
    • Merge pull request #5471 from pks-t/pks/v1.0 · 7d3c7057
      Release v1.0
      Patrick Steinhardt committed
    • merge: cache negative cache results for similarity metrics · 4dfcc50f
      When computing renames, we cache the hash signatures for each of the
      potentially conflicting entries so that we do not need to repeatedly
      read the file and can at least halfway efficiently determine whether two
      files are similar enough to be deemed a rename. In order to make the
      hash signatures meaningful, we require at least four lines of data to be
      present, resulting in at least four different hashes that can be
      compared. Files that are deemed too small are not cached at all and
      will thus be repeatedly re-hashed, which is usually not a huge issue.
      
      The issue with above heuristic is in case a file does _not_ have at
      least four lines, where a line is anything separated by a consecutive
      run of "\n" or "\0" characters. For example "a\nb" is two lines, but
      "a\0\0b" is also just two lines. Taken to the extreme, a file that has
      megabytes of consecutive space- or NUL-only may also be deemed as too
      small and thus not get cached. As a result, we will repeatedly load its
      blob, calculate its hash signature just to finally throw it away as we
      notice it's not of any value. When you've got a comparitively big file
      that you compare against a big set of potentially renamed files, then
      the cost simply expodes.
      
      The issue can be trivially fixed by introducing negative cache entries.
      Whenever we determine that a given blob does not have a meaningful
      representation via a hash signature, we store this negative cache marker
      and will from then on not hash it again, but also ignore it as a
      potential rename target. This should help the "normal" case already
      where you have a lot of small files as rename candidates, but in the
      above scenario it's savings are extraordinarily high.
      
      To verify we do not hit the issue anymore with described solution, this
      commit adds a test that uses the exact same setup described above with
      one 50 megabyte blob of '\0' characters and 1000 other files that get
      renamed. Without the negative cache:
      
      $ time ./libgit2_clar -smerge::trees::renames::cache_recomputation >/dev/null
      real    11m48.377s
      user    11m11.576s
      sys     0m35.187s
      
      And with the negative cache:
      
      $ time ./libgit2_clar -smerge::trees::renames::cache_recomputation >/dev/null
      real    0m1.972s
      user    0m1.851s
      sys     0m0.118s
      
      So this represents a ~350-fold performance improvement, but it obviously
      depends on how many files you have and how big the blob is. The test
      number were chosen in a way that one will immediately notice as soon as
      the bug resurfaces.
      Patrick Steinhardt committed
  6. 30 Mar, 2020 1 commit
  7. 28 Mar, 2020 2 commits
  8. 26 Mar, 2020 8 commits
  9. 25 Mar, 2020 1 commit
  10. 23 Mar, 2020 3 commits
  11. 22 Mar, 2020 1 commit
  12. 21 Mar, 2020 1 commit
  13. 18 Mar, 2020 1 commit
  14. 17 Mar, 2020 1 commit
  15. 14 Mar, 2020 1 commit
    • cmake: use install directories provided via GNUInstallDirs · 87fc539f
      We currently hand-code logic to configure where to install our artifacts
      via the `LIB_INSTALL_DIR`, `INCLUDE_INSTALL_DIR` and `BIN_INSTALL_DIR`
      variables. This is reinventing the wheel, as CMake already provide a way
      to do that via `CMAKE_INSTALL_<DIR>` paths, e.g. `CMAKE_INSTALL_LIB`.
      This requires users of libgit2 to know about the discrepancy and will
      require special hacks for any build systems that handle these variables
      in an automated way. One such example is Gentoo Linux, which sets up
      these paths in both the cmake and cmake-utils eclass.
      
      So let's stop doing that: the GNUInstallDirs module handles it in a
      better way for us, especially so as the actual values are dependent on
      CMAKE_INSTALL_PREFIX. This commit removes our own set of variables and
      instead refers users to use the standard ones.
      
      As a second benefit, this commit also fixes our pkgconfig generation to
      use the GNUInstallDirs module. We had a bug there where we ignored the
      CMAKE_INSTALL_PREFIX when configuring the libdir and includedir keys, so
      if libdir was set to "lib64", then libdir would be an invalid path. With
      GNUInstallDirs, we can now use `CMAKE_INSTALL_FULL_LIBDIR`, which
      handles the prefix for us.
      Patrick Steinhardt committed