1. 23 Feb, 2022 2 commits
  2. 17 Oct, 2021 1 commit
    • str: introduce `git_str` for internal, `git_buf` is external · f0e693b1
      libgit2 has two distinct requirements that were previously solved by
      `git_buf`.  We require:
      
      1. A general purpose string class that provides a number of utility APIs
         for manipulating data (eg, concatenating, truncating, etc).
      2. A structure that we can use to return strings to callers that they
         can take ownership of.
      
      By using a single class (`git_buf`) for both of these purposes, we have
      confused the API to the point that refactorings are difficult and
      reasoning about correctness is also difficult.
      
      Move the utility class `git_buf` to be called `git_str`: this represents
      its general purpose, as an internal string buffer class.  The name also
      is an homage to Junio Hamano ("gitstr").
      
      The public API remains `git_buf`, and has a much smaller footprint.  It
      is generally only used as an "out" param with strict requirements that
      follow the documentation.  (Exceptions exist for some legacy APIs to
      avoid breaking callers unnecessarily.)
      
      Utility functions exist to convert a user-specified `git_buf` to a
      `git_str` so that we can call internal functions, then converting it
      back again.
      Edward Thomson committed
  3. 27 Sep, 2021 1 commit
  4. 11 May, 2021 1 commit
  5. 21 Sep, 2019 2 commits
    • buffer: fix printing into out-of-memory buffer · 174b7a32
      Before printing into a `git_buf` structure, we always call `ENSURE_SIZE`
      first. This macro will reallocate the buffer as-needed depending on
      whether the current amount of allocated bytes is sufficient or not. If
      `asize` is big enough, then it will just do nothing, otherwise it will
      call out to `git_buf_try_grow`. But in fact, it is insufficient to only
      check `asize`.
      
      When we fail to allocate any more bytes e.g. via `git_buf_try_grow`,
      then we set the buffer's pointer to `git_buf__oom`. Note that we touch
      neither `asize` nor `size`. So if we just check `asize > targetsize`,
      then we will happily let the caller of `ENSURE_SIZE` proceed with an
      out-of-memory buffer. As a result, we will print all bytes into the
      out-of-memory buffer instead, resulting in an out-of-bounds write.
      
      Fix the issue by having `ENSURE_SIZE` verify that the buffer is not
      marked as OOM. Add a test to verify that we're not writing into the OOM
      buffer.
      Patrick Steinhardt committed
    • buffer: fix infinite loop when growing buffers · 208f1d7a
      When growing buffers, we repeatedly multiply the currently allocated
      number of bytes by 1.5 until it exceeds the requested number of bytes.
      This has two major problems:
      
          1. If the current number of bytes is tiny and one wishes to resize
             to a comparatively huge number of bytes, then we may need to loop
             thousands of times.
      
          2. If resizing to a value close to `SIZE_MAX` (which would fail
             anyway), then we probably hit an infinite loop as multiplying the
             current amount of bytes will repeatedly result in integer
             overflows.
      
      When reallocating buffers, one typically chooses values close to 1.5 to
      enable re-use of resulting memory holes in later reallocations. But
      because of this, it really only makes sense to use a factor of 1.5
      _once_, but not looping until we finally are able to fit it. Thus, we
      can completely avoid the loop and just opt for the much simpler
      algorithm of multiplying with 1.5 once and, if the result doesn't fit,
      just use the target size. This avoids both problems of looping
      extensively and hitting overflows.
      
      This commit also adds a test that would've previously resulted in an
      infinite loop.
      Patrick Steinhardt committed
  6. 20 Jul, 2019 1 commit
  7. 10 Jun, 2018 1 commit
  8. 26 May, 2016 2 commits
  9. 17 Sep, 2015 1 commit
    • git_futils_mkdir_*: make a relative-to-base mkdir · ac2fba0e
      Untangle git_futils_mkdir from git_futils_mkdir_ext - the latter
      assumes that we own everything beneath the base, as if it were
      being called with a base of the repository or working directory,
      and is tailored towards checkout and ensuring that there is no
      bogosity beneath the base that must be cleaned up.
      
      This is (at best) slow and (at worst) unsafe in the larger context
      of a filesystem where we do not own things and cannot do things like
      unlink symlinks that are in our way.
      Edward Thomson committed
  10. 24 Jun, 2015 2 commits
  11. 22 Jun, 2015 1 commit
  12. 20 Jan, 2015 1 commit
  13. 21 Nov, 2014 1 commit
  14. 01 Oct, 2014 1 commit
  15. 15 Aug, 2014 1 commit
  16. 23 Jun, 2014 1 commit
    • crlf: pass-through mixed EOL buffers from LF->CRLF · 5a76ad35
      When checking out files, we're performing conversion into the user's
      native line endings, but we only want to do it for files which have
      consistent line endings. Refuse to perform the conversion for mixed-EOL
      files.
      
      The CRLF->LF filter is left as-is, as that conversion is considered to be
      normalization by git and should force a conversion of the line endings.
      Carlos Martín Nieto committed
  17. 23 Apr, 2014 1 commit
  18. 01 Apr, 2014 1 commit
  19. 20 Jan, 2014 1 commit
  20. 14 Nov, 2013 1 commit
  21. 17 Sep, 2013 1 commit
    • Start of filter API + git_blob_filtered_content · 0cf77103
      This begins the process of exposing git_filter objects to the
      public API.  This includes:
      
      * new public type and API for `git_buffer` through which an
        allocated buffer can be passed to the user
      * new API `git_blob_filtered_content`
      * make the git_filter type and GIT_FILTER_TO_... constants public
      Russell Belfer committed
  22. 19 Aug, 2013 1 commit
    • Skip UTF-8 BOM in binary detection · c0b01b75
      When a git_buf contains a UTF-8 BOM, the three bytes comprising
      that BOM are treated as unprintable characters.  For a small git_buf,
      the three BOM characters overwhelm the printable characters.  This
      is problematic when trying to check out a small file as the CR/LF
      filtering will not apply.
      Edward Thomson committed
  23. 31 Jul, 2013 1 commit
    • Major rename detection changes · d730d3f4
      After doing further profiling, I found that a lot of time was
      being spent attempting to insert hashes into the file hash
      signature when using the rolling hash because the rolling hash
      approach generates a hash per byte of the file instead of one
      per run/line of data.
      
      To optimize this, I decided to convert back to a run-based file
      signature algorithm which would be more like core Git.
      
      After changing this, a number of the existing tests started to
      fail.  In some cases, this appears to have been because the test
      was coded to be too specific to the particular results of the file
      similarity metric and in some cases there appear to have been bugs
      in the core rename detection code where only by the coincidence
      of the file similarity scoring were the expected results being
      generated.
      
      This renames all the variables in the core rename detection code
      to be more consistent and hopefully easier to follow which made it
      a bit easier to reason about the behavior of that code and fix the
      problems that I was seeing.  I think it's in better shape now.
      
      There are a couple of tests now that attempt to stress test the
      rename detection code and they are quite slow.  Most of the time
      is spent setting up the test data on disk and in the index.  When
      we roll out performance improvements for index insertion, it
      should also speed up these tests I hope.
      Russell Belfer committed
  24. 25 Mar, 2013 1 commit
    • Move crlf conversion into buf_text · 3658e81e
      This adds crlf/lf conversion functions into buf_text with more
      efficient implementations that bypass the high level buffer
      functions.  They attempt to minimize the number of reallocations
      done and they directly write the buffer data as needed if they
      know that there is enough memory allocated to memcpy data.
      
      Tests are added for these new functions.  The crlf.c code is
      updated to use the new functions.
      
      Removed the include of buf_text.h from filter.h and just include
      it more narrowly in the places that need it.
      Russell Belfer committed
  25. 20 Feb, 2013 4 commits
  26. 29 Jan, 2013 1 commit
  27. 11 Jan, 2013 1 commit
    • Match binary file check of core git in diff · 0d65acad
      Core git just looks for NUL bytes in files when deciding about
      is-binary inside diff (although it uses a better algorithm in
      checkout, when deciding if CRLF conversion should be done).
      Libgit2 was using the better algorithm in both places, but that
      is causing some confusion.  For now, this makes diff just look
      for NUL bytes to decide if a file is binary by content in diff.
      Russell Belfer committed
  28. 28 Nov, 2012 1 commit
    • Consolidate text buffer functions · 7bf87ab6
      There are many scattered functions that look into the contents of
      buffers to do various text manipulations (such as escaping or
      unescaping data, calculating text stats, guessing if content is
      binary, etc).  This groups all those functions together into a
      new file and converts the code to use that.
      
      This has two enhancements to existing functionality.  The old
      text stats function is significantly rewritten and the BOM
      detection code was extended (although largely we can't deal with
      anything other than a UTF8 BOM).
      Russell Belfer committed
  29. 10 Oct, 2012 1 commit
  30. 23 Aug, 2012 1 commit
  31. 24 Jul, 2012 1 commit
  32. 12 Jul, 2012 1 commit
  33. 11 Jul, 2012 1 commit