1. 30 Mar, 2020 2 commits
    • fetchhead: strip credentials from remote URL · 25cd374d
      If fetching from an anonymous remote via its URL, then the URL gets
      written into the FETCH_HEAD reference. This is mainly done to give
      valuable context to some commands, like for example git-merge(1), which
      will put the URL into the generated MERGE_MSG. As a result, what gets
      written into FETCH_HEAD may become public in some cases. This is
      especially important considering that URLs may contain credentials, e.g.
      when cloning 'https://foo:bar@example.com/repo' we persist the complete
      URL into FETCH_HEAD and put it without any kind of sanitization into the
      MERGE_MSG. This is obviously bad, as your login data has now just leaked
      as soon as you do git-push(1).
      
      When writing the URL into FETCH_HEAD, upstream git does strip
      credentials first. Let's do the same by trying to parse the remote URL
      as a "real" URL, removing any credentials and then re-formatting the
      URL. In case this fails, e.g. when it's a file path or not a valid URL,
      we just fall back to using the URL as-is without any sanitization. Add
      tests to verify our behaviour.
      Patrick Steinhardt committed
    • Merge pull request #5467 from pks-t/pks/v0.28.5 · 1e5b1395
      Release v0.28.5
      Edward Thomson committed
  2. 26 Mar, 2020 38 commits
    • scripts: adjust expected SOVERSION for v0.28 branch · 9f4cb617
      The v0.28 branch still uses an old SOVERSION style which includes the
      minor version, only. Adjust the release script to reflect that.
      Patrick Steinhardt committed
    • scripts: add script to create releases · 2ac4b4a3
      The current release process is not documented in any way. As a result,
      it's not obvious how releases should be done at all, like e.g. which
      locations need adjusting.
      
      To fix this, let's introduce a new script that shall from now on be used
      to do all releases. As input it gets the tree that shall be released,
      the repository in which to do the release, credentials to
      authenticate against GitHub and the new version. E.g. executing the
      following will create a new release v0.32:
      
          $ ./script/release.py 0.32.0 --user pks-t --password ****
      
      While the password may currently be your usual GitLab password, it's
      recommended to use a personal access token intead.
      
      The script will then perform the following steps:
      
          1. Verify that "include/git2/version.h" matches the new version.
      
          2. Verify that "docs/changelog.md" has a section for that new
             version.
      
          3. Extract the changelog entries for the current release from
             "docs/changelog.md".
      
          4. Generate two archives in "tar.gz" and "zip" format via "git
             archive" from the tree passed by the user. If no tree was passed,
             we will use "HEAD".
      
          5. Create the GitHub release using the extracted changelog entries
             as well as tag and name information derived from the version
             passed by the used.
      
          6. Upload both code archives to that release.
      
      This should cover all steps required for a new release and thus ensures
      that nothing is missing that shouldn't be.
      Patrick Steinhardt committed
    • azure: only override PATH when building · b85983a2
      We currently unconditionally override the PATH variable with a custom
      path with the main intent of making available our own custom MinGW
      installation. This worked quite well so far, but is heavily dependent on
      the machine we're running this on. And naturally, it fails on the new
      Windows machines we need to upgrade to, as tools like CMake are not
      contained in the path we currently set up.
      
      Fix this by remodeling the way we set up the PATH environment. Instead
      of overriding it completely, we now override it only when executing
      the CMake build.
      Patrick Steinhardt committed
    • azure: upgrade to newer Windows VM images · d909524e
      This is a subset of commit 95f329b4 (azure: upgrade to newer hosted VM
      images, 2020-03-10), upgrading all of our Windows jobs to use
      'vs2017-win2016' machines and macOS to 'macos-10.14'. This is intended
      to keep our continuous integration builds from failing in the future, as
      these images will get deprecated on March 31st. As this is in
      preparation of a stable release, we do not want to upgrade any of the
      other machines like is done in the mentioned commit but keep the impact
      minimal.
      
      fixup! azure: upgrade to newer Windows VM images
      Patrick Steinhardt committed
    • refdb_fs: initialize backend version · 4dc93239
      While the `git_refdb_backend()` struct has a version, we do not
      initialize it correctly when calling `git_refdb_backend_fs()`. Fix this
      by adding the call to `git_refdb_init_backend()`.
      Patrick Steinhardt committed
    • Fix segfault when calling git_blame_buffer() · 786b29cb
      This change makes sure that the hunk is not null before trying to
      dereference it. This avoids segfaults, especially when blaming against a
      modified buffer (i.e. the index).
      
      Fixes: #5443
      lhchavez committed
    • Fix typo on GIT_USE_NEC · 345689ff
      Signed-off-by: Sven Strickroth <email@cs-ware.de>
      Sven Strickroth committed
    • refs: refuse to delete HEAD · aeda6786
      This requires adding a new symbolic ref to the testrepo fixture.
      Some of the existing tests attempt to delete HEAD, expecting a different failure. Introduce and use a non-HEAD symbolic ref instead.
      Adjust a few other tests as needed.
      
      Fixes #5357
      Josh Bleecher Snyder committed
    • smart_pkt: fix overflow resulting in OOB read/write of one byte · c2b617f3
      When parsing OK packets, we copy any information after the initial "ok "
      prefix into the resulting packet. As newlines act as packet boundaries,
      we also strip the trailing newline if there is any. We do not check
      whether there is any data left after the initial "ok " prefix though,
      which leads to a pointer overflow in that case as `len == 0`:
      
      	if (line[len - 1] == '\n')
      		--len;
      
      This out-of-bounds read is a rather useless gadget, as we can only
      deduce whether at some offset there is a newline character. In case
      there accidentally is one, we overflow `len` to `SIZE_MAX` and then
      write a NUL byte into an array indexed by it:
      
      	pkt->ref[len] = '\0';
      
      Again, this doesn't seem like something that's possible to be exploited
      in any meaningful way, but it may surely lead to inconsistencies or DoS.
      
      Fix the issue by checking whether there is any trailing data after the
      packet prefix.
      Patrick Steinhardt committed
    • global: convert to fiber-local storage to fix exit races · bb173381
      On Windows platforms, we automatically clean up the thread-local storage
      upon detaching a thread via `DllMain()`. The thing is that this happens
      for every thread of applications that link against the libgit2 DLL, even
      those that don't have anything to do with libgit2 itself. As a result,
      we cannot assume that these unsuspecting threads make use of our
      `git_libgit2_init()` and `git_libgit2_shutdow()` reference counting,
      which may lead to racy situations:
      
          Thread 1                    Thread 2
      
          git_libgit2_shutdown()
                                      DllMain(DETACH_THREAD)
                                      git__free_tls_data()
          git_atomic_dec() == 0
          git__free_tls_data()
          TlsFree(_tls_index)
                                      TlsGetValue(_tls_index)
      
      Due to the second thread never having executed `git_libgit2_init()`, the
      first thread will clean up TLS data and as a result also free the
      `_tls_index` variable. When detaching the second thread, we
      unconditionally access the now-free'd `_tls_index` variable, which is
      obviously not going to work out well.
      
      Fix the issue by converting the code to use fiber-local storage instead
      of thread-local storage. While FLS will behave the exact same as TLS if
      no fibers are in use, it does allow us to specify a destructor similar
      to the one that is accepted by pthread_key_create(3P). Like this, we do
      not have to manually free indices anymore, but will let the FLS handle
      calling the destructor. This allows us to get rid of `DllMain()`
      completely, as we only used it to keep track of when threads were
      exiting and results in an overall simplification of TLS cleanup.
      Patrick Steinhardt committed
    • patch_parse: fix out-of-bounds reads caused by integer underflow · 60d1f99e
      The patch format for binary files is a simple Base85 encoding with a
      length byte as prefix that encodes the current line's length. For each
      line, we thus check whether the line's actual length matches its
      expected length in order to not faultily apply a truncated patch. This
      also acts as a check to verify that we're not reading outside of the
      line's string:
      
      	if (encoded_len > ctx->parse_ctx.line_len - 1) {
      		error = git_parse_err(...);
      		goto done;
      	}
      
      There is the possibility for an integer underflow, though. Given a line
      with a single prefix byte, only, `line_len` will be zero when reaching
      this check. As a result, subtracting one from that will result in an
      integer underflow, causing us to assume that there's a wealth of bytes
      available later on. Naturally, this may result in an out-of-bounds read.
      
      Fix the issue by checking both `encoded_len` and `line_len` for a
      non-zero value. The binary format doesn't make use of zero-length lines
      anyway, so we need to know that there are both encoded bytes and
      remaining characters available at all.
      
      This patch also adds a test that works based on the last error message.
      Checking error messages is usually too tightly coupled, but in fact
      parsing the patch failed even before the change. Thus the only
      possibility is to use e.g. Valgrind, but that'd result in us not
      catching issues when run without Valgrind. As a result, using the error
      message is considered a viable tradeoff as we know that we didn't start
      decoding Base85 in the first place.
      Patrick Steinhardt committed
    • patch_parse: use paths from "---"/"+++" lines for binary patches · 8ff44c2a
      For some patches, it is not possible to derive the old and new file
      paths from the patch header's first line, most importantly when they
      contain spaces. In such a case, we derive both paths from the "---" and
      "+++" lines, which allow for non-ambiguous parsing. We fail to use these
      paths when parsing binary patches without data, though, as we always
      expect the header paths to be filled in.
      
      Fix this by using the "---"/"+++" paths by default and only fall back to
      header paths if they aren't set. If neither of those paths are set, we
      just return an error. Add two tests to verify this behaviour, one of
      which would have previously caused a segfault.
      Patrick Steinhardt committed
    • fileops: correct error return on p_lstat failures when mkdir · 128230ba
      IIRC I got a strange return once from lstat, which translated in a weird
      error class/message being reported. As a safety measure, enforce a -1 return in
      that case.
      Etienne Samson committed
    • patch_parse: fix segfault when header path contains whitespace only · 7ce231ff
      When parsing header paths from a patch, we reject any patches with empty
      paths as malformed patches. We perform the check whether a path is empty
      before sanitizing it, though, which may lead to a path becoming empty
      after the check, e.g. if we have trimmed whitespace. This may lead to a
      segfault later when any part of our patching logic actually references
      such a path, which may then be a `NULL` pointer.
      
      Fix the issue by performing the check after sanitizing. Add tests to
      catch the issue as they would have produced a segfault previosuly.
      Patrick Steinhardt committed
    • fix a bug introduced in 8a23597b · 9f2732cc
      romkatv committed
    • patch_parse: detect overflow when calculating old/new line position · 54f0a278
      When the patch contains lines close to INT_MAX, then it may happen that
      we end up with an integer overflow when calculating the line of the
      current diff hunk. Reject such patches as unreasonable to avoid the
      integer overflow.
      
      As the calculation is performed on integers, we introduce two new
      helpers `git__add_int_overflow` and `git__sub_int_overflow` that perform
      the integer overflow check in a generic way.
      Patrick Steinhardt committed
    • patch_parse: fix out-of-bounds read with No-NL lines · 608cb07d
      We've got two locations where we copy lines into the patch. The first
      one is when copying normal " ", "-" or "+" lines, while the second
      location gets executed when we copy "\ No newline at end of file" lines.
      While the first one correctly uses `git__strndup` to copy only until the
      newline, the other one doesn't. Thus, if the line occurs at the end of
      the patch and if there is no terminating NUL character, then it may
      result in an out-of-bounds read.
      
      Fix the issue by using `git__strndup`, as was already done in the other
      location. Furthermore, add allocation checks to both locations to detect
      out-of-memory situations.
      Patrick Steinhardt committed
    • patch_parse: reject empty path names · 3223f5de
      When parsing patch headers, we currently accept empty path names just
      fine, e.g. a line "--- \n" would be parsed as the empty filename. This
      is not a valid patch format and may cause `NULL` pointer accesses at a
      later place as `git_buf_detach` will return `NULL` in that case.
      
      Reject such patches as malformed with a nice error message.
      Patrick Steinhardt committed
    • patch_parse: reject patches with multiple old/new paths · db73191b
      It's currently possible to have patches with multiple old path name
      headers. As we didn't check for this case, this resulted in a memory
      leak when overwriting the old old path with the new old path because we
      simply discarded the old pointer.
      
      Instead of fixing this by free'ing the old pointer, we should reject
      such patches altogether. It doesn't make any sense for the "---" or
      "+++" markers to occur multiple times within a patch n the first place.
      This also implicitly fixes the memory leak.
      Patrick Steinhardt committed
    • patch_parse: handle patches without extended headers · fc60777e
      Extended header lines (especially the "index <hash>..<hash> <mode>") are
      not required by "git apply" so it import patches. So we allow the
      from-file/to-file lines (--- a/file\n+++ b/file) to directly follow the
      git diff header.
      
      This fixes #5267.
      Denis Laxalde committed
    • refs: unlock unmodified refs on transaction commit · fceedda5
      Refs which are locked in a transaction without an altered target,
      still should to be unlocked on `git_transaction_commit`.
      `git_transaction_free` also unlocks refs but the moment of calling of `git_transaction_free`
      cannot be controlled in all situations.
      Some binding libs call `git_transaction_free` on garbage collection or not at all if the
      application exits before and don't provide public access to `git_transaction_free`.
      It is better to release locks as soon as possible.
      Sebastian Henke committed
    • refs: fix locks getting forcibly removed · 5aca2444
      The flag GIT_FILEBUF_FORCE currently does two things:
           1. It will cause the filebuf to create non-existing leading
              directories for the file that is about to be written.
           2. It will forcibly remove any pre-existing locks.
      While most call sites actually do want (1), they do not want to
      remove pre-existing locks, as that renders the locking mechanisms
      effectively useless.
      Introduce a new flag `GIT_FILEBUF_CREATE_LEADING_DIRS` to
      separate both behaviours cleanly from each other and convert
      callers to use it instead of `GIT_FILEBUF_FORCE` to have them
      honor locked files correctly.
      
      As this conversion removes all current users of `GIT_FILEBUF_FORCE`,
      this commit removes the flag altogether.
      Sebastian Henke committed
    • patch_parse: handle patches with new empty files · 85ab27c8
      Patches containing additions of empty files will not contain diff data
      but will end with the index header line followed by the terminating
      sequence "-- ". We follow the same logic as in cc4c44a9 and allow "-- "
      to immediately follow the index header.
      Denis Laxalde committed
    • buffer: fix printing into out-of-memory buffer · 3c605da6
      Before printing into a `git_buf` structure, we always call `ENSURE_SIZE`
      first. This macro will reallocate the buffer as-needed depending on
      whether the current amount of allocated bytes is sufficient or not. If
      `asize` is big enough, then it will just do nothing, otherwise it will
      call out to `git_buf_try_grow`. But in fact, it is insufficient to only
      check `asize`.
      
      When we fail to allocate any more bytes e.g. via `git_buf_try_grow`,
      then we set the buffer's pointer to `git_buf__oom`. Note that we touch
      neither `asize` nor `size`. So if we just check `asize > targetsize`,
      then we will happily let the caller of `ENSURE_SIZE` proceed with an
      out-of-memory buffer. As a result, we will print all bytes into the
      out-of-memory buffer instead, resulting in an out-of-bounds write.
      
      Fix the issue by having `ENSURE_SIZE` verify that the buffer is not
      marked as OOM. Add a test to verify that we're not writing into the OOM
      buffer.
      Patrick Steinhardt committed
    • buffer: fix infinite loop when growing buffers · 93dc8a04
      When growing buffers, we repeatedly multiply the currently allocated
      number of bytes by 1.5 until it exceeds the requested number of bytes.
      This has two major problems:
      
          1. If the current number of bytes is tiny and one wishes to resize
             to a comparatively huge number of bytes, then we may need to loop
             thousands of times.
      
          2. If resizing to a value close to `SIZE_MAX` (which would fail
             anyway), then we probably hit an infinite loop as multiplying the
             current amount of bytes will repeatedly result in integer
             overflows.
      
      When reallocating buffers, one typically chooses values close to 1.5 to
      enable re-use of resulting memory holes in later reallocations. But
      because of this, it really only makes sense to use a factor of 1.5
      _once_, but not looping until we finally are able to fit it. Thus, we
      can completely avoid the loop and just opt for the much simpler
      algorithm of multiplying with 1.5 once and, if the result doesn't fit,
      just use the target size. This avoids both problems of looping
      extensively and hitting overflows.
      
      This commit also adds a test that would've previously resulted in an
      infinite loop.
      Patrick Steinhardt committed
    • buffer: fix memory leak if unable to grow buffer · 18ca62de
      If growing a buffer fails, we set its pointer to the static
      `git_buf__oom` structure. While we correctly free the old pointer if
      `git__malloc` returned an error, we do not free it if there was an
      integer overflow while calculating the new allocation size. Fix this
      issue by freeing the pointer to plug the memory leak.
      Patrick Steinhardt committed
    • iterator: remove duplicate memset · 35168571
      When allocating new tree iterator frames, we zero out the allocated
      memory twice. Remove one of the `memset` calls.
      Patrick Steinhardt committed
    • iterator: avoid leaving partially initialized frame on stack · f647d021
      When allocating tree iterator entries, we use GIT_ERROR_ALLOC_CHECK` to
      check whether the allocation has failed. The macro will cause the
      function to immediately return, though, leaving behind a partially
      initialized iterator frame.
      
      Fix the issue by manually checking for memory allocation errors and
      using `goto done` in case of an error, popping the iterator frame.
      Patrick Steinhardt committed
    • diff_generate: detect memory allocation errors when preparing opts · ad735bf3
      When preparing options for the two iterators that are about to be
      diffed, we allocate a common prefix for both iterators depending on
      the options passed by the user. We do not check whether the allocation
      was successful, though. In fact, this isn't much of a problem, as using
      a `NULL` prefix is perfectly fine. But in the end, we probably want to
      detect that the system doesn't have any memory left, as we're unlikely
      to be able to continue afterwards anyway.
      
      While the issue is being fixed in the newly created function
      `diff_prepare_iterator_opts`, it has been previously existing in the
      previous macro `DIFF_FROM_ITERATORS` already.
      Patrick Steinhardt committed
    • diff_generate: refactor `DIFF_FROM_ITERATORS` macro of doom · 7aa03e92
      While the `DIFF_FROM_ITERATORS` does make it shorter to implement the
      various `git_diff_foo_to_bar` functions, it is a complex and unreadable
      beast that implicitly assumes certain local variable names. This is not
      something desirable to have at all and obstructs understanding and more
      importantly debugging the code by quite a bit.
      
      The `DIFF_FROM_ITERATORS` macro basically removed the burden of having
      to derive the options for both iterators from a pair of iterator flags
      and the diff options. This patch introduces a new function that does the
      that exact and refactors all callers to manage the iterators by
      themselves.
      
      As we potentially need to allocate a shared prefix for the
      iterator, we need to tell the caller to allocate that prefix as soon as
      the options aren't required anymore. Thus, the function has a `char
      **prefix` out pointer that will get set to the allocated string and
      subsequently be free'd by the caller.
      
      While this patch increases the line count, I personally deem this to an
      acceptable tradeoff for increased readbiblity.
      Patrick Steinhardt committed
    • ignore: correct handling of nested rules overriding wild card unignore · 99b89a9c
      problem:
      filesystem_iterator loads .gitignore files in top-down order.
      subsequently, ignore module evaluates them in the order they are loaded.
      this creates a problem if we have unignored a rule (using a wild card)
      in a sub dir and ignored it again in a level further below (see the test
      included in this patch).
      
      solution:
      process ignores in reverse order.
      
      closes #4963
      buddyspike committed
    • apply: Test for EOFNL mishandling when several hunks are processed · 5e5a9cce
      Introduce an unit test to validate that git_apply__patch() properly
      handles EOFNL changes in case of patches with several hunks.
      Max Kostyukevich committed