1. 25 Oct, 2018 5 commits
    • commit: fix reading out of bounds when parsing encoding · 7655b2d8
      The commit message encoding is currently being parsed by the
      `git__prefixcmp` function. As this function does not accept a buffer
      length, it will happily skip over a buffer's end if it is not `NUL`
      terminated.
      
      Fix the issue by using `git__prefixncmp` instead. Add a test that
      verifies that we are unable to parse the encoding field if it's cut off
      by the supplied buffer length.
      Patrick Steinhardt committed
    • tests: add tests that exercise commit parsing · c2e3d8ef
      We currently do not have any test suites dedicated to parsing commits
      from their raw representations. Add one based on `git_object__from_raw`
      to be able to test special cases more easily.
      Patrick Steinhardt committed
    • tag: fix out of bounds read when searching for tag message · ee11d47e
      When parsing tags, we skip all unknown fields that appear before the tag
      message. This skipping is done by using a plain `strstr(buffer, "\n\n")`
      to search for the two newlines that separate tag fields from tag
      message. As it is not possible to supply a buffer length to `strstr`,
      this call may skip over the buffer's end and thus result in an out of
      bounds read. As `strstr` may return a pointer that is out of bounds, the
      following computation of `buffer_end - buffer` will overflow and result
      in an allocation of an invalid length.
      
      Fix the issue by using `git__memmem` instead. Add a test that verifies
      parsing the tag fails not due to the allocation failure but due to the
      tag having no message.
      Patrick Steinhardt committed
    • tests: add tests that exercise tag parsing · 4c738e56
      While the tests in object::tag::read exercises reading and parsing valid
      tags from the ODB, they barely try to verify that the parser fails in a
      sane way when parsing invalid tags. Create a new test suite
      object::tag::parse that directly exercise the parser by using
      `git_object__from_raw` and add various tests for valid and invalid tags.
      Patrick Steinhardt committed
    • util: provide `git__memmem` function · 83e8a6b3
      Unfortunately, neither the `memmem` nor the `strnstr` functions are part
      of any C standard but are merely extensions of C that are implemented by
      e.g. glibc. Thus, there is no standardized way to search for a string in
      a block of memory with a limited size, and using `strstr` is to be
      considered unsafe in case where the buffer has not been sanitized. In
      fact, there are some uses of `strstr` in exactly that unsafe way in our
      codebase.
      
      Provide a new function `git__memmem` that implements the `memmem`
      semantics. That is in a given haystack of `n` bytes, search for the
      occurrence of a byte sequence of `m` bytes and return a pointer to the
      first occurrence. The implementation chosen is the "Not So Naive"
      algorithm from [1]. It was chosen as the implementation is comparably
      simple while still being reasonably efficient in most cases.
      Preprocessing happens in constant time and space, searching has a time
      complexity of O(n*m) with a slightly sub-linear average case.
      
      [1]: http://www-igm.univ-mlv.fr/~lecroq/string/
      Patrick Steinhardt committed
  2. 17 Oct, 2018 3 commits
  3. 15 Oct, 2018 3 commits
  4. 13 Oct, 2018 1 commit
  5. 12 Oct, 2018 1 commit
  6. 11 Oct, 2018 4 commits
    • Apply code review feedback · 463c21e2
      Nelson Elhage committed
    • fuzzers: add object parsing fuzzer · a1d5fd06
      Add a simple fuzzer that exercises our object parser code. The fuzzer
      is quite trivial in that it simply passes the input data directly to
      `git_object__from_raw` for each of the four object types.
      Patrick Steinhardt committed
    • object: properly propagate errors on parsing failures · 6562cdda
      When failing to parse a raw object fromits data, we free the
      partially parsed object but then fail to propagate the error to the
      caller. This may lead callers to operate on objects with invalid memory,
      which will sooner or later cause the program to segfault.
      
      Fix the issue by passing up the error code returned by `parse_raw`.
      Patrick Steinhardt committed
    • fuzzers: initialize libgit2 in standalone driver · 6956a954
      The standalone driver for libgit2's fuzzing targets makes use of
      functions from libgit2 itself. While this is totally fine to do, we need
      to make sure to always have libgit2 initialized via `git_libgit2_init`
      before we call out to any of these. While this happens in most cases as
      we call `LLVMFuzzerInitialize`, which is provided by our fuzzers and
      which right now always calls `git_libgit2_init`, one exception to this
      rule is our error path when not enough arguments have been given. In
      this case, we will call `git_vector_free_deep` without libgit2 having
      been initialized. As we did not set up our allocation functions in that
      case, this will lead to a segmentation fault.
      
      Fix the issue by always initializing and shutting down libgit2 in the
      standalone driver. Note that we cannot let this replace the
      initialization in `LLVMFuzzerInitialize`, as it is required when using
      the "real" fuzzers by LLVM without our standalone driver. It's no
      problem to call the initialization and deinitialization functions
      multiple times, though.
      Patrick Steinhardt committed
  7. 09 Oct, 2018 2 commits
  8. 07 Oct, 2018 3 commits
  9. 06 Oct, 2018 1 commit
    • ignore unsupported http authentication schemes · 475db39b
      auth_context_match returns 0 instead of -1 for unknown schemes to
      not fail in situations where some authentication schemes are supported
      and others are not.
      
      apply_credentials is adjusted to handle auth_context_match returning
      0 without producing authentication context.
      Anders Borum committed
  10. 05 Oct, 2018 12 commits
  11. 04 Oct, 2018 5 commits
    • Merge pull request #4829 from pks-t/pks/cmake-cmp0054 · b95c79ab
      cmake: enable new quoted argument policy CMP0054
      Edward Thomson committed
    • Merge pull request #4824 from palmin/packbuilder-interesting-blob · e41a0f7b
      fix check if blob is uninteresting when inserting tree to packbuilder
      Patrick Steinhardt committed
    • diff_stats: use git's formatting of renames with common directories · e5090ee3
      In cases where a file gets renamed such that the directories containing
      it previous and after the rename have a common prefix, then git will
      avoid printing this prefix twice and instead format the rename as
      "prefix/{old => new}". We currently didn't do anything like that, but
      simply printed "prefix/old -> prefix/new".
      
      Adjust our behaviour to instead match upstream. Adjust the test for this
      behaviour to expect the new format.
      Patrick Steinhardt committed
    • tests: verify diff stats with renames in subdirectory · 3148efd2
      Until now, we didn't have any tests that verified that our format for
      renames in subdirectories is correct. While our current behaviour is no
      different than for renames that do not happen with a common prefix
      shared between old and new file name, we intend to change the format to
      instead match the format that upstream git uses.
      
      Add a test case for this to document our current behaviour and to show
      how the next commit will change that format.
      Patrick Steinhardt committed
    • cmake: enable new quoted argument policy CMP0054 · 633584b5
      Quoting from CMP0054's documentation:
      
          Only interpret if() arguments as variables or keywords when
          unquoted.
      
          CMake 3.1 and above no longer implicitly dereference variables or
          interpret keywords in an if() command argument when it is a Quoted
          Argument or a Bracket Argument.
      
          The OLD behavior for this policy is to dereference variables and
          interpret keywords even if they are quoted or bracketed. The NEW
          behavior is to not dereference variables or interpret keywords that
          have been quoted or bracketed.
      
      The previous behaviour could be quite unexpected. Quoted arguments might
      be expanded in case where the value of the argument corresponds to a
      variable. E.g. `IF("MONKEY" STREQUAL "MONKEY")` would have been expanded
      to `IF("1" STREQUAL "1")` iff `SET(MONKEY 1)` was set. This behaviour
      was weird, and recent CMake versions have started to complain about this
      if they see ambiguous situations. Thus we want to disable it in favor of
      the new behaviour.
      Patrick Steinhardt committed