1. 04 Oct, 2018 2 commits
    • diff_stats: use git's formatting of renames with common directories · e5090ee3
      In cases where a file gets renamed such that the directories containing
      it previous and after the rename have a common prefix, then git will
      avoid printing this prefix twice and instead format the rename as
      "prefix/{old => new}". We currently didn't do anything like that, but
      simply printed "prefix/old -> prefix/new".
      
      Adjust our behaviour to instead match upstream. Adjust the test for this
      behaviour to expect the new format.
      Patrick Steinhardt committed
    • tests: verify diff stats with renames in subdirectory · 3148efd2
      Until now, we didn't have any tests that verified that our format for
      renames in subdirectories is correct. While our current behaviour is no
      different than for renames that do not happen with a common prefix
      shared between old and new file name, we intend to change the format to
      instead match the format that upstream git uses.
      
      Add a test case for this to document our current behaviour and to show
      how the next commit will change that format.
      Patrick Steinhardt committed
  2. 06 Jul, 2018 1 commit
  3. 29 Jun, 2018 1 commit
    • delta: fix sign-extension of big left-shift · 7db25870
      Our delta code was originally adapted from JGit, which itself adapted it
      from git itself. Due to this heritage, we inherited a bug from git.git
      in how we compute the delta offset, which was fixed upstream in
      48fb7deb5 (Fix big left-shifts of unsigned char, 2009-06-17). As
      explained by Linus:
      
          Shifting 'unsigned char' or 'unsigned short' left can result in sign
          extension errors, since the C integer promotion rules means that the
          unsigned char/short will get implicitly promoted to a signed 'int' due to
          the shift (or due to other operations).
      
          This normally doesn't matter, but if you shift things up sufficiently, it
          will now set the sign bit in 'int', and a subsequent cast to a bigger type
          (eg 'long' or 'unsigned long') will now sign-extend the value despite the
          original expression being unsigned.
      
          One example of this would be something like
      
                  unsigned long size;
                  unsigned char c;
      
                  size += c << 24;
      
          where despite all the variables being unsigned, 'c << 24' ends up being a
          signed entity, and will get sign-extended when then doing the addition in
          an 'unsigned long' type.
      
          Since git uses 'unsigned char' pointers extensively, we actually have this
          bug in a couple of places.
      
      In our delta code, we inherited such a bogus shift when computing the
      offset at which the delta base is to be found. Due to the sign extension
      we can end up with an offset where all the bits are set. This can allow
      an arbitrary memory read, as the addition in `base_len < off + len` can
      now overflow if `off` has all its bits set.
      
      Fix the issue by casting the result of `*delta++ << 24UL` to an unsigned
      integer again. Add a test with a crafted delta that would actually
      succeed with an out-of-bounds read in case where the cast wouldn't
      exist.
      
      Reported-by: Riccardo Schirone <rschiron@redhat.com>
      Test-provided-by: Riccardo Schirone <rschiron@redhat.com>
      Patrick Steinhardt committed
  4. 18 Jun, 2018 1 commit
  5. 10 Jun, 2018 1 commit
  6. 05 May, 2018 1 commit
  7. 05 Apr, 2018 1 commit
  8. 20 Feb, 2018 3 commits
    • diff_tform: fix rename detection with rewrite/delete pair · ce7080a0
      A rewritten file can either be classified as a modification of its
      contents or of a delete of the complete file followed by an addition of
      the new content. This distinction becomes important when we want to
      detect renames for rewrites. Given a scenario where a file "a" has been
      deleted and another file "b" has been renamed to "a", this should be
      detected as a deletion of "a" followed by a rename of "a" -> "b". Thus,
      splitting of the original rewrite into a delete/add pair is important
      here.
      
      This splitting is represented by a flag we can set at the current delta.
      While the flag is already being set in case we want to break rewrites,
      we do not do so in case where the `GIT_DIFF_FIND_RENAMES_FROM_REWRITES`
      flag is set. This can trigger an assert when we try to match the source
      and target deltas.
      
      Fix the issue by setting the `GIT_DIFF_FLAG__TO_SPLIT` flag at the delta
      when it is a rename target and `GIT_DIFF_FIND_RENAMES_FROM_REWRITES` is
      set.
      Patrick Steinhardt committed
    • tests: add rename-rewrite scenarios to "renames" repository · 80e77b87
      Add two more scenarios to the "renames" repository. The first scenario
      has a major rewrite of a file and a delete of another file, the second
      scenario has a deletion of a file and rename of another file to the
      deleted file. Both scenarios will be used in the following commit.
      Patrick Steinhardt committed
    • tests: diff::rename: use defines for commit OIDs · d91da1da
      While we frequently reuse commit OIDs throughout the file, we do not
      have any constants to refer to these commits. Make this a bit easier to
      read by giving the commit OIDs somewhat descriptive names of what kind
      of commit they refer to.
      Patrick Steinhardt committed
  9. 15 Dec, 2017 1 commit
    • diff_file: properly refcount blobs when initializing file contents · 2388a9e2
      When initializing a `git_diff_file_content` from a source whose data is
      derived from a blob, we simply assign the blob's pointer to the
      resulting struct without incrementing its refcount. Thus, the structure
      can only be used as long as the blob is kept alive by the caller.
      
      Fix the issue by using `git_blob_dup` instead of a direct assignment.
      This function will increment the refcount of the blob without allocating
      new memory, so it does exactly what we want. As
      `git_diff_file_content__unload` already frees the blob when
      `GIT_DIFF_FLAG__FREE_BLOB` is set, we don't need to add new code
      handling the free but only have to set that flag correctly.
      Patrick Steinhardt committed
  10. 01 Sep, 2017 1 commit
  11. 26 Jun, 2017 1 commit
    • diff: implement function to calculate patch ID · 89a34828
      The upstream git project provides the ability to calculate a so-called
      patch ID. Quoting from git-patch-id(1):
      
          A "patch ID" is nothing but a sum of SHA-1 of the file diffs
          associated with a patch, with whitespace and line numbers ignored."
      
      Patch IDs can be used to identify two patches which are probably the
      same thing, e.g. when a patch has been cherry-picked to another branch.
      
      This commit implements a new function `git_diff_patchid`, which gets a
      patch and derives an OID from the diff. Note the different terminology
      here: a patch in libgit2 are the differences in a single file and a diff
      can contain multiple patches for different files. The implementation
      matches the upstream implementation and should derive the same OID for
      the same diff. In fact, some code has been directly derived from the
      upstream implementation.
      
      The upstream implementation has two different modes to calculate patch
      IDs, which is the stable and unstable mode. The old way of calculating
      the patch IDs was unstable in a sense that a different ordering the
      diffs was leading to different results. This oversight was fixed in git
      1.9, but as git tries hard to never break existing workflows, the old
      and unstable way is still default. The newer and stable way does not
      care for ordering of the diff hunks, and in fact it is the mode that
      should probably be used today. So right now, we only implement the
      stable way of generating the patch ID.
      Patrick Steinhardt committed
  12. 14 Mar, 2017 3 commits
    • diff_parse: correctly set options for parsed diffs · c0eba379
      The function `diff_parsed_alloc` allocates and initializes a
      `git_diff_parsed` structure. This structure also contains diff options.
      While we initialize its flags, we fail to do a real initialization of
      its values. This bites us when we want to actually use the generated
      diff as we do not se the option's version field, which is required to
      operate correctly.
      
      Fix the issue by executing `git_diff_init_options` on the embedded
      struct.
      Patrick Steinhardt committed
    • patch_parse: fix parsing minimal trailing diff line · ad5a909c
      In a diff, the shortest possible hunk with a modification (that is, no
      deletion) results from a file with only one line with a single character
      which is removed. Thus the following hunk
      
          @@ -1 +1 @@
          -a
          +
      
      is the shortest valid hunk modifying a line. The function parsing the
      hunk body though assumes that there must always be at least 4 bytes
      present to make up a valid hunk, which is obviously wrong in this case.
      The absolute minimum number of bytes required for a modification is
      actually 2 bytes, that is the "+" and the following newline. Note: if
      there is no trailing newline, the assumption will not be offended as the
      diff will have a line "\ No trailing newline" at its end.
      
      This patch fixes the issue by lowering the amount of bytes required.
      Patrick Steinhardt committed
    • patch_generate: fix `git_diff_foreach` only working with generated diffs · ace3508f
      The current logic of `git_diff_foreach` makes the assumption that all
      diffs passed in are actually derived from generated diffs. With these
      assumptions we try to derive the actual diff by inspecting either the
      working directory files or blobs of a repository. This obviously cannot
      work for diffs parsed from a file, where we do not necessarily have a
      repository at hand.
      
      Since the introduced split of parsed and generated patches, there are
      multiple functions which help us to handle patches generically, being
      indifferent from where they stem from. Use these functions and remove
      the old logic specific to generated patches. This allows re-using the
      same code for invoking the callbacks on the deltas.
      Patrick Steinhardt committed
  13. 09 Oct, 2016 1 commit
  14. 24 Aug, 2016 1 commit
  15. 26 Jun, 2016 4 commits
  16. 26 May, 2016 2 commits
  17. 02 Apr, 2016 1 commit
  18. 24 Mar, 2016 1 commit
  19. 23 Mar, 2016 3 commits
  20. 20 Mar, 2016 1 commit
  21. 03 Mar, 2016 1 commit
  22. 28 Feb, 2016 1 commit
  23. 12 Feb, 2016 1 commit
  24. 11 Feb, 2016 1 commit
  25. 01 Dec, 2015 1 commit
  26. 20 Nov, 2015 1 commit
  27. 03 Nov, 2015 1 commit
  28. 02 Nov, 2015 1 commit
  29. 21 Oct, 2015 1 commit