- 26 Oct, 2018 1 commit
-
-
(cherry picked from commit f9e28026)
Etienne Samson committed
-
- 05 Jul, 2018 1 commit
-
-
Our delta code was originally adapted from JGit, which itself adapted it from git itself. Due to this heritage, we inherited a bug from git.git in how we compute the delta offset, which was fixed upstream in 48fb7deb5 (Fix big left-shifts of unsigned char, 2009-06-17). As explained by Linus: Shifting 'unsigned char' or 'unsigned short' left can result in sign extension errors, since the C integer promotion rules means that the unsigned char/short will get implicitly promoted to a signed 'int' due to the shift (or due to other operations). This normally doesn't matter, but if you shift things up sufficiently, it will now set the sign bit in 'int', and a subsequent cast to a bigger type (eg 'long' or 'unsigned long') will now sign-extend the value despite the original expression being unsigned. One example of this would be something like unsigned long size; unsigned char c; size += c << 24; where despite all the variables being unsigned, 'c << 24' ends up being a signed entity, and will get sign-extended when then doing the addition in an 'unsigned long' type. Since git uses 'unsigned char' pointers extensively, we actually have this bug in a couple of places. In our delta code, we inherited such a bogus shift when computing the offset at which the delta base is to be found. Due to the sign extension we can end up with an offset where all the bits are set. This can allow an arbitrary memory read, as the addition in `base_len < off + len` can now overflow if `off` has all its bits set. Fix the issue by casting the result of `*delta++ << 24UL` to an unsigned integer again. Add a test with a crafted delta that would actually succeed with an out-of-bounds read in case where the cast wouldn't exist. Reported-by: Riccardo Schirone <rschiron@redhat.com> Test-provided-by: Riccardo Schirone <rschiron@redhat.com>
Patrick Steinhardt committed
-
- 30 May, 2018 1 commit
-
-
Erik van Zijst committed
-
- 20 Feb, 2018 3 commits
-
-
A rewritten file can either be classified as a modification of its contents or of a delete of the complete file followed by an addition of the new content. This distinction becomes important when we want to detect renames for rewrites. Given a scenario where a file "a" has been deleted and another file "b" has been renamed to "a", this should be detected as a deletion of "a" followed by a rename of "a" -> "b". Thus, splitting of the original rewrite into a delete/add pair is important here. This splitting is represented by a flag we can set at the current delta. While the flag is already being set in case we want to break rewrites, we do not do so in case where the `GIT_DIFF_FIND_RENAMES_FROM_REWRITES` flag is set. This can trigger an assert when we try to match the source and target deltas. Fix the issue by setting the `GIT_DIFF_FLAG__TO_SPLIT` flag at the delta when it is a rename target and `GIT_DIFF_FIND_RENAMES_FROM_REWRITES` is set.
Patrick Steinhardt committed -
Add two more scenarios to the "renames" repository. The first scenario has a major rewrite of a file and a delete of another file, the second scenario has a deletion of a file and rename of another file to the deleted file. Both scenarios will be used in the following commit.
Patrick Steinhardt committed -
While we frequently reuse commit OIDs throughout the file, we do not have any constants to refer to these commits. Make this a bit easier to read by giving the commit OIDs somewhat descriptive names of what kind of commit they refer to.
Patrick Steinhardt committed
-
- 15 Dec, 2017 1 commit
-
-
When initializing a `git_diff_file_content` from a source whose data is derived from a blob, we simply assign the blob's pointer to the resulting struct without incrementing its refcount. Thus, the structure can only be used as long as the blob is kept alive by the caller. Fix the issue by using `git_blob_dup` instead of a direct assignment. This function will increment the refcount of the blob without allocating new memory, so it does exactly what we want. As `git_diff_file_content__unload` already frees the blob when `GIT_DIFF_FLAG__FREE_BLOB` is set, we don't need to add new code handling the free but only have to set that flag correctly.
Patrick Steinhardt committed
-
- 01 Sep, 2017 1 commit
-
-
Patches which contain exact renames only will not contain an actual diff body, but only a list of files that were renamed. Thus, the patch header is immediately followed by the terminating sequence "-- ". We currently do not recognize this character sequence as a possible terminating sequence. Add it and create a test to catch the failure.
Patrick Steinhardt committed
-
- 26 Jun, 2017 1 commit
-
-
The upstream git project provides the ability to calculate a so-called patch ID. Quoting from git-patch-id(1): A "patch ID" is nothing but a sum of SHA-1 of the file diffs associated with a patch, with whitespace and line numbers ignored." Patch IDs can be used to identify two patches which are probably the same thing, e.g. when a patch has been cherry-picked to another branch. This commit implements a new function `git_diff_patchid`, which gets a patch and derives an OID from the diff. Note the different terminology here: a patch in libgit2 are the differences in a single file and a diff can contain multiple patches for different files. The implementation matches the upstream implementation and should derive the same OID for the same diff. In fact, some code has been directly derived from the upstream implementation. The upstream implementation has two different modes to calculate patch IDs, which is the stable and unstable mode. The old way of calculating the patch IDs was unstable in a sense that a different ordering the diffs was leading to different results. This oversight was fixed in git 1.9, but as git tries hard to never break existing workflows, the old and unstable way is still default. The newer and stable way does not care for ordering of the diff hunks, and in fact it is the mode that should probably be used today. So right now, we only implement the stable way of generating the patch ID.
Patrick Steinhardt committed
-
- 14 Mar, 2017 3 commits
-
-
The function `diff_parsed_alloc` allocates and initializes a `git_diff_parsed` structure. This structure also contains diff options. While we initialize its flags, we fail to do a real initialization of its values. This bites us when we want to actually use the generated diff as we do not se the option's version field, which is required to operate correctly. Fix the issue by executing `git_diff_init_options` on the embedded struct.
Patrick Steinhardt committed -
In a diff, the shortest possible hunk with a modification (that is, no deletion) results from a file with only one line with a single character which is removed. Thus the following hunk @@ -1 +1 @@ -a + is the shortest valid hunk modifying a line. The function parsing the hunk body though assumes that there must always be at least 4 bytes present to make up a valid hunk, which is obviously wrong in this case. The absolute minimum number of bytes required for a modification is actually 2 bytes, that is the "+" and the following newline. Note: if there is no trailing newline, the assumption will not be offended as the diff will have a line "\ No trailing newline" at its end. This patch fixes the issue by lowering the amount of bytes required.
Patrick Steinhardt committed -
The current logic of `git_diff_foreach` makes the assumption that all diffs passed in are actually derived from generated diffs. With these assumptions we try to derive the actual diff by inspecting either the working directory files or blobs of a repository. This obviously cannot work for diffs parsed from a file, where we do not necessarily have a repository at hand. Since the introduced split of parsed and generated patches, there are multiple functions which help us to handle patches generically, being indifferent from where they stem from. Use these functions and remove the old logic specific to generated patches. This allows re-using the same code for invoking the callbacks on the deltas.
Patrick Steinhardt committed
-
- 09 Oct, 2016 1 commit
-
-
Sim Domingo committed
-
- 24 Aug, 2016 1 commit
-
-
Ensure that `git_patch_from_diff` can return the patch for parsed diffs, not just generate a patch for a generated diff.
Edward Thomson committed
-
- 26 Jun, 2016 4 commits
-
-
When showing copy information because we are duplicating contents, for example, when performing a `diff --find-copies-harder -M100 -B100`, then show copy from/to lines in a patch, and do not show context. Ensure that we can also parse such patches.
Edward Thomson committed -
Edward Thomson committed
-
Edward Thomson committed
-
Test that we can create a diff file, then parse the results and that the two are identical in-memory.
Edward Thomson committed
-
- 26 May, 2016 2 commits
-
-
Parse diff files into a `git_diff` structure.
Edward Thomson committed -
Edward Thomson committed
-
- 02 Apr, 2016 1 commit
-
-
Test that submodules are found when the are included in a pathspec but have a trailing slash.
Edward Thomson committed
-
- 24 Mar, 2016 1 commit
-
-
Iterator tests were split over repo::iterator and diff::iterator, with duplication between the two. Move them to iterator::index, iterator::tree, and iterator::workdir.
Edward Thomson committed
-
- 23 Mar, 2016 3 commits
-
-
Jeff Hostetler committed
-
Refactored the tree iterator to never recurse; simply process the next entry in order in `advance`. Additionally, reduce the number of allocations and sorting as much as possible to provide a ~30% speedup on case-sensitive iteration. (The gains for case-insensitive iteration are less majestic.)
Edward Thomson committed -
Disambiguate the reset and reset_range functions. Now reset_range with a NULL path will clear the start or end; reset will leave the existing start and end unchanged.
Edward Thomson committed
-
- 20 Mar, 2016 1 commit
-
-
Instead of copying over the data into the individual entries, point to the originals, which are already in a format we can use.
Carlos Martín Nieto committed
-
- 03 Mar, 2016 1 commit
-
-
Carlos Martín Nieto committed
-
- 28 Feb, 2016 1 commit
-
-
Use legitimate (existing) object IDs in tests so that we have the ability to turn on strict object validation when running tests.
Edward Thomson committed
-
- 12 Feb, 2016 1 commit
-
-
Windows defines `timeval` with `long`, which we cannot sanely cope with. Instead, use a custom timeval struct.
Edward Thomson committed
-
- 11 Feb, 2016 1 commit
-
-
Arthur Schreiber committed
-
- 01 Dec, 2015 1 commit
-
-
When formatting a patch as email we do not include the commit's message in the formatted patch output. Implement this and add a test that verifies behavior.
Patrick Steinhardt committed
-
- 20 Nov, 2015 1 commit
-
-
Jacques Germishuys committed
-
- 03 Nov, 2015 1 commit
-
-
When `core.symlinks = false`, we write the symlinks content (target) to a regular file. We should ensure that when we later see that regular file, we treat it specially - and that changing that regular file would actually change the symlink target. (For compatibility with Git for Windows).
Edward Thomson committed
-
- 02 Nov, 2015 1 commit
-
-
Jason Haslam committed
-
- 21 Oct, 2015 1 commit
-
-
Vicent Marti committed
-
- 25 Sep, 2015 1 commit
-
-
git expects an empty line after the binary data: literal X ...binary data... <empty_line> The last literal block of the generated patches were not containing the required empty line. Example: diff --git a/binary_file b/binary_file index 3f1b3f9098131cfecea4a50ff8afab349ea66d22..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644 GIT binary patch literal 8 Pc${NM&PdElPvrst3ey5{ literal 6 Nc${NM%g@i}0ssZ|0lokL diff --git a/binary_file2 b/binary_file2 index 31be99be19470da4af5b28b21e27896a2f2f9ee2..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644 GIT binary patch literal 8 Pc${NM&PdElPvrst3ey5{ literal 13 Sc${NMEKbZyOexL+Qd|HZV+4u- git apply of that diff results in: error: corrupt binary patch at line 9: diff --git a/binary_file2 b/binary_file2 fatal: patch with only garbage at line 10 The proper formating is: diff --git a/binary_file b/binary_file index 3f1b3f9098131cfecea4a50ff8afab349ea66d22..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644 GIT binary patch literal 8 Pc${NM&PdElPvrst3ey5{ literal 6 Nc${NM%g@i}0ssZ|0lokL diff --git a/binary_file2 b/binary_file2 index 31be99be19470da4af5b28b21e27896a2f2f9ee2..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644 GIT binary patch literal 8 Pc${NM&PdElPvrst3ey5{ literal 13 Sc${NMEKbZyOexL+Qd|HZV+4u-
Guille -bisho- committed
-
- 12 Sep, 2015 1 commit
-
-
Ensure that a diff with the workdir is not erroneously returning directories.
Edward Thomson committed
-
- 31 Aug, 2015 1 commit
-
-
Some nicer refactoring for index iteration walks. The index iterator doesn't binary search through the pathlist space, since it lacks directory entries, and would have to binary search each index entry and all its parents (eg, when presented with an index entry of `foo/bar/file.c`, you would have to look in the pathlist for `foo/bar/file.c`, `foo/bar` and `foo`). Since the index entries and the pathlist are both nicely sorted, we walk the index entries in lockstep with the pathlist like we do for other iteration/diff/merge walks.
Edward Thomson committed
-
- 30 Aug, 2015 1 commit
-
-
When using literal pathspecs in diff with `GIT_DIFF_DISABLE_PATHSPEC_MATCH` turn on the faster iterator pathlist handling. Updates iterator pathspecs to include directory prefixes (eg, `foo/`) for compatibility with `GIT_DIFF_DISABLE_PATHSPEC_MATCH`.
Edward Thomson committed
-
- 29 Aug, 2015 1 commit
-
-
Document that `GIT_DIFF_PATHSPEC_DISABLE` is not necessarily about explicit path matching, but also includes matching of directory names. Enforce this in a test.
Edward Thomson committed
-