Commits · a4b332e8775a2e4fedf24c143b10f087ddf53d02 · lvzhengyang / git2

26 Oct, 2018 1 commit
- patch_parse: populate line numbers while parsing diffs · a051bce7
```
(cherry picked from commit f9e28026)
```
  Etienne Samson committed 6 years ago
  a051bce7 Browse Directory
05 Jul, 2018 1 commit

delta: fix sign-extension of big left-shift · 3f461902

Our delta code was originally adapted from JGit, which itself adapted it
from git itself. Due to this heritage, we inherited a bug from git.git
in how we compute the delta offset, which was fixed upstream in
48fb7deb5 (Fix big left-shifts of unsigned char, 2009-06-17). As
explained by Linus:

    Shifting 'unsigned char' or 'unsigned short' left can result in sign
    extension errors, since the C integer promotion rules means that the
    unsigned char/short will get implicitly promoted to a signed 'int' due to
    the shift (or due to other operations).

    This normally doesn't matter, but if you shift things up sufficiently, it
    will now set the sign bit in 'int', and a subsequent cast to a bigger type
    (eg 'long' or 'unsigned long') will now sign-extend the value despite the
    original expression being unsigned.

    One example of this would be something like

            unsigned long size;
            unsigned char c;

            size += c << 24;

    where despite all the variables being unsigned, 'c << 24' ends up being a
    signed entity, and will get sign-extended when then doing the addition in
    an 'unsigned long' type.

    Since git uses 'unsigned char' pointers extensively, we actually have this
    bug in a couple of places.

In our delta code, we inherited such a bogus shift when computing the
offset at which the delta base is to be found. Due to the sign extension
we can end up with an offset where all the bits are set. This can allow
an arbitrary memory read, as the addition in `base_len < off + len` can
now overflow if `off` has all its bits set.

Fix the issue by casting the result of `*delta++ << 24UL` to an unsigned
integer again. Add a test with a crafted delta that would actually
succeed with an out-of-bounds read in case where the cast wouldn't
exist.

Reported-by: Riccardo Schirone <rschiron@redhat.com>
Test-provided-by: Riccardo Schirone <rschiron@redhat.com>

committed 6 years ago

3f461902 Browse Directory

30 May, 2018 1 commit
- typo: Fixed a trivial typo in test function. · 2569056d
  Erik van Zijst committed 6 years ago
  
  2569056d Browse Directory
20 Feb, 2018 3 commits

diff_tform: fix rename detection with rewrite/delete pair · ce7080a0

A rewritten file can either be classified as a modification of its
contents or of a delete of the complete file followed by an addition of
the new content. This distinction becomes important when we want to
detect renames for rewrites. Given a scenario where a file "a" has been
deleted and another file "b" has been renamed to "a", this should be
detected as a deletion of "a" followed by a rename of "a" -> "b". Thus,
splitting of the original rewrite into a delete/add pair is important
here.

This splitting is represented by a flag we can set at the current delta.
While the flag is already being set in case we want to break rewrites,
we do not do so in case where the `GIT_DIFF_FIND_RENAMES_FROM_REWRITES`
flag is set. This can trigger an assert when we try to match the source
and target deltas.

Fix the issue by setting the `GIT_DIFF_FLAG__TO_SPLIT` flag at the delta
when it is a rename target and `GIT_DIFF_FIND_RENAMES_FROM_REWRITES` is
set.

committed 6 years ago

ce7080a0 Browse Directory

tests: add rename-rewrite scenarios to "renames" repository · 80e77b87

Add two more scenarios to the "renames" repository. The first scenario
has a major rewrite of a file and a delete of another file, the second
scenario has a deletion of a file and rename of another file to the
deleted file. Both scenarios will be used in the following commit.

committed 6 years ago

80e77b87 Browse Directory

tests: diff::rename: use defines for commit OIDs · d91da1da

While we frequently reuse commit OIDs throughout the file, we do not
have any constants to refer to these commits. Make this a bit easier to
read by giving the commit OIDs somewhat descriptive names of what kind
of commit they refer to.

committed 6 years ago

d91da1da Browse Directory

15 Dec, 2017 1 commit

diff_file: properly refcount blobs when initializing file contents · 2388a9e2

When initializing a `git_diff_file_content` from a source whose data is
derived from a blob, we simply assign the blob's pointer to the
resulting struct without incrementing its refcount. Thus, the structure
can only be used as long as the blob is kept alive by the caller.

Fix the issue by using `git_blob_dup` instead of a direct assignment.
This function will increment the refcount of the blob without allocating
new memory, so it does exactly what we want. As
`git_diff_file_content__unload` already frees the blob when
`GIT_DIFF_FLAG__FREE_BLOB` is set, we don't need to add new code
handling the free but only have to set that flag correctly.

committed 7 years ago

2388a9e2 Browse Directory

01 Sep, 2017 1 commit

patch_parse: fix parsing patches only containing exact renames · cc4c44a9

Patches which contain exact renames only will not contain an actual diff
body, but only a list of files that were renamed. Thus, the patch header
is immediately followed by the terminating sequence "-- ". We currently
do not recognize this character sequence as a possible terminating
sequence. Add it and create a test to catch the failure.

committed 7 years ago

cc4c44a9 Browse Directory

26 Jun, 2017 1 commit

diff: implement function to calculate patch ID · 89a34828

The upstream git project provides the ability to calculate a so-called
patch ID. Quoting from git-patch-id(1):

A "patch ID" is nothing but a sum of SHA-1 of the file diffs
associated with a patch, with whitespace and line numbers ignored."

Patch IDs can be used to identify two patches which are probably the
same thing, e.g. when a patch has been cherry-picked to another branch.

This commit implements a new function `git_diff_patchid`, which gets a
patch and derives an OID from the diff. Note the different terminology
here: a patch in libgit2 are the differences in a single file and a diff
can contain multiple patches for different files. The implementation
matches the upstream implementation and should derive the same OID for
the same diff. In fact, some code has been directly derived from the
upstream implementation.

The upstream implementation has two different modes to calculate patch
IDs, which is the stable and unstable mode. The old way of calculating
the patch IDs was unstable in a sense that a different ordering the
diffs was leading to different results. This oversight was fixed in git
1.9, but as git tries hard to never break existing workflows, the old
and unstable way is still default. The newer and stable way does not
care for ordering of the diff hunks, and in fact it is the mode that
should probably be used today. So right now, we only implement the
stable way of generating the patch ID.

committed 7 years ago

89a34828 Browse Directory

14 Mar, 2017 3 commits

diff_parse: correctly set options for parsed diffs · c0eba379

The function `diff_parsed_alloc` allocates and initializes a
`git_diff_parsed` structure. This structure also contains diff options.
While we initialize its flags, we fail to do a real initialization of
its values. This bites us when we want to actually use the generated
diff as we do not se the option's version field, which is required to
operate correctly.

Fix the issue by executing `git_diff_init_options` on the embedded
struct.

committed 7 years ago

c0eba379 Browse Directory

patch_parse: fix parsing minimal trailing diff line · ad5a909c

In a diff, the shortest possible hunk with a modification (that is, no
deletion) results from a file with only one line with a single character
which is removed. Thus the following hunk

    @@ -1 +1 @@
    -a
    +

is the shortest valid hunk modifying a line. The function parsing the
hunk body though assumes that there must always be at least 4 bytes
present to make up a valid hunk, which is obviously wrong in this case.
The absolute minimum number of bytes required for a modification is
actually 2 bytes, that is the "+" and the following newline. Note: if
there is no trailing newline, the assumption will not be offended as the
diff will have a line "\ No trailing newline" at its end.

This patch fixes the issue by lowering the amount of bytes required.

committed 7 years ago

ad5a909c Browse Directory

patch_generate: fix `git_diff_foreach` only working with generated diffs · ace3508f

The current logic of `git_diff_foreach` makes the assumption that all
diffs passed in are actually derived from generated diffs. With these
assumptions we try to derive the actual diff by inspecting either the
working directory files or blobs of a repository. This obviously cannot
work for diffs parsed from a file, where we do not necessarily have a
repository at hand.

Since the introduced split of parsed and generated patches, there are
multiple functions which help us to handle patches generically, being
indifferent from where they stem from. Use these functions and remove
the old logic specific to generated patches. This allows re-using the
same code for invoking the callbacks on the deltas.

committed 7 years ago

ace3508f Browse Directory

09 Oct, 2016 1 commit
- make git_diff_stats_to_buf not show 0 insertions or 0 deletions · dc5cfdba
  Sim Domingo committed 8 years ago
  
  dc5cfdba Browse Directory
24 Aug, 2016 1 commit

Teach `git_patch_from_diff` about parsed diffs · b859faa6

Ensure that `git_patch_from_diff` can return the patch for parsed diffs,
not just generate a patch for a generated diff.

committed 8 years ago

b859faa6 Browse Directory

26 Jun, 2016 4 commits
- patch: show copy information for identical copies · 1a79cd95
```
When showing copy information because we are duplicating contents,
for example, when performing a `diff --find-copies-harder -M100 -B100`,
then show copy from/to lines in a patch, and do not show context.
Ensure that we can also parse such patches.
```
  Edward Thomson committed 8 years ago
  1a79cd95 Browse Directory
- patch::parse: test diff with exact rename and copy · 9eb19381
  Edward Thomson committed 8 years ago
  
  9eb19381 Browse Directory
- patch::parse: test diff with simple rename · 8a670dc4
  Edward Thomson committed 8 years ago
  
  8a670dc4 Browse Directory
- diff::parse tests: test parsing a diff · e774d5af
```
Test that we can create a diff file, then parse the results and
that the two are identical in-memory.
```
  Edward Thomson committed 8 years ago
  e774d5af Browse Directory
26 May, 2016 2 commits
- introduce `git_diff_from_buffer` to parse diffs · 7166bb16
```
Parse diff files into a `git_diff` structure.
```
  Edward Thomson committed 8 years ago
  7166bb16 Browse Directory
- git_diff_generated: abstract generated diffs · 9be638ec
  Edward Thomson committed 8 years ago
  
  9be638ec Browse Directory
02 Apr, 2016 1 commit
- diff: test submodules are found with trailing `/` · 2e0391f4
```
Test that submodules are found when the are included in a pathspec
but have a trailing slash.
```
  Edward Thomson committed 8 years ago
  2e0391f4 Browse Directory
24 Mar, 2016 1 commit

iterator: give the tests a proper hierarchy · de034cd2

Iterator tests were split over repo::iterator and diff::iterator,
with duplication between the two.  Move them to iterator::index,
iterator::tree, and iterator::workdir.

committed 8 years ago

de034cd2 Browse Directory

23 Mar, 2016 3 commits

Added clar test for #3568 · df25daef
Jeff Hostetler committed 8 years ago

df25daef Browse Directory

iterators: refactored tree iterator · be30387e

Refactored the tree iterator to never recurse; simply process the
next entry in order in `advance`.  Additionally, reduce the number of
allocations and sorting as much as possible to provide a ~30% speedup
on case-sensitive iteration.  (The gains for case-insensitive iteration
are less majestic.)

committed 8 years ago

be30387e Browse Directory

iterator: disambiguate reset and reset_range · 684b35c4

Disambiguate the reset and reset_range functions.  Now reset_range
with a NULL path will clear the start or end; reset will leave the
existing start and end unchanged.

committed 8 years ago

684b35c4 Browse Directory

20 Mar, 2016 1 commit
- tree: re-use the id and filename in the odb object · 60a194aa
```
Instead of copying over the data into the individual entries, point to
the originals, which are already in a format we can use.
```
  Carlos Martín Nieto committed 8 years ago
  60a194aa Browse Directory
03 Mar, 2016 1 commit
- tests: take the version from our define · e23efa6d
  Carlos Martín Nieto committed 8 years ago
  
  e23efa6d Browse Directory
28 Feb, 2016 1 commit

tests: use legitimate object ids · 4afe536b

Use legitimate (existing) object IDs in tests so that we have the
ability to turn on strict object validation when running tests.

committed 8 years ago

4afe536b Browse Directory

12 Feb, 2016 1 commit
- win32: introduce p_timeval that isn't stupid · 35439f59
```
Windows defines `timeval` with `long`, which we cannot
sanely cope with.  Instead, use a custom timeval struct.
```
  Edward Thomson committed 8 years ago
  35439f59 Browse Directory
11 Feb, 2016 1 commit
- Horrible fix for #3173. · 3679ebae
  Arthur Schreiber committed 8 years ago
  
  3679ebae Browse Directory
01 Dec, 2015 1 commit

diff: include commit message when formatting patch · 254e0a33

When formatting a patch as email we do not include the commit's
message in the formatted patch output. Implement this and add a
test that verifies behavior.

committed 9 years ago

254e0a33 Browse Directory

20 Nov, 2015 1 commit
- Fix some warnings · 87428c55
  Jacques Germishuys committed 9 years ago
  
  87428c55 Browse Directory
03 Nov, 2015 1 commit

diff: test "symlinks" in wd are respected on win32 · f20480ab

When `core.symlinks = false`, we write the symlinks content (target)
to a regular file.  We should ensure that when we later see that
regular file, we treat it specially - and that changing that regular
file would actually change the symlink target.  (For compatibility
with Git for Windows).

committed 9 years ago

f20480ab Browse Directory

02 Nov, 2015 1 commit
- Add diff progress callback. · 3138ad93
  Jason Haslam committed 9 years ago
  
  3138ad93 Browse Directory
21 Oct, 2015 1 commit
- tests: Fix warnings · bbe1957b
  Vicent Marti committed 9 years ago
  
  bbe1957b Browse Directory
25 Sep, 2015 1 commit

Fix binary diffs · e4b2b919

git expects an empty line after the binary data:

literal X
...binary data...
<empty_line>

The last literal block of the generated patches were not containing the required empty line. Example:

	diff --git a/binary_file b/binary_file
	index 3f1b3f9098131cfecea4a50ff8afab349ea66d22..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644
	GIT binary patch
	literal 8
	Pc${NM&PdElPvrst3ey5{

	literal 6
	Nc${NM%g@i}0ssZ|0lokL
	diff --git a/binary_file2 b/binary_file2
	index 31be99be19470da4af5b28b21e27896a2f2f9ee2..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644
	GIT binary patch
	literal 8
	Pc${NM&PdElPvrst3ey5{

	literal 13
	Sc${NMEKbZyOexL+Qd|HZV+4u-

git apply of that diff results in:

	error: corrupt binary patch at line 9: diff --git a/binary_file2 b/binary_file2
	fatal: patch with only garbage at line 10

The proper formating is:

	diff --git a/binary_file b/binary_file
	index 3f1b3f9098131cfecea4a50ff8afab349ea66d22..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644
	GIT binary patch
	literal 8
	Pc${NM&PdElPvrst3ey5{

	literal 6
	Nc${NM%g@i}0ssZ|0lokL
	diff --git a/binary_file2 b/binary_file2
	index 31be99be19470da4af5b28b21e27896a2f2f9ee2..86e5c1008b5ce635d3e3fffa4434c5eccd8f00b6 100644
	GIT binary patch
	literal 8
	Pc${NM&PdElPvrst3ey5{

	literal 13
	Sc${NMEKbZyOexL+Qd|HZV+4u-

committed 9 years ago

e4b2b919 Browse Directory

12 Sep, 2015 1 commit
- diff::workdir: ensure ignored files are not returned · 92f7d32b
```
Ensure that a diff with the workdir is not erroneously returning
directories.
```
  Edward Thomson committed 9 years ago
  92f7d32b Browse Directory
31 Aug, 2015 1 commit

iterator: saner pathlist matching for idx iterator · d53c8880

Some nicer refactoring for index iteration walks.

The index iterator doesn't binary search through the pathlist space,
since it lacks directory entries, and would have to binary search
each index entry and all its parents (eg, when presented with an index
entry of `foo/bar/file.c`, you would have to look in the pathlist for
`foo/bar/file.c`, `foo/bar` and `foo`).  Since the index entries and the
pathlist are both nicely sorted, we walk the index entries in lockstep
with the pathlist like we do for other iteration/diff/merge walks.

committed 9 years ago

d53c8880 Browse Directory

30 Aug, 2015 1 commit

diff: use new iterator pathlist handling · 4a0dbeb0

When using literal pathspecs in diff with `GIT_DIFF_DISABLE_PATHSPEC_MATCH`
turn on the faster iterator pathlist handling.

Updates iterator pathspecs to include directory prefixes (eg, `foo/`)
for compatibility with `GIT_DIFF_DISABLE_PATHSPEC_MATCH`.

committed 9 years ago

4a0dbeb0 Browse Directory

29 Aug, 2015 1 commit

diff: better document GIT_DIFF_PATHSPEC_DISABLE · 3273ab3f

Document that `GIT_DIFF_PATHSPEC_DISABLE` is not necessarily about
explicit path matching, but also includes matching of directory
names.  Enforce this in a test.

committed 9 years ago

3273ab3f Browse Directory