Commits · ccffea6bfa65f9fc9e986f53924a0fd3dc5dfac1 · lvzhengyang / git2

28 Nov, 2019 2 commits

patch_parse: fix out-of-bounds reads caused by integer underflow · 33e6c402

The patch format for binary files is a simple Base85 encoding with a
length byte as prefix that encodes the current line's length. For each
line, we thus check whether the line's actual length matches its
expected length in order to not faultily apply a truncated patch. This
also acts as a check to verify that we're not reading outside of the
line's string:

	if (encoded_len > ctx->parse_ctx.line_len - 1) {
		error = git_parse_err(...);
		goto done;
	}

There is the possibility for an integer underflow, though. Given a line
with a single prefix byte, only, `line_len` will be zero when reaching
this check. As a result, subtracting one from that will result in an
integer underflow, causing us to assume that there's a wealth of bytes
available later on. Naturally, this may result in an out-of-bounds read.

Fix the issue by checking both `encoded_len` and `line_len` for a
non-zero value. The binary format doesn't make use of zero-length lines
anyway, so we need to know that there are both encoded bytes and
remaining characters available at all.

This patch also adds a test that works based on the last error message.
Checking error messages is usually too tightly coupled, but in fact
parsing the patch failed even before the change. Thus the only
possibility is to use e.g. Valgrind, but that'd result in us not
catching issues when run without Valgrind. As a result, using the error
message is considered a viable tradeoff as we know that we didn't start
decoding Base85 in the first place.

committed 5 years ago

33e6c402 Browse Directory

diff: make patchid computation work with all types of commits. · ece5bb5e

Current implementation of patchid is not computing a correct patchid
when given a patch where, for example, a new file is added or removed.
Some more corner cases need to be handled to have same behavior as git
patch-id command.
Add some more tests to cover those corner cases.

Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com>

committed 5 years ago

ece5bb5e Browse Directory

19 Nov, 2019 1 commit

patch_parse: correct parsing of patch containing not shown binary data. · 048e94ad

When not shown binary data is added or removed in a patch, patch parser
is currently returning 'error -1 - corrupt git binary header at line 4'.
Fix it by correctly handling case where binary data is added/removed.

Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com>

committed 5 years ago

048e94ad Browse Directory

10 Nov, 2019 1 commit

patch_parse: use paths from "---"/"+++" lines for binary patches · de7659cc

For some patches, it is not possible to derive the old and new file
paths from the patch header's first line, most importantly when they
contain spaces. In such a case, we derive both paths from the "---" and
"+++" lines, which allow for non-ambiguous parsing. We fail to use these
paths when parsing binary patches without data, though, as we always
expect the header paths to be filled in.

Fix this by using the "---"/"+++" paths by default and only fall back to
header paths if they aren't set. If neither of those paths are set, we
just return an error. Add two tests to verify this behaviour, one of
which would have previously caused a segfault.

committed 5 years ago

de7659cc Browse Directory

05 Nov, 2019 1 commit

patch_parse: fix segfault when header path contains whitespace only · de543e29

When parsing header paths from a patch, we reject any patches with empty
paths as malformed patches. We perform the check whether a path is empty
before sanitizing it, though, which may lead to a path becoming empty
after the check, e.g. if we have trimmed whitespace. This may lead to a
segfault later when any part of our patching logic actually references
such a path, which may then be a `NULL` pointer.

Fix the issue by performing the check after sanitizing. Add tests to
catch the issue as they would have produced a segfault previosuly.

committed 5 years ago

de543e29 Browse Directory

21 Oct, 2019 1 commit

patch_parse: detect overflow when calculating old/new line position · 37141ff7

When the patch contains lines close to INT_MAX, then it may happen that
we end up with an integer overflow when calculating the line of the
current diff hunk. Reject such patches as unreasonable to avoid the
integer overflow.

As the calculation is performed on integers, we introduce two new
helpers `git__add_int_overflow` and `git__sub_int_overflow` that perform
the integer overflow check in a generic way.

committed 5 years ago

37141ff7 Browse Directory

19 Oct, 2019 3 commits

patch_parse: fix out-of-bounds read with No-NL lines · 468e3ddc

We've got two locations where we copy lines into the patch. The first
one is when copying normal " ", "-" or "+" lines, while the second
location gets executed when we copy "\ No newline at end of file" lines.
While the first one correctly uses `git__strndup` to copy only until the
newline, the other one doesn't. Thus, if the line occurs at the end of
the patch and if there is no terminating NUL character, then it may
result in an out-of-bounds read.

Fix the issue by using `git__strndup`, as was already done in the other
location. Furthermore, add allocation checks to both locations to detect
out-of-memory situations.

committed 5 years ago

468e3ddc Browse Directory

patch_parse: reject empty path names · 6c6c15e9

When parsing patch headers, we currently accept empty path names just
fine, e.g. a line "--- \n" would be parsed as the empty filename. This
is not a valid patch format and may cause `NULL` pointer accesses at a
later place as `git_buf_detach` will return `NULL` in that case.

Reject such patches as malformed with a nice error message.

committed 5 years ago

6c6c15e9 Browse Directory

patch_parse: reject patches with multiple old/new paths · 223e7e43

It's currently possible to have patches with multiple old path name
headers. As we didn't check for this case, this resulted in a memory
leak when overwriting the old old path with the new old path because we
simply discarded the old pointer.

Instead of fixing this by free'ing the old pointer, we should reject
such patches altogether. It doesn't make any sense for the "---" or
"+++" markers to occur multiple times within a patch n the first place.
This also implicitly fixes the memory leak.

committed 5 years ago

223e7e43 Browse Directory

16 Oct, 2019 1 commit

patch_parse: handle patches without extended headers · 11de594f

Extended header lines (especially the "index <hash>..<hash> <mode>") are
not required by "git apply" so it import patches. So we allow the
from-file/to-file lines (--- a/file\n+++ b/file) to directly follow the
git diff header.

This fixes #5267.

committed 5 years ago

11de594f Browse Directory

28 Aug, 2019 1 commit
- apply: Test for EOFNL mishandling when several hunks are processed · 585fbd74
```
Introduce an unit test to validate that git_apply__patch() properly
handles EOFNL changes in case of patches with several hunks.
```
  Max Kostyukevich committed 5 years ago
  585fbd74 Browse Directory
11 Jul, 2019 2 commits

patch_parse: ensure valid patch output with EOFNL · b0893282
Erik Aigner committed 5 years ago

b0893282 Browse Directory

patch_parse: handle missing newline indicator in old file · 3f855fe8

When either the old or new file contents have no newline at the end of
the file, then git-diff(1) will print out a "\ No newline at end of
file" indicator. While we do correctly handle this in the case where the
new file has this indcator, we fail to parse patches where the old file
is missing a newline at EOF.

Fix this bug by handling and missing newline indicators in the old file.
Add tests to verify that we can parse such files.

committed 5 years ago

3f855fe8 Browse Directory

05 Jul, 2019 1 commit

patch_parse: do not depend on parsed buffer's lifetime · dedf70ad

When parsing a patch from a buffer, we let the patch lines point into
the original buffer. While this is efficient use of resources, this also
ties the lifetime of the parsed patch to the parsed buffer. As this
behaviour is not documented anywhere in our API it is very surprising to
its users.

Untie the lifetime by duplicating the lines into the parsed patch. Add a
test that verifies that lifetimes are indeed independent of each other.

committed 5 years ago

dedf70ad Browse Directory

06 Apr, 2019 1 commit
- patch_parse.c: Handle CRLF in parse_header_start · 30c06b60
  Drew DeVault committed 5 years ago
  
  30c06b60 Browse Directory
29 Mar, 2019 1 commit
- tests: diff: test parsing diffs with a new file with spaces in its path · 9d65360b
```
Add a test that verifies that we are able to parse patches which add a
new file that has spaces in its path.
```
  Erik Aigner committed 5 years ago
  9d65360b Browse Directory
05 Nov, 2018 1 commit
- patch: add support for partial patch application · 72630572
```
Add hunk callback parameter to git_apply__patch to allow hunks to be skipped.
```
  Jason Haslam committed 6 years ago
  72630572 Browse Directory
10 Jun, 2018 1 commit
- Convert usage of `git_buf_free` to new `git_buf_dispose` · ecf4f33a
  Patrick Steinhardt committed 6 years ago
  
  ecf4f33a Browse Directory
11 Nov, 2017 1 commit

patch_parse: allow parsing ambiguous patch headers · 80226b5f

The git patch format allows for having unquoted paths with whitespaces
inside. This format becomes ambiguous to parse, e.g. in the following
example:

    diff --git a/file b/with spaces.txt b/file b/with spaces.txt

While we cannot parse this in a correct way, we can instead use the
"---" and "+++" lines to retrieve the file names, as the path is not
followed by anything here but spans the complete remaining line. Because
of this, we can simply bail outwhen parsing the "diff --git" header here
without an actual error and then proceed to just take the paths from the
other headers.

committed 7 years ago

80226b5f Browse Directory

26 Jun, 2017 1 commit

diff: implement function to calculate patch ID · 89a34828

The upstream git project provides the ability to calculate a so-called
patch ID. Quoting from git-patch-id(1):

A "patch ID" is nothing but a sum of SHA-1 of the file diffs
associated with a patch, with whitespace and line numbers ignored."

Patch IDs can be used to identify two patches which are probably the
same thing, e.g. when a patch has been cherry-picked to another branch.

This commit implements a new function `git_diff_patchid`, which gets a
patch and derives an OID from the diff. Note the different terminology
here: a patch in libgit2 are the differences in a single file and a diff
can contain multiple patches for different files. The implementation
matches the upstream implementation and should derive the same OID for
the same diff. In fact, some code has been directly derived from the
upstream implementation.

The upstream implementation has two different modes to calculate patch
IDs, which is the stable and unstable mode. The old way of calculating
the patch IDs was unstable in a sense that a different ordering the
diffs was leading to different results. This oversight was fixed in git
1.9, but as git tries hard to never break existing workflows, the old
and unstable way is still default. The newer and stable way does not
care for ordering of the diff hunks, and in fact it is the mode that
should probably be used today. So right now, we only implement the
stable way of generating the patch ID.

committed 7 years ago

89a34828 Browse Directory

05 Sep, 2016 1 commit

diff: treat binary patches with no data special · adedac5a

When creating and printing diffs, deal with binary deltas that have
binary data specially, versus diffs that have a binary file but lack the
actual binary data.

committed 8 years ago

adedac5a Browse Directory

26 May, 2016 8 commits
- patch: differentiate not found and invalid patches · 94e488a0
  Edward Thomson committed 8 years ago
  
  94e488a0 Browse Directory
- git_patch_parse_ctx: refcount the context · 17572f67
  Edward Thomson committed 8 years ago
  
  17572f67 Browse Directory
- patch: `git_patch_from_patchfile` -> `git_patch_from_buffer` · 440e3bae
  Edward Thomson committed 8 years ago
  
  440e3bae Browse Directory
- apply: test postimages that grow/shrink original · 0ff723cc
```
Test with some postimages that actually grow/shrink from the
original, adding new lines or removing them.  (Also do so without
context to ensure that we can add/remove from a non-zero part of
the line vector.)
```
  Edward Thomson committed 8 years ago
  0ff723cc Browse Directory
- Introduce git_patch_options, handle prefixes · 82175084
```
Handle prefixes (in terms of number of path components) for patch
parsing.
```
  Edward Thomson committed 8 years ago
  82175084 Browse Directory
- patch_parse: test roundtrip patch parsing -> print · 2f3b922f
  Edward Thomson committed 8 years ago
  
  2f3b922f Browse Directory
- patch_parse: ensure we can parse a patch · 42b34428
  Edward Thomson committed 8 years ago
  
  42b34428 Browse Directory
- apply: move patch data to patch_common.h · 8bca8b9e
  Edward Thomson committed 8 years ago
  
  8bca8b9e Browse Directory