Commits · 3e2fcfee161ab369dae11fd07e884ba2a5cbeccc · lvzhengyang / git2

26 Mar, 2020 40 commits

attr: Update definition of binary macro · 3e2fcfee
Laurence McGlashan committed Mar 26, 2020

3e2fcfee Browse Files

global: convert to fiber-local storage to fix exit races · bb173381

On Windows platforms, we automatically clean up the thread-local storage
upon detaching a thread via `DllMain()`. The thing is that this happens
for every thread of applications that link against the libgit2 DLL, even
those that don't have anything to do with libgit2 itself. As a result,
we cannot assume that these unsuspecting threads make use of our
`git_libgit2_init()` and `git_libgit2_shutdow()` reference counting,
which may lead to racy situations:

    Thread 1                    Thread 2

    git_libgit2_shutdown()
                                DllMain(DETACH_THREAD)
                                git__free_tls_data()
    git_atomic_dec() == 0
    git__free_tls_data()
    TlsFree(_tls_index)
                                TlsGetValue(_tls_index)

Due to the second thread never having executed `git_libgit2_init()`, the
first thread will clean up TLS data and as a result also free the
`_tls_index` variable. When detaching the second thread, we
unconditionally access the now-free'd `_tls_index` variable, which is
obviously not going to work out well.

Fix the issue by converting the code to use fiber-local storage instead
of thread-local storage. While FLS will behave the exact same as TLS if
no fibers are in use, it does allow us to specify a destructor similar
to the one that is accepted by pthread_key_create(3P). Like this, we do
not have to manually free indices anymore, but will let the FLS handle
calling the destructor. This allows us to get rid of `DllMain()`
completely, as we only used it to keep track of when threads were
exiting and results in an overall simplification of TLS cleanup.

committed Mar 26, 2020

bb173381 Browse Files

patch_parse: fix out-of-bounds reads caused by integer underflow · 60d1f99e

The patch format for binary files is a simple Base85 encoding with a
length byte as prefix that encodes the current line's length. For each
line, we thus check whether the line's actual length matches its
expected length in order to not faultily apply a truncated patch. This
also acts as a check to verify that we're not reading outside of the
line's string:

	if (encoded_len > ctx->parse_ctx.line_len - 1) {
		error = git_parse_err(...);
		goto done;
	}

There is the possibility for an integer underflow, though. Given a line
with a single prefix byte, only, `line_len` will be zero when reaching
this check. As a result, subtracting one from that will result in an
integer underflow, causing us to assume that there's a wealth of bytes
available later on. Naturally, this may result in an out-of-bounds read.

Fix the issue by checking both `encoded_len` and `line_len` for a
non-zero value. The binary format doesn't make use of zero-length lines
anyway, so we need to know that there are both encoded bytes and
remaining characters available at all.

This patch also adds a test that works based on the last error message.
Checking error messages is usually too tightly coupled, but in fact
parsing the patch failed even before the change. Thus the only
possibility is to use e.g. Valgrind, but that'd result in us not
catching issues when run without Valgrind. As a result, using the error
message is considered a viable tradeoff as we know that we didn't start
decoding Base85 in the first place.

committed Mar 26, 2020

60d1f99e Browse Files

patch_parse: use paths from "---"/"+++" lines for binary patches · 8ff44c2a

For some patches, it is not possible to derive the old and new file
paths from the patch header's first line, most importantly when they
contain spaces. In such a case, we derive both paths from the "---" and
"+++" lines, which allow for non-ambiguous parsing. We fail to use these
paths when parsing binary patches without data, though, as we always
expect the header paths to be filled in.

Fix this by using the "---"/"+++" paths by default and only fall back to
header paths if they aren't set. If neither of those paths are set, we
just return an error. Add two tests to verify this behaviour, one of
which would have previously caused a segfault.

committed Mar 26, 2020

8ff44c2a Browse Files

fileops: correct error return on p_lstat failures when mkdir · 128230ba

IIRC I got a strange return once from lstat, which translated in a weird
error class/message being reported. As a safety measure, enforce a -1 return in
that case.

committed Mar 26, 2020

128230ba Browse Files

patch_parse: fix segfault when header path contains whitespace only · 7ce231ff

When parsing header paths from a patch, we reject any patches with empty
paths as malformed patches. We perform the check whether a path is empty
before sanitizing it, though, which may lead to a path becoming empty
after the check, e.g. if we have trimmed whitespace. This may lead to a
segfault later when any part of our patching logic actually references
such a path, which may then be a `NULL` pointer.

Fix the issue by performing the check after sanitizing. Add tests to
catch the issue as they would have produced a segfault previosuly.

committed Mar 26, 2020

7ce231ff Browse Files

fix a bug introduced in 8a23597b · 9f2732cc
romkatv committed Mar 26, 2020

9f2732cc Browse Files
Follow 308 redirect in WinHTTP transport · c32f5b00
pcpthm committed Mar 26, 2020

c32f5b00 Browse Files

patch_parse: detect overflow when calculating old/new line position · 54f0a278

When the patch contains lines close to INT_MAX, then it may happen that
we end up with an integer overflow when calculating the line of the
current diff hunk. Reject such patches as unreasonable to avoid the
integer overflow.

As the calculation is performed on integers, we introduce two new
helpers `git__add_int_overflow` and `git__sub_int_overflow` that perform
the integer overflow check in a generic way.

committed Mar 26, 2020

54f0a278 Browse Files

patch_parse: fix out-of-bounds read with No-NL lines · 608cb07d

We've got two locations where we copy lines into the patch. The first
one is when copying normal " ", "-" or "+" lines, while the second
location gets executed when we copy "\ No newline at end of file" lines.
While the first one correctly uses `git__strndup` to copy only until the
newline, the other one doesn't. Thus, if the line occurs at the end of
the patch and if there is no terminating NUL character, then it may
result in an out-of-bounds read.

Fix the issue by using `git__strndup`, as was already done in the other
location. Furthermore, add allocation checks to both locations to detect
out-of-memory situations.

committed Mar 26, 2020

608cb07d Browse Files

patch_parse: reject empty path names · 3223f5de

When parsing patch headers, we currently accept empty path names just
fine, e.g. a line "--- \n" would be parsed as the empty filename. This
is not a valid patch format and may cause `NULL` pointer accesses at a
later place as `git_buf_detach` will return `NULL` in that case.

Reject such patches as malformed with a nice error message.

committed Mar 26, 2020

3223f5de Browse Files

patch_parse: reject patches with multiple old/new paths · db73191b

It's currently possible to have patches with multiple old path name
headers. As we didn't check for this case, this resulted in a memory
leak when overwriting the old old path with the new old path because we
simply discarded the old pointer.

Instead of fixing this by free'ing the old pointer, we should reject
such patches altogether. It doesn't make any sense for the "---" or
"+++" markers to occur multiple times within a patch n the first place.
This also implicitly fixes the memory leak.

committed Mar 26, 2020

db73191b Browse Files

patch_parse: handle patches without extended headers · fc60777e

Extended header lines (especially the "index <hash>..<hash> <mode>") are
not required by "git apply" so it import patches. So we allow the
from-file/to-file lines (--- a/file\n+++ b/file) to directly follow the
git diff header.

This fixes #5267.

committed Mar 26, 2020

fc60777e Browse Files

refs: unlock unmodified refs on transaction commit · fceedda5

Refs which are locked in a transaction without an altered target,
still should to be unlocked on `git_transaction_commit`.
`git_transaction_free` also unlocks refs but the moment of calling of `git_transaction_free`
cannot be controlled in all situations.
Some binding libs call `git_transaction_free` on garbage collection or not at all if the
application exits before and don't provide public access to `git_transaction_free`.
It is better to release locks as soon as possible.

committed Mar 26, 2020

fceedda5 Browse Files

refs: fix locks getting forcibly removed · 5aca2444

The flag GIT_FILEBUF_FORCE currently does two things:
     1. It will cause the filebuf to create non-existing leading
        directories for the file that is about to be written.
     2. It will forcibly remove any pre-existing locks.
While most call sites actually do want (1), they do not want to
remove pre-existing locks, as that renders the locking mechanisms
effectively useless.
Introduce a new flag `GIT_FILEBUF_CREATE_LEADING_DIRS` to
separate both behaviours cleanly from each other and convert
callers to use it instead of `GIT_FILEBUF_FORCE` to have them
honor locked files correctly.

As this conversion removes all current users of `GIT_FILEBUF_FORCE`,
this commit removes the flag altogether.

committed Mar 26, 2020

5aca2444 Browse Files

patch_parse: handle patches with new empty files · 85ab27c8

Patches containing additions of empty files will not contain diff data
but will end with the index header line followed by the terminating
sequence "-- ". We follow the same logic as in cc4c44a9 and allow "-- "
to immediately follow the index header.

committed Mar 26, 2020

85ab27c8 Browse Files

buffer: fix printing into out-of-memory buffer · 3c605da6

Before printing into a `git_buf` structure, we always call `ENSURE_SIZE`
first. This macro will reallocate the buffer as-needed depending on
whether the current amount of allocated bytes is sufficient or not. If
`asize` is big enough, then it will just do nothing, otherwise it will
call out to `git_buf_try_grow`. But in fact, it is insufficient to only
check `asize`.

When we fail to allocate any more bytes e.g. via `git_buf_try_grow`,
then we set the buffer's pointer to `git_buf__oom`. Note that we touch
neither `asize` nor `size`. So if we just check `asize > targetsize`,
then we will happily let the caller of `ENSURE_SIZE` proceed with an
out-of-memory buffer. As a result, we will print all bytes into the
out-of-memory buffer instead, resulting in an out-of-bounds write.

Fix the issue by having `ENSURE_SIZE` verify that the buffer is not
marked as OOM. Add a test to verify that we're not writing into the OOM
buffer.

committed Mar 26, 2020

3c605da6 Browse Files

buffer: fix infinite loop when growing buffers · 93dc8a04

When growing buffers, we repeatedly multiply the currently allocated
number of bytes by 1.5 until it exceeds the requested number of bytes.
This has two major problems:

    1. If the current number of bytes is tiny and one wishes to resize
       to a comparatively huge number of bytes, then we may need to loop
       thousands of times.

    2. If resizing to a value close to `SIZE_MAX` (which would fail
       anyway), then we probably hit an infinite loop as multiplying the
       current amount of bytes will repeatedly result in integer
       overflows.

When reallocating buffers, one typically chooses values close to 1.5 to
enable re-use of resulting memory holes in later reallocations. But
because of this, it really only makes sense to use a factor of 1.5
_once_, but not looping until we finally are able to fit it. Thus, we
can completely avoid the loop and just opt for the much simpler
algorithm of multiplying with 1.5 once and, if the result doesn't fit,
just use the target size. This avoids both problems of looping
extensively and hitting overflows.

This commit also adds a test that would've previously resulted in an
infinite loop.

committed Mar 26, 2020

93dc8a04 Browse Files

buffer: fix memory leak if unable to grow buffer · 18ca62de

If growing a buffer fails, we set its pointer to the static
`git_buf__oom` structure. While we correctly free the old pointer if
`git__malloc` returned an error, we do not free it if there was an
integer overflow while calculating the new allocation size. Fix this
issue by freeing the pointer to plug the memory leak.

committed Mar 26, 2020

18ca62de Browse Files

open:move all cleanup code to cleanup label in git_repository_open_ext · 51fb0c15
Laurence McGlashan committed Mar 26, 2020

51fb0c15 Browse Files
open:fix memory leak when passing NULL to git_repository_open_ext · 8c99ccc5
Laurence McGlashan committed Mar 26, 2020

8c99ccc5 Browse Files

iterator: remove duplicate memset · 35168571

When allocating new tree iterator frames, we zero out the allocated
memory twice. Remove one of the `memset` calls.

committed Mar 26, 2020

35168571 Browse Files

iterator: avoid leaving partially initialized frame on stack · f647d021

When allocating tree iterator entries, we use GIT_ERROR_ALLOC_CHECK` to
check whether the allocation has failed. The macro will cause the
function to immediately return, though, leaving behind a partially
initialized iterator frame.

Fix the issue by manually checking for memory allocation errors and
using `goto done` in case of an error, popping the iterator frame.

committed Mar 26, 2020

f647d021 Browse Files

diff_generate: detect memory allocation errors when preparing opts · ad735bf3

When preparing options for the two iterators that are about to be
diffed, we allocate a common prefix for both iterators depending on
the options passed by the user. We do not check whether the allocation
was successful, though. In fact, this isn't much of a problem, as using
a `NULL` prefix is perfectly fine. But in the end, we probably want to
detect that the system doesn't have any memory left, as we're unlikely
to be able to continue afterwards anyway.

While the issue is being fixed in the newly created function
`diff_prepare_iterator_opts`, it has been previously existing in the
previous macro `DIFF_FROM_ITERATORS` already.

committed Mar 26, 2020

ad735bf3 Browse Files

diff_generate: refactor `DIFF_FROM_ITERATORS` macro of doom · 7aa03e92

While the `DIFF_FROM_ITERATORS` does make it shorter to implement the
various `git_diff_foo_to_bar` functions, it is a complex and unreadable
beast that implicitly assumes certain local variable names. This is not
something desirable to have at all and obstructs understanding and more
importantly debugging the code by quite a bit.

The `DIFF_FROM_ITERATORS` macro basically removed the burden of having
to derive the options for both iterators from a pair of iterator flags
and the diff options. This patch introduces a new function that does the
that exact and refactors all callers to manage the iterators by
themselves.

As we potentially need to allocate a shared prefix for the
iterator, we need to tell the caller to allocate that prefix as soon as
the options aren't required anymore. Thus, the function has a `char
**prefix` out pointer that will get set to the allocated string and
subsequently be free'd by the caller.

While this patch increases the line count, I personally deem this to an
acceptable tradeoff for increased readbiblity.

committed Mar 26, 2020

7aa03e92 Browse Files

ignore: correct handling of nested rules overriding wild card unignore · 99b89a9c

problem:
filesystem_iterator loads .gitignore files in top-down order.
subsequently, ignore module evaluates them in the order they are loaded.
this creates a problem if we have unignored a rule (using a wild card)
in a sub dir and ignored it again in a level further below (see the test
included in this patch).

solution:
process ignores in reverse order.

closes #4963

committed Mar 26, 2020

99b89a9c Browse Files

apply: Test for EOFNL mishandling when several hunks are processed · 5e5a9cce
```
Introduce an unit test to validate that git_apply__patch() properly
handles EOFNL changes in case of patches with several hunks.
```
Max Kostyukevich committed Mar 26, 2020
5e5a9cce Browse Files

apply: Fix a patch corruption related to EOFNL handling · 0126e3fc

Use of apply's API can lead to an improper patch application and a corruption
of the modified file.

The issue is caused by mishandling of the end of file changes if there are
several hunks to apply. The new line character is added to a line from a wrong
hunk.

The solution is to modify apply_hunk() to add the newline character at the end
of a line from a right hunk.

committed Mar 26, 2020

0126e3fc Browse Files

apply: free test data · ae9b333a
Edward Thomson committed Mar 26, 2020

ae9b333a Browse Files
apply: Test for git_apply_to_tree failures when new files are added · deda897a
```
Introduce an unit test to validate if git_apply_to_tree() fails when an
applied patch adds new files.
```
Max Kostyukevich committed Mar 26, 2020
deda897a Browse Files

apply: git_apply_to_tree fails to apply patches that add new files · d6e5c44f

git_apply_to_tree() cannot be used apply patches with new files. An attempt
to apply such a patch fails because git_apply_to_tree() tries to remove a
non-existing file from an old index.

The solution is to modify git_apply_to_tree() to git_index_remove() when the
patch states that the modified files is removed.

committed Mar 26, 2020

d6e5c44f Browse Files

config: check if we are running in a sandboxed environment · 30cd1e1f

On macOS the $HOME environment variable returns the path to the sandbox container instead of the actual user $HOME for sandboxed apps. To get the correct path, we have to get it from the password file entry.

committed Mar 26, 2020

30cd1e1f Browse Files

patch_parse: fix segfault due to line containing static contents · c159cceb

With commit dedf70ad (patch_parse: do not depend on parsed buffer's
lifetime, 2019-07-05), all lines of the patch are allocated with
`strdup` to make lifetime of the parsed patch independent of the buffer
that is currently being parsed. In patch b0893282 (patch_parse: ensure
valid patch output with EOFNL, 2019-07-11), we introduced another
code location where we add lines to the parsed patch. But as that one
was implemented via a separate pull request, it wasn't converted to use
`strdup`, as well. As a consequence, we generate a segfault when trying
to deallocate the potentially static buffer that's now in some of the
lines.

Use `git__strdup` to fix the issue.

committed Mar 26, 2020

c159cceb Browse Files

patch_parse: ensure valid patch output with EOFNL · 16dbedc9
Erik Aigner committed Mar 26, 2020

16dbedc9 Browse Files

patch_parse: handle missing newline indicator in old file · fe012c60

When either the old or new file contents have no newline at the end of
the file, then git-diff(1) will print out a "\ No newline at end of
file" indicator. While we do correctly handle this in the case where the
new file has this indcator, we fail to parse patches where the old file
is missing a newline at EOF.

Fix this bug by handling and missing newline indicators in the old file.
Add tests to verify that we can parse such files.

committed Mar 26, 2020

fe012c60 Browse Files

apply: refactor to use a switch statement · b8339912
Patrick Steinhardt committed Mar 26, 2020

b8339912 Browse Files

diff: ignore EOFNL for computing patch IDs · ef1651e6

The patch ID is supposed to be mostly context-insignificant and
thus only includes added or deleted lines. As such, we shouldn't honor
end-of-file-without-newline markers in diffs.

Ignore such lines to fix how we compute the patch ID for such diffs.

committed Mar 26, 2020

ef1651e6 Browse Files

patch_parse: do not depend on parsed buffer's lifetime · 782bc334

When parsing a patch from a buffer, we let the patch lines point into
the original buffer. While this is efficient use of resources, this also
ties the lifetime of the parsed patch to the parsed buffer. As this
behaviour is not documented anywhere in our API it is very surprising to
its users.

Untie the lifetime by duplicating the lines into the parsed patch. Add a
test that verifies that lifetimes are indeed independent of each other.

committed Mar 26, 2020

782bc334 Browse Files

ci: add flaky test re-execution on Windows · 7786d7e9

Our online tests are occasionally flaky since they hit real network
endpoints.  Re-run them up to 5 times if they fail, to allow us to
avoid having to fail the whole build.

committed Mar 26, 2020

7786d7e9 Browse Files

ci: add flaky test re-execution on Unix · f8a09985

Our online tests are occasionally flaky since they hit real network
endpoints.  Re-run them up to 5 times if they fail, to allow us to
avoid having to fail the whole build.

committed Mar 26, 2020

f8a09985 Browse Files