- 03 Jan, 2018 8 commits
-
-
hash: openssl: check return values of SHA1_* functions
Edward Thomson committed -
diff_generate: avoid excessive stats of .gitattribute files
Edward Thomson committed -
When generating a diff between two trees, for each file that is to be diffed we have to determine whether it shall be treated as text or as binary files. While git has heuristics to determine which kind of diff to generate, users can also that default behaviour by setting or unsetting the 'diff' attribute for specific files. Because of that, we have to query gitattributes in order to determine how to diff the current files. Instead of hitting the '.gitattributes' file every time we need to query an attribute, which can get expensive especially on networked file systems, we try to cache them instead. This works perfectly fine for every '.gitattributes' file that is found, but we hit cache invalidation problems when we determine that an attribuse file is _not_ existing. We do create an entry in the cache for missing '.gitattributes' files, but as soon as we hit that file again we invalidate it and stat it again to see if it has now appeared. In the case of diffing large trees with each other, this behaviour is very suboptimal. For each pair of files that is to be diffed, we will repeatedly query every directory component leading towards their respective location for an attributes file. This leads to thousands or even hundreds of thousands of wasted syscalls. The attributes cache already has a mechanism to help in that scenario in form of the `git_attr_session`. As long as the same attributes session is still active, we will not try to re-query the gitmodules files at all but simply retain our currently cached results. To fix our problem, we can create a session at the top-most level, which is the initialization of the `git_diff` structure, and use it in order to look up the correct diff driver. As the `git_diff` structure is used to generate patches for multiple files at once, this neatly solves our problem by retaining the session until patches for all files have been generated. The fix has been tested with linux.git by calling `git_diff_tree_to_tree` and `git_diff_to_buf` with v4.10^{tree} and v4.14^{tree}. | time | .gitattributes stats without fix | 33.201s | 844614 with fix | 30.327s | 4441 While execution only improved by roughly 10%, the stat(3) syscalls for .gitattributes files decreased by 99.5%. The benchmarks were quite simple with best-of-three timings on Linux ext4 systems. One can assume that for network based file systems the performance gain will be a lot larger due to a much higher latency.
Patrick Steinhardt committed -
cmake: create a dummy file for Xcode
Patrick Steinhardt committed -
The function `ERR_error_string` can be invoked without providing a buffer, in which case OpenSSL will simply return a string printed into a static buffer. Obviously and as documented in ERR_error_string(3), this is not thread-safe at all. As libgit2 is a library, though, it is easily possible that other threads may be using OpenSSL at the same time, which might lead to clobbered error strings. Fix the issue by instead using a stack-allocated buffer. According to the documentation, the caller has to provide a buffer of at least 256 bytes of size. While we do so, make sure that the buffer will never get overflown by switching to `ERR_error_string_n` to specify the buffer's size.
Patrick Steinhardt committed -
The OpenSSL functions `SHA1_Init`, `SHA1_Update` and `SHA1_Final` all return 1 for success and 0 otherwise, but we never check their return values. Do so.
Patrick Steinhardt committed -
docs: git_treebuilder_insert validates entries
Patrick Steinhardt committed -
tree: standard error messages are lowercase
Patrick Steinhardt committed
-
- 01 Jan, 2018 1 commit
-
-
winhttp: properly support ntlm and negotiate
Edward Thomson committed
-
- 31 Dec, 2017 2 commits
-
-
Our standard error messages begin with a lower case letter so that they can be prefixed or embedded nicely. These error messages were missed during the standardization pass since they use the `tree_error` helper function.
Edward Thomson committed -
The documentation for `git_treebuilder_insert` erroneously states that we do not validate that the entry being inserted exists. We do, as of https://github.com/libgit2/libgit2/pull/3633. Update the documentation to reflect the new reality.
Edward Thomson committed
-
- 30 Dec, 2017 6 commits
-
-
Support using notes via a commit rather than a ref
Edward Thomson committed -
Transfer fewer objects on push and local fetch
Edward Thomson committed -
refs: traverse symlinked directories
Edward Thomson committed -
Inflate large loose blobs
Edward Thomson committed -
Ensure that we can recurse into directories via symbolic links.
Edward Thomson committed -
Perform some error checking when examining symlink directories.
Edward Thomson committed
-
- 29 Dec, 2017 2 commits
-
-
Native Git allows symlinked directories under .git/refs. This change allows libgit2 to also look for references that live under symlinked directories. Signed-off-by: Andy Doan <andy@opensourcefoundries.com>
Andy Doan committed -
When parsing unauthorized responses, properly parse headers looking for both NTLM and Negotiate challenges. Set the HTTP credentials to default credentials (using a `NULL` username and password) with the schemes supported by ourselves and the server.
Edward Thomson committed
-
- 28 Dec, 2017 1 commit
-
-
FETCH_HEAD and multiple refspecs
Edward Thomson committed
-
- 26 Dec, 2017 4 commits
-
-
Carlos Martín Nieto committed
-
We treat each refspec on its own, but the code currently overwrites the contents of FETCH_HEAD so we end up with the entries for the last refspec we processed. Instead, truncate it before performing the updates and append to it when updating the references.
Carlos Martín Nieto committed -
We want to do this in order to get FETCH_HEAD to be empty when we start updating it due to fetching from the remote.
Carlos Martín Nieto committed -
Carlos Martín Nieto committed
-
- 23 Dec, 2017 7 commits
-
-
patch_parse: fix parsing unquoted filenames with spaces
Edward Thomson committed -
Fix unpack double free
Edward Thomson committed -
If an element has been cached, but then the call to packfile_unpack_compressed() fails, the very next thing that happens is that its data is freed and then the element is not removed from the cache, which frees the data again. This change sets obj->data to NULL to avoid the double-free. It also stops trying to resolve deltas after two continuous failed rounds of resolution, and adds a test for this.
lhchavez committed -
Free OpenSSL peer certificate
Edward Thomson committed -
libFuzzer: Prevent a potential shift overflow
Edward Thomson committed -
cmake: let USE_ICONV be optional on macOS
Edward Thomson committed -
Do not attempt to check out submodule as blob when merging a submodule modify/deltete conflict
Edward Thomson committed
-
- 20 Dec, 2017 9 commits
-
-
Writing very large files may be slow, particularly on inefficient filesystems and when running instrumented code to detect invalid memory accesses (eg within valgrind or similar tools). Introduce `GITTEST_SLOW` so that tests that are slow can be skipped by the CI system.
Edward Thomson committed -
Teach the CommonCrypto hash mechanisms to support large files. The hash primitives take a `CC_LONG` (aka `uint32_t`) at a time. So loop to give the hash function at most an unsigned 32 bit's worth of bytes until we have hashed the entire file.
Edward Thomson committed -
Teach the win32 hash mechanisms to support large files. The hash primitives take at most `ULONG_MAX` bytes at a time. Loop, giving the hash function the maximum supported number of bytes, until we have hashed the entire file.
Edward Thomson committed -
Check the size of objects being read from the loose odb backend and reject those that would not fit in memory with an error message that reflects the actual problem, instead of error'ing later with an unintuitive error message regarding truncation or invalid hashes.
Edward Thomson committed -
Instead of paging to zlib in INT_MAX sized chunks, we can give it as many as UINT_MAX bytes at a time. zlib doesn't care how big a buffer we give it, this simply results in fewer calls into zlib.
Edward Thomson committed -
zlib will only inflate/deflate an `int`s worth of data at a time. We need to loop through large files in order to ensure that we inflate the entire file, not just an `int`s worth of data. Thankfully, we already have this loop in our `git_zstream` layer. Handle large objects using the `git_zstream`.
Edward Thomson committed -
Introduce an internal API to get the object type based on a length-specified (not null terminated) string representation. This can be used to compare the (space terminated) object type name in a loose object. Reimplement `git_object_string2type` based on this API.
Edward Thomson committed -
Introduce a test for very large objects in the ODB. Write a large object (5 GB) and ensure that the write succeeds and provides us the expected object ID. Introduce a test that writes that file and ensures that we can subsequently read it.
Edward Thomson committed -
Introduce `git_prefixncmp` that will search up to the first `n` characters of a string to see if it is prefixed by another string. This is useful for examining if a non-null terminated character array is prefixed by a particular substring. Consolidate the various implementations of `git__prefixcmp` around a single core implementation and add some test cases to validate its behavior.
Edward Thomson committed
-