- 24 Jun, 2019 5 commits
-
-
Create a separate `git_hash_sha1_ctx` structure that is specific to the SHA1 implementation and move all SHA1 functions over to use that one instead of the generic `git_hash_ctx`. The `git_hash_ctx` for now simply has a union containing this single SHA1 implementation, only, without any mechanism to distinguish between different algortihms.
Patrick Steinhardt committed -
As a preparatory step to allow multiple hashing APIs to exist at the same time, split the hashing functions into one layer for generic hashing and one layer for SHA1-specific hashing. Right now, this is simply an additional indirection layer that doesn't yet serve any purpose. In the future, the generic API will be extended to allow for choosing which hash to use, though, by simply passing an enum to the hash context initialization function. This is necessary as a first step to be ready for Git's move to SHA256.
Patrick Steinhardt committed -
As we will include additional hash algorithms in the future due to upstream git discussing a move away from SHA1, we should accomodate for that and prepare for the move. As a first step, move all SHA1 implementations into a common subdirectory. Also, create a SHA1-specific header file that lives inside the hash folder. This header will contain the SHA1-specific header includes, function declarations and the SHA1 context structure.
Patrick Steinhardt committed -
The hash source files have circular include dependencies right now, which shows by our broken generic hash implementation. The "hash.h" header declares two functions and the `git_hash_ctx` typedef before actually including the hash backend header and can only declare the remaining hash functions after the include due to possibly static function declarations inside of the implementation includes. Let's break this cycle and help maintainability by creating a real implementation file for each of the hash implementations. Instead of relying on the exact include order, we now especially avoid the use of `GIT_INLINE` for function declarations.
Patrick Steinhardt committed -
The structure `git_hash_prov` is only ever used by the Win32 SHA1 backend. As such, it doesn't make much sense to expose it via the generic "hash.h" header, as it is an implementation detail of the Win32 backend only. Move the typedef of `git_hash_prov` into "hash/sha1/win32.h" to fix this.
Patrick Steinhardt committed
-
- 16 Jun, 2019 1 commit
-
-
Our enumeration values are not generally suffixed with `T`. Further, our enumeration names are generally more descriptive.
Edward Thomson committed
-
- 15 Jun, 2019 13 commits
-
-
The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the tag function for consistency with the rest of the library.
Edward Thomson committed -
The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the index functions for consistency with the rest of the library.
Edward Thomson committed -
We don't use double-underscores in the public API.
Edward Thomson committed -
The majority of functions are named `from_something` (with an underscore) instead of `fromsomething`. Update the blob functions for consistency with the rest of the library.
Edward Thomson committed -
The only function that is named `issomething` (without underscore) was `git_oid_iszero`. Rename it to `git_oid_is_zero` for consistency with the rest of the library.
Edward Thomson committed -
The `fnmatch` code has now been completely replaced by `wildmatch`, same as upstream git.git has been doing in 2014. Remove it.
Patrick Steinhardt committed -
Upstream git has converted to use `wildmatch` instead of `fnmatch`. Convert our gitattributes logic to use `wildmatch` as the last user of `fnmatch`. Please, don't expect I know what I'm doing here: the fnmatch parser is one of the most fun things to play around with as it has a sh*tload of weird cases. In all honesty, I'm simply relying on our tests that are by now rather comprehensive in that area. The conversion actually fixes compatibility with how git.git parser "**" patterns when the given path does not contain any directory separators. Previously, a pattern "**.foo" erroneously wouldn't match a file "x.foo", while git.git would match. Remove the new-unused LEADINGDIR/NOLEADINGDIR flags for `git_attr_fnmatch`.
Patrick Steinhardt committed -
We currently use `p_fnmatch` to compute whether a given "gitdir:" or "gitdir/i:" conditional matches the current configuration file path. As git.git has moved to use `wildmatch` instead of `p_fnmatch` throughout its complete codebase, we evaluate conditionals inconsistently with git.git in some special cases. Convert `p_fnmatch` to use `wildmatch`. The `FNM_LEADINGDIR` flag cannot be translated to `wildmatch`, but in fact git.git doesn't use it here either. And in fact, dropping it while we go increases compatibility with git.git.
Patrick Steinhardt committed -
When evaluating "gitdir:" and "gitdir/i:" conditionals, we currently compare the given pattern with the value of `git_repository_path`. Thing is though that `git_repository_path` returns the gitdir path with trailing '/', while we actually need to match against the gitdir without it. Fix this issue by stripping the trailing '/' previous to matching. Add various tests to ensure we get this right.
Patrick Steinhardt committed -
The function `do_match_gitdir` has some horribly named parameters and variables. Rename them to improve readability. Furthermore, fix a potentially undetected out-of-memory condition when appending "**" to the pattern.
Patrick Steinhardt committed -
Upstream git.git has converted its codebase to use wildcard in favor of fnmatch in commit 70a8fc999d (stop using fnmatch (either native or compat), 2014-02-15). To keep our own regex-matching in line with what git does, convert all trivial instances of `fnmatch` usage to use `wildcard`, instead. Trivial usage is defined to be use of `fnmatch` with either no flags or flags that have a 1:1 equivalent in wildmatch (PATHNAME, IGNORECASE).
Patrick Steinhardt committed -
We're about to phase out our bundled fnmatch implementation as git.git has moved to wildmatch long ago in 2014. To make it easier to spot which files are stilll using fnmatch, remove the implicit "fnmatch.h" include in "posix.h" and instead include it explicitly.
Patrick Steinhardt committed -
In commit 70a8fc999d (stop using fnmatch (either native or compat), 2014-02-15), upstream git has switched over all code from their internal fnmatch copy to its new wildmatch code. We haven't followed suit, and thus have developed some incompatibilities in how we match regular expressions. Import git's wildmatch from v2.22.0 and add a test suite based on their t3070-wildmatch.sh tests.
Patrick Steinhardt committed
-
- 14 Jun, 2019 4 commits
-
-
By now, we have repeatedly failed to provide a nice cross-platform implementation of `p_fallocate`. Recent tries to do that escalated quite fast to a set of different CMake checks, implementations, fallbacks, etc., which started to look real awkward to maintain. In fact, `p_fallocate` had only been introduced in commit 4e3949b7 (tests: test that largefiles can be read through the tree API, 2019-01-30) to support a test with large files, but given the maintenance costs it just seems not to be worht it. As we have removed the sole user of `p_fallocate` in the previous commit, let's drop it altogether.
Patrick Steinhardt committed -
The interactions between `USE_HTTPS` and `SHA1_BACKEND` have been streamlined. Previously we would have accepted not quite working configurations (like, `-DUSE_HTTPS=OFF -DSHA1_BACKEND=OpenSSL`) and, as the OpenSSL detection only ran with `USE_HTTPS`, the link would fail. The detection was moved to a new `USE_SHA1`, modeled after `USE_HTTPS`, which takes the values "CollisionDetection/Backend/Generic", to better match how the "hashing backend" is selected, the default (ON) being "CollisionDetection". Note that, as `SHA1_BACKEND` is still used internally, you might need to check what customization you're using it for.
Etienne Samson committed -
Edward Thomson committed
-
In libgit2 nomenclature, when we need to verb a direct object, we name a function `git_directobject_verb`. Thus, if we need to init an options structure named `git_foo_options`, then the name of the function that does that should be `git_foo_options_init`. The previous names of `git_foo_init_options` is close - it _sounds_ as if it's initializing the options of a `foo`, but in fact `git_foo_options` is its own noun that should be respected. Deprecate the old names; they'll now call directly to the new ones.
Edward Thomson committed
-
- 13 Jun, 2019 6 commits
-
-
Our bundled http-parser includes bugfixes, therefore we should prefer our http-parser until such time as we can identify that the system http-parser has these bugfixes (using a version check). Since these bugs are - at present - minor, retain the ability for users to force that they want to use the system http-parser anyway. This does change the cmake specification so that people _must_ opt-in to the new behavior knowingly.
Edward Thomson committed -
In our attributes pattern parsing code, we have a comment that states we might have to convert '\' characters to '/' to have proper POSIX paths. But in fact, '\' characters are valid inside the string and act as escape mechanism for various characters, which is why we never want to convert those to POSIX directory separators. Furthermore, gitignore patterns are specified to only treat '/' as directory separators. Remove the comment to avoid future confusion.
Patrick Steinhardt committed -
When determining the trailing space length, we need to honor whether spaces are escaped or not. Currently, we do not check whether the escape itself is escaped, though, which might generate an off-by-one in that case as we will simply treat the space as escaped. Fix this by checking whether the backslashes preceding the space are themselves escaped.
Patrick Steinhardt committed -
When parsing attribute patterns, we will eventually unescape the parsed pattern. This is required because we require custom escapes for whitespace characters, as normally they are used to terminate the current pattern. Thing is, we don't only unescape those whitespace characters, but in fact all escaped sequences. So for example if the pattern was "\*", we unescape that to "*". As this is directly passed to fnmatch(3) later, fnmatch would treat it as a simple glob matching all files where it should instead only match a file with name "*". Fix the issue by unescaping spaces, only. Add a bunch of tests to exercise escape parsing.
Patrick Steinhardt committed -
When parsing attributes, we need to search for the first unescaped whitespace character to determine where the pattern is to be cut off. The scan fails to account for the case where the escaping '\' character is itself escaped, though, and thus we would not recognize the cut-off point in patterns like "\\ ". Refactor the scanning loop to remember whether the last character was an escape character. If it was and the next character is a '\', too, then we will reset to non-escaped mode again. Thus, we now handle escaped whitespaces as well as escaped wildcards correctly.
Patrick Steinhardt committed -
Windows-based systems treat paths starting with '\' as absolute, either referring to the current drive's root (e.g. "\foo" might refer to "C:\foo") or to a network path (e.g. "\\host\foo"). On the other hand, (most?) systems that are not based on Win32 accept backslashes as valid characters that may be part of the filename, and thus we cannot treat them to identify absolute paths. Change the logic to only paths starting with '\' as absolute on the Win32 platform. Add tests to avoid regressions and document behaviour.
Patrick Steinhardt committed
-
- 11 Jun, 2019 1 commit
-
-
Update our copy of sha1dc to the upstream commit 855827c (Detect endianess on HP-UX, 2019-05-09). Changes include fixes to endian detection on AIX and HP-UX systems as well as a define that allows us to force aligned access, which we're not using yet.
Patrick Steinhardt committed
-
- 10 Jun, 2019 10 commits
-
-
When we send HTTP credentials but the server rejects them, tear down the authentication context so that we can start fresh. To maintain this state, additionally move all of the authentication handling into `on_auth_required`.
Edward Thomson committed -
When we're issuing a CONNECT to a proxy, we expect to keep-alive to the proxy. However, during authentication negotiations, the proxy may close the connection. Reconnect if the server closes the connection.
Edward Thomson committed -
When we have a keep-alive connection to the server, that server may legally drop the connection for any reason once a successful request and response has occurred. It's common for servers to drop the connection after some amount of time or number of requests have occurred.
Edward Thomson committed -
We stop the read loop when we have read all the data. We should also consider the server's feelings. If the server hangs up on us, we need to stop our read loop. Otherwise, we'll try to read from the server - and fail - ad infinitum.
Edward Thomson committed -
Instead of using `is_complete` to decide whether we have connection or request affinity for authentication mechanisms, set a boolean on the mechanism definition itself.
Edward Thomson committed -
For request-based authentication mechanisms (Basic, Digest) we should keep the authentication context alive across socket connections, since the authentication headers must be transmitted with every request. However, we should continue to remove authentication contexts for mechanisms with connection affinity (NTLM, Negotiate) since we need to reauthenticate for every socket connection.
Edward Thomson committed -
Hold an individual authentication context instead of trying to maintain all the contexts; we can select the preferred context during the initial negotiation. Subsequent authentication steps will re-use the chosen authentication (until such time as it's rejected) instead of trying to manage multiple contexts when all but one will never be used (since we can only authenticate with a single mechanism at a time.) Also, when we're given a 401 or 407 in the middle of challenge/response handling, short-circuit immediately without incrementing the retry count. The multi-step authentication is expected, and not a "retry" and should not be penalized as such. This means that we don't need to keep the contexts around and ensures that we do not unnecessarily fail for too many retries when we have challenge/response auth on a proxy and a server and potentially redirects in play as well.
Edward Thomson committed -
Edward Thomson committed
-
A "connection" to a server is transient, and we may reconnect to a server in the midst of authentication failures (if the remote indicates that we should, via `Connection: close`) or in a redirect.
Edward Thomson committed -
Edward Thomson committed
-