- 12 Oct, 2018 1 commit
-
-
The `git_odb_stream` members `declared_size` and `received_bytes` are both of the type `git_off_t`, which we usually defined to be a 64 bit signed integer. Thus, passing these members to "PRIdZ" formatters is not correct, as they are not guaranteed to accept big enough numbers. Instead, use the "PRId64" formatter, which is able to represent 64 bit signed integers. (cherry picked from commit 0fcd0563)
Patrick Steinhardt committed
-
- 23 Mar, 2018 1 commit
-
-
In commit 7ec7aa4a (odb: assert on logic errors when writing objects, 2018-02-01), the check for whether we are trying to overflowing the fake stream buffer was changed from returning an error to raising an assert. The conversion forgot though that the logic around `assert`s are basically inverted. Previously, if the statement stream->written + len > steram->size evaluated to true, we would return a `-1`. Now we are asserting that this statement is true, and in case it is not we will raise an error. So the conversion to the `assert` in fact changed the behaviour to the complete opposite intention. Fix the assert by inverting its condition again and add a regression test.
Patrick Steinhardt committed
-
- 09 Feb, 2018 7 commits
-
-
Patrick Steinhardt committed
-
Return an error to the caller when we can't create an object header for some reason (printf failure) instead of simply asserting.
Edward Thomson committed -
There's no recovery possible if we're so confused or corrupted that we're trying to overwrite our memory. Simply assert.
Edward Thomson committed -
Edward Thomson committed
-
Provide error messages on hash failures: assert when given invalid input instead of failing with a user error; provide error messages on program errors.
Edward Thomson committed -
It's unlikely that we'll fail to allocate a single byte, but let's check for allocation failures for good measure. Untangle `-1` being a marker of not having found the hardcoded odb object; use that to reflect actual errors.
Edward Thomson committed -
At the moment, we're swallowing the allocation failure. We need to return the error to the caller.
Edward Thomson committed
-
- 02 Feb, 2018 1 commit
-
-
The streaming read functionality should provide the length and the type of the object, like the normal read functionality does.
Edward Thomson committed
-
- 26 Jan, 2018 1 commit
-
-
The null OID (hash with all zeroes) indicates a missing object in upstream git and is thus not a valid object ID. Add defensive measurements to avoid writing such a hash to the object database in the very unlikely case where some data results in the null OID. Furthermore, add shortcuts when reading the null OID from the ODB to avoid ever returning an object when a faulty repository may contain the null OID.
Patrick Steinhardt committed
-
- 03 Jul, 2017 1 commit
-
-
Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
Patrick Steinhardt committed
-
- 12 Jun, 2017 1 commit
-
-
When looking for an object by prefix, we query all the backends so that we can ensure that there is no ambiguity. We need to reset the `error` value between backends; otherwise the first backend may find an object by prefix, but subsequent backends may not. If we do not reset the `error` value then it will remain at `GIT_ENOTFOUND` and `read_prefix_1` will fail, despite having actually found an object.
Edward Thomson committed
-
- 15 May, 2017 2 commits
-
-
The fields `declared_size` and `received_bytes` of the `git_odb_stream` are both of type `git_off_t` which is defined as a signed integer. When passing these values to a printf-style string in `git_odb_stream__invalid_length`, though, we format these as PRIuZ, which is unsigned. Fix the issue by using PRIdZ instead, silencing warnings on macOS.
Patrick Steinhardt committed -
The `error` variable is used as a return value in the out-section of both `odb_read_1` and `read_prefix_1`. While the value will actually always be initialized inside of this section, GCC fails to realize this due to interactions with the `found` variable: if `found` is set, the error will always be initialized. If it is not, we return early without reaching the out-statements. Shut up the warnings by initializing the error variable, even though it is unnecessary.
Patrick Steinhardt committed
-
- 28 Apr, 2017 4 commits
-
-
While the function reading an object from the complete OID already verifies OIDs, we do not yet do so for reading objects from a partial OID. Do so when strict OID verification is enabled.
Patrick Steinhardt committed -
The read_prefix_1 function has several return statements springled throughout the code. As we have to free memory upon getting an error, the free code has to be repeated at every single retrun -- which it is not, so we have a memory leak here. Refactor the code to use the typical `goto out` pattern, which will free data when an error has occurred. While we're at it, we can also improve the error message thrown when multiple ambiguous prefixes are found. It will now include the colliding prefixes.
Patrick Steinhardt committed -
Verifying hashsums of objects we are reading from the ODB may be costly as we have to perform an additional hashsum calculation on the object. Especially when reading large objects, the penalty can be as high as 35%, as can be seen when executing the equivalent of `git cat-file` with and without verification enabled. To mitigate for this, we add a global option for libgit2 which enables the developer to turn off the verification, e.g. when he can be reasonably sure that the objects on disk won't be corrupted.
Patrick Steinhardt committed -
The upstream git.git project verifies objects when looking them up from disk. This avoids scenarios where objects have somehow become corrupt on disk, e.g. due to hardware failures or bit flips. While our mantra is usually to follow upstream behavior, we do not do so in this case, as we never check hashes of objects we have just read from disk. To fix this, we create a new error class `GIT_EMISMATCH` which denotes that we have looked up an object with a hashsum mismatch. `odb_read_1` will then, after having read the object from its backend, hash the object and compare the resulting hash to the expected hash. If hashes do not match, it will return an error. This obviously introduces another computation of checksums and could potentially impact performance. Note though that we usually perform I/O operations directly before doing this computation, and as such the actual overhead should be drowned out by I/O. Running our test suite seems to confirm this guess. On a Linux system with best-of-five timings, we had 21.592s with the check enabled and 21.590s with the ckeck disabled. Note though that our test suite mostly contains very small blobs only. It is expected that repositories with bigger blobs may notice an increased hit by this check. In addition to a new test, we also had to change the odb::backend::nonrefreshing test suite, which now triggers a hashsum mismatch when looking up the commit "deadbeef...". This is expected, as the fake backend allocated inside of the test will return an empty object for the OID "deadbeef...", which will obviously not hash back to "deadbeef..." again. We can simply adjust the hash to equal the hash of the empty object here to fix this test.
Patrick Steinhardt committed
-
- 03 Mar, 2017 1 commit
-
-
Freshen the tree object that a commit points to during commit time.
Edward Thomson committed
-
- 02 Mar, 2017 1 commit
-
-
Edward Thomson committed
-
- 29 Dec, 2016 1 commit
-
-
Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one
Edward Thomson committed
-
- 14 Nov, 2016 1 commit
-
-
Patrick Steinhardt committed
-
- 05 Aug, 2016 1 commit
-
-
Only provide the empty tree internally, which matches git's behavior. If we provide the empty blob then any users trying to write it with libgit2 would omit it from actually landing in the odb, which appear to git proper as a broken repository (missing that object).
Edward Thomson committed
-
- 04 Aug, 2016 1 commit
-
-
When writing an object, we calculate its OID and see if it exists in the object database. If it does, we need to freshen the file that contains it.
Edward Thomson committed
-
- 20 Jun, 2016 1 commit
-
-
Sim Domingo committed
-
- 26 May, 2016 1 commit
-
-
Move the delta application functions into `delta.c`, next to the similar delta creation functions. Make the `git__delta_apply` functions adhere to other naming and parameter style within the library.
Edward Thomson committed
-
- 09 Mar, 2016 4 commits
-
-
Vicent Marti committed
-
Vicent Marti committed
-
Vicent Marti committed
-
The old implementation had two issues: 1. OIDs that were too short as to be ambiguous were not being handled properly. 2. If the last OID to expand in the array was missing from the ODB, we would leak a `GIT_ENOTFOUND` error code from the function.
Vicent Marti committed
-
- 08 Mar, 2016 2 commits
-
-
Take (and write to) an array of a struct, `git_odb_expand_id`.
Edward Thomson committed -
Edward Thomson committed
-
- 07 Mar, 2016 2 commits
-
-
Query the object database for multiple objects at a time, given their object ID (which may be abbreviated) and optional type.
Edward Thomson committed -
When looking up an abbreviated oid, show the actual (abbreviated) oid the caller passed instead of a full (but ambiguously truncated) oid.
Edward Thomson committed
-
- 14 Oct, 2015 2 commits
-
-
For most real use cases, repositories with alternates use them as main object storage. Checking the alternate for objects before the main repository should result in measurable speedups. Because of this, we're changing the sorting algorithm to prioritize alternates *in cases where two backends have the same priority*. This means that the pack backend for the alternate will be checked before the pack backend for the main repository *but* both of them will be checked before any loose backends.
Vicent Marti committed -
In the current implementation of ODB backends, each backend is tasked with refreshing itself after a failed lookup. This is standard Git behavior: we want to e.g. reload the packfiles on disk in case they have changed and that's the reason we can't find the object we're looking for. This behavior, however, becomes pathological in repositories where multiple alternates have been loaded. Given that each alternate counts as a separate backend, a miss in the main repository (which can potentially be very frequent in cases where object storage comes from the alternate) will result in refreshing all its packfiles before we move on to the alternate backend where the object will most likely be found. To fix this, the code in `odb.c` has been refactored as to perform the refresh of all the backends externally, once we've verified that the object is nowhere to be found. If the refresh is successful, we then perform the lookup sequentially through all the backends, skipping the ones that we know for sure weren't refreshed (because they have no refresh API). The on-disk pack backend has been adjusted accordingly: it no longer performs refreshes internally.
Vicent Marti committed
-
- 30 Sep, 2015 1 commit
-
-
As refdb and odb backends can be allocated by client code, libgit2 can’t know whether an alternative memory allocator was used, and thus should not try to call `git__free` on those objects. Instead, odb and refdb backend implementations must always provide their own `free` functions to ensure memory gets freed correctly.
Arthur Schreiber committed
-
- 29 Jun, 2015 1 commit
-
-
Edward Thomson committed
-
- 02 Jun, 2015 1 commit
-
-
Pierre-Olivier Latour committed
-