- 01 May, 2017 1 commit
-
-
Edward Thomson committed
-
- 28 Apr, 2017 4 commits
-
-
Verifying hashsums of objects we are reading from the ODB may be costly as we have to perform an additional hashsum calculation on the object. Especially when reading large objects, the penalty can be as high as 35%, as can be seen when executing the equivalent of `git cat-file` with and without verification enabled. To mitigate for this, we add a global option for libgit2 which enables the developer to turn off the verification, e.g. when he can be reasonably sure that the objects on disk won't be corrupted.
Patrick Steinhardt committed -
The upstream git.git project verifies objects when looking them up from disk. This avoids scenarios where objects have somehow become corrupt on disk, e.g. due to hardware failures or bit flips. While our mantra is usually to follow upstream behavior, we do not do so in this case, as we never check hashes of objects we have just read from disk. To fix this, we create a new error class `GIT_EMISMATCH` which denotes that we have looked up an object with a hashsum mismatch. `odb_read_1` will then, after having read the object from its backend, hash the object and compare the resulting hash to the expected hash. If hashes do not match, it will return an error. This obviously introduces another computation of checksums and could potentially impact performance. Note though that we usually perform I/O operations directly before doing this computation, and as such the actual overhead should be drowned out by I/O. Running our test suite seems to confirm this guess. On a Linux system with best-of-five timings, we had 21.592s with the check enabled and 21.590s with the ckeck disabled. Note though that our test suite mostly contains very small blobs only. It is expected that repositories with bigger blobs may notice an increased hit by this check. In addition to a new test, we also had to change the odb::backend::nonrefreshing test suite, which now triggers a hashsum mismatch when looking up the commit "deadbeef...". This is expected, as the fake backend allocated inside of the test will return an empty object for the OID "deadbeef...", which will obviously not hash back to "deadbeef..." again. We can simply adjust the hash to equal the hash of the empty object here to fix this test.
Patrick Steinhardt committed -
We currently have no tests which check whether we fail reading corrupted objects. Add one which modifies contents of an object stored on disk and then tries to read the object.
Patrick Steinhardt committed -
The object::lookup tests do use the "testrepo.git" repository in a read-only way, so we do not set up the repository as a sandbox but simply open it. But in a future commit, we will want to test looking up objects which are corrupted in some way, which requires us to modify the on-disk data. Doing this in a repository without creating the sandbox will modify contents of our libgit2 repository, though. Create the repository in a sandbox to avoid this.
Patrick Steinhardt committed
-
- 14 Nov, 2016 1 commit
-
-
We do not currently use the sorted version of this input in the function, which means we produce bad results.
Carlos Martín Nieto committed
-
- 09 Aug, 2016 1 commit
-
-
Patrick Steinhardt committed
-
- 20 Jun, 2016 1 commit
-
-
Patrick Steinhardt committed
-
- 24 May, 2016 1 commit
-
-
When we remove all entries in a tree, we should remove that tree from its parent rather than include the empty tree.
Carlos Martín Nieto committed
-
- 19 May, 2016 2 commits
-
-
Carlos Martín Nieto committed
-
This gives us trees with subdirectories, which the new test needs.
Carlos Martín Nieto committed
-
- 17 May, 2016 1 commit
-
-
Instead of going through the usual steps of reading a tree recursively into an index, modifying it and writing it back out as a tree, introduce a function to perform simple updates more efficiently. `git_tree_create_updated` avoids reading trees which are not modified and supports upsert and delete operations. It is not as versatile as modifying the index, but it makes some common operations much more efficient.
Carlos Martín Nieto committed
-
- 25 Apr, 2016 1 commit
-
-
While no extra header fields are defined for tags, git accepts them by ignoring them and continuing the search for the message. There are a few tags like this in the wild which git parses just fine, so we should do the same.
Carlos Martín Nieto committed
-
- 22 Mar, 2016 3 commits
-
-
The callback mechanism makes it awkward to write data from an IO source; move to `_fromstream()` which lets the caller remain in control, in the same vein as we prefer iterators over foreach callbacks.
Carlos Martín Nieto committed -
By returning when the count goes to zero rather than below it, setting `howmany` to 7 in fact writes out the string 6 times. Correct the termination condition to write out the string the amount of times we specify.
Carlos Martín Nieto committed -
The pair of `git_blob_create_frombuffer()` and `git_blob_create_frombuffer_commit()` is meant to replace `git_blob_create_fromchunks()` by providing a way for a user to write a new blob when they want filtering or they do not know the size. This approach allows the caller to retain control over when to add data to this buffer and a more natural fit into higher-level language's own stream abstractions instead of having to handle IO wait in the callback. The in-memory buffer size of 2MB is chosen somewhat arbitrarily to be a round multiple of usual page sizes and a value where most blobs seem likely to be either going to be way below or way over that size. It's also a round number of pages. This implementation re-uses the helper we have from `_fromchunks()` so we end up writing everything to disk, but hopefully more efficiently than with a default filebuf. A later optimisation can be to avoid writing the in-memory contents to disk, with some extra complexity.
Carlos Martín Nieto committed
-
- 20 Mar, 2016 1 commit
-
-
Instead of copying over the data into the individual entries, point to the originals, which are already in a format we can use.
Carlos Martín Nieto committed
-
- 04 Mar, 2016 1 commit
-
-
Submodules don't exist in the objectdb and the code is making us try to look for a blob with its commit id, which is obviously not going to work. Skip the test if the user wants to insert a submodule.
Carlos Martín Nieto committed
-
- 28 Feb, 2016 3 commits
-
-
Edward Thomson committed
-
Use legitimate (existing) object IDs in tests so that we have the ability to turn on strict object validation when running tests.
Edward Thomson committed -
When `GIT_OPT_ENABLE_STRICT_OBJECT_CREATION` is turned on, validate the tree and parent ids given to treebuilder insertion.
Edward Thomson committed
-
- 28 May, 2015 1 commit
-
-
Edward Thomson committed
-
- 04 Jan, 2015 1 commit
-
-
Carlos Martín Nieto committed
-
- 27 Dec, 2014 1 commit
-
-
This function is a constructor, so let's name it like one and leave _create() for the reference functions, which do create/write the reference.
Carlos Martín Nieto committed
-
- 17 Dec, 2014 1 commit
-
-
Path validation may be influenced by `core.protectHFS` and `core.protectNTFS` configuration settings, thus treebuilders can take a repository to influence their configuration.
Edward Thomson committed
-
- 22 Nov, 2014 1 commit
-
-
There are some combination of objects and target types which we know cannot be fulfilled. Return EINVALIDSPEC for those to signify that there is a mismatch in the user-provided data and what the object model is capable of satisfying. If we start at a tag and in the course of peeling find out that we cannot reach a particular type, we return EPEEL.
Carlos Martín Nieto committed
-
- 16 Sep, 2014 1 commit
-
-
Ciro Santilli committed
-
- 18 Aug, 2014 1 commit
-
-
The old `allocfmt` is of no use to callers, as they are not able to free the returned buffer. Export a new API that returns a static string that doesn't need to be freed.
Vicent Marti committed
-
- 01 Jul, 2014 1 commit
-
-
Edward Thomson committed
-
- 10 Jun, 2014 1 commit
-
-
Finding a filename in a vector means we need to resort it every time we want to read from it, which includes every time we want to write to it as well, as we want to find duplicate keys. A hash-map fits what we want to do much more accurately, as we do not care about sorting, but just the particular filename. We still keep removed entries around, as the interface let you assume they were going to be around until the treebuilder is cleared or freed, but in this case that involves an append to a vector in the filter case, which can now fail. The only time we care about sorting is when we write out the tree, so let's make that the only time we do any sorting.
Carlos Martín Nieto committed
-
- 07 Jun, 2014 1 commit
-
-
Philip Kelley committed
-
- 18 May, 2014 1 commit
-
-
The comment char is configurable and we need to provide a way for the user to specify which comment char they chose for their message.
Carlos Martín Nieto committed
-
- 08 May, 2014 1 commit
-
-
This adds in missing calls to `git_buf_sanitize` and fixes a number of places where `git_buf` APIs could inadvertently write NUL terminator bytes into invalid buffers. This also changes the behavior of `git_buf_sanitize` to NUL terminate a buffer if it can and of `git_buf_shorten` to do nothing if it can. Adds tests of filtering code with zeroed (i.e. unsanitized) buffer which was previously triggering a segfault.
Russell Belfer committed
-
- 06 May, 2014 1 commit
-
-
Diff and status do not want core.safecrlf to actually raise an error regardless of the setting, so this extends the filter API with an additional options flags parameter and adds a flag so that filters can be applied with GIT_FILTER_OPT_ALLOW_UNSAFE, indicating that unsafe filter application should be downgraded from a failure to a warning.
Russell Belfer committed
-
- 29 Apr, 2014 1 commit
-
-
The current version of the commit creation and amend function are unsafe to use when passing the update_ref parameter, as they do not check that the reference at the moment of update points to what the user expects. Make sure that we're moving history forward when we ask the library to update the reference for us by checking that the first parent of the new commit is the current value of the reference. We also make sure that the ref we're updating hasn't moved between the read and the write. Similarly, when amending a commit, make sure that the current tip of the branch is the commit we're amending.
Carlos Martín Nieto committed
-
- 10 Mar, 2014 1 commit
-
-
Jiri Pospisil committed
-
- 05 Mar, 2014 1 commit
-
-
This finds a short id string that will unambiguously select the given object, starting with the core.abbrev length (usually 7) and growing until it is no longer ambiguous.
Russell Belfer committed
-
- 08 Feb, 2014 1 commit
-
-
This adds an API to amend an existing commit, basically a shorthand for creating a new commit filling in missing parameters from the values of an existing commit. As part of this, I also added a new "sys" API to create a commit using a callback to get the parents. This allowed me to rewrite all the other commit creation APIs so that temporary allocations are no longer needed.
Russell Belfer committed
-
- 27 Jan, 2014 1 commit
-
-
A lot of the tests were checking for overflow, which we don't have anymore, so we can remove them.
Carlos Martín Nieto committed
-
- 25 Jan, 2014 1 commit
-
-
This was not converted when we converted the rest, so do it now.
Carlos Martín Nieto committed
-