1. 01 May, 2017 1 commit
  2. 28 Apr, 2017 4 commits
    • odb: add option to turn off hash verification · 35079f50
      Verifying hashsums of objects we are reading from the ODB may be costly
      as we have to perform an additional hashsum calculation on the object.
      Especially when reading large objects, the penalty can be as high as
      35%, as can be seen when executing the equivalent of `git cat-file` with
      and without verification enabled. To mitigate for this, we add a global
      option for libgit2 which enables the developer to turn off the
      verification, e.g. when he can be reasonably sure that the objects on
      disk won't be corrupted.
      Patrick Steinhardt committed
    • odb: verify object hashes · 28a0741f
      The upstream git.git project verifies objects when looking them up from
      disk. This avoids scenarios where objects have somehow become corrupt on
      disk, e.g. due to hardware failures or bit flips. While our mantra is
      usually to follow upstream behavior, we do not do so in this case, as we
      never check hashes of objects we have just read from disk.
      
      To fix this, we create a new error class `GIT_EMISMATCH` which denotes
      that we have looked up an object with a hashsum mismatch. `odb_read_1`
      will then, after having read the object from its backend, hash the
      object and compare the resulting hash to the expected hash. If hashes do
      not match, it will return an error.
      
      This obviously introduces another computation of checksums and could
      potentially impact performance. Note though that we usually perform I/O
      operations directly before doing this computation, and as such the
      actual overhead should be drowned out by I/O. Running our test suite
      seems to confirm this guess. On a Linux system with best-of-five
      timings, we had 21.592s with the check enabled and 21.590s with the
      ckeck disabled. Note though that our test suite mostly contains very
      small blobs only. It is expected that repositories with bigger blobs may
      notice an increased hit by this check.
      
      In addition to a new test, we also had to change the
      odb::backend::nonrefreshing test suite, which now triggers a hashsum
      mismatch when looking up the commit "deadbeef...". This is expected, as
      the fake backend allocated inside of the test will return an empty
      object for the OID "deadbeef...", which will obviously not hash back to
      "deadbeef..." again. We can simply adjust the hash to equal the hash of
      the empty object here to fix this test.
      Patrick Steinhardt committed
    • tests: object: test looking up corrupted objects · d59dabe5
      We currently have no tests which check whether we fail reading corrupted
      objects. Add one which modifies contents of an object stored on disk and
      then tries to read the object.
      Patrick Steinhardt committed
    • tests: object: create sandbox · 86c03552
      The object::lookup tests do use the "testrepo.git" repository in a
      read-only way, so we do not set up the repository as a sandbox but
      simply open it. But in a future commit, we will want to test looking up
      objects which are corrupted in some way, which requires us to modify the
      on-disk data. Doing this in a repository without creating the sandbox
      will modify contents of our libgit2 repository, though.
      
      Create the repository in a sandbox to avoid this.
      Patrick Steinhardt committed
  3. 14 Nov, 2016 1 commit
  4. 09 Aug, 2016 1 commit
  5. 20 Jun, 2016 1 commit
  6. 24 May, 2016 1 commit
  7. 19 May, 2016 2 commits
  8. 17 May, 2016 1 commit
    • Introduce a function to create a tree based on a different one · 9464f9eb
      Instead of going through the usual steps of reading a tree recursively
      into an index, modifying it and writing it back out as a tree, introduce
      a function to perform simple updates more efficiently.
      
      `git_tree_create_updated` avoids reading trees which are not modified
      and supports upsert and delete operations. It is not as versatile as
      modifying the index, but it makes some common operations much more
      efficient.
      Carlos Martín Nieto committed
  9. 25 Apr, 2016 1 commit
  10. 22 Mar, 2016 3 commits
    • blob: remove _fromchunks() · 6669e3e8
      The callback mechanism makes it awkward to write data from an IO
      source; move to `_fromstream()` which lets the caller remain in control,
      in the same vein as we prefer iterators over foreach callbacks.
      Carlos Martín Nieto committed
    • blob: fix fromchunks iteration counter · 35e68606
      By returning when the count goes to zero rather than below it, setting
      `howmany` to 7 in fact writes out the string 6 times.
      
      Correct the termination condition to write out the string the amount of
      times we specify.
      Carlos Martín Nieto committed
    • blob: introduce creating a blob by writing into a stream · 0a5c6028
      The pair of `git_blob_create_frombuffer()` and
      `git_blob_create_frombuffer_commit()` is meant to replace
      `git_blob_create_fromchunks()` by providing a way for a user to write a
      new blob when they want filtering or they do not know the size.
      
      This approach allows the caller to retain control over when to add data
      to this buffer and a more natural fit into higher-level language's own
      stream abstractions instead of having to handle IO wait in the callback.
      
      The in-memory buffer size of 2MB is chosen somewhat arbitrarily to be a
      round multiple of usual page sizes and a value where most blobs seem
      likely to be either going to be way below or way over that size. It's
      also a round number of pages.
      
      This implementation re-uses the helper we have from `_fromchunks()` so
      we end up writing everything to disk, but hopefully more efficiently
      than with a default filebuf. A later optimisation can be to avoid
      writing the in-memory contents to disk, with some extra complexity.
      Carlos Martín Nieto committed
  11. 20 Mar, 2016 1 commit
  12. 04 Mar, 2016 1 commit
  13. 28 Feb, 2016 3 commits
  14. 28 May, 2015 1 commit
  15. 04 Jan, 2015 1 commit
  16. 27 Dec, 2014 1 commit
  17. 17 Dec, 2014 1 commit
  18. 22 Nov, 2014 1 commit
    • peel: reject bad queries with EINVALIDSPEC · 753e17b0
      There are some combination of objects and target types which we know
      cannot be fulfilled. Return EINVALIDSPEC for those to signify that there
      is a mismatch in the user-provided data and what the object model is
      capable of satisfying.
      
      If we start at a tag and in the course of peeling find out that we
      cannot reach a particular type, we return EPEEL.
      Carlos Martín Nieto committed
  19. 16 Sep, 2014 1 commit
  20. 18 Aug, 2014 1 commit
  21. 01 Jul, 2014 1 commit
  22. 10 Jun, 2014 1 commit
    • treebuilder: use a map instead of vector to store the entries · 4d3f1f97
      Finding a filename in a vector means we need to resort it every time we
      want to read from it, which includes every time we want to write to it
      as well, as we want to find duplicate keys.
      
      A hash-map fits what we want to do much more accurately, as we do not
      care about sorting, but just the particular filename.
      
      We still keep removed entries around, as the interface let you assume
      they were going to be around until the treebuilder is cleared or freed,
      but in this case that involves an append to a vector in the filter case,
      which can now fail.
      
      The only time we care about sorting is when we write out the tree, so
      let's make that the only time we do any sorting.
      Carlos Martín Nieto committed
  23. 07 Jun, 2014 1 commit
  24. 18 May, 2014 1 commit
  25. 08 May, 2014 1 commit
    • Be more careful with user-supplied buffers · 1e4976cb
      This adds in missing calls to `git_buf_sanitize` and fixes a
      number of places where `git_buf` APIs could inadvertently write
      NUL terminator bytes into invalid buffers.  This also changes the
      behavior of `git_buf_sanitize` to NUL terminate a buffer if it can
      and of `git_buf_shorten` to do nothing if it can.
      
      Adds tests of filtering code with zeroed (i.e. unsanitized) buffer
      which was previously triggering a segfault.
      Russell Belfer committed
  26. 06 May, 2014 1 commit
    • Add filter options and ALLOW_UNSAFE · 5269008c
      Diff and status do not want core.safecrlf to actually raise an
      error regardless of the setting, so this extends the filter API
      with an additional options flags parameter and adds a flag so that
      filters can be applied with GIT_FILTER_OPT_ALLOW_UNSAFE, indicating
      that unsafe filter application should be downgraded from a failure
      to a warning.
      Russell Belfer committed
  27. 29 Apr, 2014 1 commit
    • commit: safer commit creation with reference update · 217c029b
      The current version of the commit creation and amend function are unsafe
      to use when passing the update_ref parameter, as they do not check that
      the reference at the moment of update points to what the user expects.
      
      Make sure that we're moving history forward when we ask the library to
      update the reference for us by checking that the first parent of the new
      commit is the current value of the reference. We also make sure that the
      ref we're updating hasn't moved between the read and the write.
      
      Similarly, when amending a commit, make sure that the current tip of the
      branch is the commit we're amending.
      Carlos Martín Nieto committed
  28. 10 Mar, 2014 1 commit
  29. 05 Mar, 2014 1 commit
  30. 08 Feb, 2014 1 commit
    • Add git_commit_amend API · 80c29fe9
      This adds an API to amend an existing commit, basically a shorthand
      for creating a new commit filling in missing parameters from the
      values of an existing commit.  As part of this, I also added a new
      "sys" API to create a commit using a callback to get the parents.
      This allowed me to rewrite all the other commit creation APIs so
      that temporary allocations are no longer needed.
      Russell Belfer committed
  31. 27 Jan, 2014 1 commit
  32. 25 Jan, 2014 1 commit