1. 22 Jun, 2018 9 commits
    • indexer: extract function reading stream objects · be41c384
      The loop inside of `git_indexer_append` iterates over every object that
      is to be stored as part of the index. While the logic to retrieve every
      object from the packfile stream is rather involved, it currently just
      part of the loop, making it unnecessarily hard to follow.
      
      Move the logic into its own function `read_stream_object`, which unpacks
      a single object from the stream. Note that there is some subtletly here
      involving the special error `GIT_EBUFS`, which indicates to the indexer
      that no more data is currently available. So instead of returning an
      error and aborting the whole loop in that case, we do have to catch that
      value and return successfully to wait for more data to be read.
      Patrick Steinhardt committed
    • indexer: remove useless local variable · 6568f374
      The `processed` variable local to `git_indexer_append` counts how many
      objects have already been processed. But actually, whenever it gets
      assigned to, we are also assigning the same value to the
      `stats->indexed_objects` struct member. So in fact, it is being quite
      useless due to always having the same value as the `indexer_objects`
      member and makes it a bit harder to understand the code. We can just
      remove the variable to fix that.
      Patrick Steinhardt committed
    • object: implement function to parse raw data · ca4db5f4
      Now that we have implement functions to parse all git objects from raw
      data, we can implement a generic function `git_object__from_raw` to
      create a structure of type `git_object`. This allows us to parse and
      interpret objects from raw data without having to touch the ODB at all,
      which is especially useful for object verification prior to accepting
      them into the repository.
      Patrick Steinhardt committed
    • tree: implement function to parse raw data · 73bd6411
      Currently, parsing objects is strictly tied to having an ODB object
      available. This makes it hard to parse an object when all that is
      available is its raw object and size. Furthermore, hacking around that
      limitation by directly creating an ODB structure either on stack or on
      heap does not really work that well due to ODB objects being reference
      counted and then automatically free'd when reaching a reference count of
      zero.
      
      Implement a function `git_tree__parse_raw` to parse a tree object from a
      pair of `data` and `size`.
      Patrick Steinhardt committed
    • tag: implement function to parse raw data · af5cd936
      Currently, parsing objects is strictly tied to having an ODB object
      available. This makes it hard to parse an object when all that is
      available is its raw object and size. Furthermore, hacking around that
      limitation by directly creating an ODB structure either on stack or on
      heap does not really work that well due to ODB objects being reference
      counted and then automatically free'd when reaching a reference count of
      zero.
      
      Implement a function `git_tag__parse_raw` to parse a tag object from a
      pair of `data` and `size`.
      Patrick Steinhardt committed
    • commit: implement function to parse raw data · ab265a35
      Currently, parsing objects is strictly tied to having an ODB object
      available. This makes it hard to parse an object when all that is
      available is its raw object and size. Furthermore, hacking around that
      limitation by directly creating an ODB structure either on stack or on
      heap does not really work that well due to ODB objects being reference
      counted and then automatically free'd when reaching a reference count of
      zero.
      
      Implement a function `git_commit__parse_raw` to parse a commit object
      from a pair of `data` and `size`.
      Patrick Steinhardt committed
    • blob: implement function to parse raw data · 9ac79ecc
      Currently, parsing objects is strictly tied to having an ODB object
      available. This makes it hard to parse an object when all that is
      available is its raw object and size. Furthermore, hacking around that
      limitation by directly creating an ODB structure either on stack or on
      heap does not really work that well due to ODB objects being reference
      counted and then automatically free'd when reaching a reference count of
      zero.
      
      In some occasions parsing raw objects without touching the ODB
      is actually recuired, though. One use case is for example object
      verification, where we want to assure that an object is valid before
      inserting it into the ODB or writing it into the git repository.
      
      Asa first step towards that, introduce a distinction between raw and ODB
      objects for blobs. Creation of ODB objects stays the same by simply
      using `git_blob__parse`, but a new function `git_blob__parse_raw` has
      been added that creates a blob from a pair of data and size. By setting
      a new flag inside of the blob, we can now distinguish whether it is a
      raw or ODB object now and treat it accordingly in several places.
      
      Note that the blob data passed in is not being copied. Because of that,
      callers need to make sure to keep it alive during the blob's life time.
      This is being used to avoid unnecessarily increasing the memory
      footprint when parsing largish blobs.
      Patrick Steinhardt committed
    • blob: use getters to get raw blob content and size · bbbe8441
      Going forward, we will have to change how blob sizes are calculated
      based on whether the blob is a cahed object part of the ODB or not. In
      order to not have to distinguish between those two object types
      repeatedly when accessing the blob's data or size, encapsulate all
      existing direct uses of those fields by instead using
      `git_blob_rawcontent` and `git_blob_rawsize`.
      Patrick Steinhardt committed
    • pack-objects: make `git_walk_object` internal to pack-objects · 4e8dc055
      The `git_walk_objects` structure is currently only being used inside of
      the pack-objects.c file, but being declared in its header. This has
      actually been the case since its inception in 04a36fef (pack-objects:
      fill a packbuilder from a walk, 2014-10-11) and has never really
      changed.
      
      Move the struct declaration into pack-objects.c to improve code
      encapsulation.
      Patrick Steinhardt committed
  2. 18 Jun, 2018 5 commits
  3. 17 Jun, 2018 1 commit
  4. 16 Jun, 2018 1 commit
  5. 15 Jun, 2018 24 commits