Commits · c16556aaddffc1d663c6403747d793adc0819e0a · lvzhengyang / git2

22 Jun, 2018 11 commits

indexer: introduce options struct to `git_indexer_new` · c16556aa

We strive to keep an options structure to many functions to be able to
extend options in the future without breaking the API. `git_indexer_new`
doesn't have one right now, but we want to be able to add an option
for enabling strict packfile verification.

Add a new `git_indexer_options` structure and adjust callers to use
that.

committed Jun 22, 2018

c16556aa Browse Files

indexer: check pack file connectivity · a616fb16

When passing `--strict` to `git-unpack-objects`, core git will verify
the pack file that is currently being read. In addition to the typical
checksum verification, this will especially cause it to verify object
connectivity of the received pack file. So it checks, for every received
object, if all the objects it references are either part of the local
object database or part of the pack file. In libgit2, we currently have
no such mechanism, which leaves us unable to verify received pack files
prior to writing them into our local object database.

This commit introduce the concept of `expected_oids` to the indexer.
When pack file verification is turned on by a new flag, the indexer will
try to parse each received object first. If the object has any links to
other objects, it will check if those links are already satisfied by
known objects either part of the object database or objects it has
already seen as part of that pack file. If not, it will add them to the
list of `expected_oids`. Furthermore, the indexer will remove the
current object from the `expected_oids` if it is currently being
expected.

Like this, we are able to verify whether all object links are being
satisfied. As soon as we hit the end of the object stream and have
resolved all objects as well as deltified objects, we assert that
`expected_oids` is in fact empty. This should always be the case for a
valid pack file with full connectivity.

committed Jun 22, 2018

a616fb16 Browse Files

indexer: extract function reading stream objects · be41c384

The loop inside of `git_indexer_append` iterates over every object that
is to be stored as part of the index. While the logic to retrieve every
object from the packfile stream is rather involved, it currently just
part of the loop, making it unnecessarily hard to follow.

Move the logic into its own function `read_stream_object`, which unpacks
a single object from the stream. Note that there is some subtletly here
involving the special error `GIT_EBUFS`, which indicates to the indexer
that no more data is currently available. So instead of returning an
error and aborting the whole loop in that case, we do have to catch that
value and return successfully to wait for more data to be read.

committed Jun 22, 2018

be41c384 Browse Files

indexer: remove useless local variable · 6568f374

The `processed` variable local to `git_indexer_append` counts how many
objects have already been processed. But actually, whenever it gets
assigned to, we are also assigning the same value to the
`stats->indexed_objects` struct member. So in fact, it is being quite
useless due to always having the same value as the `indexer_objects`
member and makes it a bit harder to understand the code. We can just
remove the variable to fix that.

committed Jun 22, 2018

6568f374 Browse Files

object: implement function to parse raw data · ca4db5f4

Now that we have implement functions to parse all git objects from raw
data, we can implement a generic function `git_object__from_raw` to
create a structure of type `git_object`. This allows us to parse and
interpret objects from raw data without having to touch the ODB at all,
which is especially useful for object verification prior to accepting
them into the repository.

committed Jun 22, 2018

ca4db5f4 Browse Files

tree: implement function to parse raw data · 73bd6411

Currently, parsing objects is strictly tied to having an ODB object
available. This makes it hard to parse an object when all that is
available is its raw object and size. Furthermore, hacking around that
limitation by directly creating an ODB structure either on stack or on
heap does not really work that well due to ODB objects being reference
counted and then automatically free'd when reaching a reference count of
zero.

Implement a function `git_tree__parse_raw` to parse a tree object from a
pair of `data` and `size`.

committed Jun 22, 2018

73bd6411 Browse Files

tag: implement function to parse raw data · af5cd936

Currently, parsing objects is strictly tied to having an ODB object
available. This makes it hard to parse an object when all that is
available is its raw object and size. Furthermore, hacking around that
limitation by directly creating an ODB structure either on stack or on
heap does not really work that well due to ODB objects being reference
counted and then automatically free'd when reaching a reference count of
zero.

Implement a function `git_tag__parse_raw` to parse a tag object from a
pair of `data` and `size`.

committed Jun 22, 2018

af5cd936 Browse Files

commit: implement function to parse raw data · ab265a35

Currently, parsing objects is strictly tied to having an ODB object
available. This makes it hard to parse an object when all that is
available is its raw object and size. Furthermore, hacking around that
limitation by directly creating an ODB structure either on stack or on
heap does not really work that well due to ODB objects being reference
counted and then automatically free'd when reaching a reference count of
zero.

Implement a function `git_commit__parse_raw` to parse a commit object
from a pair of `data` and `size`.

committed Jun 22, 2018

ab265a35 Browse Files

blob: implement function to parse raw data · 9ac79ecc

Currently, parsing objects is strictly tied to having an ODB object
available. This makes it hard to parse an object when all that is
available is its raw object and size. Furthermore, hacking around that
limitation by directly creating an ODB structure either on stack or on
heap does not really work that well due to ODB objects being reference
counted and then automatically free'd when reaching a reference count of
zero.

In some occasions parsing raw objects without touching the ODB
is actually recuired, though. One use case is for example object
verification, where we want to assure that an object is valid before
inserting it into the ODB or writing it into the git repository.

Asa first step towards that, introduce a distinction between raw and ODB
objects for blobs. Creation of ODB objects stays the same by simply
using `git_blob__parse`, but a new function `git_blob__parse_raw` has
been added that creates a blob from a pair of data and size. By setting
a new flag inside of the blob, we can now distinguish whether it is a
raw or ODB object now and treat it accordingly in several places.

Note that the blob data passed in is not being copied. Because of that,
callers need to make sure to keep it alive during the blob's life time.
This is being used to avoid unnecessarily increasing the memory
footprint when parsing largish blobs.

committed Jun 22, 2018

9ac79ecc Browse Files

blob: use getters to get raw blob content and size · bbbe8441

Going forward, we will have to change how blob sizes are calculated
based on whether the blob is a cahed object part of the ODB or not. In
order to not have to distinguish between those two object types
repeatedly when accessing the blob's data or size, encapsulate all
existing direct uses of those fields by instead using
`git_blob_rawcontent` and `git_blob_rawsize`.

committed Jun 22, 2018

bbbe8441 Browse Files

pack-objects: make `git_walk_object` internal to pack-objects · 4e8dc055

The `git_walk_objects` structure is currently only being used inside of
the pack-objects.c file, but being declared in its header. This has
actually been the case since its inception in 04a36fef (pack-objects:
fill a packbuilder from a walk, 2014-10-11) and has never really
changed.

Move the struct declaration into pack-objects.c to improve code
encapsulation.

committed Jun 22, 2018

4e8dc055 Browse Files

18 Jun, 2018 5 commits
- Merge pull request #4685 from csware/no-git_buf_free · e212011b
```
Fix last references to deprecated git_buf_free
```
  Edward Thomson committed Jun 18, 2018
  e212011b Browse Files
- Merge pull request #4606 from libgit2/cmn/revwalk-iteration · cc9c9522
```
revwalk: avoid walking the entire history when output is unsorted
```
  Edward Thomson committed Jun 18, 2018
  cc9c9522 Browse Files
- Fix last references to deprecated git_buf_free · b5818dda
```
Signed-off-by: Sven Strickroth <email@cs-ware.de>
```
  Sven Strickroth committed Jun 18, 2018
  b5818dda Browse Files
- revwalk: formatting updates · ff98fec0
  Edward Thomson committed Jun 18, 2018
  
  ff98fec0 Browse Files
- Merge pull request #4586 from emilio/mailmap · 96882f20
```
Add mailmap support.
```
  Edward Thomson committed Jun 18, 2018
  96882f20 Browse Files
17 Jun, 2018 1 commit
- Require the length argument to git_mailmap_from_buffer and make mailmap_add_buffer internal · f98131be
  Nika Layzell committed Jun 17, 2018
  
  f98131be Browse Files
16 Jun, 2018 1 commit
- Merge pull request #4683 from pks-t/pks/tree-unused-functions · 0ecf0e33
```
tree: remove unused functions
```
  Edward Thomson committed Jun 16, 2018
  0ecf0e33 Browse Files
15 Jun, 2018 22 commits
- tree: remove unused function `git_tree__prefix_position` · f0a1d76a
  Patrick Steinhardt committed Jun 15, 2018
  
  f0a1d76a Browse Files
- tree: remove unused function `git_tree_entry_icmp` · 31f6b529
  Patrick Steinhardt committed Jun 15, 2018
  
  31f6b529 Browse Files
- Merge pull request #4678 from staticfloat/sf/mbedtls_linkage · 678fa45b
```
Link `mbedTLS` libraries in when `SHA1_BACKEND` == "mbedTLS"
```
  Patrick Steinhardt committed Jun 15, 2018
  678fa45b Browse Files
- Merge pull request #4676 from libgit2/editorconfig · c103616f
```
editorconfig: allow trailing whitespace in markdown
```
  Patrick Steinhardt committed Jun 15, 2018
  c103616f Browse Files
- mailmap: git_buf_free => git_buf_dispose · 9faf36a6
  Nika Layzell committed Jun 14, 2018
  
  9faf36a6 Browse Files
- mailmap: Hide EEXISTS to simplify git_mailmap_add_entry callers · d91d2968
  Nika Layzell committed Jun 14, 2018
  
  d91d2968 Browse Files
- mailmap: Free the mailmap vector · c1a85ae2
  Nika Layzell committed Jun 14, 2018
  
  c1a85ae2 Browse Files
- mailmap: API and style cleanup · 56303e1a
  Nika Layzell committed Jun 14, 2018
  
  56303e1a Browse Files
- mailmap: Updates tests for new API and features · a140c138
  Nika Layzell committed Jun 14, 2018
  
  a140c138 Browse Files
- mailmap: Rewrite API to support accurate mailmap resolution · 8ff0504d
  Nika Layzell committed Jun 14, 2018
  
  8ff0504d Browse Files
- mailmap: API and style cleanup · 18ff9bab
  Nika Layzell committed Jun 14, 2018
  
  18ff9bab Browse Files
- mailmap: Switch mailmap parsing to use the git_parse module · 57cfeab9
  Nika Layzell committed Jun 14, 2018
  
  57cfeab9 Browse Files
- mailmap: Clean up the mailmap fixture's .gitted directory · aa3a24a4
  Nika Layzell committed Jun 14, 2018
  
  aa3a24a4 Browse Files
- mailmap: Fix some other minor style nits · 5c6c8a9b
  Emilio Cobos Álvarez committed Jun 14, 2018
  
  5c6c8a9b Browse Files
- mailmap: Fix more bugs which snuck in when I rebased · 4ff44be8
  Nika Layzell committed Jun 14, 2018
  
  4ff44be8 Browse Files
- mailmap: Add a bunch of tests for the new mailmap functionality · 983b8c2d
  Nika Layzell committed Jun 14, 2018
  
  983b8c2d Browse Files
- mailmap: Integrate mailmaps with blame and signatures · e3dcaca5
  Nika Layzell committed Jun 14, 2018
  
  e3dcaca5 Browse Files
- mailmap: Make everything a bit more style conforming · b05fbba3
  Nika Layzell committed Jun 14, 2018
  
  b05fbba3 Browse Files
- mailmap: Support path fixtures in cl_git_repository_init() · 939d8d57
  Nika Layzell committed Jun 14, 2018
  
  939d8d57 Browse Files
- mailmap: Add some super-basic tests · b88cbf8c
  Emilio Cobos Álvarez committed Jun 14, 2018
  
  b88cbf8c Browse Files
- mailmap: Don't error out when there's junk at the end of the line · 7bafd175
```
Also matches git.
```
  Emilio Cobos Álvarez committed Jun 14, 2018
  7bafd175 Browse Files
- mailmap: Don't return a freed pointer, even if we return an error code · 59fbf9cf
  Emilio Cobos Álvarez committed Jun 14, 2018
  
  59fbf9cf Browse Files