Commits · bb278ceb3722ee57fd87835e93d2f896bfcb177f · lvzhengyang / git2

18 Jan, 2020 1 commit

iterator: update enum type name for consistency · b59c71d8

libgit2 does not use `type_t` suffixes as it's redundant; thus, rename
`git_iterator_type_t` to `git_iterator_t` for consistency.

committed 5 years ago

b59c71d8 Browse File

19 Nov, 2019 1 commit

diff_print: add a new 'print_index' flag when printing diff. · accd7848

Add a new 'print_index' flag to let the caller decide whether or not
'index <oid>..<oid>' should be printed.
Since patch id needs not to have index when hashing a patch, it will be
useful soon.

Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com>

committed 5 years ago

accd7848 Browse File

01 Feb, 2018 1 commit
- consistent header guards · abb04caa
```
use consistent names for the #include / #define header guard pattern.
```
  Edward Thomson committed 6 years ago
  abb04caa Browse File
03 Jan, 2018 1 commit

diff_generate: avoid excessive stats of .gitattribute files · d8896bda

When generating a diff between two trees, for each file that is to be
diffed we have to determine whether it shall be treated as text or as
binary files. While git has heuristics to determine which kind of diff
to generate, users can also that default behaviour by setting or
unsetting the 'diff' attribute for specific files.

Because of that, we have to query gitattributes in order to determine
how to diff the current files. Instead of hitting the '.gitattributes'
file every time we need to query an attribute, which can get expensive
especially on networked file systems, we try to cache them instead. This
works perfectly fine for every '.gitattributes' file that is found, but
we hit cache invalidation problems when we determine that an attribuse
file is _not_ existing. We do create an entry in the cache for missing
'.gitattributes' files, but as soon as we hit that file again we
invalidate it and stat it again to see if it has now appeared.

In the case of diffing large trees with each other, this behaviour is
very suboptimal. For each pair of files that is to be diffed, we will
repeatedly query every directory component leading towards their
respective location for an attributes file. This leads to thousands or
even hundreds of thousands of wasted syscalls.

The attributes cache already has a mechanism to help in that scenario in
form of the `git_attr_session`. As long as the same attributes session
is still active, we will not try to re-query the gitmodules files at all
but simply retain our currently cached results. To fix our problem, we
can create a session at the top-most level, which is the initialization
of the `git_diff` structure, and use it in order to look up the correct
diff driver. As the `git_diff` structure is used to generate patches for
multiple files at once, this neatly solves our problem by retaining the
session until patches for all files have been generated.

The fix has been tested with linux.git by calling
`git_diff_tree_to_tree` and `git_diff_to_buf` with v4.10^{tree} and
v4.14^{tree}.

                | time    | .gitattributes stats
    without fix | 33.201s | 844614
    with fix    | 30.327s | 4441

While execution only improved by roughly 10%, the stat(3) syscalls for
.gitattributes files decreased by 99.5%. The benchmarks were quite
simple with best-of-three timings on Linux ext4 systems. One can assume
that for network based file systems the performance gain will be a lot
larger due to a much higher latency.

committed 7 years ago

d8896bda Browse File

03 Jul, 2017 1 commit

Make sure to always include "common.h" first · 0c7f49dd

Next to including several files, our "common.h" header also declares
various macros which are then used throughout the project. As such, we
have to make sure to always include this file first in all
implementation files. Otherwise, we might encounter problems or even
silent behavioural differences due to macros or defines not being
defined as they should be. So in fact, our header and implementation
files should make sure to always include "common.h" first.

This commit does so by establishing a common include pattern. Header
files inside of "src" will now always include "common.h" as its first
other file, separated by a newline from all the other includes to make
it stand out as special. There are two cases for the implementation
files. If they do have a matching header file, they will always include
this one first, leading to "common.h" being transitively included as
first file. If they do not have a matching header file, they instead
include "common.h" as first file themselves.

This fixes the outlined problems and will become our standard practice
for header and source files inside of the "src/" from now on.

committed 7 years ago

0c7f49dd Browse File

24 Aug, 2016 1 commit

Teach `git_patch_from_diff` about parsed diffs · b859faa6

Ensure that `git_patch_from_diff` can return the patch for parsed diffs,
not just generate a patch for a generated diff.

committed 8 years ago

b859faa6 Browse File

26 May, 2016 2 commits
- introduce `git_diff_from_buffer` to parse diffs · 7166bb16
```
Parse diff files into a `git_diff` structure.
```
  Edward Thomson committed 8 years ago
  7166bb16 Browse File
- git_diff_generated: abstract generated diffs · 9be638ec
  Edward Thomson committed 8 years ago
  
  9be638ec Browse File
23 Nov, 2015 1 commit

checkout: only consider nsecs when built that way · 25e84f95

When examining the working directory and determining whether it's
up-to-date, only consider the nanoseconds in the index entry when
built with `GIT_USE_NSEC`. This prevents us from believing that
the working directory is always dirty when the index was originally
written with a git client that uinderstands nsecs (like git 2.x).

committed 9 years ago

25e84f95 Browse File

26 Jun, 2015 1 commit

Only write index if updated when passing GIT_DIFF_UPDATE_INDEX · c2e1b058

When diffing the index with the workdir and GIT_DIFF_UPDATE_INDEX has been passed,
the previous implementation was always writing the index to disk even if it wasn't
modified.

committed 9 years ago

c2e1b058 Browse File

23 Jun, 2015 2 commits

stash: save the workdir file when deleted in index · 90177111

When stashing the workdir tree, examine the index as well.  Using
a mechanism similar to `git_diff_tree_to_workdir_with_index`
allows us to determine that a file was added in the index and
subsequently modified in the working directory.  Without examining
the index, we would erroneously believe that this file was
untracked and fail to include it in the working directory tree.

Use a slightly modified `git_diff_tree_to_workdir_with_index` in
order to avoid some of the behavior custom to `git diff`.  In
particular, be sure to include the working directory side of a
file when it was deleted in the index.

committed 9 years ago

90177111 Browse File

git_diff__merge: allow pluggable diff merges · 5ef43d41
Edward Thomson committed 9 years ago

5ef43d41 Browse File

20 Jun, 2015 1 commit

diff: preserve original mode in the index · 96dd171e

When updating the index during a diff, preserve the original mode,
which prevents us from dropping the mode to what we have interpreted
as on our system (eg, what the working directory claims it to be,
which may be a lie on some systems.)

committed 9 years ago

96dd171e Browse File

02 May, 2014 5 commits

Remove trace / add git_diff_perfdata struct + api · 9c8ed499
Russell Belfer committed 10 years ago

9c8ed499 Browse File

Add GIT_STATUS_OPT_UPDATE_INDEX and use trace API · cd424ad5

This adds an option to refresh the stat cache while generating
status. It also rips out the GIT_PERF stuff I had an makes use
of the trace API to keep statistics about what happens during diff.

committed 10 years ago

cd424ad5 Browse File

Add diff option to update index stat cache · 94fb4aad

When diff is scanning the working directory, if it finds a file
where it is not sure if the index entry matches the working dir,
it will recalculate the OID (which is pretty expensive). This
adds a new flag to diff so that if the OID calculation finds that
the file actually has not changed (i.e. just the modified time was
altered or such), then it will refresh the stat cache in the index
so that future calls to diff will not have to check the oid again.

committed 10 years ago

94fb4aad Browse File

Lay groundwork for updating stat cache in diff · 0fc8e1f6

This reorganized the diff OID calculation to make it easier to
correctly update the stat cache during a diff once the flags to
do so are enabled.

This includes marking the path of a git_index_entry as const so
we can make a "fake" git_index_entry with a "const char *" path
and not get warnings.  I was a little surprised at how unobtrusive
this change was, but I think it's probably a good thing.

committed 10 years ago

0fc8e1f6 Browse File

Add build option for diff internal statistics · 240f4af3
Russell Belfer committed 10 years ago

240f4af3 Browse File

15 Apr, 2014 1 commit
- Introduce git_diff_format_email and git_diff_commit_as_email · d8cc1fb6
  Jacques Germishuys committed 10 years ago
  
  d8cc1fb6 Browse File
25 Jan, 2014 1 commit
- diff: rename the file's 'oid' to 'id' · 9950bb4e
```
In the same vein as the previous commits in this series.
```
  Carlos Martín Nieto committed 10 years ago
  9950bb4e Browse File
01 Nov, 2013 1 commit

Fix --assume-unchanged support · 3e57069e

This was never really working right because we were checking the
wrong flag and not checking it in all the places that we need to
be checking it.  I finally got around to writing a test and adding
actual support for it.

committed 11 years ago

3e57069e Browse File

11 Oct, 2013 1 commit

Rename diff objects and split patch.h · 3ff1d123

This makes no functional change to diff but renames a couple of
the objects and splits the new git_patch (formerly git_diff_patch)
into a new header file.

committed 11 years ago

3ff1d123 Browse File

25 Jul, 2013 1 commit

Make rename detection file size fix better · effdbeb3

The previous fix for checking file sizes with rename detection
always loads the blob.  In this version, if the odb backend can
get the object header without loading the whole thing into memory,
then we'll just use that, so that we can eliminate possible rename
sources & targets without loading them.

committed 11 years ago

effdbeb3 Browse File

23 Jul, 2013 1 commit

Add hunk/file headers to git_diff_patch_size · 197b8966

This allows git_diff_patch_size to account for hunk headers and
file headers in the returned size.  This required some refactoring
of the code that is used to print file headers so that it could be
invoked by the git_diff_patch_size API.

Also this increases the test coverage and fixes an off-by-one bug
in the size calculation when newline changes happen at the end of
the file.

committed 11 years ago

197b8966 Browse File

10 Jul, 2013 1 commit

Add git_pathspec_match_diff API · 2b672d5b

This adds an additional pathspec API that will match a pathspec
against a diff object.  This is convenient if you want to handle
renames (so you need the whole diff and can't use the pathspec
constraint built into the diff API) but still want to tell if the
diff had any files that matched the pathspec.

When the pathspec is matched against a diff, instead of keeping
a list of filenames that matched, instead the API keeps the list
of git_diff_deltas that matched and they can be retrieved via a
new API git_pathspec_match_list_diff_entry.

There are a couple of other minor API extensions here that were
mostly for the sake of convenience and to reduce dependencies
on knowing the internal data structure between files inside the
library.

committed 11 years ago

2b672d5b Browse File

17 Jun, 2013 2 commits

More tests and bug fixes for status with rename · a1683f28

This changes the behavior of the status RENAMED flags so that they
will be combined with the MODIFIED flags if appropriate.  If a file
is modified in the index and also renamed, then the status code
will have both the GIT_STATUS_INDEX_MODIFIED and INDEX_RENAMED bits
set.  If it is renamed but the OID has not changed, then just the
GIT_STATUS_INDEX_RENAMED bit will be set.  Similarly, the flags
GIT_STATUS_WT_MODIFIED and GIT_STATUS_WT_RENAMED can both be set
independently of one another.

This fixes a serious bug where the check for unmodified files that
was done at data load time could end up erasing the RENAMED state
of a file that was renamed with no changes.

Lastly, this contains a bunch of new tests for status with renames,
including tests where the only rename changes are case changes.
The expected results of these tests have to vary by whether the
platform uses a case sensitive filesystem or not, so the expected
data covers those platform differences separately.

committed 11 years ago

a1683f28 Browse File

Improve case handling in git_diff__paired_foreach · 351888cf

This commit reinstates some changes to git_diff__paired_foreach
that were discarded during the rebase (because the diff_output.c
file had gone away), and also adjusts the case insensitively
logic slightly to hopefully deal with either mismatched icase
diffs and other case insensitivity scenarios.

committed 11 years ago

351888cf Browse File

10 Jun, 2013 1 commit

Reorganize diff and add basic diff driver · 114f5a6c

This is a significant reorganization of the diff code to break it
into a set of more clearly distinct files and to document the new
organization.  Hopefully this will make the diff code easier to
understand and to extend.

This adds a new `git_diff_driver` object that looks of diff driver
information from the attributes and the config so that things like
function content in diff headers can be provided.  The full driver
spec is not implemented in the commit - this is focused on the
reorganization of the code and putting the driver hooks in place.

This also removes a few #includes from src/repository.h that were
overbroad, but as a result required extra #includes in a variety
of places since including src/repository.h no longer results in
pulling in the whole world.

committed 11 years ago

114f5a6c Browse File

23 May, 2013 1 commit

More diff rename tests; better split swap handling · 67db583d

This adds a couple more tests of different rename scenarios.

Also, this fixes a problem with the case where you have two
"split" deltas and the left half of one matches the right half of
the other.  That case was already being handled, but in the wrong
order in a way that could result in bad output.  Also, if the swap
also happened to put the other two halves into the correct place
(i.e. two files exchanged places with each other), then the second
delta was left with the SPLIT flag set when it really should be
cleared.

committed 11 years ago

67db583d Browse File

22 May, 2013 1 commit

Significant rename detection rewrite · a21cbb12

This flips rename detection around so instead of creating a
forward mapping from deltas to possible rename targets, instead
it creates a reverse mapping, looking at possible targets and
trying to find a source that they could have been renamed or
copied from.  This is important because each output can only
have a single source, but a given source could map to multiple
outputs (in the form of COPIED records).

Additionally, this makes a couple of tweaks to the public rename
detection APIs, mostly renaming a couple of options that control
the behavior to make more sense and to be more like core Git.

I walked through the tests looking at the exact results and
updated the expectations based on what I saw.  The new code is
different from the old because it cannot give some nonsense
results (like A was renamed to both B and C) which were part of
the outputs previously.

committed 11 years ago

a21cbb12 Browse File

07 May, 2013 1 commit

Add GIT_DIFF_LINE_CONTEXT_EOFNL · e35e2684

This adds a new line origin constant for the special line that
is used when both files end without a newline.

In the course of writing the tests for this, I was having problems
with modifying a file but not having diff notice because it was
the same size and modified less than one second from the start of
the test, so I decided to start working on nanosecond timestamp
support.  This commit doesn't contain the nanosecond support, but
it contains the reorganization of maybe_modified and the hooks so
that if the nanosecond data were being read by stat() (or rather
being copied by git_index_entry__init_from_stat), then the nsec
would be taken into account.

This new stuff could probably use some more tests, although there
is some amount of it here.

committed 11 years ago

e35e2684 Browse File

30 Apr, 2013 1 commit
- renames! · 0462fba5
  Edward Thomson committed 11 years ago
  
  0462fba5 Browse File
20 Feb, 2013 1 commit

Replace diff delta binary with flags · 71a3d27e

Previously the git_diff_delta recorded if the delta was binary.
This replaces that (with no net change in structure size) with
a full set of flags.  The flag values that were already in use
for individual git_diff_file objects are reused for the delta
flags, too (along with renaming those flags to make it clear that
they are used more generally).

This (a) makes things somewhat more consistent (because I was
using a -1 value in the "boolean" binary field to indicate unset,
whereas now I can just use the flags that are easier to understand),
and (b) will make it easier for me to add some additional flags to
the delta object in the future, such as marking the results of a
copy/rename detection or other deltas that might want a special
indicator.

While making this change, I officially moved some of the flags that
were internal only into the private diff header.

This also allowed me to remove a gross hack in rename/copy detect
code where I was overwriting the status field with an internal
value.

committed 11 years ago

71a3d27e Browse File

08 Jan, 2013 1 commit
- update copyrights · 359fc2d2
  Edward Thomson committed 12 years ago
  
  359fc2d2 Browse File
10 Dec, 2012 1 commit

Clean up iterator APIs · 9950d27a

This removes the need to explicitly pass the repo into iterators
where the repo is implied by the other parameters.  This moves
the repo to be owned by the parent struct.  Also, this has some
iterator related updates to the internal diff API to lay the
groundwork for checkout improvements.

committed 12 years ago

9950d27a Browse File

01 Dec, 2012 1 commit
- Deploy GITERR_CHECK_VERSION · c7231c45
  Ben Straub committed 12 years ago
  
  c7231c45 Browse File
30 Nov, 2012 1 commit
- Deploy GIT_DIFF_OPTIONS_INIT · 2f8d30be
  Ben Straub committed 12 years ago
  
  2f8d30be Browse File
15 Nov, 2012 2 commits

Add explicit git_index ptr to diff and checkout · bbe6dbec

A number of diff APIs and the `git_checkout_index` API take a
`git_repository` object an operate on the index.  This updates
them to take a `git_index` pointer explicitly and only fall back
on the `git_repository` index if the index input is NULL.  This
makes it easier to operate on a temporary index.

committed 12 years ago

bbe6dbec Browse File

Add iterator for git_index object · bad68c0a

The index iterator could previously only be created from a repo
object, but this allows creating an iterator from a `git_index`
object instead (while keeping, though renaming, the old function).

committed 12 years ago

bad68c0a Browse File

09 Nov, 2012 1 commit

Fix various cross-platform build issues · 0f3def71

This fixes a number of warnings and problems with cross-platform
builds.  Among other things, it's not safe to name a member of a
structure "strcmp" because that may be #defined.

committed 12 years ago

0f3def71 Browse File