- 18 Jan, 2020 1 commit
-
-
libgit2 does not use `type_t` suffixes as it's redundant; thus, rename `git_iterator_type_t` to `git_iterator_t` for consistency.
Edward Thomson committed
-
- 19 Nov, 2019 1 commit
-
-
Add a new 'print_index' flag to let the caller decide whether or not 'index <oid>..<oid>' should be printed. Since patch id needs not to have index when hashing a patch, it will be useful soon. Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com>
Gregory Herrero committed
-
- 01 Feb, 2018 1 commit
-
-
use consistent names for the #include / #define header guard pattern.
Edward Thomson committed
-
- 03 Jan, 2018 1 commit
-
-
When generating a diff between two trees, for each file that is to be diffed we have to determine whether it shall be treated as text or as binary files. While git has heuristics to determine which kind of diff to generate, users can also that default behaviour by setting or unsetting the 'diff' attribute for specific files. Because of that, we have to query gitattributes in order to determine how to diff the current files. Instead of hitting the '.gitattributes' file every time we need to query an attribute, which can get expensive especially on networked file systems, we try to cache them instead. This works perfectly fine for every '.gitattributes' file that is found, but we hit cache invalidation problems when we determine that an attribuse file is _not_ existing. We do create an entry in the cache for missing '.gitattributes' files, but as soon as we hit that file again we invalidate it and stat it again to see if it has now appeared. In the case of diffing large trees with each other, this behaviour is very suboptimal. For each pair of files that is to be diffed, we will repeatedly query every directory component leading towards their respective location for an attributes file. This leads to thousands or even hundreds of thousands of wasted syscalls. The attributes cache already has a mechanism to help in that scenario in form of the `git_attr_session`. As long as the same attributes session is still active, we will not try to re-query the gitmodules files at all but simply retain our currently cached results. To fix our problem, we can create a session at the top-most level, which is the initialization of the `git_diff` structure, and use it in order to look up the correct diff driver. As the `git_diff` structure is used to generate patches for multiple files at once, this neatly solves our problem by retaining the session until patches for all files have been generated. The fix has been tested with linux.git by calling `git_diff_tree_to_tree` and `git_diff_to_buf` with v4.10^{tree} and v4.14^{tree}. | time | .gitattributes stats without fix | 33.201s | 844614 with fix | 30.327s | 4441 While execution only improved by roughly 10%, the stat(3) syscalls for .gitattributes files decreased by 99.5%. The benchmarks were quite simple with best-of-three timings on Linux ext4 systems. One can assume that for network based file systems the performance gain will be a lot larger due to a much higher latency.
Patrick Steinhardt committed
-
- 03 Jul, 2017 1 commit
-
-
Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
Patrick Steinhardt committed
-
- 24 Aug, 2016 1 commit
-
-
Ensure that `git_patch_from_diff` can return the patch for parsed diffs, not just generate a patch for a generated diff.
Edward Thomson committed
-
- 26 May, 2016 2 commits
-
-
Parse diff files into a `git_diff` structure.
Edward Thomson committed -
Edward Thomson committed
-
- 23 Nov, 2015 1 commit
-
-
When examining the working directory and determining whether it's up-to-date, only consider the nanoseconds in the index entry when built with `GIT_USE_NSEC`. This prevents us from believing that the working directory is always dirty when the index was originally written with a git client that uinderstands nsecs (like git 2.x).
Edward Thomson committed
-
- 26 Jun, 2015 1 commit
-
-
When diffing the index with the workdir and GIT_DIFF_UPDATE_INDEX has been passed, the previous implementation was always writing the index to disk even if it wasn't modified.
Pierre-Olivier Latour committed
-
- 23 Jun, 2015 2 commits
-
-
When stashing the workdir tree, examine the index as well. Using a mechanism similar to `git_diff_tree_to_workdir_with_index` allows us to determine that a file was added in the index and subsequently modified in the working directory. Without examining the index, we would erroneously believe that this file was untracked and fail to include it in the working directory tree. Use a slightly modified `git_diff_tree_to_workdir_with_index` in order to avoid some of the behavior custom to `git diff`. In particular, be sure to include the working directory side of a file when it was deleted in the index.
Edward Thomson committed -
Edward Thomson committed
-
- 20 Jun, 2015 1 commit
-
-
When updating the index during a diff, preserve the original mode, which prevents us from dropping the mode to what we have interpreted as on our system (eg, what the working directory claims it to be, which may be a lie on some systems.)
Edward Thomson committed
-
- 02 May, 2014 5 commits
-
-
Russell Belfer committed
-
This adds an option to refresh the stat cache while generating status. It also rips out the GIT_PERF stuff I had an makes use of the trace API to keep statistics about what happens during diff.
Russell Belfer committed -
When diff is scanning the working directory, if it finds a file where it is not sure if the index entry matches the working dir, it will recalculate the OID (which is pretty expensive). This adds a new flag to diff so that if the OID calculation finds that the file actually has not changed (i.e. just the modified time was altered or such), then it will refresh the stat cache in the index so that future calls to diff will not have to check the oid again.
Russell Belfer committed -
This reorganized the diff OID calculation to make it easier to correctly update the stat cache during a diff once the flags to do so are enabled. This includes marking the path of a git_index_entry as const so we can make a "fake" git_index_entry with a "const char *" path and not get warnings. I was a little surprised at how unobtrusive this change was, but I think it's probably a good thing.
Russell Belfer committed -
Russell Belfer committed
-
- 15 Apr, 2014 1 commit
-
-
Jacques Germishuys committed
-
- 25 Jan, 2014 1 commit
-
-
In the same vein as the previous commits in this series.
Carlos Martín Nieto committed
-
- 01 Nov, 2013 1 commit
-
-
This was never really working right because we were checking the wrong flag and not checking it in all the places that we need to be checking it. I finally got around to writing a test and adding actual support for it.
Russell Belfer committed
-
- 11 Oct, 2013 1 commit
-
-
This makes no functional change to diff but renames a couple of the objects and splits the new git_patch (formerly git_diff_patch) into a new header file.
Russell Belfer committed
-
- 25 Jul, 2013 1 commit
-
-
The previous fix for checking file sizes with rename detection always loads the blob. In this version, if the odb backend can get the object header without loading the whole thing into memory, then we'll just use that, so that we can eliminate possible rename sources & targets without loading them.
Russell Belfer committed
-
- 23 Jul, 2013 1 commit
-
-
This allows git_diff_patch_size to account for hunk headers and file headers in the returned size. This required some refactoring of the code that is used to print file headers so that it could be invoked by the git_diff_patch_size API. Also this increases the test coverage and fixes an off-by-one bug in the size calculation when newline changes happen at the end of the file.
Russell Belfer committed
-
- 10 Jul, 2013 1 commit
-
-
This adds an additional pathspec API that will match a pathspec against a diff object. This is convenient if you want to handle renames (so you need the whole diff and can't use the pathspec constraint built into the diff API) but still want to tell if the diff had any files that matched the pathspec. When the pathspec is matched against a diff, instead of keeping a list of filenames that matched, instead the API keeps the list of git_diff_deltas that matched and they can be retrieved via a new API git_pathspec_match_list_diff_entry. There are a couple of other minor API extensions here that were mostly for the sake of convenience and to reduce dependencies on knowing the internal data structure between files inside the library.
Russell Belfer committed
-
- 17 Jun, 2013 2 commits
-
-
This changes the behavior of the status RENAMED flags so that they will be combined with the MODIFIED flags if appropriate. If a file is modified in the index and also renamed, then the status code will have both the GIT_STATUS_INDEX_MODIFIED and INDEX_RENAMED bits set. If it is renamed but the OID has not changed, then just the GIT_STATUS_INDEX_RENAMED bit will be set. Similarly, the flags GIT_STATUS_WT_MODIFIED and GIT_STATUS_WT_RENAMED can both be set independently of one another. This fixes a serious bug where the check for unmodified files that was done at data load time could end up erasing the RENAMED state of a file that was renamed with no changes. Lastly, this contains a bunch of new tests for status with renames, including tests where the only rename changes are case changes. The expected results of these tests have to vary by whether the platform uses a case sensitive filesystem or not, so the expected data covers those platform differences separately.
Russell Belfer committed -
This commit reinstates some changes to git_diff__paired_foreach that were discarded during the rebase (because the diff_output.c file had gone away), and also adjusts the case insensitively logic slightly to hopefully deal with either mismatched icase diffs and other case insensitivity scenarios.
Russell Belfer committed
-
- 10 Jun, 2013 1 commit
-
-
This is a significant reorganization of the diff code to break it into a set of more clearly distinct files and to document the new organization. Hopefully this will make the diff code easier to understand and to extend. This adds a new `git_diff_driver` object that looks of diff driver information from the attributes and the config so that things like function content in diff headers can be provided. The full driver spec is not implemented in the commit - this is focused on the reorganization of the code and putting the driver hooks in place. This also removes a few #includes from src/repository.h that were overbroad, but as a result required extra #includes in a variety of places since including src/repository.h no longer results in pulling in the whole world.
Russell Belfer committed
-
- 23 May, 2013 1 commit
-
-
This adds a couple more tests of different rename scenarios. Also, this fixes a problem with the case where you have two "split" deltas and the left half of one matches the right half of the other. That case was already being handled, but in the wrong order in a way that could result in bad output. Also, if the swap also happened to put the other two halves into the correct place (i.e. two files exchanged places with each other), then the second delta was left with the SPLIT flag set when it really should be cleared.
Russell Belfer committed
-
- 22 May, 2013 1 commit
-
-
This flips rename detection around so instead of creating a forward mapping from deltas to possible rename targets, instead it creates a reverse mapping, looking at possible targets and trying to find a source that they could have been renamed or copied from. This is important because each output can only have a single source, but a given source could map to multiple outputs (in the form of COPIED records). Additionally, this makes a couple of tweaks to the public rename detection APIs, mostly renaming a couple of options that control the behavior to make more sense and to be more like core Git. I walked through the tests looking at the exact results and updated the expectations based on what I saw. The new code is different from the old because it cannot give some nonsense results (like A was renamed to both B and C) which were part of the outputs previously.
Russell Belfer committed
-
- 07 May, 2013 1 commit
-
-
This adds a new line origin constant for the special line that is used when both files end without a newline. In the course of writing the tests for this, I was having problems with modifying a file but not having diff notice because it was the same size and modified less than one second from the start of the test, so I decided to start working on nanosecond timestamp support. This commit doesn't contain the nanosecond support, but it contains the reorganization of maybe_modified and the hooks so that if the nanosecond data were being read by stat() (or rather being copied by git_index_entry__init_from_stat), then the nsec would be taken into account. This new stuff could probably use some more tests, although there is some amount of it here.
Russell Belfer committed
-
- 30 Apr, 2013 1 commit
-
-
Edward Thomson committed
-
- 20 Feb, 2013 1 commit
-
-
Previously the git_diff_delta recorded if the delta was binary. This replaces that (with no net change in structure size) with a full set of flags. The flag values that were already in use for individual git_diff_file objects are reused for the delta flags, too (along with renaming those flags to make it clear that they are used more generally). This (a) makes things somewhat more consistent (because I was using a -1 value in the "boolean" binary field to indicate unset, whereas now I can just use the flags that are easier to understand), and (b) will make it easier for me to add some additional flags to the delta object in the future, such as marking the results of a copy/rename detection or other deltas that might want a special indicator. While making this change, I officially moved some of the flags that were internal only into the private diff header. This also allowed me to remove a gross hack in rename/copy detect code where I was overwriting the status field with an internal value.
Russell Belfer committed
-
- 08 Jan, 2013 1 commit
-
-
Edward Thomson committed
-
- 10 Dec, 2012 1 commit
-
-
This removes the need to explicitly pass the repo into iterators where the repo is implied by the other parameters. This moves the repo to be owned by the parent struct. Also, this has some iterator related updates to the internal diff API to lay the groundwork for checkout improvements.
Russell Belfer committed
-
- 01 Dec, 2012 1 commit
-
-
Ben Straub committed
-
- 30 Nov, 2012 1 commit
-
-
Ben Straub committed
-
- 15 Nov, 2012 2 commits
-
-
A number of diff APIs and the `git_checkout_index` API take a `git_repository` object an operate on the index. This updates them to take a `git_index` pointer explicitly and only fall back on the `git_repository` index if the index input is NULL. This makes it easier to operate on a temporary index.
Russell Belfer committed -
The index iterator could previously only be created from a repo object, but this allows creating an iterator from a `git_index` object instead (while keeping, though renaming, the old function).
Russell Belfer committed
-
- 09 Nov, 2012 1 commit
-
-
This fixes a number of warnings and problems with cross-platform builds. Among other things, it's not safe to name a member of a structure "strcmp" because that may be #defined.
Russell Belfer committed
-