- 23 Feb, 2022 1 commit
-
-
Edward Thomson committed
-
- 12 Feb, 2022 3 commits
-
-
When we know that we know a file's size, and the file's size changes, fail.
Edward Thomson committed -
Skip new_file_size non-zero test, custom error message if file changed in workdir Co-authored-by: Edward Thomson <ethomson@github.com>
Iliyas Jorio committed -
"diff_file_content_load_workdir_file()" maps a file from the workdir into memory. It uses git_diff_file.size to determine the size of the memory mapping. If this value goes stale, the mmaped area would be sized incorrectly. This could occur if an external program changes the contents of the file after libgit2 had cached its size. This used to segfault if the file becomes smaller (mmaped area too large). This patch causes diff_file_content_load_workdir_file to fail without crashing if it detects that the file size has changed.
Iliyas Jorio committed
-
- 17 Oct, 2021 1 commit
-
-
libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
Edward Thomson committed
-
- 06 May, 2021 1 commit
-
-
Introduce `git_filter_list__convert_buf` which behaves like the old implementation of `git_filter_list__apply_data`, where it might move the input data buffer over into the output data buffer space for efficiency. This new implementation will do so in a more predictible way, always freeing the given input buffer (either moving it to the output buffer or filtering it into the output buffer first). Convert internal users to it.
Edward Thomson committed
-
- 28 Apr, 2021 1 commit
-
-
The new git_repository_workdir_path function does error checking on working directory inputs on Windows; use it to construct paths within working directories.
Edward Thomson committed
-
- 30 Jun, 2020 1 commit
-
-
This change: * Initializes a few variables that were being read before being initialized. * Includes https://github.com/madler/zlib/pull/393. As such, it only works reliably with `-DUSE_BUNDLED_ZLIB=ON`.
lhchavez committed
-
- 18 Jan, 2020 1 commit
-
-
libgit2 does not use `type_t` suffixes as it's redundant; thus, rename `git_iterator_type_t` to `git_iterator_t` for consistency.
Edward Thomson committed
-
- 22 Nov, 2019 2 commits
-
-
Instead of using a signed type (`off_t`) use `uint64_t` for the maximum size of files.
Edward Thomson committed -
Instead of using a signed type (`off_t`) use a new `git_object_size_t` for the sizes of objects.
Edward Thomson committed
-
- 20 Jul, 2019 1 commit
-
-
Our file utils functions all have a "futils" prefix, e.g. `git_futils_touch`. One would thus naturally guess that their definitions and implementation would live in files "futils.h" and "futils.c", respectively, but in fact they live in "fileops.h". Rename the files to match expectations.
Patrick Steinhardt committed
-
- 18 Jul, 2019 1 commit
-
-
`cvar` is an unhelpful name. Refactor its usage to `configmap` for more clarity.
Patrick Steinhardt committed
-
- 15 Jun, 2019 1 commit
-
-
The only function that is named `issomething` (without underscore) was `git_oid_iszero`. Rename it to `git_oid_is_zero` for consistency with the rest of the library.
Edward Thomson committed
-
- 22 Jan, 2019 1 commit
-
-
Move to the `git_error` name in the internal API for error-related functions.
Edward Thomson committed
-
- 01 Dec, 2018 1 commit
-
-
Use the new object_type enumeration names within the codebase.
Edward Thomson committed
-
- 10 Jun, 2018 1 commit
-
-
Patrick Steinhardt committed
-
- 03 Jan, 2018 1 commit
-
-
When generating a diff between two trees, for each file that is to be diffed we have to determine whether it shall be treated as text or as binary files. While git has heuristics to determine which kind of diff to generate, users can also that default behaviour by setting or unsetting the 'diff' attribute for specific files. Because of that, we have to query gitattributes in order to determine how to diff the current files. Instead of hitting the '.gitattributes' file every time we need to query an attribute, which can get expensive especially on networked file systems, we try to cache them instead. This works perfectly fine for every '.gitattributes' file that is found, but we hit cache invalidation problems when we determine that an attribuse file is _not_ existing. We do create an entry in the cache for missing '.gitattributes' files, but as soon as we hit that file again we invalidate it and stat it again to see if it has now appeared. In the case of diffing large trees with each other, this behaviour is very suboptimal. For each pair of files that is to be diffed, we will repeatedly query every directory component leading towards their respective location for an attributes file. This leads to thousands or even hundreds of thousands of wasted syscalls. The attributes cache already has a mechanism to help in that scenario in form of the `git_attr_session`. As long as the same attributes session is still active, we will not try to re-query the gitmodules files at all but simply retain our currently cached results. To fix our problem, we can create a session at the top-most level, which is the initialization of the `git_diff` structure, and use it in order to look up the correct diff driver. As the `git_diff` structure is used to generate patches for multiple files at once, this neatly solves our problem by retaining the session until patches for all files have been generated. The fix has been tested with linux.git by calling `git_diff_tree_to_tree` and `git_diff_to_buf` with v4.10^{tree} and v4.14^{tree}. | time | .gitattributes stats without fix | 33.201s | 844614 with fix | 30.327s | 4441 While execution only improved by roughly 10%, the stat(3) syscalls for .gitattributes files decreased by 99.5%. The benchmarks were quite simple with best-of-three timings on Linux ext4 systems. One can assume that for network based file systems the performance gain will be a lot larger due to a much higher latency.
Patrick Steinhardt committed
-
- 15 Dec, 2017 1 commit
-
-
When initializing a `git_diff_file_content` from a source whose data is derived from a blob, we simply assign the blob's pointer to the resulting struct without incrementing its refcount. Thus, the structure can only be used as long as the blob is kept alive by the caller. Fix the issue by using `git_blob_dup` instead of a direct assignment. This function will increment the refcount of the blob without allocating new memory, so it does exactly what we want. As `git_diff_file_content__unload` already frees the blob when `GIT_DIFF_FLAG__FREE_BLOB` is set, we don't need to add new code handling the free but only have to set that flag correctly.
Patrick Steinhardt committed
-
- 03 Jul, 2017 1 commit
-
-
Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
Patrick Steinhardt committed
-
- 29 Dec, 2016 1 commit
-
-
Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one
Edward Thomson committed
-
- 26 May, 2016 2 commits
-
-
Edward Thomson committed
-
Now that `git_diff_delta` data can be produced by reading patch file data, which may have an abbreviated oid, allow consumers to know that the id is abbreviated.
Edward Thomson committed
-
- 03 Nov, 2015 1 commit
-
-
On platforms that lack `core.symlinks`, we should not go looking for symbolic links and `p_readlink` their target. Instead, we should examine the file's contents.
Edward Thomson committed
-
- 25 Jun, 2015 1 commit
-
-
Fallback describes the mechanism, while unspecified explains what the user is thinking.
Carlos Martín Nieto committed
-
- 22 Jun, 2015 2 commits
-
-
This lets us specify in the status call which ignore rules we want to use (optionally falling back to whatever the submodule has in its configuration). This removes one of the reasons for having `_set_ignore()` set the value in-memory. We re-use the `IGNORE_RESET` value for this as it is no longer relevant but has a similar purpose to `IGNORE_FALLBACK`. Similarly, we remove `IGNORE_DEFAULT` which does not have use outside of initializers and move that to fall back to the configuration as well.
Carlos Martín Nieto committed -
As submodules are becomes more like values, we should not let a status check to update its properties. Instead of taking a submodule, have status take a repo and submodule name.
Carlos Martín Nieto committed
-
- 12 Jun, 2015 1 commit
-
-
Introduce a new binary diff callback to provide the actual binary delta contents to callers. Create this data from the diff contents (instead of directly from the ODB) to support binary diffs including the workdir, not just things coming out of the ODB.
Edward Thomson committed
-
- 19 Feb, 2015 1 commit
-
-
For consistency with the rest of the library, where an opt is an options *structure*.
Edward Thomson committed
-
- 20 May, 2014 1 commit
-
-
Alan Rogers committed
-
- 06 May, 2014 1 commit
-
-
Diff and status do not want core.safecrlf to actually raise an error regardless of the setting, so this extends the filter API with an additional options flags parameter and adds a flag so that filters can be applied with GIT_FILTER_OPT_ALLOW_UNSAFE, indicating that unsafe filter application should be downgraded from a failure to a warning.
Russell Belfer committed
-
- 25 Mar, 2014 2 commits
-
-
This cleans up some places I missed that could hold onto submodule references and cleans up the way in which the repository cache is both reloaded and released so that existing submodule references aren't destroyed inappropriately.
Russell Belfer committed -
`git_submodule` objects were already refcounted internally in case the submodule name was different from the path at which it was stored. This makes that refcounting externally used as well, so `git_submodule_lookup` and `git_submodule_add_setup` return an object that requires a `git_submodule_free` when done.
Russell Belfer committed
-
- 27 Feb, 2014 1 commit
-
-
This adds `git_diff_buffers` and `git_patch_from_buffers`. This also includes a bunch of internal refactoring to increase the shared code between these functions and the blob-to-blob and blob-to-buffer APIs, as well as some higher level assert helpers in the tests to also remove redundancy.
Russell Belfer committed
-
- 25 Jan, 2014 1 commit
-
-
In the same vein as the previous commits in this series.
Carlos Martín Nieto committed
-
- 15 Oct, 2013 1 commit
-
-
This lays groundwork for separating formatting options from diff creation options. This groups the formatting flags separately from the diff list creation flags and reorders the options. This also tweaks some APIs to further separate code that uses patches from code that just looks at git_diffs.
Russell Belfer committed
-
- 11 Oct, 2013 1 commit
-
-
This makes no functional change to diff but renames a couple of the objects and splits the new git_patch (formerly git_diff_patch) into a new header file.
Russell Belfer committed
-
- 17 Sep, 2013 3 commits
-
-
This makes the git_buf struct that was used internally into an externally available structure and eliminates the git_buffer. As part of that, some of the special cases that arose with the externally used git_buffer were blended into the git_buf, such as being careful about git_buf objects that may have a NULL ptr and allowing for bufs with a valid ptr and size but zero asize as a way of referring to externally owned data.
Russell Belfer committed -
This adds the ident filter (that knows how to replace $Id$) and tweaks the filter APIs and code so that git_filter_source objects actually have the updated OID of the object being filtered when it is a known value.
Russell Belfer committed -
This moves the git_filter_list into the public API so that users can create, apply, and dispose of filter lists. This allows more granular application of filters to user data outside of libgit2 internals. This also converts all the internal usage of filters to the public APIs along with a few small tweaks to make it easier to use the public git_buffer stuff alongside the internal git_buf.
Russell Belfer committed
-