Commits · 9c3edca5bf56b5144e38c5ba3d91ae293270f5a3 · lvzhengyang / git2

23 Feb, 2022 1 commit
- refactor: `src` is now `src/libgit2` · ef4ab298
  Edward Thomson committed 2 years ago
  
  ef4ab298 Browse File
12 Feb, 2022 3 commits

diff: fail generation if a file changes size · d2ce981f
```
When we know that we know a file's size, and the file's size changes,
fail.
```
Edward Thomson committed 3 years ago
d2ce981f Browse Directory

diff_file: Apply suggestions from code review · 3bac68ff

Skip new_file_size non-zero test, custom error message if file changed in workdir

Co-authored-by: Edward Thomson <ethomson@github.com>

committed 3 years ago

3bac68ff Browse Directory

diff_file: fix crash if size of diffed file changes in workdir · 0a0cd67d

"diff_file_content_load_workdir_file()" maps a file from the workdir
into memory. It uses git_diff_file.size to determine the size of the
memory mapping.

If this value goes stale, the mmaped area would be sized incorrectly.
This could occur if an external program changes the contents of the
file after libgit2 had cached its size. This used to segfault if the
file becomes smaller (mmaped area too large).

This patch causes diff_file_content_load_workdir_file to fail without
crashing if it detects that the file size has changed.

committed 3 years ago

0a0cd67d Browse Directory

17 Oct, 2021 1 commit

str: introduce `git_str` for internal, `git_buf` is external · f0e693b1

libgit2 has two distinct requirements that were previously solved by
`git_buf`.  We require:

1. A general purpose string class that provides a number of utility APIs
   for manipulating data (eg, concatenating, truncating, etc).
2. A structure that we can use to return strings to callers that they
   can take ownership of.

By using a single class (`git_buf`) for both of these purposes, we have
confused the API to the point that refactorings are difficult and
reasoning about correctness is also difficult.

Move the utility class `git_buf` to be called `git_str`: this represents
its general purpose, as an internal string buffer class.  The name also
is an homage to Junio Hamano ("gitstr").

The public API remains `git_buf`, and has a much smaller footprint.  It
is generally only used as an "out" param with strict requirements that
follow the documentation.  (Exceptions exist for some legacy APIs to
avoid breaking callers unnecessarily.)

Utility functions exist to convert a user-specified `git_buf` to a
`git_str` so that we can call internal functions, then converting it
back again.

committed 3 years ago

f0e693b1 Browse Directory

06 May, 2021 1 commit

filter: internal git_buf filter handling function · 31d9c24b

Introduce `git_filter_list__convert_buf` which behaves like the old
implementation of `git_filter_list__apply_data`, where it might move the
input data buffer over into the output data buffer space for efficiency.

This new implementation will do so in a more predictible way, always
freeing the given input buffer (either moving it to the output buffer or
filtering it into the output buffer first).

Convert internal users to it.

committed 3 years ago

31d9c24b Browse Directory

28 Apr, 2021 1 commit

diff: use git_repository_workdir_path · 91156a0f

The new git_repository_workdir_path function does error checking on
working directory inputs on Windows; use it to construct paths within
working directories.

committed 3 years ago

91156a0f Browse Directory

30 Jun, 2020 1 commit

Make the tests pass cleanly with MemorySanitizer · 3a197ea7

This change:

* Initializes a few variables that were being read before being
  initialized.
* Includes https://github.com/madler/zlib/pull/393. As such,
  it only works reliably with `-DUSE_BUNDLED_ZLIB=ON`.

committed 4 years ago

3a197ea7 Browse Directory

18 Jan, 2020 1 commit

iterator: update enum type name for consistency · b59c71d8

libgit2 does not use `type_t` suffixes as it's redundant; thus, rename
`git_iterator_type_t` to `git_iterator_t` for consistency.

committed 5 years ago

b59c71d8 Browse Directory

22 Nov, 2019 2 commits
- futils_filesize: use `uint64_t` for object size · fb2198db
```
Instead of using a signed type (`off_t`) use `uint64_t` for the maximum
size of files.
```
  Edward Thomson committed 5 years ago
  fb2198db Browse Directory
- blob: use `git_object_size_t` for object size · 4334b177
```
Instead of using a signed type (`off_t`) use a new `git_object_size_t`
for the sizes of objects.
```
  Edward Thomson committed 5 years ago
  4334b177 Browse Directory
20 Jul, 2019 1 commit

fileops: rename to "futils.h" to match function signatures · e54343a4

Our file utils functions all have a "futils" prefix, e.g.
`git_futils_touch`. One would thus naturally guess that their
definitions and implementation would live in files "futils.h" and
"futils.c", respectively, but in fact they live in "fileops.h".

Rename the files to match expectations.

committed 5 years ago

e54343a4 Browse Directory

18 Jul, 2019 1 commit
- configuration: cvar -> configmap · 658022c4
```
`cvar` is an unhelpful name.  Refactor its usage to `configmap` for more
clarity.
```
  Patrick Steinhardt committed 5 years ago
  658022c4 Browse Directory
15 Jun, 2019 1 commit

oid: `is_zero` instead of `iszero` · 5d92e547

The only function that is named `issomething` (without underscore) was
`git_oid_iszero`.  Rename it to `git_oid_is_zero` for consistency with
the rest of the library.

committed 5 years ago

5d92e547 Browse Directory

22 Jan, 2019 1 commit
- git_error: use new names in internal APIs and usage · f673e232
```
Move to the `git_error` name in the internal API for error-related
functions.
```
  Edward Thomson committed 6 years ago
  f673e232 Browse Directory
01 Dec, 2018 1 commit
- object_type: use new enumeration names · 168fe39b
```
Use the new object_type enumeration names within the codebase.
```
  Edward Thomson committed 6 years ago
  168fe39b Browse Directory
10 Jun, 2018 1 commit
- Convert usage of `git_buf_free` to new `git_buf_dispose` · ecf4f33a
  Patrick Steinhardt committed 6 years ago
  
  ecf4f33a Browse Directory
03 Jan, 2018 1 commit

diff_generate: avoid excessive stats of .gitattribute files · d8896bda

When generating a diff between two trees, for each file that is to be
diffed we have to determine whether it shall be treated as text or as
binary files. While git has heuristics to determine which kind of diff
to generate, users can also that default behaviour by setting or
unsetting the 'diff' attribute for specific files.

Because of that, we have to query gitattributes in order to determine
how to diff the current files. Instead of hitting the '.gitattributes'
file every time we need to query an attribute, which can get expensive
especially on networked file systems, we try to cache them instead. This
works perfectly fine for every '.gitattributes' file that is found, but
we hit cache invalidation problems when we determine that an attribuse
file is _not_ existing. We do create an entry in the cache for missing
'.gitattributes' files, but as soon as we hit that file again we
invalidate it and stat it again to see if it has now appeared.

In the case of diffing large trees with each other, this behaviour is
very suboptimal. For each pair of files that is to be diffed, we will
repeatedly query every directory component leading towards their
respective location for an attributes file. This leads to thousands or
even hundreds of thousands of wasted syscalls.

The attributes cache already has a mechanism to help in that scenario in
form of the `git_attr_session`. As long as the same attributes session
is still active, we will not try to re-query the gitmodules files at all
but simply retain our currently cached results. To fix our problem, we
can create a session at the top-most level, which is the initialization
of the `git_diff` structure, and use it in order to look up the correct
diff driver. As the `git_diff` structure is used to generate patches for
multiple files at once, this neatly solves our problem by retaining the
session until patches for all files have been generated.

The fix has been tested with linux.git by calling
`git_diff_tree_to_tree` and `git_diff_to_buf` with v4.10^{tree} and
v4.14^{tree}.

                | time    | .gitattributes stats
    without fix | 33.201s | 844614
    with fix    | 30.327s | 4441

While execution only improved by roughly 10%, the stat(3) syscalls for
.gitattributes files decreased by 99.5%. The benchmarks were quite
simple with best-of-three timings on Linux ext4 systems. One can assume
that for network based file systems the performance gain will be a lot
larger due to a much higher latency.

committed 7 years ago

d8896bda Browse Directory

15 Dec, 2017 1 commit

diff_file: properly refcount blobs when initializing file contents · 2388a9e2

When initializing a `git_diff_file_content` from a source whose data is
derived from a blob, we simply assign the blob's pointer to the
resulting struct without incrementing its refcount. Thus, the structure
can only be used as long as the blob is kept alive by the caller.

Fix the issue by using `git_blob_dup` instead of a direct assignment.
This function will increment the refcount of the blob without allocating
new memory, so it does exactly what we want. As
`git_diff_file_content__unload` already frees the blob when
`GIT_DIFF_FLAG__FREE_BLOB` is set, we don't need to add new code
handling the free but only have to set that flag correctly.

committed 7 years ago

2388a9e2 Browse Directory

03 Jul, 2017 1 commit

Make sure to always include "common.h" first · 0c7f49dd

Next to including several files, our "common.h" header also declares
various macros which are then used throughout the project. As such, we
have to make sure to always include this file first in all
implementation files. Otherwise, we might encounter problems or even
silent behavioural differences due to macros or defines not being
defined as they should be. So in fact, our header and implementation
files should make sure to always include "common.h" first.

This commit does so by establishing a common include pattern. Header
files inside of "src" will now always include "common.h" as its first
other file, separated by a newline from all the other includes to make
it stand out as special. There are two cases for the implementation
files. If they do have a matching header file, they will always include
this one first, leading to "common.h" being transitively included as
first file. If they do not have a matching header file, they instead
include "common.h" as first file themselves.

This fixes the outlined problems and will become our standard practice
for header and source files inside of the "src/" from now on.

committed 7 years ago

0c7f49dd Browse Directory

29 Dec, 2016 1 commit

giterr_set: consistent error messages · 909d5494

Error messages should be sentence fragments, and therefore:

1. Should not begin with a capital letter,
2. Should not conclude with punctuation, and
3. Should not end a sentence and begin a new one

committed 8 years ago

909d5494 Browse Directory

26 May, 2016 2 commits
- git_diff_generated: abstract generated diffs · 9be638ec
  Edward Thomson committed 8 years ago
  
  9be638ec Browse Directory
- diff: include oid length in deltas · d68cb736
```
Now that `git_diff_delta` data can be produced by reading patch
file data, which may have an abbreviated oid, allow consumers to
know that the id is abbreviated.
```
  Edward Thomson committed 8 years ago
  d68cb736 Browse Directory
03 Nov, 2015 1 commit

diff: on win32, treat fake "symlinks" specially · 6b0fc6ab

On platforms that lack `core.symlinks`, we should not go looking for
symbolic links and `p_readlink` their target.  Instead, we should
examine the file's contents.

committed 9 years ago

6b0fc6ab Browse Directory

25 Jun, 2015 1 commit
- Rename FALLBACK to UNSPECIFIED · c2418f46
```
Fallback describes the mechanism, while unspecified explains what the
user is thinking.
```
  Carlos Martín Nieto committed 9 years ago
  c2418f46 Browse Directory
22 Jun, 2015 2 commits

submodule: add an ignore option to status · c6f489c9

This lets us specify in the status call which ignore rules we want to
use (optionally falling back to whatever the submodule has in its
configuration).

This removes one of the reasons for having `_set_ignore()` set the value
in-memory. We re-use the `IGNORE_RESET` value for this as it is no
longer relevant but has a similar purpose to `IGNORE_FALLBACK`.

Similarly, we remove `IGNORE_DEFAULT` which does not have use outside of
initializers and move that to fall back to the configuration as well.

committed 9 years ago

c6f489c9 Browse Directory

submodule: don't let status change an existing instance · 64bbd47a

As submodules are becomes more like values, we should not let a status
check to update its properties. Instead of taking a submodule, have
status take a repo and submodule name.

committed 9 years ago

64bbd47a Browse Directory

12 Jun, 2015 1 commit

diff: introduce binary diff callbacks · 8147b1af

Introduce a new binary diff callback to provide the actual binary
delta contents to callers.  Create this data from the diff contents
(instead of directly from the ODB) to support binary diffs including
the workdir, not just things coming out of the ODB.

committed 9 years ago

8147b1af Browse Directory

19 Feb, 2015 1 commit
- git_filter_opt_t -> git_filter_flag_t · 795eaccd
```
For consistency with the rest of the library, where an opt is an
options *structure*.
```
  Edward Thomson committed 9 years ago
  795eaccd Browse Directory
20 May, 2014 1 commit
- Start adding GIT_DELTA_UNREADABLE and GIT_STATUS_WT_UNREADABLE. · 61bef72d
  Alan Rogers committed 10 years ago
  
  61bef72d Browse Directory
06 May, 2014 1 commit

Add filter options and ALLOW_UNSAFE · 5269008c

Diff and status do not want core.safecrlf to actually raise an
error regardless of the setting, so this extends the filter API
with an additional options flags parameter and adds a flag so that
filters can be applied with GIT_FILTER_OPT_ALLOW_UNSAFE, indicating
that unsafe filter application should be downgraded from a failure
to a warning.

committed 10 years ago

5269008c Browse Directory

25 Mar, 2014 2 commits

Fix submodule leaks and invalid references · 591e8295

This cleans up some places I missed that could hold onto submodule
references and cleans up the way in which the repository cache is
both reloaded and released so that existing submodule references
aren't destroyed inappropriately.

committed 10 years ago

591e8295 Browse Directory

Make submodules externally refcounted · a15c7802

`git_submodule` objects were already refcounted internally in case
the submodule name was different from the path at which it was
stored.  This makes that refcounting externally used as well, so
`git_submodule_lookup` and `git_submodule_add_setup` return an
object that requires a `git_submodule_free` when done.

committed 10 years ago

a15c7802 Browse Directory

27 Feb, 2014 1 commit

Add buffer to buffer diff and patch APIs · 6789b7a7

This adds `git_diff_buffers` and `git_patch_from_buffers`.  This
also includes a bunch of internal refactoring to increase the
shared code between these functions and the blob-to-blob and
blob-to-buffer APIs, as well as some higher level assert helpers
in the tests to also remove redundancy.

committed 10 years ago

6789b7a7 Browse Directory

25 Jan, 2014 1 commit
- diff: rename the file's 'oid' to 'id' · 9950bb4e
```
In the same vein as the previous commits in this series.
```
  Carlos Martín Nieto committed 11 years ago
  9950bb4e Browse Directory
15 Oct, 2013 1 commit

Diff API cleanup · 10672e3e

This lays groundwork for separating formatting options from diff
creation options.  This groups the formatting flags separately
from the diff list creation flags and reorders the options.  This
also tweaks some APIs to further separate code that uses patches
from code that just looks at git_diffs.

committed 11 years ago

10672e3e Browse Directory

11 Oct, 2013 1 commit

Rename diff objects and split patch.h · 3ff1d123

This makes no functional change to diff but renames a couple of
the objects and splits the new git_patch (formerly git_diff_patch)
into a new header file.

committed 11 years ago

3ff1d123 Browse Directory

17 Sep, 2013 3 commits

Merge git_buf and git_buffer · a9f51e43

This makes the git_buf struct that was used internally into an
externally available structure and eliminates the git_buffer.

As part of that, some of the special cases that arose with the
externally used git_buffer were blended into the git_buf, such as
being careful about git_buf objects that may have a NULL ptr and
allowing for bufs with a valid ptr and size but zero asize as a
way of referring to externally owned data.

committed 11 years ago

a9f51e43 Browse Directory

Add ident filter · 4b11f25a

This adds the ident filter (that knows how to replace $Id$) and
tweaks the filter APIs and code so that git_filter_source objects
actually have the updated OID of the object being filtered when
it is a known value.

committed 11 years ago

4b11f25a Browse Directory

Extend public filter api with filter lists · 2a7d224f

This moves the git_filter_list into the public API so that users
can create, apply, and dispose of filter lists.  This allows more
granular application of filters to user data outside of libgit2
internals.

This also converts all the internal usage of filters to the public
APIs along with a few small tweaks to make it easier to use the
public git_buffer stuff alongside the internal git_buf.

committed 11 years ago

2a7d224f Browse Directory