Commits · 2098c490c8b8bfb88adab30575ff82fa157e52a3 · lvzhengyang / git2

12 Feb, 2023 1 commit
- packfile: handle sha256 packfiles · 479c8c8c
```
Teach the packfile machinery to cope with SHA256.
```
  Edward Thomson committed a year ago
  479c8c8c Browse File
09 Jul, 2022 1 commit

pack: don't pretend we support pack files v3 · 4597b869

Pack files v3 are introduced in the SHA256 hash transition document
https://github.com/git/git/blob/master/Documentation/technical/hash-function-transition.txt

Obviously we do not support these yet. Stop pretending that we do.

committed 2 years ago

4597b869 Browse File

10 Apr, 2022 1 commit
- pack: use raw oid data · 4fc3ce15
```
A packfile contains arrays of raw oid data, use a byte array to index
into them.
```
  Edward Thomson committed 2 years ago
  4fc3ce15 Browse File
23 Feb, 2022 1 commit
- refactor: `src` is now `src/libgit2` · ef4ab298
  Edward Thomson committed 2 years ago
  
  ef4ab298 Browse File
27 Aug, 2021 2 commits
- midx: Add a way to write multi-pack-index files · 9d117e38
```
This change adds the git_midx_writer_* functions to allow to
write and create `multi-pack-index` files from `.idx`/`.pack` files.

Part of: #5399
```
  lhchavez committed 3 years ago
  9d117e38 Browse Directory
- Review feedback · 366115e0
  lhchavez committed 3 years ago
  
  366115e0 Browse Directory
27 Jul, 2021 1 commit

midx: Add a way to write multi-pack-index files · fff209c4

This change adds the git_midx_writer_* functions to allow to
write and create `multi-pack-index` files from `.idx`/`.pack` files.

Part of: #5399

committed 3 years ago

fff209c4 Browse Directory

06 Dec, 2020 1 commit
- threads: rename git_atomic to git_atomic32 · 37763d38
```
Clarify the `git_atomic` type and functions now that we have a 64 bit
version as well (`git_atomic64`).
```
  Edward Thomson committed 4 years ago
  37763d38 Browse Directory
29 Nov, 2020 1 commit

Make the pack and mwindow implementations data-race-free · 322c15ee

This change fixes a packfile heap corruption that can happen when
interacting with multiple packfiles concurrently across multiple
threads. This is exacerbated by setting a lower mwindow open file limit.

This change:

* Renames most of the internal methods in pack.c to clearly indicate
  that they expect to be called with a certain lock held, making
  reasoning about the state of locks a bit easier.
* Splits the `git_pack_file` lock in two: the one in `git_pack_file`
  only protects the `index_map`. The protection to `git_mwindow_file` is
  now in that struct.
* Explicitly checks for freshness of the `git_pack_file` in
  `git_packfile_unpack_header`: this allows the mwindow implementation
  to close files whenever there is enough cache pressure, and
  `git_packfile_unpack_header` will reopen the packfile if needed.
* After a call to `p_munmap()`, the `data` and `len` fields are poisoned
  with `NULL` to make use-after-frees more evident and crash rather than
  being open to the possibility of heap corruption.
* Adds a test case to prevent this from regressing in the future.

Fixes: #5591

committed 4 years ago

322c15ee Browse Directory

27 Nov, 2020 1 commit
- pack: use GIT_ASSERT · 7cd0bf65
  Edward Thomson committed 4 years ago
  
  7cd0bf65 Browse Directory
05 Oct, 2020 1 commit

multipack: Introduce a parser for multi-pack-index files · 005e7715

This change is the first in a series to add support for git's
multi-pack-index. This should speed up large repositories significantly.

Part of: #5399

committed 4 years ago

005e7715 Browse Directory

01 Apr, 2020 1 commit
- Making get_delta_base() conform to the general error-handling pattern · ba59a4a2
```
This makes get_delta_base() return the error code as the return value
and the delta base as an out-parameter.
```
  lhchavez committed 4 years ago
  ba59a4a2 Browse Directory
09 Jan, 2020 1 commit

pack: refactor streams to use `git_zstream` · 0edc26c8

While we do have a `git_zstream` abstraction that encapsulates all the
calls to zlib as well as its error handling, we do not use it in our
pack file code. Refactor it to make the code a lot easier to understand.

committed 5 years ago

0edc26c8 Browse Directory

25 Nov, 2019 1 commit
- internal: use off64_t instead of git_off_t · 6460e8ab
```
Prefer `off64_t` internally.
```
  Edward Thomson committed 5 years ago
  6460e8ab Browse Directory
15 Feb, 2019 1 commit

maps: use uniform lifecycle management functions · 351eeff3

Currently, the lifecycle functions for maps (allocation, deallocation, resize)
are not named in a uniform way and do not have a uniform function signature.
Rename the functions to fix that, and stick to libgit2's naming scheme of saying
`git_foo_new`. This results in the following new interface for allocation:

- `int git_<t>map_new(git_<t>map **out)` to allocate a new map, returning an
  error code if we ran out of memory

- `void git_<t>map_free(git_<t>map *map)` to free a map

- `void git_<t>map_clear(git<t>map *map)` to remove all entries from a map

This commit also fixes all existing callers.

committed 5 years ago

351eeff3 Browse Directory

01 Dec, 2018 1 commit
- object_type: use new enumeration names · 168fe39b
```
Use the new object_type enumeration names within the codebase.
```
  Edward Thomson committed 6 years ago
  168fe39b Browse Directory
10 Jun, 2018 1 commit

pack: rename `git_packfile_stream_free` · c8ee5270

The function `git_packfile_stream_free` frees all state of the packfile
stream without freeing the structure itself. This naming makes it hard
to spot whether it will try to free the pointer itself or not, causing
potential future errors. Due to this reason, we have decided to name a
function freeing state without freeing the actual struture a "dispose"
function.

Rename `git_packfile_stream_free` to `git_packfile_stream_dispose` as a
first example following this rule.

committed 6 years ago

c8ee5270 Browse Directory

03 Jul, 2017 1 commit

Make sure to always include "common.h" first · 0c7f49dd

Next to including several files, our "common.h" header also declares
various macros which are then used throughout the project. As such, we
have to make sure to always include this file first in all
implementation files. Otherwise, we might encounter problems or even
silent behavioural differences due to macros or defines not being
defined as they should be. So in fact, our header and implementation
files should make sure to always include "common.h" first.

This commit does so by establishing a common include pattern. Header
files inside of "src" will now always include "common.h" as its first
other file, separated by a newline from all the other includes to make
it stand out as special. There are two cases for the implementation
files. If they do have a matching header file, they will always include
this one first, leading to "common.h" being transitively included as
first file. If they do not have a matching header file, they instead
include "common.h" as first file themselves.

This fixes the outlined problems and will become our standard practice
for header and source files inside of the "src/" from now on.

committed 7 years ago

0c7f49dd Browse Directory

21 Jan, 2017 1 commit
- indexer: introduce `git_packfile_close` · bf339ab0
```
Encapsulation!
```
  Edward Thomson committed 7 years ago
  bf339ab0 Browse Directory
04 Aug, 2016 1 commit

odb: only freshen pack files every 2 seconds · 27051d4e

Since writing multiple objects may all already exist in a single
packfile, avoid freshening that packfile repeatedly in a tight loop.
Instead, only freshen pack files every 2 seconds.

committed 8 years ago

27051d4e Browse Directory

13 Jan, 2016 1 commit
- Make packfile_unpack_compressed a private API · b644e223
  P.S.V.R committed 9 years ago
  
  b644e223 Browse Directory
11 Mar, 2015 1 commit

Reorder some khash declarations · b63b76e0

Keep the definitions in the headers, while putting the declarations in
the C files. Putting the function definitions in headers causes
them to be duplicated if you include two headers with them.

committed 9 years ago

b63b76e0 Browse Directory

15 Feb, 2015 1 commit

Remove extra semicolon outside of a function · c8e02b87

Without this change, compiling with gcc and pedantic generates warning:
ISO C does not allow extra ‘;’ outside of a function.

committed 9 years ago

c8e02b87 Browse Directory

23 Jun, 2014 1 commit

Share packs across repository instances · b3b66c57

Opening the same repository multiple times will currently open the same
file multiple times, as well as map the same region of the file multiple
times. This is not necessary, as the packfile data is immutable.

Instead of opening and closing packfiles directly, introduce an
indirection and allocate packfiles globally. This does mean locking on
each packfile open, but we already use this lock for the global mwindow
list so it doesn't introduce a new contention point.

committed 10 years ago

b3b66c57 Browse Directory

13 May, 2014 1 commit
- pack: expose a cached delta base directly · a3ffbf23
```
Instead of going through a special entry in the chain, let's pass it as
an output parameter.
```
  Carlos Martín Nieto committed 10 years ago
  a3ffbf23 Browse Directory
09 May, 2014 2 commits

pack: use a cache for delta bases when unpacking · a332e91c

Bring back the use of the delta base cache for unpacking objects. When
generating the delta chain, we stop when we find a delta base in the
pack's cache and use that as the starting point.

committed 10 years ago

a332e91c Browse Directory

pack: unpack using a loop · 2acdf4b8

We currently make use of recursive function calls to unpack an object,
resolving the deltas as we come back down the chain. This means that we
have unbounded stack growth as we look up objects in a pack.

This is now done in two steps: first we figure out what the dependency
chain is by looking up the delta bases until we reach a non-delta
object, pushing the information we need onto a stack and then we pop
from that stack and apply the deltas until there are no more left.

This version of the code does not make use of the delta base cache so it
is slower than what's in the mainline. A later commit will reintroduce
it.

committed 10 years ago

2acdf4b8 Browse Directory

23 Jan, 2014 1 commit
- Drop parsing pack filename SHA1 part, no one cares the filename · 8610487c
  Linquize committed 10 years ago
  
  8610487c Browse Directory
01 Nov, 2013 2 commits
- pack: `__object_header` always returns unsigned values · 51a3dfb5
  Vicent Marti committed 11 years ago
  
  51a3dfb5 Browse Directory
- Fix warning on win64 · 3343b5ff
  Linquize committed 11 years ago
  
  3343b5ff Browse Directory
04 Oct, 2013 1 commit
- pack: move the object header function here · 51e82492
  Carlos Martín Nieto committed 11 years ago
  
  51e82492 Browse Directory
22 Apr, 2013 3 commits

Consolidate packfile allocation further · 5d2d21e5

Rename git_packfile_check to git_packfile_alloc since it is now
being used more in that capacity.  Fix the various places that use
it.  Consolidate some repeated code in odb_pack.c related to the
allocation of a new pack_backend.

committed 11 years ago

5d2d21e5 Browse Directory

Further threading fixes · 53607868

This builds on the earlier thread safety work to make it so that
setting the odb, index, refdb, or config for a repository is done
in a threadsafe manner with minimized locking time.  This is done
by adding a lock to the repository object and using it to guard
the assignment of the above listed pointers.  The lock is only
held to assign the pointer value.

This also contains some minor fixes to the other work with pack
files to reduce the time that locks are being held to and fix an
apparently memory leak.

committed 11 years ago

53607868 Browse Directory

Add mutex around mapping and unmapping pack files · 24c70804

When I was writing threading tests for the new cache, the main
error I kept running into was a pack file having it's content
unmapped underneath the running thread.  This adds a lock around
the routines that map and unmap the pack data so that threads can
effectively reload the data when they need it.

This also required reworking the error handling paths in a couple
places in the code which I tried to make consistent.

committed 11 years ago

24c70804 Browse Directory

03 Mar, 2013 1 commit

indexer: use a hashtable for keeping track of offsets · 0e040c03

These offsets are needed for REF_DELTA objects, which encode which
object they use as a base, but not where it lies in the packfile, so
we need a list.

These objects are mostly from older packfiles, before OFS_DELTA was
widely spread. The time spent in indexing these packfiles is greatly
reduced, though remains above what git is able to do.

committed 11 years ago

0e040c03 Browse Directory

12 Jan, 2013 1 commit

indexer: properly free the packfile resources · 96c9b9f0

The indexer needs to call the packfile's free function so it takes care of
freeing the caches.

We still need to close the mwf descriptor manually so we can rename the
packfile into its final name on Windows.

committed 12 years ago

96c9b9f0 Browse Directory

11 Jan, 2013 4 commits
- Revert "pack: packfile_free -> git_packfile_free and use it in the indexers" · 80d647ad
```
This reverts commit f289f886, which
makes the tests fail on Windows. Revert until we can figure out a
solution.
```
  Carlos Martín Nieto committed 12 years ago
  80d647ad Browse Directory
- pack: That declaration · d0b14cea
  Vicent Marti committed 12 years ago
  
  d0b14cea Browse Directory
- pack: limit the amount of memory the base delta cache can use · 0ed75620
```
Currently limited to 16MB (like git) and to objects up to 1MB in
size.
```
  Carlos Martín Nieto committed 12 years ago
  0ed75620 Browse Directory
- pack: abstract out the cache into its own functions · c8f79c2b
  Carlos Martín Nieto committed 12 years ago
  
  c8f79c2b Browse Directory