- 16 Aug, 2017 2 commits
-
-
In our CMakeLists.txt, we have to check multiple functions in order to determine if we have to use our own or whether we can use the platform-provided one. For two of these functions, namely `regcomp_l()` and `futimens`, the defined macro is actually used inside of the header file "src/unix/posix.h". As such, these macros are not only required by the library, but also by our test suite, which is makes use of internal headers. To prepare for the CMakeLists.txt split, move these two defines inside of the "features.h" header.
Patrick Steinhardt committed -
In a future commit, we will split out the build instructions for our library directory and move them into a subdirectory. One of the benefits is fixing scoping issues, where e.g. defines do not leak to build targets where they do not belong to. But unfortunately, this does also pose the problem of how to propagate some defines which are required by both the library and the test suite. One way would be to create another variable keeping track of all added defines and declare it inside of the parent scope. While this is the most obvious and simplest way of going ahead, it is kind of unfortunate. The main reason to not use this is that these defines become implicit dependencies between the build targets. By simply observing a define inside of the CMakeLists.txt file, one cannot reason whether this define is only required by the current target or whether it is required by different targets, as well. Another approach would be to use an internal header file keeping track of all defines shared between targets. While configuring the library, we will set various variables and let CMake configure the file, adding or removing defines based on what has been configured. Like this, one can easily keep track of the current environment by simply inspecting the header file. Furthermore, these dependencies are becoming clear inside the CMakeLists.txt, as instead of simply adding a define, we now call e.g. `SET(GIT_THREADSAFE 1)`. Having this header file though requires us to make sure it is always included before any "#ifdef"-preprocessor checks are executed. As we have already refactored code to always include the "common.h" header file before any statement inside of a file, this becomes easy: just make sure "common.h" includes the new "features.h" header file first.
Patrick Steinhardt committed
-
- 09 Aug, 2017 3 commits
-
-
The open-coded version was inherited from git.git. But it turns out it was based on an older version of glibc, whose memcmp was not very optimized. Modern glibc does much better, and some compilers (like gcc 7) can even inline the memcmp into a series of multi-byte xors. Upstream is switching to using memcmp in git/git@0b006014c87f400bd9a86267ed30fd3e7b383884.
Jeff King committed -
This was pulled over from git.git, and is an experiment in making binary-searching lists of sha1s faster. It was never compiled by default (nor was it used upstream by default without a special environment variable). Unfortunately, it is actually slower in practice, and upstream is planning to drop it in git/git@f1068efefe6dd3beaa89484db5e2db730b094e0b (which has some timing results). It's worth doing the same here for simplicity.
Jeff King committed -
If we enter the sha1_position() function with "lo == hi", we have no elements. But the do-while loop means that we'll enter the loop body once anyway, picking "mi" at that same value and comparing nonsense to our desired key. This is unlikely to match in practice, but we still shouldn't be looking at the memory in the first place. This bug is inherited from git.git; it was fixed there in e01580cfe01526ec2c4eb4899f776a82ade7e0e1.
Jeff King committed
-
- 26 Jul, 2017 2 commits
-
-
When using the `do_with_retries` macro for retrying filesystem operations in the posix emulation layer, allow the remediation function to return `GIT_RETRY`, meaning that the error was believed to be remediated, and the operation should be retried immediately, without a sleep. This is a slightly more general solution to the problem fixed in #4312.
Edward Thomson committed -
Fixed an issue where the retry logic on p_unlink sleeps before it tries setting a file to write mode causing unnecessary slowdown.
Carson Howard committed
-
- 21 Jul, 2017 1 commit
-
-
The conditional `run < minrun` can never be true directly after assigning `run = minrun`. Remove it to avoid confusion.
Patrick Steinhardt committed
-
- 15 Jul, 2017 6 commits
-
-
Modifying variables pulled in by an included file currently succeeds, but it doesn't actually do what one would expect, as refreshing the configuration will cause the values to reappear. As we are currently not really able to support this use case, we will instead just return an error for deleting and setting variables which were included via an include.
Patrick Steinhardt committed -
Right now, we have multiple call sites which initialize a `reader` structure. As the structure is only actually used inside of `config_read`, we can instead just move the reader inside of the `config_read` function. Instead, we can just pass in the configuration file into `config_read`, which eases code readability.
Patrick Steinhardt committed -
Currently, we only re-parse the top-level configuration file when it has changed itself. This can cause problems when an include is changed, as we were not updating all values correctly. Instead of conditionally reparsing only refreshed files, the logic becomes much clearer and easier to follow if we always re-parse the top-level configuration file when either the file itself or one of its included configuration files has changed on disk. This commit implements this logic. Note that this might impact performance in some cases, as we need to re-read all configuration files whenever any of the included files changed. It could increase performance to just re-parse include files which have actually changed, but this would compromise maintainability of the code without much gain. The only case where we will gain anything is when we actually use includes and when only these includes are updated, which will probably be quite an unusual scenario to actually be worthwhile to optimize.
Patrick Steinhardt committed -
The backend passed to `config_read` is never actually used anymore, so we can remove it from the function and the `parse_data` structure.
Patrick Steinhardt committed -
Previously, the callbacks passed to `config_parse` got the reader via a pointer to a pointer. This allowed the callbacks to update the callers `reader` variable when the array holding it has been reallocated. As the array is no longer present, we can simply the code by making the reader a simple pointer.
Patrick Steinhardt committed -
Current code for configuration files uses the `reader` structure to parse configuration files and store additional metadata like the file's path and checksum. These structures are stored within an array in the backend itself, which causes multiple problems. First, it does not make sense to keep around the file's contents with the backend itself. While this data is usually free'd before being added to the backend, this brings along somewhat intricate lifecycle problems. A better solution would be to store only the file paths as well as the checksum of the currently parsed content only. The second problem is that the `reader` structures are stored inside an array. When re-parsing configuration files due to changed contents, we may cause this array to be reallocated, requiring us to update pointers hold by callers. Furthermore, we do not keep track of includes which are already associated to a reader inside of this array. This causes us to add readers multiple times to the backend, e.g. in the scenario of refreshing configurations. This commit fixes these shortcomings. We introduce a split between the parsing data and the configuration file's metadata. The `reader` will now only hold the file's contents and the parser state and the new `config_file` structure holds the file's path and checksum. Furthermore, the new structure is a recursive structure in that it will also hold references to the files it directly includes. The diskfile is changed to only store the top-level configuration file. These changes allow us further refactorings and greatly simplify understanding the code.
Patrick Steinhardt committed
-
- 12 Jul, 2017 1 commit
-
-
If the signature is invalid but we detect that after allocating the strings, we free them. We however leave that pointer dangling in the structure the caller gave us, which can lead to double-free. Set these pointers to `NULL` after freeing their memory to avoid this.
Carlos Martín Nieto committed
-
- 10 Jul, 2017 1 commit
-
-
Pointers to general data should usually be used as a void pointer such that it is possible to hand in variables of a different pointer type without the need to cast. This is the same when creating patches from buffers, where the buffers may contain arbitrary data. Instead of requiring the caller to care whether his buffer is e.g. `char *` or `unsigned char *`, we should instead just accept a `void *`. This is also consistent in how we tread other types like for example `git_blob`, which also just has a void pointer as its raw contents.
Patrick Steinhardt committed
-
- 03 Jul, 2017 3 commits
-
-
Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
Patrick Steinhardt committed -
Some implementation files were missing the license headers. This commit adds them.
Patrick Steinhardt committed -
Some of our header files are not included at all by any of their implementing counter-parts. Including them inside of these files leads to some compile errors mostly due to unknown types because of missing includes. But there's also one case where a declared function does not match the implementation's prototype. Fix all these errors by fixing up the prototype and adding missing includes. This is preparatory work for fixing up missing includes in the implementation files.
Patrick Steinhardt committed
-
- 30 Jun, 2017 2 commits
-
-
The current order of declarations and includes between "common.h" and "w32_crtdbg_stacktrace.h" is rather complicated. Both header files make use of things defined in the other one and are thus circularly dependent on each other. This makes it currently impossible to compile the "w32_crtdbg_stacktrace.c" file when including "common.h" inside of "w32_crtdbg_stacktrace.h". We can disentangle the mess by moving declaration of the inline crtdbg functions into the "w32_crtdbg_stacktrace.h" file and adding additional includes inside of it, such that all required functions are available to it. This allows us to break the dependency cycle.
Patrick Steinhardt committed -
Andrey Davydov committed
-
- 26 Jun, 2017 1 commit
-
-
The upstream git project provides the ability to calculate a so-called patch ID. Quoting from git-patch-id(1): A "patch ID" is nothing but a sum of SHA-1 of the file diffs associated with a patch, with whitespace and line numbers ignored." Patch IDs can be used to identify two patches which are probably the same thing, e.g. when a patch has been cherry-picked to another branch. This commit implements a new function `git_diff_patchid`, which gets a patch and derives an OID from the diff. Note the different terminology here: a patch in libgit2 are the differences in a single file and a diff can contain multiple patches for different files. The implementation matches the upstream implementation and should derive the same OID for the same diff. In fact, some code has been directly derived from the upstream implementation. The upstream implementation has two different modes to calculate patch IDs, which is the stable and unstable mode. The old way of calculating the patch IDs was unstable in a sense that a different ordering the diffs was leading to different results. This oversight was fixed in git 1.9, but as git tries hard to never break existing workflows, the old and unstable way is still default. The newer and stable way does not care for ordering of the diff hunks, and in fact it is the mode that should probably be used today. So right now, we only implement the stable way of generating the patch ID.
Patrick Steinhardt committed
-
- 23 Jun, 2017 1 commit
-
-
`sin_port` should be in network byte order.
Ian Douglas Scott committed
-
- 21 Jun, 2017 1 commit
-
-
The function `merge_diff_mark_similarity_exact` may error our early and, when it does so, free the `ours_deletes_by_oid` and `theirs_deletes_by_oid` variables. While the first one can never be uninitialized due to the first call actually assigning to it, the second variable can be freed without being initialized. Fix the issue by initializing both variables to `NULL`.
Patrick Steinhardt committed
-
- 19 Jun, 2017 2 commits
-
-
Ariel Davis committed
-
Ariel Davis committed
-
- 13 Jun, 2017 1 commit
-
-
Mohseen Mukaddam committed
-
- 12 Jun, 2017 4 commits
-
-
When in an in-memory repository - without a configuration file - do not fail to create a configuration object.
Edward Thomson committed -
Disambiguate error values: return `GIT_ENOTFOUND` when the item cannot exist in the repository (perhaps because the repository is inmemory or otherwise not backed by a filesystem), return `-1` when there is a hard failure.
Edward Thomson committed -
When looking for an object by prefix, we query all the backends so that we can ensure that there is no ambiguity. We need to reset the `error` value between backends; otherwise the first backend may find an object by prefix, but subsequent backends may not. If we do not reset the `error` value then it will remain at `GIT_ENOTFOUND` and `read_prefix_1` will fail, despite having actually found an object.
Edward Thomson committed -
Edward Thomson committed
-
- 10 Jun, 2017 3 commits
-
-
Only ignore `EBUSY` from `rmdir` when the `GIT_RMDIR_SKIP_NONEMPTY` bit is set.
Edward Thomson committed -
When deleting a directory during checkout, do not simply delete the directory, since there may be untracked files. Instead, go into the iterator and examine each file. In the original code (the code with the faulty assumption), we look to see if there's an index entry beneath the directory that we want to remove. Eg, it looks to see if we have a workdir entry foo and an index entry foo/bar.txt. If this is not the case, then the working directory must have precious files in that directory. This part is okay. The part that's not okay is if there is an index entry foo/bar.txt. It just blows away the whole damned directory. That's not cool. Instead, by simply pushing the directory itself onto the stack and iterating each entry, we will deal with the files one by one - whether they're in the index (and can be force removed) or not (and are precious). The original code was a bad optimization, assuming that we didn't need to git_iterator_advance_into if there was any index entry in the folder. That's wrong - we could have optimized this iff all folder entries are in the index. Instead, we need to simply dig into the directory and analyze its entries.
Edward Thomson committed -
Fix ACK parsing in wait_while_ack() internal function. This patch handles the case where multi_ack_detailed mode sends 'ready' ACKs. The existing functionality would bail out too early, thus causing the processing of the ensuing packfile to fail if/when 'ready' ACKs were sent.
Roger Gee committed
-
- 08 Jun, 2017 6 commits
-
-
Initially, the setting has been solely used to enable the use of `fsync()` when creating objects. Since then, the use has been extended to also cover references and index files. As the option is not yet part of any release, we can still correct this by renaming the option to something more sensible, indicating not only correlation to objects. This commit renames the option to `GIT_OPT_ENABLE_FSYNC_GITDIR`. We also move the variable from the object to repository source code.
Patrick Steinhardt committed -
Patrick Steinhardt committed
-
Patrick Steinhardt committed
-
Patrick Steinhardt committed
-
Patrick Steinhardt committed
-
Patrick Steinhardt committed
-