- 20 Jan, 2015 1 commit
-
-
Main change: Don't treat chars > 128 as non-printable (common in UTF-8 files) Signed-off-by: Sven Strickroth <email@cs-ware.de>
Sven Strickroth committed
-
- 21 Nov, 2014 1 commit
-
-
Vicent Marti committed
-
- 01 Oct, 2014 1 commit
-
-
Vicent Marti committed
-
- 15 Aug, 2014 1 commit
-
-
Decode base64-encoded text into a git_buf
Edward Thomson committed
-
- 23 Jun, 2014 1 commit
-
-
When checking out files, we're performing conversion into the user's native line endings, but we only want to do it for files which have consistent line endings. Refuse to perform the conversion for mixed-EOL files. The CRLF->LF filter is left as-is, as that conversion is considered to be normalization by git and should force a conversion of the line endings.
Carlos Martín Nieto committed
-
- 23 Apr, 2014 1 commit
-
-
Edward Thomson committed
-
- 01 Apr, 2014 1 commit
-
-
There are a few places where we need to join three strings to assemble a path. This adds a simple join3 function to avoid the comparatively expensive join_n (which calls strlen on each string twice).
Russell Belfer committed
-
- 20 Jan, 2014 1 commit
-
-
Patrick Reynolds committed
-
- 14 Nov, 2013 1 commit
-
-
Ben Straub committed
-
- 17 Sep, 2013 1 commit
-
-
This begins the process of exposing git_filter objects to the public API. This includes: * new public type and API for `git_buffer` through which an allocated buffer can be passed to the user * new API `git_blob_filtered_content` * make the git_filter type and GIT_FILTER_TO_... constants public
Russell Belfer committed
-
- 19 Aug, 2013 1 commit
-
-
When a git_buf contains a UTF-8 BOM, the three bytes comprising that BOM are treated as unprintable characters. For a small git_buf, the three BOM characters overwhelm the printable characters. This is problematic when trying to check out a small file as the CR/LF filtering will not apply.
Edward Thomson committed
-
- 31 Jul, 2013 1 commit
-
-
After doing further profiling, I found that a lot of time was being spent attempting to insert hashes into the file hash signature when using the rolling hash because the rolling hash approach generates a hash per byte of the file instead of one per run/line of data. To optimize this, I decided to convert back to a run-based file signature algorithm which would be more like core Git. After changing this, a number of the existing tests started to fail. In some cases, this appears to have been because the test was coded to be too specific to the particular results of the file similarity metric and in some cases there appear to have been bugs in the core rename detection code where only by the coincidence of the file similarity scoring were the expected results being generated. This renames all the variables in the core rename detection code to be more consistent and hopefully easier to follow which made it a bit easier to reason about the behavior of that code and fix the problems that I was seeing. I think it's in better shape now. There are a couple of tests now that attempt to stress test the rename detection code and they are quite slow. Most of the time is spent setting up the test data on disk and in the index. When we roll out performance improvements for index insertion, it should also speed up these tests I hope.
Russell Belfer committed
-
- 25 Mar, 2013 1 commit
-
-
This adds crlf/lf conversion functions into buf_text with more efficient implementations that bypass the high level buffer functions. They attempt to minimize the number of reallocations done and they directly write the buffer data as needed if they know that there is enough memory allocated to memcpy data. Tests are added for these new functions. The crlf.c code is updated to use the new functions. Removed the include of buf_text.h from filter.h and just include it more narrowly in the places that need it.
Russell Belfer committed
-
- 20 Feb, 2013 4 commits
-
-
This plugs in the three basic similarity strategies for handling whitespace via internal use of the pluggable API. In so doing, I realized that the use of git_buf in the hashsig API was not needed and actually just made it harder to use, so I tweaked that API as well. Note that the similarity metric is still not hooked up in the find_similarity code - this is just setting out the function that will be used.
Russell Belfer committed -
Seems to be working pretty well...
Russell Belfer committed -
This moves the similarity metric code out of buf_text and into a new file. Also, this implements a different approach to similarity measurement based on a Rabin-Karp rolling hash where we only keep the top 100 and bottom 100 hashes. In theory, that should be sufficient samples to given a fairly accurate measurement while limiting the amount of data we keep for file signatures no matter how large the file is.
Russell Belfer committed -
This adds a new `git_buf_text_hashsig` type and functions to generate these hash signatures and compare them to give a similarity score. This can be plugged into diff similarity scoring.
Russell Belfer committed
-
- 29 Jan, 2013 1 commit
-
-
Russell Belfer committed
-
- 11 Jan, 2013 1 commit
-
-
Core git just looks for NUL bytes in files when deciding about is-binary inside diff (although it uses a better algorithm in checkout, when deciding if CRLF conversion should be done). Libgit2 was using the better algorithm in both places, but that is causing some confusion. For now, this makes diff just look for NUL bytes to decide if a file is binary by content in diff.
Russell Belfer committed
-
- 28 Nov, 2012 1 commit
-
-
There are many scattered functions that look into the contents of buffers to do various text manipulations (such as escaping or unescaping data, calculating text stats, guessing if content is binary, etc). This groups all those functions together into a new file and converts the code to use that. This has two enhancements to existing functionality. The old text stats function is significantly rewritten and the BOM detection code was extended (although largely we can't deal with anything other than a UTF8 BOM).
Russell Belfer committed
-
- 10 Oct, 2012 1 commit
-
-
Russell Belfer committed
-
- 23 Aug, 2012 1 commit
-
-
Russell Belfer committed
-
- 24 Jul, 2012 1 commit
-
-
yorah committed
-
- 12 Jul, 2012 1 commit
-
-
Russell Belfer committed
-
- 11 Jul, 2012 1 commit
-
-
* `git_buf_rfind` (with tests and tests for `git_buf_rfind_next`) * `git_buf_puts_escaped` and `git_buf_puts_escaped_regex` (with tests) to copy strings into a buffer while injecting an escape sequence (e.g. '\') in front of particular characters.
Russell Belfer committed
-
- 15 May, 2012 1 commit
-
-
The goal of this work is to rewrite git_status_file to use the same underlying code as git_status_foreach. This is done in 3 phases: 1. Extend iterators to allow ranged iteration with start and end prefixes for the range of file names to be covered. 2. Improve diff so that when there is a pathspec and there is a common non-wildcard prefix of the pathspec, it will use ranged iterators to minimize excess iteration. 3. Rewrite git_status_file to call git_status_foreach_ext with a pathspec that covers just the one file being checked. Since ranged iterators underlie the status & diff implementation, this is actually fairly efficient. The workdir iterator does end up loading the contents of all the directories down to the single file, which should ideally be avoided, but it is pretty good.
Russell Belfer committed
-
- 17 Apr, 2012 1 commit
-
-
This updates to the latest clar which includes the helpers `cl_assert_equal_s` and `cl_assert_equal_i`. Convert the code over to use those and remove the old libgit2-only helpers.
Russell Belfer committed
-
- 21 Mar, 2012 1 commit
-
-
Cleaned up some other issues.
Russell Belfer committed
-
- 27 Feb, 2012 1 commit
-
-
This makes so much sense that I can't believe it hasn't been done before. Kill the old `git_fbuffer` and read files straight into `git_buf` objects. Also: In order to fully support 4GB files in 32-bit systems, the `git_buf` implementation has been changed from using `ssize_t` for storage and storing negative values on allocation failure, to using `size_t` and changing the buffer pointer to a magical pointer on allocation failure. Hopefully this won't break anything.
Vicent Martí committed
-
- 25 Jan, 2012 1 commit
-
-
Clay is the name of a programming language on the makings, and we want to avoid confusions. Sorry for the huge diff!
Vicent Martí committed
-
- 08 Dec, 2011 1 commit
-
-
This converts virtually all of the places that allocate GIT_PATH_MAX buffers on the stack for manipulating paths to use git_buf objects instead. The patch is pretty careful not to touch the public API for libgit2, so there are a few places that still use GIT_PATH_MAX. This extends and changes some details of the git_buf implementation to add a couple of extra functions and to make error handling easier. This includes serious alterations to all the path.c functions, and several of the fileops.c ones, too. Also, there are a number of new functions that parallel existing ones except that use a git_buf instead of a stack-based buffer (such as git_config_find_global_r that exists alongsize git_config_find_global). This also modifies the win32 version of p_realpath to allocate whatever buffer size is needed to accommodate the realpath instead of hardcoding a GIT_PATH_MAX limit, but that change needs to be tested still.
Russell Belfer committed
-
- 30 Nov, 2011 4 commits
-
-
This streamlines git_buf_join and removes the join-append behavior, opting instead for a very compact join-replace of the git_buf contents. The unit tests had to be updated to remove the join-append tests and have a bunch more exhaustive tests added.
Russell Belfer committed -
Taking a page from core git's strbuf, this introduces git_buf_initbuf which is an empty string that is used to initialize the git_buf ptr value even for new buffers. Now the git_buf ptr will always point to a valid NUL-terminated string. This change required jumping through a few hoops for git_buf_grow and git_buf_free to distinguish between a actual allocated buffer and the global initial value. Also, this moves the allocation related functions to be next to each other near the top of buffer.c.
Russell Belfer committed -
Russell Belfer committed
-
At a tiny cost of 1 extra byte per allocation, this makes git_buf_cstr into basically a noop, which simplifies error checking when trying to convert things to use dynamic allocation. This patch also adds a new function (git_buf_copy_cstr) for copying the cstr data directly into an external buffer.
Russell Belfer committed
-
- 28 Nov, 2011 3 commits
-
-
* replace some ints with size_ts * update NULL checks in various places
Russell Belfer committed -
This commit addresses two of the comments: * renamed existing n-input git_buf_join to git_buf_join_n * added new git_buf_join that always takes two inputs * moved some parameter error checking to asserts * extended unit tests to cover new version of git_buf_join
Russell Belfer committed -
Add new functions to git_buf for: * initializing a buffer from a string * joining one or more strings onto a buffer with separators * swapping two buffers in place * extracting data from a git_buf (leaving it empty) Also, make git_buf_free leave a git_buf back in its initted state, and slightly tweak buffer allocation sizes and thresholds. Finally, port unit tests to clay and extend with lots of new tests for the various git_buf functions.
Russell Belfer committed
-