Commits · d730d3f4f0efb269dd760a3100ae86c460b8ba36 · lvzhengyang / git2

31 Jul, 2013 1 commit

Major rename detection changes · d730d3f4

After doing further profiling, I found that a lot of time was
being spent attempting to insert hashes into the file hash
signature when using the rolling hash because the rolling hash
approach generates a hash per byte of the file instead of one
per run/line of data.

To optimize this, I decided to convert back to a run-based file
signature algorithm which would be more like core Git.

After changing this, a number of the existing tests started to
fail.  In some cases, this appears to have been because the test
was coded to be too specific to the particular results of the file
similarity metric and in some cases there appear to have been bugs
in the core rename detection code where only by the coincidence
of the file similarity scoring were the expected results being
generated.

This renames all the variables in the core rename detection code
to be more consistent and hopefully easier to follow which made it
a bit easier to reason about the behavior of that code and fix the
problems that I was seeing.  I think it's in better shape now.

There are a couple of tests now that attempt to stress test the
rename detection code and they are quite slow.  Most of the time
is spent setting up the test data on disk and in the index.  When
we roll out performance improvements for index insertion, it
should also speed up these tests I hope.

committed Jul 31, 2013

d730d3f4 Browse Files

26 Jul, 2013 1 commit
- Fix some warnings · 8dd8aa48
  Russell Belfer committed Jul 26, 2013
  
  8dd8aa48 Browse Files
25 Jul, 2013 3 commits

Fix rename detection to use actual blob size · a16e4172

The size data in the index may not reflect the actual size of the
blob data from the ODB when content filtering comes into play.
This commit fixes rename detection to use the actual blob size when
calculating data signatures instead of the value from the index.

Because of a misunderstanding on my part, I first converted the
git_index_add_bypath API to use the post-filtered blob data size
in creating the index entry.  I backed that change out, but I
kept the overall refactoring of that routine and the new internal
git_blob__create_from_paths API because it eliminates an extra
stat() call from the code that adds a file to the index.

The existing tests actually cover this code path, at least when
running on Windows, so at this point I'm not adding new tests to
cover the changes.

committed Jul 25, 2013

a16e4172 Browse Files

Make rename detection file size fix better · effdbeb3

The previous fix for checking file sizes with rename detection
always loads the blob.  In this version, if the odb backend can
get the object header without loading the whole thing into memory,
then we'll just use that, so that we can eliminate possible rename
sources & targets without loading them.

committed Jul 24, 2013

effdbeb3 Browse Files

Fix rename detection for tree-to-tree diffs · a5140f4d

The performance improvements I introduced for rename detection
were not able to run successfully for tree-to-tree diffs because
the blob size was not known early enough and so the file signature
always had to be calculated nonetheless.

This change separates loading blobs into memory from calculating
the signature.  I can't avoid having to load the large blobs into
memory, but by moving it forward, I'm able to avoid the signature
calculation if the blob won't come into play for renames.

committed Jul 24, 2013

a5140f4d Browse Files

24 Jul, 2013 6 commits
- Fix incorrect comment · f5c4d022
  Russell Belfer committed Jul 24, 2013
  
  f5c4d022 Browse Files
- Add rename test that used to be really slow · 397357a0
```
Before the optimization commits, this test used to take about 20
seconds to run on my machine.  Afterwards, there is still a couple
seconds of data setup, but the actual diff and rename detection
runs in a fraction of a second.
```
  Russell Belfer committed Jul 24, 2013
  397357a0 Browse Files
- Use local variables in hash calc to avoid aliasing · 427cc255
  Russell Belfer committed Jul 24, 2013
  
  427cc255 Browse Files
- Don't check rename if file size difference is huge · 18e9efc4
  Russell Belfer committed Jul 24, 2013
  
  18e9efc4 Browse Files
- Don't do text diff unless content will be used · 69c66b55
  Russell Belfer committed Jul 24, 2013
  
  69c66b55 Browse Files
- Don't unload diff data unless loaded · 39a1a662
  Russell Belfer committed Jul 24, 2013
  
  39a1a662 Browse Files
23 Jul, 2013 4 commits
- Merge pull request #1745 from libgit2/doc-fixes · cdbcb8dd
```
Doc fixes
```
  Russell Belfer committed Jul 23, 2013
  cdbcb8dd Browse Files
- remote: fix git_remote_download() documentation · 64061d4a
```
The description of what the function does hasn't been true for quite a
while. Change it to reflect the way it currently works.

While here, remove an even older comment about missing features that
have been implemented.
```
  Carlos Martín Nieto committed Jul 23, 2013
  64061d4a Browse Files
- Clean up some documentation · c05a55b0
```
clang's docparser highlighted these.
```
  Carlos Martín Nieto committed Jul 23, 2013
  c05a55b0 Browse Files
- Merge pull request #1732 from libgit2/revwalk-glob-should-ignore-invalid · e5bdf829
```
Invalid refs on disk cause revwalk globbing to fail
```
  Vicent Martí committed Jul 22, 2013
  e5bdf829 Browse Files
22 Jul, 2013 4 commits

Update init and clean for revwalk::basic tests · 4cee9b86

The new tests don't always want to use the same fixture data as
the old ones so this makes it configurable on a per-test basis.

committed Jul 22, 2013

4cee9b86 Browse Files

Fix warning message about mismatched types · 989710d9
Russell Belfer committed Jul 22, 2013

989710d9 Browse Files

Use pool for loose refdb string allocations · c77342ef

Instead of using lots of strdup calls, this adds a memory pool to
the loose refs iteration code and uses it for keeping track of the
loose refs array.  Memory usage could probably be reduced even
further by eliminating the vector and just scanning by adding the
strlen of each ref, but that would be a more intrusive changes.

This also updates the error handling to be more thorough about
checking for failed allocations, etc.

committed Jul 22, 2013

c77342ef Browse Files

git_reference_next_name must match git_reference_next · b7107131

The git_reference_next API silently skips invalid references when
scanning the loose refs.  The git_reference_next_name API should
skip the same ones even though it isn't creating the reference
object.

This adds a test with a an invalid loose reference and makes sure
that both APIs skip the same entries and generate the same results.

committed Jul 22, 2013

b7107131 Browse Files

19 Jul, 2013 7 commits
- Merge pull request #1743 from ethomson/readme · 1cd9dc29
```
Clarify when to use github issues
```
  Martin Woodward committed Jul 19, 2013
  1cd9dc29 Browse Files
- Update README.md · bef59b1b
  Edward Thomson committed Jul 19, 2013
  
  bef59b1b Browse Files
- Merge pull request #1726 from crazymaster/development · 97309dd0
```
git_buf_text_gather_stats doesn't work for multi-byte characters
```
  Ben Straub committed Jul 19, 2013
  97309dd0 Browse Files
- Clarify when to use github issues · 41a93cc6
```
Suggest that github issues are to be used for bug reports, while questions about usage should be directed to StackOverflow.
```
  Edward Thomson committed Jul 19, 2013
  41a93cc6 Browse Files
- Merge pull request #1742 from martinwoodward/Refresh-Readme · 847b8e0e
```
Refresh readme and contributing guidance
```
  Ben Straub committed Jul 19, 2013
  847b8e0e Browse Files
- Update contributing guidance to explain PR flow · 6ca83665
```
Updating the contributing guidance to explain a bit more about how we use
PR's
```
  Martin Woodward committed Jul 19, 2013
  6ca83665 Browse Files
- Tidy up the methods of contacting the project · 3e3d332b
```
Updated the methods of getting involved with the project and asking
questions.
```
  Martin Woodward committed Jul 19, 2013
  3e3d332b Browse Files
18 Jul, 2013 3 commits
- Typo · 275d8d55
  Ben Straub committed Jul 18, 2013
  
  275d8d55 Browse Files
- Merge pull request #1736 from ben/default-to-cdecl · 79400365
```
Switch default calling convention to cdecl
```
  Vicent Martí committed Jul 18, 2013
  79400365 Browse Files
- Merge pull request #1722 from libgit2/ntk/fix/issue_1722 · 99a9c86c
```
git_revparse_ext: should return a NULL reference  when the revparse expression doesn't lead to a reference
```
  Ben Straub committed Jul 17, 2013
  99a9c86c Browse Files
17 Jul, 2013 3 commits
- Merge pull request #1735 from ethomson/ignored_are_not_rename_candidates · d2db351c
```
don't include ignored as rename candidates
```
  Vicent Martí committed Jul 17, 2013
  d2db351c Browse Files
- don't include ignored as rename candidates · d55bed1a
  Edward Thomson committed Jul 17, 2013
  
  d55bed1a Browse Files
- Switch default calling convention to cdecl. · e49dc687
  Ben Straub committed Jul 17, 2013
  
  e49dc687 Browse Files
16 Jul, 2013 2 commits
- Merge pull request #1731 from alindeman/patch-1 · 4e05fa7d
```
Small grammar fix in docs
```
  Ben Straub committed Jul 15, 2013
  4e05fa7d Browse Files
- Small grammar fix in docs · 51b0397a
  Andy Lindeman committed Jul 15, 2013
  
  51b0397a Browse Files
15 Jul, 2013 6 commits
- Merge pull request #1728 from ivoire/small_fixes · f5385150
```
Small fixes
```
  Vicent Martí committed Jul 15, 2013
  f5385150 Browse Files
- Merge pull request #1729 from tiennou/remote-owner · 3f8086e0
```
Add `git_remote_owner`.
```
  Vicent Martí committed Jul 15, 2013
  3f8086e0 Browse Files
- Add `git_remote_owner` · 85e1eded
  Etienne Samson committed Jul 15, 2013
  
  85e1eded Browse Files
- Fix some more memory leaks in error path · c6451624
  Rémi Duraffort committed Jul 15, 2013
  
  c6451624 Browse Files
- pack: fix memory leak in error path · 050af8bb
  Rémi Duraffort committed Jul 15, 2013
  
  050af8bb Browse Files
- index: fix potential memory leaks · 8d6ef4bf
  Rémi Duraffort committed Jul 15, 2013
  
  8d6ef4bf Browse Files