1. 31 Aug, 2020 2 commits
  2. 28 Aug, 2020 1 commit
  3. 27 Aug, 2020 1 commit
  4. 24 Aug, 2020 4 commits
  5. 21 Aug, 2020 2 commits
  6. 18 Aug, 2020 1 commit
  7. 05 Aug, 2020 4 commits
    • zstream: handle Z_BUF_ERROR appropriately in get_output_chunk · 9bb61bad
      Our processing loop in git_zstream_get_output_chunk does not handle
      `Z_BUF_ERROR` appropriately at the end of a compressed window.
      
      From the zlib manual, inflate will return:
      
      > Z_BUF_ERROR if no progress was possible or if there was not enough
      > room in the output buffer when Z_FINISH is used. Note that Z_BUF_ERROR
      > is not fatal, and inflate() can be called again with more input and
      > more output space to continue decompressing.
      
      In our loop, we were waiting until we got the expected size, then
      ensuring that we were at `Z_STREAM_END`.  We are not guaranteed to be,
      since zlib may be in the `Z_BUF_ERROR` state where it has consumed a
      full window's worth of data, but it doesn't know that it's really at the
      end of the stream.  There _could_ be more compressed data, but it
      doesn't _know_ that there's not until we make a subsequent call.
      
      We can change the loop to look for the end of stream instead of our
      expected size.  This allows us to call inflate one last time when we are
      at the end of a window (and in the `Z_BUF_ERROR` state), allowing it to
      recognize the end of the stream, and move from the `Z_BUF_ERROR` state
      to the `Z_STREAM_END` state.
      
      If we do this, we need another exit condition: when `bytes == 0`, then
      no progress could be made and we should stop trying to inflate.  This
      will be an error case, caught by the size and/or end-of-stream test.
      Edward Thomson committed
  8. 03 Aug, 2020 5 commits
  9. 13 Jul, 2020 1 commit
    • config_entries: Avoid excessive map operations · f2400a9c
      When appending config entries, we currently always first get the
      currently existing map entry and then afterwards update the map to
      contain the current config value. In the common scenario where keys
      aren't being overridden, this is the best we can do. But in case a key
      gets set multiple times, then we'll also perform these two map
      operations. In extreme cases, hashing the map keys will thus start to
      dominate performance.
      
      Let's optimize the pattern by using a separately allocated map entry.
      Currently, we always put the current list entry into the map and update
      it to get any overridden multivar. As these list entries are also used
      to iterate config entries, we cannot update them in-place in the map and
      are thus forced to always set the map to contain the new entry. But with
      a separately allocated map entry, we can now create one once per config
      key and insert it into the map. Whenever appending a new config value
      with the same key, we can now just update the map entry in-place instead
      of having to replace the map entry completely.
      
      This reduces calls to the hashing function by half and trades the
      improved runtime for one more allocation per unique config key. Given
      that the refactoring arguably improves code readability by splitting
      concerns of the `config_entry_list` type and not having to track it in
      two different structures, this alone would already be reason enough to
      take the trade.
      
      Given a pathological case of a gitconfig with 100.000 repeated keys and
      a section of length 10.000 characters, this reduces runtime by half from
      approximately 14 seconds to 7 seconds as expected.
      Patrick Steinhardt committed
  10. 12 Jul, 2020 19 commits