1. 15 Jun, 2019 1 commit
    • wildmatch: import wildmatch from git.git · a9f57629
      In commit 70a8fc999d (stop using fnmatch (either native or
      compat), 2014-02-15), upstream git has switched over all code
      from their internal fnmatch copy to its new wildmatch code. We
      haven't followed suit, and thus have developed some
      incompatibilities in how we match regular expressions.
      
      Import git's wildmatch from v2.22.0 and add a test suite based on
      their t3070-wildmatch.sh tests.
      Patrick Steinhardt committed
  2. 14 Jun, 2019 3 commits
    • posix: remove `p_fallocate` abstraction · 2d85c7e8
      By now, we have repeatedly failed to provide a nice
      cross-platform implementation of `p_fallocate`. Recent tries to
      do that escalated quite fast to a set of different CMake checks,
      implementations, fallbacks, etc., which started to look real
      awkward to maintain. In fact, `p_fallocate` had only been
      introduced in commit 4e3949b7 (tests: test that largefiles can
      be read through the tree API, 2019-01-30) to support a test with
      large files, but given the maintenance costs it just seems not to
      be worht it.
      
      As we have removed the sole user of `p_fallocate` in the previous
      commit, let's drop it altogether.
      Patrick Steinhardt committed
    • Rename opt init functions to `options_init` · 0b5ba0d7
      In libgit2 nomenclature, when we need to verb a direct object, we name
      a function `git_directobject_verb`.  Thus, if we need to init an options
      structure named `git_foo_options`, then the name of the function that
      does that should be `git_foo_options_init`.
      
      The previous names of `git_foo_init_options` is close - it _sounds_ as
      if it's initializing the options of a `foo`, but in fact
      `git_foo_options` is its own noun that should be respected.
      
      Deprecate the old names; they'll now call directly to the new ones.
      Edward Thomson committed
  3. 19 May, 2019 8 commits
  4. 22 Feb, 2019 1 commit
  5. 15 Feb, 2019 9 commits
    • oidmap: remove legacy low-level interface · bd66925a
      Remove the low-level interface that was exposing implementation details of
      `git_oidmap` to callers. From now on, only the high-level functions shall be
      used to retrieve or modify values of a map. Adjust remaining existing callers.
      Patrick Steinhardt committed
    • strmap: remove legacy low-level interface · fdfabdc4
      Remove the low-level interface that was exposing implementation details of
      `git_strmap` to callers. From now on, only the high-level functions shall be
      used to retrieve or modify values of a map. Adjust remaining existing callers.
      Patrick Steinhardt committed
    • maps: provide high-level iteration interface · 18cf5698
      Currently, our headers need to leak some implementation details of maps due to
      their direct use of indices in the implementation of their foreach macros. This
      makes it impossible to completely hide the map structures away, and also makes
      it impossible to include the khash implementation header in the C files of the
      respective map only.
      
      This is now being fixed by providing a high-level iteration interface
      `map_iterate`, which takes as inputs the map that shall be iterated over, an
      iterator as well as the locations where keys and values shall be put into. For
      simplicity's sake, the iterator is a simple `size_t` that shall initialized to
      `0` on the first call. All existing foreach macros are then adjusted to make use
      of this new function.
      Patrick Steinhardt committed
    • oidmap: introduce high-level setter for key/value pairs · 2e0a3048
      Currently, one would use either `git_oidmap_insert` to insert key/value pairs
      into a map or `git_oidmap_put` to insert a key only. These function have
      historically been macros, which is why their syntax is kind of weird: instead of
      returning an error code directly, they instead have to be passed a pointer to
      where the return value shall be stored. This does not match libgit2's common
      idiom of directly returning error codes.Furthermore, `git_oidmap_put` is tightly
      coupled with implementation details of the map as it exposes the index of
      inserted entries.
      
      Introduce a new function `git_oidmap_set`, which takes as parameters the map,
      key and value and directly returns an error code. Convert all trivial callers of
      `git_oidmap_insert` and `git_oidmap_put` to make use of it.
      Patrick Steinhardt committed
    • oidmap: introduce high-level getter for values · 9694ef20
      The current way of looking up an entry from a map is tightly coupled with the
      map implementation, as one first has to look up the index of the key and then
      retrieve the associated value by using the index. As a caller, you usually do
      not care about any indices at all, though, so this is more complicated than
      really necessary. Furthermore, it invites for errors to happen if the correct
      error checking sequence is not being followed.
      
      Introduce a new high-level function `git_oidmap_get` that takes a map and a key
      and returns a pointer to the associated value if such a key exists. Otherwise,
      a `NULL` pointer is returned. Adjust all callers that can trivially be
      converted.
      Patrick Steinhardt committed
    • strmap: introduce high-level setter for key/value pairs · 03555830
      Currently, one would use the function `git_strmap_insert` to insert key/value
      pairs into a map. This function has historically been a macro, which is why its
      syntax is kind of weird: instead of returning an error code directly, it instead
      has to be passed a pointer to where the return value shall be stored. This does
      not match libgit2's common idiom of directly returning error codes.
      
      Introduce a new function `git_strmap_set`, which takes as parameters the map,
      key and value and directly returns an error code. Convert all callers of
      `git_strmap_insert` to make use of it.
      Patrick Steinhardt committed
    • strmap: introduce `git_strmap_get` and use it throughout the tree · ef507bc7
      The current way of looking up an entry from a map is tightly coupled with the
      map implementation, as one first has to look up the index of the key and then
      retrieve the associated value by using the index. As a caller, you usually do
      not care about any indices at all, though, so this is more complicated than
      really necessary. Furthermore, it invites for errors to happen if the correct
      error checking sequence is not being followed.
      
      Introduce a new high-level function `git_strmap_get` that takes a map and a key
      and returns a pointer to the associated value if such a key exists. Otherwise,
      a `NULL` pointer is returned. Adjust all callers that can trivially be
      converted.
      Patrick Steinhardt committed
    • maps: provide a uniform entry count interface · 7e926ef3
      There currently exist two different function names for getting the entry count
      of maps, where offmaps offset and string maps use `num_entries` and OID maps use
      `size`. In most programming languages with built-in map types, this is simply
      called `size`, which is also shorter to type. Thus, this commit renames the
      other two functions `num_entries` to match the common way and adjusts all
      callers.
      Patrick Steinhardt committed
    • maps: use uniform lifecycle management functions · 351eeff3
      Currently, the lifecycle functions for maps (allocation, deallocation, resize)
      are not named in a uniform way and do not have a uniform function signature.
      Rename the functions to fix that, and stick to libgit2's naming scheme of saying
      `git_foo_new`. This results in the following new interface for allocation:
      
      - `int git_<t>map_new(git_<t>map **out)` to allocate a new map, returning an
        error code if we ran out of memory
      
      - `void git_<t>map_free(git_<t>map *map)` to free a map
      
      - `void git_<t>map_clear(git<t>map *map)` to remove all entries from a map
      
      This commit also fixes all existing callers.
      Patrick Steinhardt committed
  6. 25 Jan, 2019 2 commits
  7. 22 Jan, 2019 1 commit
  8. 06 Jan, 2019 1 commit
  9. 28 Nov, 2018 4 commits
    • stream registration: take an enum type · 02bb39f4
      Accept an enum (`git_stream_t`) during custom stream registration that
      indicates whether the registration structure should be used for standard
      (non-TLS) streams or TLS streams.
      Edward Thomson committed
    • stream: provide generic registration API · df2cc108
      Update the new stream registration API to be `git_stream_register`
      which takes a registration structure and a TLS boolean.  This allows
      callers to register non-TLS streams as well as TLS streams.
      
      Provide `git_stream_register_tls` that takes just the init callback for
      backward compatibliity.
      Edward Thomson committed
    • tls: introduce a wrap function · 43b592ac
      Introduce `git_tls_stream_wrap` which will take an existing `stream`
      with an already connected socket and begin speaking TLS on top of it.
      This is useful if you've built a connection to a proxy server and you
      wish to begin CONNECT over it to tunnel a TLS connection.
      
      Also update the pluggable TLS stream layer so that it can accept a
      registration structure that provides an `init` and `wrap` function,
      instead of a single initialization function.
      Edward Thomson committed
    • khash: remove intricate knowledge of khash types · 852bc9f4
      Instead of using the `khiter_t`, `git_strmap_iter` and `khint_t` types,
      simply use `size_t` instead. This decouples code from the khash stuff
      and makes it possible to move the khash includes into the implementation
      files.
      Patrick Steinhardt committed
  10. 14 Nov, 2018 1 commit
    • strntol: fix out-of-bounds reads when parsing numbers with leading sign · 4209a512
      When parsing a number, we accept a leading plus or minus sign to return
      a positive or negative number. When the parsed string has such a leading
      sign, we set up a flag indicating that the number is negative and
      advance the pointer to the next character in that string. This misses
      updating the number of bytes in the string, though, which is why the
      parser may later on do an out-of-bounds read.
      
      Fix the issue by correctly updating both the pointer and the number of
      remaining bytes. Furthermore, we need to check whether we actually have
      any bytes left after having advanced the pointer, as otherwise the
      auto-detection of the base may do an out-of-bonuds access. Add a test
      that detects the out-of-bound read.
      
      Note that this is not actually security critical. While there are a lot
      of places where the function is called, all of these places are guarded
      or irrelevant:
      
      - commit list: this operates on objects from the ODB, which are always
        NUL terminated any may thus not trigger the off-by-one OOB read.
      
      - config: the configuration is NUL terminated.
      
      - curl stream: user input is being parsed that is always NUL terminated
      
      - index: the index is read via `git_futils_readbuffer`, which always NUL
        terminates it.
      
      - loose objects: used to parse the length from the object's header. As
        we check previously that the buffer contains a NUL byte, this is safe.
      
      - rebase: this parses numbers from the rebase instruction sheet. As the
        rebase code uses `git_futils_readbuffer`, the buffer is always NUL
        terminated.
      
      - revparse: this parses a user provided buffer that is NUL terminated.
      
      - signature: this parser the header information of objects. As objects
        read from the ODB are always NUL terminated, this is a non-issue. The
        constructor `git_signature_from_buffer` does not accept a length
        parameter for the buffer, so the buffer needs to be NUL terminated, as
        well.
      
      - smart transport: the buffer that is parsed is NUL terminated
      
      - tree cache: this parses the tree cache from the index extension. The
        index itself is read via `git_futils_readbuffer`, which always NUL
        terminates it.
      
      - winhttp transport: user input is being parsed that is always NUL
        terminated
      Patrick Steinhardt committed
  11. 02 Nov, 2018 2 commits
    • strntol: fix detection and skipping of base prefixes · 50d09407
      The `git__strntol` family of functions has the ability to auto-detect
      a number's base if the string has either the common '0x' prefix for
      hexadecimal numbers or '0' prefix for octal numbers. The detection of
      such prefixes and following handling has two major issues though that are
      being fixed in one go now.
      
      - We do not do any bounds checking previous to verifying the '0x' base.
        While we do verify that there is at least one digit available
        previously, we fail to verify that there are two digits available and
        thus may do an out-of-bounds read when parsing this
        two-character-prefix.
      
      - When skipping the prefix of such numbers, we only update the pointer
        length without also updating the number of remaining bytes. Thus if we
        try to parse a number '0x1' of total length 3, we will first skip the
        first two bytes and then try to read 3 bytes starting at '1'.
      
      Fix both issues by disentangling the logic. Instead of doing the
      detection and skipping of such prefixes in one go, we will now first try
      to detect the base while also honoring how many bytes are left. Only if
      we have a valid base that is either 8 or 16 and have one of the known
      prefixes, we will now advance the pointer and update the remaining bytes
      in one step.
      
      Add some tests that verify that no out-of-bounds parsing happens and
      that autodetection works as advertised.
      Patrick Steinhardt committed
    • strntol: fix out-of-bounds read when skipping leading spaces · 41863a00
      The `git__strntol` family of functions accepts leading spaces and will
      simply skip them. The skipping will not honor the provided buffer's
      length, though, which may lead it to read outside of the provided
      buffer's bounds if it is not a simple NUL-terminated string.
      Furthermore, if leading space is trimmed, the function will further
      advance the pointer but not update the number of remaining bytes, which
      may also lead to out-of-bounds reads.
      
      Fix the issue by properly paying attention to the buffer length and
      updating it when stripping leading whitespace characters. Add a test
      that verifies that we won't read past the provided buffer length.
      Patrick Steinhardt committed
  12. 25 Oct, 2018 1 commit
    • util: provide `git__memmem` function · 83e8a6b3
      Unfortunately, neither the `memmem` nor the `strnstr` functions are part
      of any C standard but are merely extensions of C that are implemented by
      e.g. glibc. Thus, there is no standardized way to search for a string in
      a block of memory with a limited size, and using `strstr` is to be
      considered unsafe in case where the buffer has not been sanitized. In
      fact, there are some uses of `strstr` in exactly that unsafe way in our
      codebase.
      
      Provide a new function `git__memmem` that implements the `memmem`
      semantics. That is in a given haystack of `n` bytes, search for the
      occurrence of a byte sequence of `m` bytes and return a pointer to the
      first occurrence. The implementation chosen is the "Not So Naive"
      algorithm from [1]. It was chosen as the implementation is comparably
      simple while still being reasonably efficient in most cases.
      Preprocessing happens in constant time and space, searching has a time
      complexity of O(n*m) with a slightly sub-linear average case.
      
      [1]: http://www-igm.univ-mlv.fr/~lecroq/string/
      Patrick Steinhardt committed
  13. 19 Oct, 2018 1 commit
    • util: fix out of bounds read in error message · ea19efc1
      When an integer that is parsed with `git__strntol32` is too big to fit
      into an int32, we will generate an error message that includes the
      actual string that failed to parse. This does not acknowledge the fact
      that the string may either not be NUL terminated or alternative include
      additional characters after the number that is to be parsed. We may thus
      end up printing characters into the buffer that aren't the number or,
      worse, read out of bounds.
      
      Fix the issue by utilizing the `endptr` that was set by
      `git__strntol64`. This pointer is guaranteed to be set to the first
      character following the number, and we can thus use it to compute the
      width of the number that shall be printed. Create a test to verify that
      we correctly truncate the number.
      Patrick Steinhardt committed
  14. 18 Oct, 2018 3 commits
  15. 05 Oct, 2018 2 commits