1. 03 Oct, 2018 7 commits
    • smart_pkt: fix buffer overflow when parsing "ok" packets · a9f1ca09
      There are two different buffer overflows present when parsing "ok"
      packets. First, we never verify whether the line already ends after
      "ok", but directly go ahead and also try to skip the expected space
      after "ok". Second, we then go ahead and use `strchr` to scan for the
      terminating newline character. But in case where the line isn't
      terminated correctly, this can overflow the line buffer.
      
      Fix the issues by using `git__prefixncmp` to check for the "ok " prefix
      and only checking for a trailing '\n' instead of using `memchr`. This
      also fixes the issue of us always requiring a trailing '\n'.
      
      Reported by oss-fuzz, issue 9749:
      
      Crash Type: Heap-buffer-overflow READ {*}
      Crash Address: 0x6310000389c0
      Crash State:
        ok_pkt
        git_pkt_parse_line
        git_smart__store_refs
      
      Sanitizer: address (ASAN)
      Patrick Steinhardt committed
    • smart_pkt: fix buffer overflow when parsing "ACK" packets · bc349045
      We are being quite lenient when parsing "ACK" packets. First, we didn't
      correctly verify that we're not overrunning the provided buffer length,
      which we fix here by using `git__prefixncmp` instead of
      `git__prefixcmp`. Second, we do not verify that the actual contents make
      any sense at all, as we simply ignore errors when parsing the ACKs OID
      and any unknown status strings. This may result in a parsed packet
      structure with invalid contents, which is being silently passed to the
      caller. This is being fixed by performing proper input validation and
      checking of return codes.
      Patrick Steinhardt committed
    • smart_pkt: adjust style of "ref" packet parsing function · 5edcf5d1
      While the function parsing ref packets doesn't have any immediately
      obvious buffer overflows, it's style is different to all the other
      parsing functions. Instead of checking buffer length while we go, it
      does a check up-front. This causes the code to seem a lot more magical
      than it really is due to some magic constants. Refactor the function to
      instead make use of the style of other packet parser and verify buffer
      lengths as we go.
      Patrick Steinhardt committed
    • smart_pkt: check whether error packets are prefixed with "ERR " · 786426ea
      In the `git_pkt_parse_line` function, we determine what kind of packet
      a given packet line contains by simply checking for the prefix of that
      line. Except for "ERR" packets, we always only check for the immediate
      identifier without the trailing space (e.g. we check for an "ACK"
      prefix, not for "ACK "). But for "ERR" packets, we do in fact include
      the trailing space in our check. This is not really much of a problem at
      all, but it is inconsistent with all the other packet types and thus
      causes confusion when the `err_pkt` function just immediately skips the
      space without checking whether it overflows the line buffer.
      
      Adjust the check in `git_pkt_parse_line` to not include the trailing
      space and instead move it into `err_pkt` for consistency.
      Patrick Steinhardt committed
    • smart_pkt: explicitly avoid integer overflows when parsing packets · 40fd84cc
      When parsing data, progress or error packets, we need to copy the
      contents of the rest of the current packet line into the flex-array of
      the parsed packet. To keep track of this array's length, we then assign
      the remaining length of the packet line to the structure. We do have a
      mismatch of types here, as the structure's `len` field is a signed
      integer, while the length that we are assigning has type `size_t`.
      
      On nearly all platforms, this shouldn't pose any problems at all. The
      line length can at most be 16^4, as the line's length is being encoded
      by exactly four hex digits. But on a platforms with 16 bit integers,
      this assignment could cause an overflow. While such platforms will
      probably only exist in the embedded ecosystem, we still want to avoid
      this potential overflow. Thus, we now simply change the structure's
      `len` member to be of type `size_t` to avoid any integer promotion.
      Patrick Steinhardt committed
    • smart_pkt: honor line length when determining packet type · 4a5804c9
      When we parse the packet type of an incoming packet line, we do not
      verify that we don't overflow the provided line buffer. Fix this by
      using `git__prefixncmp` instead and passing in `len`. As we have
      previously already verified that `len <= linelen`, we thus won't ever
      overflow the provided buffer length.
      Patrick Steinhardt committed
    • tests: verify parsing logic for smart packets · 365d2720
      The commits following this commit are about to introduce quite a lot of
      refactoring and tightening of the smart packet parser. Unfortunately, we
      do not yet have any tests despite our online tests that verify that our
      parser does not regress upon changes. This is doubly unfortunate as our
      online tests aren't executed by default.
      
      Add new tests that exercise the smart parsing logic directly by
      executing `git_pkt_parse_line`.
      Patrick Steinhardt committed
  2. 29 Sep, 2018 3 commits
  3. 28 Sep, 2018 15 commits
    • Merge pull request #4767 from pks-t/pks/config-mem · 0530d7d9
      In-memory configuration
      Carlos Martín Nieto committed
    • config: introduce new read-only in-memory backend · 2be39cef
      Now that we have abstracted away how to store and retrieve config
      entries, it became trivial to implement a new in-memory backend by
      making use of this. And thus we do so.
      
      This commit implements a new read-only in-memory backend that can parse
      a chunk of memory into a `git_config_backend` structure.
      Patrick Steinhardt committed
    • config_entries: refactor entries iterator memory ownership · b78f4ab0
      Right now, the config file code requires us to pass in its backend to
      the config entry iterator. This is required with the current code, as
      the config file backend will first create a read-only snapshot which is
      then passed to the iterator just for that purpose. So after the iterator
      is getting free'd, the code needs to make sure that the snapshot gets
      free'd, as well.
      
      By now, though, we can easily refactor the code to be more efficient and
      remove the reverse dependency from iterator to backend. Instead of
      creating a read-only snapshot (which also requires us to re-parse the
      complete configuration file), we can simply duplicate the config entries
      and pass those to the iterator. Like that, the iterator only needs to
      make sure to free the duplicated config entries, which is trivial to do
      and clears up memory ownership by a lot.
      Patrick Steinhardt committed
    • config_entries: internalize structure declarations · d49b1365
      Access to the config entries is now completely done via the modules
      function interface and no caller messes with the struct's internals. We
      can thus completely move the structure declarations into the
      implementation file so that nobody even has a chance to mess with the
      members.
      Patrick Steinhardt committed
    • config_entries: abstract away reference counting · 123e5963
      Instead of directly calling `git_atomic_inc` in users of the config
      entries store, provide a `git_config_entries_incref` function to further
      decouple the interfaces. Convert the refcount to a `git_refcount`
      structure while at it.
      Patrick Steinhardt committed
    • config_entries: abstract away iteration over entries · 5a7e0b3c
      The nice thing about our `git_config_iterator` interfaces is that nobody
      needs to know anything about the implementation details. All that is
      required is to obtain the iterator via any backend and then use it by
      executing generic functions. We can thus completely internalize all the
      implementation details of how to iterate over entries into the config
      entries store and simply create such an iterator in our config file
      backend when we want to iterate its entries. This further decouples the
      config file backend from the config entries store.
      Patrick Steinhardt committed
    • config_entries: abstract away retrieval of config entries · 60ebc137
      The code accessing config entries in the `git_config_entries` structure
      is still much too intimate with implementation details, directly
      accessing the maps and handling indices. Provide two new functions to
      get config entries from the internal map structure to decouple the
      interfaces and use them in the config file code.
      
      The function `git_config_entries_get` will simply look up the entry by
      name and, in the case of a multi-value, return the last occurrence of
      that entry. The second function, `git_config_entries_get_unique`, will
      only return an entry if it is unique and not included via another
      configuration file. This one is required to properly implement write
      operations for single entries, as we refuse to write to or delete a
      single entry if it is not clear which one was meant.
      Patrick Steinhardt committed
    • config_entries: rename functions and structure · fb8a87da
      The previous commit simply moved all code that is required to handle
      config entries to a new module without yet adjusting any of the function
      and structure names to help readability. We now rename things
      accordingly to have a common "git_config_entries" entries instead of the
      old "diskfile_entries" one.
      Patrick Steinhardt committed
    • config_entries: pull out implementation of entry store · 04f57d51
      The configuration entry store that is used for configuration files needs
      to keep track of all entries in two different structures:
      
      - a singly linked list is being used to be able to iterate through
        configuration files in the order they have been found
      
      - a string map is being used to efficiently look up configuration
        entries by their key
      
      This store is thus something that may be used by other, future backends
      as well to abstract away implementation details and iteration over the
      entries.
      
      Pull out the necessary functions from "config_file.c" and moves them
      into their own "config_entries.c" module. For now, this is simply moving
      over code without any renames and/or refactorings to help reviewing.
      Patrick Steinhardt committed
    • config_file: remove unnecessary snapshot indirection · d75bbea1
      The implementation for config file snapshots has an unnecessary
      redirection from `config_snapshot` to `git_config_file__snapshot`.
      Inline the call to `git_config_file__snapshot` and remove it.
      Patrick Steinhardt committed
    • config: rename "config_file.h" to "config_backend.h" · b944e137
      The header "config_file.h" has a list of inline-functions to access the
      contents of a config backend without directly messing with the struct's
      function pointers. While all these functions are called
      "git_config_file_*", they are in fact completely backend-agnostic and
      don't care whether it is a file or not. Rename all the function to
      instead be backend-agnostic versions called "git_config_backend_*" and
      rename the header to match.
      Patrick Steinhardt committed
    • config: move function normalizing section names into "config.c" · 1aeff5d7
      The function `git_config_file_normalize_section` is never being used in
      any file different than "config.c", but it is implemented in
      "config_file.c". Move it over and make the symbol static.
      Patrick Steinhardt committed
    • config: make names backend-agnostic · a5562692
      As a last step to make variables and structures more backend agnostic
      for our `git_config` structure, rename local variables to not be called
      `file` anymore.
      Patrick Steinhardt committed
    • Merge pull request #4803 from tiennou/fix/4802 · 367f6243
      index: release the snapshot instead of freeing the index
      Patrick Steinhardt committed
  4. 26 Sep, 2018 2 commits
  5. 25 Sep, 2018 2 commits
  6. 24 Sep, 2018 1 commit
  7. 22 Sep, 2018 1 commit
  8. 21 Sep, 2018 5 commits
    • Merge pull request #4794 from marcin-krystianc/mkrystianc/prune_perf · a54043b7
      git_remote_prune to be O(n  * logn)
      Patrick Steinhardt committed
    • config: rename `file_internal` and its `file` member · 83733aeb
      Same as with the previous commit, the `file_internal` struct is used to
      keep track of all the backends that are added to a `git_config` struct.
      Rename it to `backend_internal` and rename its `file` member to
      `backend` to make the implementation more backend-agnostic.
      Patrick Steinhardt committed
    • config: rename `files` vector to `backends` · 633cf40c
      Originally, the `git_config` struct is a collection of all the parsed
      configuration files from different scopes (system-wide config,
      user-specific config as well as the repo-specific config files).
      Historically, we didn't and don't yet have any other configuration
      backends than the one for files, which is why the field holding the
      config backends is called `files`. But in fact, nothing dictates that
      the vector of backends actually holds file backends only, as they are
      generic and custom backends can be implemented by users.
      
      Rename the member to be called `backends` to clarify that there is
      nothing specific to files here.
      Patrick Steinhardt committed
    • config_parse: avoid unused static declared values · b9affa32
      The variables `git_config_escaped` and `git_config_escapes` are both
      defined as static const character pointers in "config_parse.h". In case
      where "config_parse.h" is included but those two variables are not being
      used, the compiler will thus complain about defined but unused
      variables. Fix this by declaring them as external and moving the actual
      initialization to the C file.
      
      Note that it is not possible to simply make this a #define, as we are
      indexing into those arrays.
      Patrick Steinhardt committed
    • submodule: fix submodule names depending on config-owned memory · 0b9c68b1
      When populating the list of submodule names, we use the submodule
      configuration entry's name as the key in the map of submodule names.
      This creates a hidden dependency on the liveliness of the configuration
      that was used to parse the submodule, which is fragile and unexpected.
      
      Fix the issue by duplicating the string before writing it into the
      submodule name map.
      Patrick Steinhardt committed
  9. 19 Sep, 2018 2 commits
  10. 18 Sep, 2018 2 commits