1. 19 Oct, 2018 2 commits
    • win32: refactor `git_win32_path_remove_namespace` · a34f5b0d
      Update `git_win32_path_remove_namespace` to disambiguate the prefix
      being removed versus the prefix being added.  Now we remove the
      "namespace", and (may) add a "prefix" in its place.  Eg, we remove the
      `\\?\` namespace.  We remove the `\\?\UNC\` namespace, and replace it
      with the `\\` prefix.  This aids readability somewhat.
      
      Additionally, use pointer arithmetic instead of offsets, which seems to
      also help readability.
      Edward Thomson committed
    • win32: rename `git_win32__canonicalize_path` · b2e85f98
      The internal API `git_win32__canonicalize_path` is far, far too easily
      confused with the internal API `git_win32_path_canonicalize`.  The
      former removes the namespace prefix from a path (eg, given
      `\\?\C:\Temp\foo`, it returns `C:\Temp\foo`, and given
      `\\?\UNC\server\share`, it returns `\\server\share`).  As such, rename
      it to `git_win32_path_remove_namespace`.
      
      `git_win32_path_canonicalize` remains unchanged.
      Edward Thomson committed
  2. 30 Sep, 2018 1 commit
  3. 28 Sep, 2018 12 commits
    • config: introduce new read-only in-memory backend · 2be39cef
      Now that we have abstracted away how to store and retrieve config
      entries, it became trivial to implement a new in-memory backend by
      making use of this. And thus we do so.
      
      This commit implements a new read-only in-memory backend that can parse
      a chunk of memory into a `git_config_backend` structure.
      Patrick Steinhardt committed
    • config_entries: refactor entries iterator memory ownership · b78f4ab0
      Right now, the config file code requires us to pass in its backend to
      the config entry iterator. This is required with the current code, as
      the config file backend will first create a read-only snapshot which is
      then passed to the iterator just for that purpose. So after the iterator
      is getting free'd, the code needs to make sure that the snapshot gets
      free'd, as well.
      
      By now, though, we can easily refactor the code to be more efficient and
      remove the reverse dependency from iterator to backend. Instead of
      creating a read-only snapshot (which also requires us to re-parse the
      complete configuration file), we can simply duplicate the config entries
      and pass those to the iterator. Like that, the iterator only needs to
      make sure to free the duplicated config entries, which is trivial to do
      and clears up memory ownership by a lot.
      Patrick Steinhardt committed
    • config_entries: internalize structure declarations · d49b1365
      Access to the config entries is now completely done via the modules
      function interface and no caller messes with the struct's internals. We
      can thus completely move the structure declarations into the
      implementation file so that nobody even has a chance to mess with the
      members.
      Patrick Steinhardt committed
    • config_entries: abstract away reference counting · 123e5963
      Instead of directly calling `git_atomic_inc` in users of the config
      entries store, provide a `git_config_entries_incref` function to further
      decouple the interfaces. Convert the refcount to a `git_refcount`
      structure while at it.
      Patrick Steinhardt committed
    • config_entries: abstract away iteration over entries · 5a7e0b3c
      The nice thing about our `git_config_iterator` interfaces is that nobody
      needs to know anything about the implementation details. All that is
      required is to obtain the iterator via any backend and then use it by
      executing generic functions. We can thus completely internalize all the
      implementation details of how to iterate over entries into the config
      entries store and simply create such an iterator in our config file
      backend when we want to iterate its entries. This further decouples the
      config file backend from the config entries store.
      Patrick Steinhardt committed
    • config_entries: abstract away retrieval of config entries · 60ebc137
      The code accessing config entries in the `git_config_entries` structure
      is still much too intimate with implementation details, directly
      accessing the maps and handling indices. Provide two new functions to
      get config entries from the internal map structure to decouple the
      interfaces and use them in the config file code.
      
      The function `git_config_entries_get` will simply look up the entry by
      name and, in the case of a multi-value, return the last occurrence of
      that entry. The second function, `git_config_entries_get_unique`, will
      only return an entry if it is unique and not included via another
      configuration file. This one is required to properly implement write
      operations for single entries, as we refuse to write to or delete a
      single entry if it is not clear which one was meant.
      Patrick Steinhardt committed
    • config_entries: rename functions and structure · fb8a87da
      The previous commit simply moved all code that is required to handle
      config entries to a new module without yet adjusting any of the function
      and structure names to help readability. We now rename things
      accordingly to have a common "git_config_entries" entries instead of the
      old "diskfile_entries" one.
      Patrick Steinhardt committed
    • config_entries: pull out implementation of entry store · 04f57d51
      The configuration entry store that is used for configuration files needs
      to keep track of all entries in two different structures:
      
      - a singly linked list is being used to be able to iterate through
        configuration files in the order they have been found
      
      - a string map is being used to efficiently look up configuration
        entries by their key
      
      This store is thus something that may be used by other, future backends
      as well to abstract away implementation details and iteration over the
      entries.
      
      Pull out the necessary functions from "config_file.c" and moves them
      into their own "config_entries.c" module. For now, this is simply moving
      over code without any renames and/or refactorings to help reviewing.
      Patrick Steinhardt committed
    • config_file: remove unnecessary snapshot indirection · d75bbea1
      The implementation for config file snapshots has an unnecessary
      redirection from `config_snapshot` to `git_config_file__snapshot`.
      Inline the call to `git_config_file__snapshot` and remove it.
      Patrick Steinhardt committed
    • config: rename "config_file.h" to "config_backend.h" · b944e137
      The header "config_file.h" has a list of inline-functions to access the
      contents of a config backend without directly messing with the struct's
      function pointers. While all these functions are called
      "git_config_file_*", they are in fact completely backend-agnostic and
      don't care whether it is a file or not. Rename all the function to
      instead be backend-agnostic versions called "git_config_backend_*" and
      rename the header to match.
      Patrick Steinhardt committed
    • config: move function normalizing section names into "config.c" · 1aeff5d7
      The function `git_config_file_normalize_section` is never being used in
      any file different than "config.c", but it is implemented in
      "config_file.c". Move it over and make the symbol static.
      Patrick Steinhardt committed
    • config: make names backend-agnostic · a5562692
      As a last step to make variables and structures more backend agnostic
      for our `git_config` structure, rename local variables to not be called
      `file` anymore.
      Patrick Steinhardt committed
  4. 26 Sep, 2018 2 commits
  5. 25 Sep, 2018 2 commits
  6. 21 Sep, 2018 4 commits
    • config: rename `file_internal` and its `file` member · 83733aeb
      Same as with the previous commit, the `file_internal` struct is used to
      keep track of all the backends that are added to a `git_config` struct.
      Rename it to `backend_internal` and rename its `file` member to
      `backend` to make the implementation more backend-agnostic.
      Patrick Steinhardt committed
    • config: rename `files` vector to `backends` · 633cf40c
      Originally, the `git_config` struct is a collection of all the parsed
      configuration files from different scopes (system-wide config,
      user-specific config as well as the repo-specific config files).
      Historically, we didn't and don't yet have any other configuration
      backends than the one for files, which is why the field holding the
      config backends is called `files`. But in fact, nothing dictates that
      the vector of backends actually holds file backends only, as they are
      generic and custom backends can be implemented by users.
      
      Rename the member to be called `backends` to clarify that there is
      nothing specific to files here.
      Patrick Steinhardt committed
    • config_parse: avoid unused static declared values · b9affa32
      The variables `git_config_escaped` and `git_config_escapes` are both
      defined as static const character pointers in "config_parse.h". In case
      where "config_parse.h" is included but those two variables are not being
      used, the compiler will thus complain about defined but unused
      variables. Fix this by declaring them as external and moving the actual
      initialization to the C file.
      
      Note that it is not possible to simply make this a #define, as we are
      indexing into those arrays.
      Patrick Steinhardt committed
    • submodule: fix submodule names depending on config-owned memory · 0b9c68b1
      When populating the list of submodule names, we use the submodule
      configuration entry's name as the key in the map of submodule names.
      This creates a hidden dependency on the liveliness of the configuration
      that was used to parse the submodule, which is fragile and unexpected.
      
      Fix the issue by duplicating the string before writing it into the
      submodule name map.
      Patrick Steinhardt committed
  7. 17 Sep, 2018 2 commits
    • revwalk: only check the first commit in the list for an earlier timestamp · 12a1790d
      This is not a big deal, but it does make us match git more closely by checking
      only the first. The lists are sorted already, so there should be no functional
      difference other than removing a possible check from every iteration in the
      loop.
      Carlos Martín Nieto committed
    • revwalk: use the max value for a signed integer · 46f35127
      When porting, we overlooked that the difference between git's and our's time
      representation and copied their way of getting the max value.
      
      Unfortunately git was using unsigned integers, so `~0ll` does correspond to
      their max value, whereas for us it corresponds to `-1`. This means that we
      always consider the last date to be smaller than the current commit's and always
      think commits are interesting.
      
      Change the initial value to the macro that gives us the maximum value on each
      platform so we can accurately consider commits interesting or not.
      Carlos Martín Nieto committed
  8. 12 Sep, 2018 1 commit
    • path validation: `char` is not signed by default. · 44291868
      ARM treats its `char` type as `unsigned type` by default; as a result,
      testing a `char` value as being `< 0` is always false.  This is a
      warning on ARM, which is promoted to an error given our use of
      `-Werror`.
      
      Per ISO 9899:199, section "6.2.5 Types":
      
      > The three types char, signed char, and unsigned char are collectively
      > called the character types. The implementation shall define char to
      > have the same range, representation, and behavior as either signed
      > char or unsigned char.
      >
      ...
      
      > Irrespective of the choice made, char is a separate type from the other
      > two and is not compatible with either.
      Edward Thomson committed
  9. 11 Sep, 2018 1 commit
  10. 07 Sep, 2018 1 commit
  11. 06 Sep, 2018 1 commit
    • config_file: fix quadratic behaviour when adding config multivars · f2694635
      In case where we add multiple configuration entries with the same key to
      a diskfile backend, we always need to iterate the list of this key to
      find the last entry due to the list being a singly-linked list. This
      is obviously quadratic behaviour, and this has sure enough been found by
      oss-fuzz by generating a configuration file with 50k lines, where most
      of them have the same key. While the issue will not arise with "sane"
      configuration files, an adversary may trigger it by providing a crafted
      ".gitmodules" file, which is delivered as part of the repo and also
      parsed by the configuration parser.
      
      The fix is trivial: store a pointer to the last entry of the list in its
      head. As there are only two locations now where we append to this data
      structure, mainting this pointer is trivial, too. We can also optimize
      retrieval of a single value via `config_get`, where we previously had to
      chase the `next` pointer to find the last entry that was added.
      
      Using our configuration file fozzur with a corpus that has a single file
      with 50000 "-=" lines previously took around 21s. With this optimization
      the same file scans in about 0.053s, which is a nearly 400-fold
      improvement. But in most cases with a "normal" amount of same-named keys
      it's not going to matter anyway.
      Patrick Steinhardt committed
  12. 05 Sep, 2018 1 commit
    • Prevent heap-buffer-overflow · d22cd1f4
      When running repack while doing repo writes, `packfile_load__cb()` can see some temporary files in the directory that are bigger than the usual, and makes `memcmp` overflow on the `p->pack_name` string. ASAN detected this. This just uses `strncmp`, that should not have any performance impact and is safe for comparing strings of different sizes.
      
      ```
      ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61200001a3f3 at pc 0x7f4a9e1976ec bp 0x7ffc1f80e100 sp 0x7ffc1f80d8b0
      READ of size 89 at 0x61200001a3f3 thread T0
      SCARINESS: 26 (multi-byte-read-heap-buffer-overflow)
          #0 0x7f4a9e1976eb in __interceptor_memcmp.part.78 (/build/cfgr-admin#link-tree/libtools_build_sanitizers_asan-ubsan-py.so+0xcf6eb)
          #1 0x7f4a518c5431 in packfile_load__cb /build/libgit2/0.27.0/src/libgit2-0.27.0/src/odb_pack.c:213
          #2 0x7f4a518d9582 in git_path_direach /build/libgit2/0.27.0/src/libgit2-0.27.0/src/path.c:1134
          #3 0x7f4a518c58ad in pack_backend__refresh /build/libgit2/0.27.0/src/libgit2-0.27.0/src/odb_pack.c:347
          #4 0x7f4a518c1b12 in git_odb_refresh /build/libgit2/0.27.0/src/libgit2-0.27.0/src/odb.c:1511
          #5 0x7f4a518bff5f in git_odb__freshen /build/libgit2/0.27.0/src/libgit2-0.27.0/src/odb.c:752
          #6 0x7f4a518c17d4 in git_odb_stream_finalize_write /build/libgit2/0.27.0/src/libgit2-0.27.0/src/odb.c:1415
          #7 0x7f4a51b9d015 in Repository_write /build/pygit2/0.27.0/src/pygit2-0.27.0/src/repository.c:509
      ```
      bisho committed
  13. 03 Sep, 2018 1 commit
  14. 02 Sep, 2018 1 commit
  15. 01 Sep, 2018 1 commit
  16. 25 Aug, 2018 1 commit
  17. 24 Aug, 2018 1 commit
  18. 21 Aug, 2018 1 commit
  19. 20 Aug, 2018 3 commits
  20. 17 Aug, 2018 1 commit