1. 20 Jul, 2019 1 commit
  2. 19 Jul, 2019 7 commits
    • repository: do not initialize HEAD if it's provided by templates · 9d46f167
      When using templates to initialize a git repository, then git-init(1)
      will copy over all contents of the template directory. These will be
      preferred over the default ones created by git-init(1). While we mostly
      do the same, there is the exception of "HEAD". While we do copy over the
      template's HEAD file, afterwards we'll immediately re-initialize its
      contents with either the default "ref: refs/origin/master" or the init
      option's `initial_head` field.
      
      Let's fix the inconsistency with upstream git-init(1) by not overwriting
      the template HEAD, but only if the user hasn't set `opts.initial_head`.
      If the `initial_head` field has been supplied, we should use that
      indifferent from whether the template contained a HEAD file or not. Add
      tests to verify we correctly use the template directory's HEAD file and
      that `initial_head` overrides the template.
      Patrick Steinhardt committed
    • repository: update error handling in `init_ext` · f3134a84
      Update `git_repository_init_ext` to use our typical style of error
      handling. The function had multiple statements which didn't `goto out`
      immediately but instead deferred it to later calls combined with `if`
      statements.
      Patrick Steinhardt committed
    • repository: avoid swallowing error codes in `create_head` · 869ae5a3
      The error handling in `git_repository_create_head` completely swallows
      all error codes. While probably not too much of a problem, this also
      violates our usual coding style.
      
      Refactor the code to use a local `error` variable with the typical `goto
      out` statements.
      Patrick Steinhardt committed
    • tests: repo: refactor setup of templates and repos · 0d12b8dd
      All tests in repo::template have a common pattern of first setting up
      templates, then settung up the repository that makes use of those
      templates via several init options. Refactor this pattern into two
      functions `setup_templates` and `setup_repo` that handle most of that
      logic to make it easier to spot what a test actually wants to check.
      
      Furthermore, this also refactors how we clean up after the tests.
      Previously, it was a combination of manually calling
      `cl_fixture_cleanup` and `cl_set_cleanup`, which really is kind of hard
      to read. This commit refactors this to instead provide the cleaning
      parameters in the setup functions. All cleanups are then performed in
      the suite's cleanup function.
      Patrick Steinhardt committed
    • tests: repo: refactor template path handling · 3b79ceaf
      The repo::template test suite makes use of quite a few local variables
      that could be consolidated. Do so to make the code easier to read.
      Patrick Steinhardt committed
    • tests: repo: move template tests into their own suite · ee193480
      There's quite a lot of supporting code for our templates and they are an
      obvious standalone feature. Thus, let's extract those tests into their
      own suite to also make refactoring of them easier.
      Patrick Steinhardt committed
    • Merge pull request #5138 from libgit2/ethomson/cvar · 3424c210
      configuration: cvar -> configmap
      Patrick Steinhardt committed
  3. 18 Jul, 2019 8 commits
  4. 17 Jul, 2019 3 commits
    • Merge pull request #5170 from bk2204/packbuilder-efficient-realloc · 51124a5b
      Allocate memory more efficiently when packing objects
      Edward Thomson committed
    • cache: evict items more efficiently · 770b91b1
      When our object cache is full, we pick eight items (or the whole cache,
      if there are fewer) and evict them. For small cache sizes, this is fine,
      but when we're dealing with a large number of objects, we can repeatedly
      exhaust the cache and spend a large amount of time in git_oidmap_iterate
      trying to find items to evict.
      
      Instead, let's assume that if the cache gets full, we have a large
      number of objects that we're handling, and be more aggressive about
      evicting items. Let's remove one item for every 2048 items, but not less
      than 8. This causes us to scale our evictions in proportion to the size
      of the cache and significantly reduces the time we spend in
      git_oidmap_iterate.
      
      Before this change, a full pack of all the non-blob objects in the Linux
      repository took in excess of 30 minutes and spent 62.3% of total runtime
      in odb_read_1 and its children, and 44.3% of the time in
      git_oidmap_iterate. With this change, the same operation now takes 14
      minutes and 44 seconds, and odb_read_1 accounts for only 35.9% of total
      time, whereas git_oidmap_iterate consists of 6.2%.
      
      Note that we do spend a little more time inflating objects and a decent
      amount more time in memcmp. However, overall, the time taken is
      significantly improved, and time in pack building is now dominated by
      git_delta_create_from_index (33.7%), which is what we would expect.
      brian m. carlson committed
    • pack-objects: allocate memory more efficiently · c4df926b
      The packbuilder code allocates memory in chunks. When it needs to
      allocate, it tries to add 1024 to the number of objects and multiply by
      3/2. However, it actually multiplies by 1 instead, since it performs an
      integral division in the expression "3 / 2" and only then multiplies by
      the increased number of objects.
      
      The current behavior causes the code to waste massive amounts of time
      copying memory when it reallocates, causing inserting all non-blob
      objects in the Linux repository into a new pack to take some
      indeterminate time greater than 5 minutes instead of 52 seconds.
      
      Correct this error by first dividing by two, and only then multiplying
      by 3. We still check for overflow for the multiplication, which is the
      only part that can overflow. This appears to be the only place in the
      code base which has this problem.
      brian m. carlson committed
  5. 16 Jul, 2019 1 commit
  6. 12 Jul, 2019 11 commits
  7. 11 Jul, 2019 9 commits
    • fileops: fix creation of directory in filesystem root · 5ae22a63
      In commit 45f24e78 (git_repository_init: stop traversing at
      windows root, 2019-04-12), we have fixed `git_futils_mkdir` to
      correctly handle the case where we create a directory in
      Windows-style filesystem roots like "C:\repo".
      
      The problem here is an off-by-one: previously, to that commit,
      we've been checking wether the parent directory's length is equal
      to the root directory's length incremented by one. When we call
      the function with "/example", then the parent directory's length
      ("/") is 1, but the root directory offset is 0 as the path is
      directly rooted without a drive prefix. This resulted in `1 == 0 +
      1`, which was true. With the change, we've stopped incrementing
      the root directory length, and thus now compare `1 <= 0`, which
      is false.
      
      The previous way of doing it was kind of finicky any non-obvious,
      which is also why the error was introduced. So instead of just
      re-adding the increment, let's explicitly add a condition that
      aborts finding the parent if the current parent path is "/".
      
      Making this change causes Azure Pipelines to fail the testcase
      repo::init::nonexistent_paths on Unix-based systems. This is because we
      have just fixed creating directories in the filesystem root, which
      previously didn't work. As Docker-based tests are running as root user,
      we are thus able to create the non-existing path and will now succeed to
      create the repository that was expected to actually fail.
      
      Let's split this up into three different tests:
      
      - A test to verify that we do not create repos in a non-existing parent
        directoy if the flag `GIT_REPOSITORY_INIT_MKPATH` is not set.
      
      - A test to verify that we fail if the root directory does not exist. As
        there is a common root directory on Unix-based systems that always
        exist, we can only test for this on Windows-based systems.
      
      - A test to verify that we fail if trying to create a repository in an
        unwriteable parent directory. We can only test this if not running
        tests as root user, as CAP_DAC_OVERRIDE will cause us to ignore
        permissions when creating files.
      Patrick Steinhardt committed
    • Merge pull request #5134 from pks-t/pks/config-parser-separation · a6ad9e8a
      Config parser separation
      Patrick Steinhardt committed
    • config_parse: provide parser init and dispose functions · dbeadf8a
      Right now, all configuration file backends are expected to
      directly mess with the configuration parser's internals in order
      to set it up. Let's avoid doing that by implementing both a
      `git_config_parser_init` and `git_config_parser_dispose` function
      to clearly define the interface between configuration backends
      and the parser.
      
      Ideally, we would make the `git_config_parser` structure
      definition private to its implementation. But as that would
      require an additional memory allocation that was not required
      before we just live with it being visible to others.
      Patrick Steinhardt committed
    • config_file: refactor error handling in `config_write` · 32157526
      Error handling in `config_write` is rather convoluted and does
      not match our current code style. Refactor it to make it easier
      to understand.
      Patrick Steinhardt committed
    • config_file: internalize `git_config_file` struct · 820fa1a3
      With the previous commits, we have finally separated the config
      parsing logic from the specific configuration file backend. Due
      to that, we can now move the `git_config_file` structure into the
      config file backend's implementation so that no other code may
      accidentally start using it again. Furthermore, we rename the
      structure to `diskfile` to make it obvious that it is internal,
      only, and to unify it with naming scheme of the other diskfile
      structures.
      Patrick Steinhardt committed
    • config_parse: remove use of `git_config_file` · 6e6da75f
      The config parser code needs to keep track of the current parsed
      file's name so that we are able to provide proper error messages
      to the user. Right now, we do that by storing a `git_config_file`
      in the parser structure, but as that is a specific backend and
      the parser aims to be generic, it is a layering violation.
      
      Switch over to use a simple string to fix that.
      Patrick Steinhardt committed
    • config_file: embed file in diskfile parse data · 54d350e0
      The config file code needs to keep track of the actual
      `git_config_file` structure, as it not only contains the path
      of the current configuration file, but it also keeps tracks of
      all includes of that file. Right now, we keep track of that
      structure via the `git_config_parser`, but as that's supposed to
      be a backend generic implementation of configuration parsing it's
      a layering violation to have it in there.
      
      Switch over the config file backend to use its own config file
      structure that's embedded in the backend parse data. This allows
      us to switch over the generic config parser to avoid using the
      `git_config_file` structure.
      Patrick Steinhardt committed
    • config_parse: rename `data` parameter to `payload` for clarity · 76749dfb
      By convention, parameters that get passed to callbacks are
      usually named `payload` in our codebase. Rename the `data`
      parameters in the configuration parser callbacks to `payload` to
      avoid confusion.
      Patrick Steinhardt committed
    • Merge pull request #5132 from pks-t/pks/config-stat-cache · ba9725a2
      config_file: implement stat cache to avoid repeated rehashing
      Patrick Steinhardt committed