1. 12 Jun, 2019 7 commits
    • Disable hash-table sanitization for mem stats maps. · ff7b3aa5
      2019-06-12  Martin Liska  <mliska@suse.cz>
      
      	* ggc-common.c (ggc_prune_overhead_list): Do not sanitize
      	the created map.
      	* hash-map.h: Add sanitize_eq_and_hash into ::hash_map.
      	* mem-stats.h (mem_alloc_description::mem_alloc_description):
      	Do not sanitize created maps.
      
      From-SVN: r272183
      Martin Liska committed
    • re PR target/90811 ([nvptx] ptxas error on OpenMP offloaded code) · 26d7a5e6
      	PR target/90811
      	* cfgexpand.c (align_local_variable): Add really_expand argument,
      	don't SET_DECL_ALIGN if it is false.
      	(add_stack_var): Add really_expand argument, pass it through to
      	align_local_variable.
      	(expand_one_stack_var_1): Pass true as really_expand to
      	align_local_variable.
      	(expand_one_ssa_partition): Pass true as really_expand to
      	add_stack_var.
      	(expand_one_var): Pass really_expand through to add_stack_var.
      
      From-SVN: r272181
      Jakub Jelinek committed
    • [arm] Implement usadv16qi and ssadv16qi standard names · 84ae7213
      
      This patch implements the usadv16qi and ssadv16qi standard names for arm.
      
      The V16QImode variant is important as it is the most commonly used pattern:
      reducing vectors of bytes into an int.
      The midend expects the optab to compute the absolute differences of operands 1
      and 2 and reduce them while widening along the way up to SImode. So the inputs
      are V16QImode and the output is V4SImode.
      
      I've based my solution on Aarch64 usadv16qi and ssadv16qi standard names
      current implementation (r260437). This solution emits below sequence of
      instructions:
      
              VABDL.u8        tmp, op1, op2   # op1, op2 lowpart
              VABAL.u8        tmp, op1, op2   # op1, op2 highpart
              VPADAL.u16      op3, tmp
      
      So, for the code:
      
      $ arm-none-linux-gnueabihf-gcc -S -O3 -march=armv8-a+simd -mfpu=auto -mfloat-abi=hard usadv16qi.c -dp
      
      #define N 1024
      unsigned char pix1[N];
      unsigned char pix2[N];
      
      int
      foo (void)
      {
        int i_sum = 0;
        int i;
        for (i = 0; i < N; i++)
          i_sum += __builtin_abs (pix1[i] - pix2[i]);
        return i_sum;
      }
      
      we now generate on arm:
      foo:
              movw    r3, #:lower16:pix2      @ 57    [c=4 l=4]  *arm_movsi_vfp/3
              movt    r3, #:upper16:pix2      @ 58    [c=4 l=4]  *arm_movt/0
              vmov.i32        q9, #0  @ v4si  @ 3     [c=4 l=4]  *neon_movv4si/2
              movw    r2, #:lower16:pix1      @ 59    [c=4 l=4]  *arm_movsi_vfp/3
              movt    r2, #:upper16:pix1      @ 60    [c=4 l=4]  *arm_movt/0
              add     r1, r3, #1024   @ 8     [c=4 l=4]  *arm_addsi3/4
      .L2:
              vld1.8  {q11}, [r3]!    @ 11    [c=8 l=4]  *movmisalignv16qi_neon_load
              vld1.8  {q10}, [r2]!    @ 10    [c=8 l=4]  *movmisalignv16qi_neon_load
              cmp     r1, r3  @ 21    [c=4 l=4]  *arm_cmpsi_insn/2
              vabdl.u8        q8, d20, d22    @ 12    [c=8 l=4]  neon_vabdluv8qi
              vabal.u8        q8, d21, d23    @ 15    [c=88 l=4]  neon_vabaluv8qi
              vpadal.u16      q9, q8  @ 16    [c=8 l=4]  neon_vpadaluv8hi
              bne     .L2             @ 22    [c=16 l=4]  arm_cond_branch
              vadd.i32        d18, d18, d19   @ 24    [c=120 l=4]  quad_halves_plusv4si
              vpadd.i32       d18, d18, d18   @ 25    [c=8 l=4]  neon_vpadd_internalv2si
              vmov.32 r0, d18[0]      @ 30    [c=12 l=4]  vec_extractv2sisi/1
      
      instead of:
      foo:
              @ args = 0, pretend = 0, frame = 0
              @ frame_needed = 0, uses_anonymous_args = 0
              @ link register save eliminated.
              movw    r3, #:lower16:pix1
              movt    r3, #:upper16:pix1
              vmov.i32        q9, #0  @ v4si
              movw    r2, #:lower16:pix2
              movt    r2, #:upper16:pix2
              add     r1, r3, #1024
      .L2:
              vld1.8  {q8}, [r3]!
              vld1.8  {q11}, [r2]!
              vmovl.u8 q10, d16
              cmp     r1, r3
              vmovl.u8 q8, d17
              vmovl.u8 q12, d22
              vmovl.u8 q11, d23
              vsub.i16        q10, q10, q12
              vsub.i16        q8, q8, q11
              vabs.s16        q10, q10
              vabs.s16        q8, q8
              vaddw.s16       q9, q9, d20
              vaddw.s16       q9, q9, d21
              vaddw.s16       q9, q9, d16
              vaddw.s16       q9, q9, d17
              bne     .L2
              vadd.i32        d18, d18, d19
              vpadd.i32       d18, d18, d18
              vmov.32 r0, d18[0]
      
      2019-06-12  Przemyslaw Wirkus  <przemyslaw.wirkus@arm.com>
      
              * config/arm/iterators.md (VABAL): New int iterator.
              * config/arm/neon.md (<sup>sadv16qi): New define_expand.
              * config/arm/unspecs.md ("unspec"): Define UNSPEC_VABAL_S, UNSPEC_VABAL_U
              values.
      
              * gcc.target/arm/ssadv16qi.c: New test.
              * gcc.target/arm/usadv16qi.c: Likewise.
      
      From-SVN: r272180
      Przemyslaw Wirkus committed
    • Remove wrong assert about single value profiler. · d134323b
      2019-06-12  Martin Liska  <mliska@suse.cz>
      
      	* value-prof.c (stream_out_histogram_value): Only first value
      	can't be negative.
      
      From-SVN: r272179
      Martin Liska committed
    • re PR c/90760 (ICE on attributes section and alias in set_section, at symtab.c:1573) · f3139680
      	PR c/90760
      	* symtab.c (symtab_node::set_section): Allow being called on aliases
      	as long as they aren't analyzed yet.
      
      	* gcc.dg/pr90760.c: New test.
      
      From-SVN: r272178
      Jakub Jelinek committed
    • Daily bump. · bfde1e21
      From-SVN: r272177
      GCC Administrator committed
  2. 11 Jun, 2019 19 commits
  3. 10 Jun, 2019 14 commits