- 29 Oct, 2019 14 commits
-
-
This patch adds support for arm_sve.h. I've tried to split all the groundwork out into separate patches, so this is mostly adding new code rather than changing existing code. The C++ frontend seems to handle correct ACLE code without modification, even in length-agnostic mode. The C frontend is close; the only correct construct I know it doesn't handle is initialisation. E.g.: svbool_t pg = svptrue_b8 (); produces: variable-sized object may not be initialized although: svbool_t pg; pg = svptrue_b8 (); works fine. This can be fixed by changing: { /* A complete type is ok if size is fixed. */ - if (TREE_CODE (TYPE_SIZE (TREE_TYPE (decl))) != INTEGER_CST + if (!poly_int_tree_p (TYPE_SIZE (TREE_TYPE (decl))) || C_DECL_VARIABLE_SIZE (decl)) { error ("variable-sized object may not be initialized"); in c/c-decl.c:start_decl. Invalid code is likely to trigger ICEs, so this isn't ready for general use yet. However, it seemed better to apply the patch now and deal with diagnosing invalid code as a follow-up. For one thing, it means that we'll be able to provide testcases for middle-end changes related to SVE vectors, which has been a problem until now. (I already have a series of such patches lined up.) The patch includes some tests, but the main ones need to wait until the PCS support has been applied. 2019-10-29 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> gcc/ * config.gcc (aarch64*-*-*): Add arm_sve.h to extra_headers. Add aarch64-sve-builtins.o, aarch64-sve-builtins-shapes.o and aarch64-sve-builtins-base.o to extra_objs. Add aarch64-sve-builtins.h and aarch64-sve-builtins.cc to target_gtfiles. * config/aarch64/t-aarch64 (aarch64-sve-builtins.o): New rule. (aarch64-sve-builtins-shapes.o): Likewise. (aarch64-sve-builtins-base.o): New rules. * config/aarch64/aarch64-c.c (aarch64_pragma_aarch64): New function. (aarch64_resolve_overloaded_builtin): Likewise. (aarch64_check_builtin_call): Likewise. (aarch64_register_pragmas): Install aarch64_resolve_overloaded_builtin and aarch64_check_builtin_call in targetm. Register the GCC aarch64 pragma. * config/aarch64/aarch64-protos.h (AARCH64_FOR_SVPRFOP): New macro. (aarch64_svprfop): New enum. (AARCH64_BUILTIN_SVE): New aarch64_builtin_class enum value. (aarch64_sve_int_mode, aarch64_sve_data_mode): Declare. (aarch64_fold_sve_cnt_pat, aarch64_output_sve_prefetch): Likewise. (aarch64_output_sve_cnt_pat_immediate): Likewise. (aarch64_output_sve_ptrues, aarch64_sve_ptrue_svpattern_p): Likewise. (aarch64_sve_sqadd_sqsub_immediate_p, aarch64_sve_ldff1_operand_p) (aarch64_sve_ldnf1_operand_p, aarch64_sve_prefetch_operand_p) (aarch64_ptrue_all_mode, aarch64_convert_sve_data_to_pred): Likewise. (aarch64_expand_sve_dupq, aarch64_replace_reg_mode): Likewise. (aarch64_sve::init_builtins, aarch64_sve::handle_arm_sve_h): Likewise. (aarch64_sve::builtin_decl, aarch64_sve::builtin_type_p): Likewise. (aarch64_sve::mangle_builtin_type): Likewise. (aarch64_sve::resolve_overloaded_builtin): Likewise. (aarch64_sve::check_builtin_call, aarch64_sve::gimple_fold_builtin) (aarch64_sve::expand_builtin): Likewise. * config/aarch64/aarch64.c (aarch64_sve_data_mode): Make public. (aarch64_sve_int_mode): Likewise. (aarch64_ptrue_all_mode): New function. (aarch64_convert_sve_data_to_pred): Make public. (svprfop_token): New function. (aarch64_output_sve_prefetch): Likewise. (aarch64_fold_sve_cnt_pat): Likewise. (aarch64_output_sve_cnt_pat_immediate): Likewise. (aarch64_sve_move_pred_via_while): Use gen_while with UNSPEC_WHILE_LO instead of gen_while_ult. (aarch64_replace_reg_mode): Make public. (aarch64_init_builtins): Call aarch64_sve::init_builtins. (aarch64_fold_builtin): Handle AARCH64_BUILTIN_SVE. (aarch64_gimple_fold_builtin, aarch64_expand_builtin): Likewise. (aarch64_builtin_decl, aarch64_builtin_reciprocal): Likewise. (aarch64_mangle_type): Call aarch64_sve::mangle_type. (aarch64_sve_sqadd_sqsub_immediate_p): New function. (aarch64_sve_ptrue_svpattern_p): Likewise. (aarch64_sve_pred_valid_immediate): Check aarch64_sve_ptrue_svpattern_p. (aarch64_sve_ldff1_operand_p, aarch64_sve_ldnf1_operand_p) (aarch64_sve_prefetch_operand_p, aarch64_output_sve_ptrues): New functions. * config/aarch64/aarch64.md (UNSPEC_LDNT1_SVE, UNSPEC_STNT1_SVE) (UNSPEC_LDFF1_GATHER, UNSPEC_PTRUE, UNSPEC_WHILE_LE, UNSPEC_WHILE_LS) (UNSPEC_WHILE_LT, UNSPEC_CLASTA, UNSPEC_UPDATE_FFR) (UNSPEC_UPDATE_FFRT, UNSPEC_RDFFR, UNSPEC_WRFFR) (UNSPEC_SVE_LANE_SELECT, UNSPEC_SVE_CNT_PAT, UNSPEC_SVE_PREFETCH) (UNSPEC_SVE_PREFETCH_GATHER, UNSPEC_SVE_COMPACT, UNSPEC_SVE_SPLICE): New unspecs. * config/aarch64/iterators.md (SI_ONLY, DI_ONLY, VNx8HI_ONLY) (VNx2DI_ONLY, SVE_PARTIAL, VNx8_NARROW, VNx8_WIDE, VNx4_NARROW) (VNx4_WIDE, VNx2_NARROW, VNx2_WIDE, PRED_HSD): New mode iterators. (UNSPEC_ADR, UNSPEC_BRKA, UNSPEC_BRKB, UNSPEC_BRKN, UNSPEC_BRKPA) (UNSPEC_BRKPB, UNSPEC_PFIRST, UNSPEC_PNEXT, UNSPEC_CNTP, UNSPEC_SADDV) (UNSPEC_UADDV, UNSPEC_FMLA, UNSPEC_FMLS, UNSPEC_FEXPA, UNSPEC_FTMAD) (UNSPEC_FTSMUL, UNSPEC_FTSSEL, UNSPEC_COND_CMPEQ_WIDE): New unspecs. (UNSPEC_COND_CMPGE_WIDE, UNSPEC_COND_CMPGT_WIDE): Likewise. (UNSPEC_COND_CMPHI_WIDE, UNSPEC_COND_CMPHS_WIDE): Likewise. (UNSPEC_COND_CMPLE_WIDE, UNSPEC_COND_CMPLO_WIDE): Likewise. (UNSPEC_COND_CMPLS_WIDE, UNSPEC_COND_CMPLT_WIDE): Likewise. (UNSPEC_COND_CMPNE_WIDE, UNSPEC_COND_FCADD90, UNSPEC_COND_FCADD270) (UNSPEC_COND_FCMLA, UNSPEC_COND_FCMLA90, UNSPEC_COND_FCMLA180) (UNSPEC_COND_FCMLA270, UNSPEC_COND_FMAX, UNSPEC_COND_FMIN): Likewise. (UNSPEC_COND_FMULX, UNSPEC_COND_FRECPX, UNSPEC_COND_FSCALE): Likewise. (UNSPEC_LASTA, UNSPEC_ASHIFT_WIDE, UNSPEC_ASHIFTRT_WIDE): Likewise. (UNSPEC_LSHIFTRT_WIDE, UNSPEC_LDFF1, UNSPEC_LDNF1): Likewise. (Vesize): Handle partial vector modes. (self_mask, narrower_mask, sve_lane_con, sve_lane_pair_con): New mode attributes. (UBINQOPS, ANY_PLUS, SAT_PLUS, ANY_MINUS, SAT_MINUS): New code iterators. (s, paired_extend, inc_dec): New code attributes. (SVE_INT_ADDV, CLAST, LAST): New int iterators. (SVE_INT_UNARY): Add UNSPEC_RBIT. (SVE_FP_UNARY, SVE_FP_UNARY_INT): New int iterators. (SVE_FP_BINARY, SVE_FP_BINARY_INT): Likewise. (SVE_COND_FP_UNARY): Add UNSPEC_COND_FRECPX. (SVE_COND_FP_BINARY): Add UNSPEC_COND_FMAX, UNSPEC_COND_FMIN and UNSPEC_COND_FMULX. (SVE_COND_FP_BINARY_INT, SVE_COND_FP_ADD): New int iterators. (SVE_COND_FP_SUB, SVE_COND_FP_MUL): Likewise. (SVE_COND_FP_BINARY_I1): Add UNSPEC_COND_FMAX and UNSPEC_COND_FMIN. (SVE_COND_FP_BINARY_REG): Add UNSPEC_COND_FMULX. (SVE_COND_FCADD, SVE_COND_FP_MAXMIN, SVE_COND_FCMLA) (SVE_COND_INT_CMP_WIDE, SVE_FP_TERNARY_LANE, SVE_CFP_TERNARY_LANE) (SVE_WHILE, SVE_SHIFT_WIDE, SVE_LDFF1_LDNF1, SVE_BRK_UNARY) (SVE_BRK_BINARY, SVE_PITER): New int iterators. (optab): Handle UNSPEC_SADDV, UNSPEC_UADDV, UNSPEC_FRECPE, UNSPEC_FRECPS, UNSPEC_RSQRTE, UNSPEC_RSQRTS, UNSPEC_RBIT, UNSPEC_SMUL_HIGHPART, UNSPEC_UMUL_HIGHPART, UNSPEC_FMLA, UNSPEC_FMLS, UNSPEC_FCMLA, UNSPEC_FCMLA90, UNSPEC_FCMLA180, UNSPEC_FCMLA270, UNSPEC_FEXPA, UNSPEC_FTSMUL, UNSPEC_FTSSEL, UNSPEC_COND_FCADD90, UNSPEC_COND_FCADD270, UNSPEC_COND_FCMLA, UNSPEC_COND_FCMLA90, UNSPEC_COND_FCMLA180, UNSPEC_COND_FCMLA270, UNSPEC_COND_FMAX, UNSPEC_COND_FMIN, UNSPEC_COND_FMULX, UNSPEC_COND_FRECPX and UNSPEC_COND_FSCALE. (maxmin_uns): Handle UNSPEC_COND_FMAX and UNSPEC_COND_FMIN. (binqops_op, binqops_op_rev, last_op): New int attributes. (su): Handle UNSPEC_SADDV and UNSPEC_UADDV. (fn, ab): New int attributes. (cmp_op): Handle UNSPEC_COND_CMP*_WIDE and UNSPEC_WHILE_*. (while_optab_cmp, brk_op, sve_pred_op): New int attributes. (sve_int_op): Handle UNSPEC_SMUL_HIGHPART, UNSPEC_UMUL_HIGHPART, UNSPEC_ASHIFT_WIDE, UNSPEC_ASHIFTRT_WIDE, UNSPEC_LSHIFTRT_WIDE and UNSPEC_RBIT. (sve_fp_op): Handle UNSPEC_FRECPE, UNSPEC_FRECPS, UNSPEC_RSQRTE, UNSPEC_RSQRTS, UNSPEC_FMLA, UNSPEC_FMLS, UNSPEC_FEXPA, UNSPEC_FTSMUL, UNSPEC_FTSSEL, UNSPEC_COND_FMAX, UNSPEC_COND_FMIN, UNSPEC_COND_FMULX, UNSPEC_COND_FRECPX and UNSPEC_COND_FSCALE. (sve_fp_op_rev): Handle UNSPEC_COND_FMAX, UNSPEC_COND_FMIN and UNSPEC_COND_FMULX. (rot): Handle UNSPEC_COND_FCADD* and UNSPEC_COND_FCMLA*. (brk_reg_con, brk_reg_opno): New int attributes. (sve_pred_fp_rhs1_operand, sve_pred_fp_rhs2_operand): Handle UNSPEC_COND_FMAX, UNSPEC_COND_FMIN and UNSPEC_COND_FMULX. (sve_pred_fp_rhs2_immediate): Handle UNSPEC_COND_FMAX and UNSPEC_COND_FMIN. (max_elem_bits): New int attribute. (min_elem_bits): Handle UNSPEC_RBIT. * config/aarch64/predicates.md (subreg_lowpart_operator): Handle TRUNCATE as well as SUBREG. (ascending_int_parallel, aarch64_simd_reg_or_minus_one) (aarch64_sve_ldff1_operand, aarch64_sve_ldnf1_operand) (aarch64_sve_prefetch_operand, aarch64_sve_ptrue_svpattern_immediate) (aarch64_sve_qadd_immediate, aarch64_sve_qsub_immediate) (aarch64_sve_gather_immediate_b, aarch64_sve_gather_immediate_h) (aarch64_sve_gather_immediate_w, aarch64_sve_gather_immediate_d) (aarch64_sve_sqadd_operand, aarch64_sve_gather_offset_b) (aarch64_sve_gather_offset_h, aarch64_sve_gather_offset_w) (aarch64_sve_gather_offset_d, aarch64_gather_scale_operand_b) (aarch64_gather_scale_operand_h): New predicates. * config/aarch64/constraints.md (UPb, UPd, UPh, UPw, Utf, Utn, vgb) (vgd, vgh, vgw, vsQ, vsS): New constraints. * config/aarch64/aarch64-sve.md: Add a note on the FFR handling. (*aarch64_sve_reinterpret<mode>): Allow any source register instead of requiring an exact match. (*aarch64_sve_ptruevnx16bi_cc, *aarch64_sve_ptrue<mode>_cc) (*aarch64_sve_ptruevnx16bi_ptest, *aarch64_sve_ptrue<mode>_ptest) (aarch64_wrffr, aarch64_update_ffr_for_load, aarch64_copy_ffr_to_ffrt) (aarch64_rdffr, aarch64_rdffr_z, *aarch64_rdffr_z_ptest) (*aarch64_rdffr_ptest, *aarch64_rdffr_z_cc, *aarch64_rdffr_cc) (aarch64_update_ffrt): New patterns. (@aarch64_load_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>) (@aarch64_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>) (@aarch64_ld<fn>f1<mode>): New patterns. (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>) (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>) (@aarch64_ldnt1<mode>): New patterns. (gather_load<mode>): Use aarch64_sve_gather_offset_<Vesize> for the scalar part of the address. (mask_gather_load<SVE_S:mode>): Use aarch64_sve_gather_offset_w for the scalar part of the addresse and add an alternative for handling nonzero offsets. (mask_gather_load<SVE_D:mode>): Likewise aarch64_sve_gather_offset_d. (*mask_gather_load<mode>_sxtw, *mask_gather_load<mode>_uxtw) (@aarch64_gather_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>) (*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw) (*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw) (@aarch64_ldff1_gather<SVE_S:mode>, @aarch64_ldff1_gather<SVE_D:mode>) (*aarch64_ldff1_gather<mode>_sxtw, *aarch64_ldff1_gather<mode>_uxtw) (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>) (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw) (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw) (@aarch64_sve_prefetch<mode>): New patterns. (@aarch64_sve_gather_prefetch<SVE_I:mode><VNx4SI_ONLY:mode>) (@aarch64_sve_gather_prefetch<SVE_I:mode><VNx2DI_ONLY:mode>) (*aarch64_sve_gather_prefetch<SVE_I:mode><VNx2DI_ONLY:mode>_sxtw) (*aarch64_sve_gather_prefetch<SVE_I:mode><VNx2DI_ONLY:mode>_uxtw) (@aarch64_store_trunc<VNx8_NARROW:mode><VNx8_WIDE:mode>) (@aarch64_store_trunc<VNx4_NARROW:mode><VNx4_WIDE:mode>) (@aarch64_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>) (@aarch64_stnt1<mode>): New patterns. (scatter_store<mode>): Use aarch64_sve_gather_offset_<Vesize> for the scalar part of the address. (mask_scatter_store<SVE_S:mode>): Use aarch64_sve_gather_offset_w for the scalar part of the addresse and add an alternative for handling nonzero offsets. (mask_scatter_store<SVE_D:mode>): Likewise aarch64_sve_gather_offset_d. (*mask_scatter_store<mode>_sxtw, *mask_scatter_store<mode>_uxtw) (@aarch64_scatter_store_trunc<VNx4_NARROW:mode><VNx4_WIDE:mode>) (@aarch64_scatter_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>) (*aarch64_scatter_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>_sxtw) (*aarch64_scatter_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>_uxtw): New patterns. (vec_duplicate<mode>): Use QI as the mode of the input operand. (extract_last_<mode>): Generalize to... (@extract_<LAST:last_op>_<mode>): ...this. (*<SVE_INT_UNARY:optab><mode>2): Rename to... (@aarch64_pred_<SVE_INT_UNARY:optab><mode>): ...this. (@cond_<SVE_INT_UNARY:optab><mode>): New expander. (@aarch64_pred_sxt<SVE_HSDI:mode><SVE_PARTIAL:mode>): New pattern. (@aarch64_cond_sxt<SVE_HSDI:mode><SVE_PARTIAL:mode>): Likewise. (@aarch64_pred_cnot<mode>, @cond_cnot<mode>): New expanders. (@aarch64_sve_<SVE_FP_UNARY_INT:optab><mode>): New pattern. (@aarch64_sve_<SVE_FP_UNARY:optab><mode>): Likewise. (*<SVE_COND_FP_UNARY:optab><mode>2): Rename to... (@aarch64_pred_<SVE_COND_FP_UNARY:optab><mode>): ...this. (@cond_<SVE_COND_FP_UNARY:optab><mode>): New expander. (*<SVE_INT_BINARY_IMM:optab><mode>3): Rename to... (@aarch64_pred_<SVE_INT_BINARY_IMM:optab><mode>): ...this. (@aarch64_adr<mode>, *aarch64_adr_sxtw): New patterns. (*aarch64_adr_uxtw_unspec): Likewise. (*aarch64_adr_uxtw): Rename to... (*aarch64_adr_uxtw_and): ...this. (@aarch64_adr<mode>_shift): New expander. (*aarch64_adr_shift_sxtw): New pattern. (aarch64_<su>abd<mode>_3): Rename to... (@aarch64_pred_<su>abd<mode>): ...this. (<su>abd<mode>_3): Update accordingly. (@aarch64_cond_<su>abd<mode>): New expander. (@aarch64_<SBINQOPS:su_optab><optab><mode>): New pattern. (@aarch64_<UBINQOPS:su_optab><optab><mode>): Likewise. (*<su>mul<mode>3_highpart): Rename to... (@aarch64_pred_<optab><mode>): ...this. (@cond_<MUL_HIGHPART:optab><mode>): New expander. (*cond_<MUL_HIGHPART:optab><mode>_2): New pattern. (*cond_<MUL_HIGHPART:optab><mode>_z): Likewise. (*<SVE_INT_BINARY_SD:optab><mode>3): Rename to... (@aarch64_pred_<SVE_INT_BINARY_SD:optab><mode>): ...this. (cond_<SVE_INT_BINARY_SD:optab><mode>): Add a "@" marker. (@aarch64_bic<mode>, @cond_bic<mode>): New expanders. (*v<ASHIFT:optab><mode>3): Rename to... (@aarch64_pred_<ASHIFT:optab><mode>): ...this. (@aarch64_sve_<SVE_SHIFT_WIDE:sve_int_op><mode>): New pattern. (@cond_<SVE_SHIFT_WIDE:sve_int_op><mode>): New expander. (*cond_<SVE_SHIFT_WIDE:sve_int_op><mode>_m): New pattern. (*cond_<SVE_SHIFT_WIDE:sve_int_op><mode>_z): Likewise. (@cond_asrd<mode>): New expander. (*cond_asrd<mode>_2, *cond_asrd<mode>_z): New patterns. (sdiv_pow2<mode>3): Expand to *cond_asrd<mode>_2. (*sdiv_pow2<mode>3): Delete. (@cond_<SVE_COND_FP_BINARY_INT:optab><mode>): New expander. (*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_2): New pattern. (*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_any): Likewise. (@aarch64_sve_<SVE_FP_BINARY:optab><mode>): New pattern. (@aarch64_sve_<SVE_FP_BINARY_INT:optab><mode>): Likewise. (*<SVE_COND_FP_BINARY_REG:optab><mode>3): Rename to... (@aarch64_pred_<SVE_COND_FP_BINARY_REG:optab><mode>): ...this. (@aarch64_pred_<SVE_COND_FP_BINARY_INT:optab><mode>): New pattern. (cond_<SVE_COND_FP_BINARY:optab><mode>): Add a "@" marker. (*add<SVE_F:mode>3): Rename to... (@aarch64_pred_add<SVE_F:mode>): ...this and add alternatives for SVE_STRICT_GP. (@aarch64_pred_<SVE_COND_FCADD:optab><mode>): New pattern. (@cond_<SVE_COND_FCADD:optab><mode>): New expander. (*cond_<SVE_COND_FCADD:optab><mode>_2): New pattern. (*cond_<SVE_COND_FCADD:optab><mode>_any): Likewise. (*sub<SVE_F:mode>3): Rename to... (@aarch64_pred_sub<SVE_F:mode>): ...this and add alternatives for SVE_STRICT_GP. (@aarch64_pred_abd<SVE_F:mode>): New expander. (*fabd<SVE_F:mode>3): Rename to... (*aarch64_pred_abd<SVE_F:mode>): ...this. (@aarch64_cond_abd<SVE_F:mode>): New expander. (*mul<SVE_F:mode>3): Rename to... (@aarch64_pred_<SVE_F:optab><mode>): ...this and add alternatives for SVE_STRICT_GP. (@aarch64_mul_lane_<SVE_F:mode>): New pattern. (*<SVE_COND_FP_MAXMIN_PUBLIC:optab><mode>3): Rename and generalize to... (@aarch64_pred_<SVE_COND_FP_MAXMIN:optab><mode>): ...this. (*<LOGICAL:optab><PRED_ALL:mode>3_ptest): New pattern. (*<nlogical><PRED_ALL:mode>3): Rename to... (aarch64_pred_<nlogical><PRED_ALL:mode>_z): ...this. (*<nlogical><PRED_ALL:mode>3_cc): New pattern. (*<nlogical><PRED_ALL:mode>3_ptest): Likewise. (*<logical_nn><PRED_ALL:mode>3): Rename to... (aarch64_pred_<logical_nn><mode>_z): ...this. (*<logical_nn><PRED_ALL:mode>3_cc): New pattern. (*<logical_nn><PRED_ALL:mode>3_ptest): Likewise. (*fma<SVE_I:mode>4): Rename to... (@aarch64_pred_fma<SVE_I:mode>): ...this. (*fnma<SVE_I:mode>4): Rename to... (@aarch64_pred_fnma<SVE_I:mode>): ...this. (@aarch64_<sur>dot_prod_lane<vsi2qi>): New pattern. (*<SVE_FP_TERNARY:optab><mode>4): Rename to... (@aarch64_pred_<SVE_FP_TERNARY:optab><mode>): ...this. (cond_<SVE_FP_TERNARY:optab><mode>): Add a "@" marker. (@aarch64_<SVE_FP_TERNARY_LANE:optab>_lane_<mode>): New pattern. (@aarch64_pred_<SVE_COND_FCMLA:optab><mode>): Likewise. (@cond_<SVE_COND_FCMLA:optab><mode>): New expander. (*cond_<SVE_COND_FCMLA:optab><mode>_4): New pattern. (*cond_<SVE_COND_FCMLA:optab><mode>_any): Likewise. (@aarch64_<FCMLA:optab>_lane_<mode>): Likewise. (@aarch64_sve_tmad<mode>): Likewise. (vcond_mask_<SVE_ALL:mode><vpred>): Add a "@" marker. (*aarch64_sel_dup<mode>): Rename to... (@aarch64_sel_dup<mode>): ...this. (@aarch64_pred_cmp<cmp_op><SVE_I:mode>_wide): New pattern. (*aarch64_pred_cmp<cmp_op><SVE_I:mode>_wide_cc): Likewise. (*aarch64_pred_cmp<cmp_op><SVE_I:mode>_wide_ptest): Likewise. (@while_ult<GPI:mode><PRED_ALL:mode>): Generalize to... (@while_<while_optab_cmp><GPI:mode><PRED_ALL:mode>): ...this. (*while_ult<GPI:mode><PRED_ALL:mode>_cc): Generalize to. (*while_<while_optab_cmp><GPI:mode><PRED_ALL:mode>_cc): ...this. (*while_<while_optab_cmp><GPI:mode><PRED_ALL:mode>_ptest): New pattern. (*fcm<cmp_op><mode>): Rename to... (@aarch64_pred_fcm<cmp_op><mode>): ...this. Make operand order match @aarch64_pred_cmp<cmp_op><SVE_I:mode>. (*fcmuo<mode>): Rename to... (@aarch64_pred_fcmuo<mode>): ...this. Make operand order match @aarch64_pred_cmp<cmp_op><SVE_I:mode>. (@aarch64_pred_fac<cmp_op><mode>): New expander. (@vcond_mask_<PRED_ALL:mode><mode>): New pattern. (fold_extract_last_<mode>): Generalize to... (@fold_extract_<last_op>_<mode>): ...this. (@aarch64_fold_extract_vector_<last_op>_<mode>): New pattern. (*reduc_plus_scal_<SVE_I:mode>): Replace with... (@aarch64_pred_reduc_<optab>_<mode>): ...this pattern, making the DImode result explicit. (reduc_plus_scal_<mode>): Update accordingly. (*reduc_<optab>_scal_<SVE_I:mode>): Rename to... (@aarch64_pred_reduc_<optab>_<SVE_I:mode>): ...this. (*reduc_<optab>_scal_<SVE_F:mode>): Rename to... (@aarch64_pred_reduc_<optab>_<SVE_F:mode>): ...this. (*aarch64_sve_tbl<mode>): Rename to... (@aarch64_sve_tbl<mode>): ...this. (@aarch64_sve_compact<mode>): New pattern. (*aarch64_sve_dup_lane<mode>): Rename to... (@aarch64_sve_dup_lane<mode>): ...this. (@aarch64_sve_dupq_lane<mode>): New pattern. (@aarch64_sve_splice<mode>): Likewise. (aarch64_sve_<perm_insn><mode>): Rename to... (@aarch64_sve_<perm_insn><mode>): ...this. (*aarch64_sve_ext<mode>): Rename to... (@aarch64_sve_ext<mode>): ...this. (aarch64_sve_<su>unpk<perm_hilo>_<SVE_BHSI:mode>): Add a "@" marker. (*aarch64_sve_<optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>): Rename to... (@aarch64_sve_<optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>): ...this. (*aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): Rename to... (@aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): ...this. (@cond_<optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>): New expander. (@cond_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): Likewise. (*cond_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): New pattern. (*aarch64_sve_<optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): Rename to... (@aarch64_sve_<optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): ...this. (aarch64_sve_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>): Add a "@" marker. (@cond_<optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): New expander. (@cond_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>): Likewise. (*cond_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>): New pattern. (*aarch64_sve_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): Rename to... (@aarch64_sve_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): ...this. (@cond_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): New expander. (*cond_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): New pattern. (aarch64_sve_<optab>_nontrunc<SVE_HSF:mode><SVE_SDF:mode>): Add a "@" marker. (@cond_<optab>_nontrunc<SVE_HSF:mode><SVE_SDF:mode>): New expander. (*cond_<optab>_nontrunc<SVE_HSF:mode><SVE_SDF:mode>): New pattern. (aarch64_sve_punpk<perm_hilo>_<mode>): Add a "@" marker. (@aarch64_brk<SVE_BRK_UNARY:brk_op>): New pattern. (*aarch64_brk<SVE_BRK_UNARY:brk_op>_cc): Likewise. (*aarch64_brk<SVE_BRK_UNARY:brk_op>_ptest): Likewise. (@aarch64_brk<SVE_BRK_BINARY:brk_op>): Likewise. (*aarch64_brk<SVE_BRK_BINARY:brk_op>_cc): Likewise. (*aarch64_brk<SVE_BRK_BINARY:brk_op>_ptest): Likewise. (@aarch64_sve_<SVE_PITER:sve_pred_op><mode>): Likewise. (*aarch64_sve_<SVE_PITER:sve_pred_op><mode>_cc): Likewise. (*aarch64_sve_<SVE_PITER:sve_pred_op><mode>_ptest): Likewise. (aarch64_sve_cnt_pat): Likewise. (@aarch64_sve_<ANY_PLUS:inc_dec><DI_ONLY:mode>_pat): Likewise. (*aarch64_sve_incsi_pat): Likewise. (@aarch64_sve_<SAT_PLUS:inc_dec><SI_ONLY:mode>_pat): Likewise. (@aarch64_sve_<ANY_PLUS:inc_dec><VNx2DI_ONLY:mode>_pat): Likewise. (@aarch64_sve_<ANY_PLUS:inc_dec><VNx4SI_ONLY:mode>_pat): Likewise. (@aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_pat): New expander. (*aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_pat): New pattern. (@aarch64_sve_<ANY_MINUS:inc_dec><DI_ONLY:mode>_pat): Likewise. (*aarch64_sve_decsi_pat): Likewise. (@aarch64_sve_<SAT_MINUS:inc_dec><SI_ONLY:mode>_pat): Likewise. (@aarch64_sve_<ANY_MINUS:inc_dec><VNx2DI_ONLY:mode>_pat): Likewise. (@aarch64_sve_<ANY_MINUS:inc_dec><VNx4SI_ONLY:mode>_pat): Likewise. (@aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_pat): New expander. (*aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_pat): New pattern. (@aarch64_pred_cntp<mode>): Likewise. (@aarch64_sve_<ANY_PLUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp): New expander. (*aarch64_sve_<ANY_PLUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp) (*aarch64_incsi<PRED_ALL:mode>_cntp): New patterns. (@aarch64_sve_<SAT_PLUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp): New expander. (*aarch64_sve_<SAT_PLUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp): New pattern. (@aarch64_sve_<ANY_PLUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New expander. (*aarch64_sve_<ANY_PLUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New pattern. (@aarch64_sve_<ANY_PLUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New expander. (*aarch64_sve_<ANY_PLUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New pattern. (@aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New expander. (*aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New pattern. (@aarch64_sve_<ANY_MINUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp): New expander. (*aarch64_sve_<ANY_MINUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp) (*aarch64_incsi<PRED_ALL:mode>_cntp): New patterns. (@aarch64_sve_<SAT_MINUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp): New expander. (*aarch64_sve_<SAT_MINUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp): New pattern. (@aarch64_sve_<ANY_MINUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New expander. (*aarch64_sve_<ANY_MINUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New pattern. (@aarch64_sve_<ANY_MINUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New expander. (*aarch64_sve_<ANY_MINUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New pattern. (@aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New expander. (*aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New pattern. * config/aarch64/arm_sve.h: New file. * config/aarch64/aarch64-sve-builtins.h: Likewise. * config/aarch64/aarch64-sve-builtins.cc: Likewise. * config/aarch64/aarch64-sve-builtins.def: Likewise. * config/aarch64/aarch64-sve-builtins-base.h: Likewise. * config/aarch64/aarch64-sve-builtins-base.cc: Likewise. * config/aarch64/aarch64-sve-builtins-base.def: Likewise. * config/aarch64/aarch64-sve-builtins-functions.h: Likewise. * config/aarch64/aarch64-sve-builtins-shapes.h: Likewise. * config/aarch64/aarch64-sve-builtins-shapes.cc: Likewise. gcc/testsuite/ * g++.target/aarch64/sve/acle/aarch64-sve-acle.exp: New file. * g++.target/aarch64/sve/acle/general-c++: New test directory. * gcc.target/aarch64/sve/acle/aarch64-sve-acle.exp: New file. * gcc.target/aarch64/sve/acle/general: New test directory. * gcc.target/aarch64/sve/acle/general-c: Likewise. Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org> Co-Authored-By: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> From-SVN: r277563
Richard Sandiford committed -
This is tested by the main SVE ACLE patches, but since it affects the evpc routines, it seemed worth splitting out. 2019-10-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-sve.md (@aarch64_sve_rev<PRED_ALL:mode>): New pattern. * config/aarch64/aarch64.c (aarch64_evpc_rev_global): Handle all SVE modes. From-SVN: r277562
Richard Sandiford committed -
This patch adds the First Fault Register to the AArch64 port, as well as a fake register known as the FFR Token or FFRT. The main ACLE patch explains what the FFRT does and how it works. 2019-10-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64.md (FFR_REGNUM, FFRT_REGNUM): New constants. * config/aarch64/aarch64.h (FIRST_PSEUDO_REGISTER): Bump to FFRT_REGNUM + 1. (FFR_REGS, PR_AND_FFR_REGS): New register classes. (REG_CLASS_NAMES, REG_CLASS_CONTENTS): Add entries for them. * config/aarch64/aarch64.c (pr_or_ffr_regnum_p): New function. (aarch64_hard_regno_nregs): Handle the new register classes. (aarch64_hard_regno_mode_ok): Likewise. (aarch64_regno_regclass): Likewise. (aarch64_class_max_nregs): Likewise. (aarch64_register_move_cost): Likewise. (aarch64_conditional_register_usage): Don't treat FFR and FFRT as general register_operands. From-SVN: r277561
Richard Sandiford committed -
2019-10-29 Martin Liska <mliska@suse.cz> * ggc-common.c: One can't subtract unsigned types in compare function. From-SVN: r277560
Martin Liska committed -
2019-10-29 Martin Liska <mliska@suse.cz> * cgraphunit.c (symbol_table::compile): Pass title as dump_memory_report argument. * toplev.c (dump_memory_report): New argument. (finalize): Pass new argument. * toplev.h (dump_memory_report): Add argument. 2019-10-29 Martin Liska <mliska@suse.cz> * lto.c (do_whole_program_analysis): Pass title as dump_memory_report argument. From-SVN: r277559
Martin Liska committed -
2019-10-29 Martin Liska <mliska@suse.cz> * ggc-common.c: Move Leak to the first column. From-SVN: r277558
Martin Liska committed -
2019-10-29 Martin Liska <mliska@suse.cz> * cgraphunit.c (symbol_table::compile): Remove argument for dump_memory_report. * ggc-common.c (dump_ggc_loc_statistics): Likewise. (compare_final): Remove in order to make report better readable. * ggc.h (dump_ggc_loc_statistics): Remove argument. * mem-stats.h (mem_alloc_description::get_list): Do not pass cmp. (mem_alloc_description::dump): Likewise here. * toplev.c (dump_memory_report): Remove final argument. (finalize): Likewise. * toplev.h (dump_memory_report): Remove argument. 2019-10-29 Martin Liska <mliska@suse.cz> * lto.c (do_whole_program_analysis): Remove argument. From-SVN: r277557
Martin Liska committed -
The SVE ACLE has convenience functions that take scalar arguments instead of vectors. This patch makes it easier to implement the shift and compare functions by making the associated immediate queries work for scalar immediates as well as vector duplicates of them. The "const" codes in the predicates were a holdover from an early version of the SVE port in which we used (const ...) wrappers for variable-length vector constants. I'll remove other instances of them in a separate patch. 2019-10-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64.c (aarch64_sve_cmp_immediate_p) (aarch64_simd_shift_imm_p): Accept scalars as well as vectors. * config/aarch64/predicates.md (aarch64_sve_cmp_vsc_immediate) (aarch64_sve_cmp_vsd_immediate): Accept "const_int", but don't accept "const". From-SVN: r277556
Richard Sandiford committed -
Similarly to the simulate_builtin_function_decl patch, this one adds a hook for simulating an enum declaration in the source language. Again, the main SVE ACLE patch has tests for various error conditions. 2019-10-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ * coretypes.h (string_int_pair): New typedef. * langhooks-def.h (LANG_HOOKS_SIMULATE_ENUM_DECL): Define. (LANG_HOOKS_FOR_TYPES_INITIALIZER): Include it. * langhooks.h (lang_hooks_for_types::simulate_enum_decl): New hook. gcc/c/ * c-tree.h (c_simulate_enum_decl): Declare. * c-decl.c (c_simulate_enum_decl): New function. * c-objc-common.h (LANG_HOOKS_SIMULATE_ENUM_DECL): Define to the above. gcc/cp/ * cp-objcp-common.h (cxx_simulate_enum_decl): Declare. (LANG_HOOKS_SIMULATE_ENUM_DECL): Define to the above. * decl.c (cxx_simulate_enum_decl): New function. From-SVN: r277555
Richard Sandiford committed -
Although it's possible to define the SVE intrinsics in a normal header file, it's much more convenient to define them directly in the compiler. This also speeds up compilation and gives better error messages. The idea is therefore for arm_sve.h (the main intrinsics header file) to have the pragma: #pragma GCC aarch64 "arm_sve.h" telling GCC to define (almost) everything arm_sve.h needs to define. The target then needs a way of injecting new built-in function declarations during compilation. The main hook for defining built-in functions is add_builtin_function. This is designed for use at start-up, and so has various features that are correct in that context but not for the pragma above: (1) the location is always BUILTINS_LOCATION, whereas for arm_sve.h it ought to be the location of the pragma. (2) the function is only immediately visible if it's in the implementation namespace, whereas the pragma is deliberately injecting functions into the general namespace. (3) there's no attempt to emulate a normal function declaration in C or C++, whereas functions declared by the pragma should be checked in the same way as an open-coded declaration would be. E.g. we should get an error if there was a previous incompatible declaration. (4) in C++, the function is treated as extern "C" and so can't be overloaded, whereas SVE intrinsics do use function overloading. This patch therefore adds a hook that targets can use to inject the equivalent of a source-level function declaration, but bound to a BUILT_IN_MD function. The main SVE intrinsic patch has tests to make sure that we report an error for conflicting definitions that appear either before or after including arm_sve.h. 2019-10-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ * langhooks.h (lang_hooks::simulate_builtin_function_decl): New hook. (simulate_builtin_function_decl): Declare. * langhooks-def.h (LANG_HOOKS_SIMULATE_BUILTIN_FUNCTION_DECL): Define. (LANG_HOOKS_INITIALIZER): Include it. * langhooks.c (add_builtin_function_common): Rename to... (build_builtin_function): ...this. Add a location parameter and use it instead of BUILTINS_LOCATION. Remove the hook parameter and return the decl instead. (add_builtin_function): Update accordingly, passing the returned decl to the lang hook. (add_builtin_function_ext_scope): Likewise (simulate_builtin_function_decl): New function. gcc/c/ * c-tree.h (c_simulate_builtin_function_decl): Declare. * c-decl.c (c_simulate_builtin_function_decl): New function. * c-objc-common.h (LANG_HOOKS_SIMULATE_BUILTIN_FUNCTION_DECL): Define to the above. gcc/cp/ * cp-tree.h (cxx_simulate_builtin_function_decl): Declare. * decl.c (cxx_simulate_builtin_function_decl): New function. * cp-objcp-common.h (LANG_HOOKS_SIMULATE_BUILTIN_FUNCTION_DECL): Define to the above. From-SVN: r277554
Richard Sandiford committed -
2019-10-29 Richard Biener <rguenther@suse.de> PR tree-optimization/92241 * gcc.dg/torture/pr92241-2.c: New testcase. From-SVN: r277553
Richard Biener committed -
install.texi (--enable-offload-targets): Fix up a typo in the example, use actual names of supported offload targets. * doc/install.texi (--enable-offload-targets): Fix up a typo in the example, use actual names of supported offload targets. From-SVN: r277552
Jakub Jelinek committed -
PR target/92258 * config/i386/sse.md (iptr): Revert 2019-10-27 change. * gcc.target/i386/pr92258.c: New test. From-SVN: r277551
Jakub Jelinek committed -
From-SVN: r277550
GCC Administrator committed
-
- 28 Oct, 2019 24 commits
-
-
gcc/ChangeLog: * tree-ssa-strlen.c (get_addr_stridx): Add argument and use it. (handle_store): Pass argument to get_addr_stridx. gcc/testsuite/ChangeLog: * gcc.dg/strlenopt-89.c: New test. * gcc.dg/strlenopt-90.c: New test. * gcc.dg/Wstringop-overflow-20.c: New test. From-SVN: r277546
Martin Sebor committed -
gcc/testsuite/ChangeLog: PR tree-optimization/92226 * gcc.dg/strlenopt-88.c: New test. gcc/ChangeLog: PR tree-optimization/92226 * tree-ssa-strlen.c (compare_nonzero_chars): Return -1 also when the offset is in the open range outlined by SI's length. From-SVN: r277545
Martin Sebor committed -
gcc/ChangeLog: PR c/66970 * doc/cpp.texi (__has_builtin): Document. * doc/extend.texi (__builtin_frob_return_addr): Correct spelling. gcc/c/ChangeLog: PR c/66970 * c-decl.c (names_builtin_p): Define a new function. gcc/c-family/ChangeLog: PR c/66970 * c-common.c (c_common_nodes_and_builtins): Call c_define_builtins even when only preprocessing. * c-common.h (names_builtin_p): Declare new function. * c-lex.c (init_c_lex): Set has_builtin. (c_common_has_builtin): Define a new function. * c-ppoutput.c (init_pp_output): Set has_builtin. gcc/cp/ChangeLog: PR c/66970 * cp-objcp-common.c (names_builtin_p): Define new function. gcc/testsuite/ChangeLog: PR c/66970 * c-c++-common/cpp/has-builtin-2.c: New test. * c-c++-common/cpp/has-builtin-3.c: New test. * c-c++-common/cpp/has-builtin.c: New test. From-SVN: r277544
Martin Sebor committed -
PR target/82981 * config/mips/mips.md (<u>mulditi3): Generate patterns for high doubleword and low doubleword result of multiplication on MIPS64R6. * gcc.target/mips/mips64r6-ti-mult.c: New test. From-SVN: r277537
Mihailo Stojanovic committed -
cp-demangle.c (d_print_mod): Add a space before printing `complex` and `imaginary`, as opposed to after. * cp-demangle.c (d_print_mod): Add a space before printing `complex` and `imaginary`, as opposed to after. * testsuite/demangle-expected: Adjust test. From-SVN: r277535
Miguel Saldivar committed -
* config/mips/mips.c (DIRECT_BUILTIN_PURE): New macro. Add a pure qualifier to the built-in. (MSA_BUILTIN_PURE): New macro. Add a pure qualifier to the MSA built-ins. (struct mips_builtin_description): Add is_pure flag. (mips_init_builtins): Mark built-in as pure if the flag in the corresponding mips_builtin_description struct is set. * gcc.target/mips/mips-builtins-pure.c: New test. From-SVN: r277534
Mihailo Stojanovic committed -
mips-msa.md (msa_insert_<msaftm_f>): Add an alternative which covers the floating-point input value. * config/mips/mips-msa.md (msa_insert_<msaftm_f>): Add an alternative which covers the floating-point input value. Also forbid the split of insert.d pattern for floating-point values. * gcc.target/mips/msa-insert-split.c: New test. From-SVN: r277533
Mihailo Stojanovic committed -
When using the -msave-restore flag we end up with calls to _riscv_save_0 and _riscv_restore_0. These functions adjust the stack and save or restore the return address. Due to grouping multiple save/restore stub functions together the save/restore 0 calls actually save s0, s1, s2, and the return address, but only the return address actually matters. Leaf functions don't call the save/restore stubs, so whenever we do see a call to the save/restore stubs, the store of the return address is required. If we look in gcc/config/riscv/riscv.c at the function riscv_expand_prologue and riscv_expand_epilogue we can see that it would be reasonably easy to adjust these functions to avoid the calls to the save/restore stubs for those cases where we are about to call _riscv_save_0 and _riscv_restore_0, however, the actual code size saving this would give is debatable, with linker relaxation, the calls to save/restore are often just 4-bytes, and can sometimes even be 2-bytes, while leaving the stack adjust and return address save inline is always going to be 4-bytes. The interesting case is when we call _riscv_save_0 and _riscv_restore_0, and also have a frame that would (without save/restore) have resulted in a tail call. In this case if we could remove the save/restore calls, and restore the tail call then we would get a real size saving. The problem is that the choice of generating a tail call or not is done during the gimple expand pass, at which point we don't know how many registers we need to save (or restore). The solution presented in this patch offers a partial solution to this problem. By using the TARGET_MACHINE_DEPENDENT_REORG pass to implement a very limited pattern matching we identify functions that call _riscv_save_0 and _riscv_restore_0, and which could be converted to make use of a tail call. These functions are then converted to the non save/restore tail call form. This should result in a code size reduction when compiling with -Os and with the -msave-restore flag. gcc/ChangeLog: * config.gcc: Add riscv-sr.o to extra_objs for riscv. * config/riscv/riscv-sr.c: New file. * config/riscv/riscv.c (riscv_reorg): New function. (TARGET_MACHINE_DEPENDENT_REORG): Define. * config/riscv/riscv.h (SIBCALL_REG_P): Define. (riscv_remove_unneeded_save_restore_calls): Declare. * config/riscv/t-riscv (riscv-sr.o): New build rule. gcc/testsuite/ChangeLog: * gcc.target/riscv/save-restore-2.c: New file. * gcc.target/riscv/save-restore-3.c: New file. * gcc.target/riscv/save-restore-4.c: New file. * gcc.target/riscv/save-restore-5.c: New file. * gcc.target/riscv/save-restore-6.c: New file. * gcc.target/riscv/save-restore-7.c: New file. * gcc.target/riscv/save-restore-8.c: New file. From-SVN: r277527
Andrew Burgess committed -
2019-10-28 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> PR tree-optimization/92163 * tree-ssa-dse.c (delete_dead_or_redundant_assignment): New param need_eh_cleanup with default value NULL. Gate on need_eh_cleanup before calling bitmap_set_bit. (dse_optimize_redundant_stores): Pass global need_eh_cleanup to delete_dead_or_redundant_assignment. (dse_dom_walker::dse_optimize_stmt): Likewise. * tree-ssa-dse.h (delete_dead_or_redundant_assignment): Adjust prototype. testsuite/ * gcc.dg/tree-ssa/pr92163.c: New test. From-SVN: r277525
Prathamesh Kulkarni committed -
2019-10-28 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> PR middle-end/91272 * tree-vect-stmts.c (vectorizable_condition): Support EXTRACT_LAST_REDUCTION with fully-masked loops. testsuite/ * gcc.target/aarch64/sve/clastb_1.c: Add dg-scan. * gcc.target/aarch64/sve/clastb_2.c: Likewise. * gcc.target/aarch64/sve/clastb_3.c: Likewise. * gcc.target/aarch64/sve/clastb_4.c: Likewise. * gcc.target/aarch64/sve/clastb_5.c: Likewise. * gcc.target/aarch64/sve/clastb_6.c: Likewise. * gcc.target/aarch64/sve/clastb_7.c: Likewise. * gcc.target/aarch64/sve/clastb_8.c: Likewise. From-SVN: r277524
Prathamesh Kulkarni committed -
2019-10-28 Richard Biener <rguenther@suse.de> PR tree-optimization/92252 * tree-vect-slp.c (vect_get_and_check_slp_defs): Adjust STMT_VINFO_REDUC_IDX when swapping operands. * gcc.dg/torture/pr92252.c: New testcase. From-SVN: r277517
Richard Biener committed -
2019-10-28 Richard Biener <rguenther@suse.de> PR tree-optimization/92241 * tree-vect-loop.c (vect_fixup_scalar_cycles_with_patterns): When we failed to update the reduction index do not use the pattern stmts for the reduction chain. (vectorizable_reduction): When the reduction chain is corrupt, fail. * tree-vect-patterns.c (vect_mark_pattern_stmts): Stop when we fail to update the reduction chain. * gcc.dg/torture/pr92241.c: New testcase. From-SVN: r277516
Richard Biener committed -
https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01962.html We use an eof_token global variable as a sentinel on a deferred parse (such as in-class function definitions, or default args). This complicates retrieving the next token in certain places. As such deferred parses always nest properly and completely before resuming the outer lexer, we can simply morph the token after the deferred buffer into a CPP_EOF token and restore it afterwards. I finally got around to implementing it with this patch. One complication is that we have to change the discriminator for when the token's value is a tree. We can't look at the token's type because it might have been overwritten. I add a bool flag to the token (there's several spare bits), and use that. This does simplify the discriminator because we just check a single bit, rather than a set of token types. * parser.h (struct cp_token): Drop {ENUM,BOOL}_BITFIELD C-ism. Add tree_check_p flag, use as nested union discriminator. (struct cp_lexer): Add saved_type & saved_keyword fields. * parser.c (eof_token): Delete. (cp_lexer_new_main): Always init last_token to last token of buffer. (cp_lexer_new_from_tokens): Overlay EOF token at end of range. (cp_lexer_destroy): Restore token under the EOF. (cp_lexer_previous_token_position): No check for eof_token here. (cp_lexer_get_preprocessor_token): Clear tree_check_p. (cp_lexer_peek_nth_token): Check CPP_EOF not eof_token. (cp_lexer_consume_token): Assert not CPP_EOF, no check for eof_token. (cp_lexer_purge_token): Likewise. (cp_lexer_purge_tokens_after): No check for EOF token. (cp_parser_nested_name_specifier, cp_parser_decltype) (cp_parser_template_id): Set tree_check_p. From-SVN: r277514
Nathan Sidwell committed -
2019-10-28 Richard Biener <rguenther@suse.de> * tree-vect-loop.c (vect_create_epilog_for_reduction): Use STMT_VINFO_REDUC_IDX from the actual stmt. (vect_transform_reduction): Likewise. (vectorizable_reduction): Compute the reduction chain length, do not recompute the reduction operand index. Remove no longer necessary restriction for condition reduction chains. From-SVN: r277513
Richard Biener committed -
2019-10-28 Richard Biener <rguenther@suse.de> PR c/92249 * gimple-parser.c (c_parser_parse_gimple_body): Make current_bb the entry block initially to easier recover from errors. (c_parser_gimple_compound_statement): Adjust. From-SVN: r277512
Richard Biener committed -
PR target/92225 * config/i386/sse.md (REDUC_SSE_SMINMAX_MODE): Use TARGET_SSE4_2 condition for V2DImode. testsuite/ChangeLog: PR target/92225 * gcc.target/i386/pr92225.c: New test. From-SVN: r277510
Uros Bizjak committed -
* config/i386/sse.md (sse_cvtss2si<rex64namesuffix>_2): Remove %k operand modifier. (*vec_extractv2df_1_sse): Remove %q operand modifier. From-SVN: r277509
Uros Bizjak committed -
where LIM interacts with foo10. On 64bit LIM doesn't do the problematic change for whatever reason, but it seems better to disable LIM alltogether, which requires a minor change in the testcase. From-SVN: r277508
Michael Matz committed -
r266734 has introduced a new instance of jump threading pass in order to take advantage of opportunities that combine opens up. It was perceived back then that it was beneficial to delay it after reload, since that might produce even more such opportunities. Unfortunately jump threading interferes with hot/cold partitioning. In the code from PR92007, it converts the following +-------------------------- 2/HOT ------------------------+ | | v v 3/HOT --> 5/HOT --> 8/HOT --> 11/COLD --> 6/HOT --EH--> 16/HOT | ^ | | +-------------------------------+ into the following: +---------------------- 2/HOT ------------------+ | | v v 3/HOT --> 8/HOT --> 11/COLD --> 6/COLD --EH--> 16/HOT This makes hot bb 6 dominated by cold bb 11, and because of this fixup_partitions makes bb 6 cold as well, which in turn makes EH edge 6->16 a crossing one. Not only can't we have crossing EH edges, we are also not allowed to introduce new crossing edges after reload in general, since it might require extra registers on some targets. Therefore, move the jump threading pass between combine and hot/cold partitioning. Building SPEC 2006 and SPEC 2017 with the old and the new code indicates that: * When doing jump threading right after reload, 3889 edges are threaded. * When doing jump threading right after combine, 3918 edges are threaded. This means this change will not introduce performance regressions. gcc/ChangeLog: 2019-10-28 Ilya Leoshkevich <iii@linux.ibm.com> PR rtl-optimization/92007 * cfgcleanup.c (thread_jump): Add an assertion that we don't call it after reload if hot/cold partitioning has been done. (class pass_postreload_jump): Rename to pass_jump_after_combine. (make_pass_postreload_jump): Rename to make_pass_jump_after_combine. * passes.def(pass_postreload_jump): Move before reload, rename to pass_jump_after_combine. * tree-pass.h (make_pass_postreload_jump): Rename to make_pass_jump_after_combine. gcc/testsuite/ChangeLog: 2019-10-28 Ilya Leoshkevich <iii@linux.ibm.com> PR rtl-optimization/92007 * g++.dg/opt/pr92007.C: New test (from Arseny Solokha). From-SVN: r277507
Ilya Leoshkevich committed -
PR ipa/92242 * ipa-fnsummary.c (ipa_merge_fn_summary_after_inlining): Check for missing EDGE_REF * ipa-prop.c (update_jump_functions_after_inlining): Likewise. From-SVN: r277504
Jan Hubicka committed -
* testsuite/libgomp.oacc-fortran/abort-1.f90: Add 'dg-do run'. * testsuite/libgomp.oacc-fortran/abort-2.f90: Ditto. * testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Ditto. * testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f90: Ditto. * testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f90: Ditto. * testsuite/libgomp.oacc-fortran/lib-1.f90: Ditto. * testsuite/libgomp.oacc-fortran/common-block-1.f90: Use 'stop' not abort(). * testsuite/libgomp.oacc-fortran/common-block-2.f90: Ditto. * testsuite/libgomp.oacc-fortran/common-block-3.f90: Ditto. * testsuite/libgomp.oacc-fortran/data-1.f90: Ditto. * testsuite/libgomp.oacc-fortran/data-2.f90: Ditto. * testsuite/libgomp.oacc-fortran/data-5.f90: Ditto. * testsuite/libgomp.oacc-fortran/dummy-array.f90: Ditto. * testsuite/libgomp.oacc-fortran/gemm-2.f90: Ditto. * testsuite/libgomp.oacc-fortran/gemm.f90: Ditto. * testsuite/libgomp.oacc-fortran/host_data-2.f90: Ditto. * testsuite/libgomp.oacc-fortran/host_data-3.f90: Ditto. * testsuite/libgomp.oacc-fortran/host_data-4.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-collapse-3.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-collapse-4.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-independent.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-loop-1.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-map-1.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-parallel-loop-data-enter-exit.f95: Ditto. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-private-vars-vector-1.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-private-vars-vector-2.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-private-vars-worker-1.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-private-vars-worker-2.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-private-vars-worker-3.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-private-vars-worker-4.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-private-vars-worker-5.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-private-vars-worker-6.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-private-vars-worker-7.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-reduction-1.f90: Ditto. * testsuite/libgomp.oacc-fortran/lib-12.f90: Ditto. * testsuite/libgomp.oacc-fortran/lib-13.f90: Ditto. * testsuite/libgomp.oacc-fortran/lib-14.f90: Ditto. * testsuite/libgomp.oacc-fortran/kernels-acc-loop-reduction-2.f90: Likewise and also add 'dg-do run'. * testsuite/libgomp.oacc-fortran/kernels-acc-loop-reduction.f90: Ditto. From-SVN: r277503
Tobias Burnus committed -
PR fortran/91863 * trans-expr.c (gfc_conv_gfc_desc_to_cfi_desc): Don't free data memory as that's done on the Fortran side. (gfc_conv_procedure_call): Handle void* pointers from gfc_conv_gfc_desc_to_cfi_desc. PR fortran/91863 * gfortran.dg/bind-c-intent-out.f90: New. From-SVN: r277502
Tobias Burnus committed -
In PR88760, there are a few disscussion about improve or tune unroller for targets. And we would agree to enable unroller for small loops at O2 first. And we could see performance improvement(~10%) for below code: ``` subroutine foo (i, i1, block) integer :: i, i1 integer :: block(9, 9, 9) block(i:9,1,i1) = block(i:9,1,i1) - 10 end subroutine foo ``` This kind of code occurs a few times in exchange2 benchmark. Similar C code: ``` for (i = 0; i < n; i++) arr[i] = arr[i] - 10; ``` On powerpcle, for O2 , enable -funroll-loops and limit PARAM_MAX_UNROLL_TIMES=2 and PARAM_MAX_UNROLLED_INSNS=20, we can see >2% overall improvement for SPEC2017. This patch is only for rs6000 in which we see visible performance improvement. gcc/ 2019-10-25 Jiufu Guo <guojiufu@linux.ibm.com> PR tree-optimization/88760 * config/rs6000/rs6000-common.c (rs6000_option_optimization_table): Enable -funroll-loops for -O2 and above. * config/rs6000/rs6000.c (rs6000_option_override_internal): Set PARAM_MAX_UNROLL_TIMES to 2 and PARAM_MAX_UNROLLED_INSNS to 20, and do not turn on web and rngreg implicitly, if the unroller is not explicitly enabled. gcc.testsuite/ 2019-10-25 Jiufu Guo <guojiufu@linux.ibm.com> PR tree-optimization/88760 * gcc.target/powerpc/small-loop-unroll.c: New test. * c-c++-common/tsan/thread_leak2.c: Update test. * gcc.dg/pr59643.c: Update test. * gcc.target/powerpc/loop_align.c: Update test. * gcc.target/powerpc/ppc-fma-1.c: Update test. * gcc.target/powerpc/ppc-fma-2.c: Update test. * gcc.target/powerpc/ppc-fma-3.c: Update test. * gcc.target/powerpc/ppc-fma-4.c: Update test. * gcc.target/powerpc/pr78604.c: Update test. From-SVN: r277501
Jiufu Guo committed -
From-SVN: r277499
GCC Administrator committed
-
- 27 Oct, 2019 2 commits
-
-
From-SVN: r277492
Jakub Jelinek committed -
2019-10-27 Andreas Tobler <andreast@gcc.gnu.org> * gcc.c-torture/execute/fprintf-2.c: Silence a Free/NetBSD libc warning. * gcc.c-torture/execute/printf-2.c: Likewise. * gcc.c-torture/execute/user-printf.c: Likewise. From-SVN: r277491
Andreas Tobler committed
-