Commit 313465bb by Diego Novillo

Add more C++ support in gengtype.

This patch combines the changes from
http://gcc.gnu.org/ml/gcc-patches/2012-08/msg02016.html with other
additions to support C++ inside GTY'd structures.

The main changes wrt Aaron's original patch are:

- Support for function declarations inside classes.

- Support scoping in identifiers.  This does not mean that gengtype
  supports scopes, it just knows that 'Foo::id' is a single entity.

- Explicit non-support for typedef and enum inside class/struct.
  Since gengtype does not really know about scopes, it cannot
  understand these types, but it knows enough to recognize and reject
  them.  GTY'd struct/class that need to typedef their own types
  should use GTY((user)).

- Documentation on what is and is not supported.

There is one check I needed to remove that gave me some trouble.
When a ctor is detected, we have already parsed the name of the
ctor as a type, which is then registered in the list of structures.

We go on to recognize it as a ctor *after* the type has been
registered.  We reject the field in declarator() and it is never
added to the list of fields for the class.

However, when we reach the end of the class, we find that the
type we created while parsing the ctor has line number
information in it (the line where the ctor was) and gengtype
thinks that it is a duplicate structure definition.

I took out this check for two reasons: (a) It is actually
unnecessary because if there were really duplicate definitions of
this structure, the code would not compile, and (b) all the other
alternatives required making the parser much more convoluted and
I'm trying hard not to make gengtype parser too smart.

2012-10-12  Aaron Gray <aaronngray.lists@gmail.com>
	    Diego Novillo <dnovillo@google.com>

        * gengtype-lex.l: Support for C++ single line comments.
        Support for classes.
	(CXX_KEYWORD): New.  Support C++ keywords inline, public,
	protected, private, template, operator, friend, &, ~.
	(TYPEDEF): New.  Support typedef.
        * gengtype-parser.c: updated 'token_names[]'
        (direct_declarator): Add support for parsing functions
	and ctors.

2012-10-12  Diego Novillo  <dnovillo@google.com>

	* doc/gty.texi: Document C++ limitations in gengtype.
	* gengtype-lex.l (CID): Rename from ID.
	(ID): Include scoping '::' as part of the identifier name.
	* gengtype-parse.c (token_names): Update.
	(token_value_format): Update.
	(consume_until_eos): Rename from consume_until_semi.
	Remove unused argument IMMEDIATE.  Update all callers.
	Also consider '}' as a finalizer.
	(consume_until_comma_or_eos): Rename from
	consume_until_comma_or_semi.
	Remove unused argument IMMEDIATE.  Update all callers.
	Also consider '}' as a finalizer.
	(direct_declarator): Add documentation on ctor support.
	Add argument IN_STRUCT.
	If the token following ID is a '(', consider ID a
	function and return NULL.
	If the token following '(' is not a '*', and IN_STRUCT is
	true, conclude that this is a ctor and return NULL.
	If the token is IGNORABLE_CXX_KEYWORD, return NULL.
	(inner_declarator): Add argument IN_STRUCT.
	Update all callers.
	(declarator): Add argument IN_STRUCT with default value
	false.  Update all callers.
	(type): Document argument NESTED.
	Skip over C++ inheritance specifiers.
	If a token TYPEDEF is found, emit an error.
	If an enum is found inside a class/structure, emit an
	error.
	(typedefs, structures, param_structs, variables): Initialize.
	(new_structure): Do not complain about duplicate
	structures if S has a line location set.
	* gengtype-state.c (write_state_type): Remove default
	handler.  Add handler for TYPE_NONE.
	(read_state_scalar_char_type):
	* gengtype.c: Fix spacing.
	* gengtype.h (enum gty_token): Add name.  Add token
	IGNORABLE_CXX_KEYWORD.

From-SVN: r192405
parent b09e6a70
2012-10-12 Aaron Gray <aaronngray.lists@gmail.com>
Diego Novillo <dnovillo@google.com>
* gengtype-lex.l: Support for C++ single line comments.
Support for classes.
(CXX_KEYWORD): New. Support C++ keywords inline, public,
protected, private, template, operator, friend, &, ~.
(TYPEDEF): New. Support typedef.
* gengtype-parser.c: updated 'token_names[]'
(direct_declarator): Add support for parsing functions
and ctors.
2012-10-12 Diego Novillo <dnovillo@google.com>
* doc/gty.texi: Document C++ limitations in gengtype.
* gengtype-lex.l (CID): Rename from ID.
(ID): Include scoping '::' as part of the identifier name.
* gengtype-parse.c (token_names): Update.
(token_value_format): Update.
(consume_until_eos): Rename from consume_until_semi.
Remove unused argument IMMEDIATE. Update all callers.
Also consider '}' as a finalizer.
(consume_until_comma_or_eos): Rename from
consume_until_comma_or_semi.
Remove unused argument IMMEDIATE. Update all callers.
Also consider '}' as a finalizer.
(direct_declarator): Add documentation on ctor support.
Add argument IN_STRUCT.
If the token following ID is a '(', consider ID a
function and return NULL.
If the token following '(' is not a '*', and IN_STRUCT is
true, conclude that this is a ctor and return NULL.
If the token is IGNORABLE_CXX_KEYWORD, return NULL.
(inner_declarator): Add argument IN_STRUCT.
Update all callers.
(declarator): Add argument IN_STRUCT with default value
false. Update all callers.
(type): Document argument NESTED.
Skip over C++ inheritance specifiers.
If a token TYPEDEF is found, emit an error.
If an enum is found inside a class/structure, emit an
error.
(typedefs, structures, param_structs, variables): Initialize.
(new_structure): Do not complain about duplicate
structures if S has a line location set.
* gengtype-state.c (write_state_type): Remove default
handler. Add handler for TYPE_NONE.
(read_state_scalar_char_type):
* gengtype.c: Fix spacing.
* gengtype.h (enum gty_token): Add name. Add token
IGNORABLE_CXX_KEYWORD.
2012-10-12 Chung-Lin Tang <cltang@codesourcery.com>
* config/arm/arm.md (get_thread_pointersi): Moved to place with
......@@ -65,6 +65,27 @@ The parser understands simple typedefs such as
@code{typedef int @var{name};}.
These don't need to be marked.
Since @code{gengtype}'s understanding of C++ is limited, there are
several constructs and declarations that are not supported inside
classes/structures marked for automatic GC code generation. The
following C++ constructs produce a @code{gengtype} error on
structures/classes marked for automatic GC code generation:
@itemize @bullet
@item
Type definitions inside classes/structures are not supported.
@item
Enumerations inside classes/structures are not supported.
@end itemize
If you have a class or structure using any of the above constructs,
you need to mark that class as @code{GTY ((user))} and provide your
own marking routines (see section @ref{User GC} for details).
It is always valid to include function definitions inside classes.
Those are always ignored by @code{gengtype}, as it only cares about
data members.
@menu
* GTY Options:: What goes inside a @code{GTY(())}.
* GGC Roots:: Making global variables GGC roots.
......
......@@ -50,12 +50,15 @@ update_lineno (const char *l, size_t len)
%}
ID [[:alpha:]_][[:alnum:]_]*
CID [[:alpha:]_][[:alnum:]_]*
WS [[:space:]]+
HWS [ \t\r\v\f]*
IWORD short|long|(un)?signed|char|int|HOST_WIDE_INT|HOST_WIDEST_INT|bool|size_t|BOOL_BITFIELD|CPPCHAR_SIGNED_T|ino_t|dev_t|HARD_REG_SET
ITYPE {IWORD}({WS}{IWORD})*
/* Include '::' in identifiers to capture C++ scope qualifiers. */
ID {CID}({HWS}::{HWS}{CID})*
EOID [^[:alnum:]_]
CXX_KEYWORD inline|public:|private:|protected:|template|operator|friend
%x in_struct in_struct_comment in_comment
%option warn noyywrap nounput nodefault perf-report
......@@ -83,6 +86,10 @@ EOID [^[:alnum:]_]
BEGIN(in_struct);
return UNION;
}
^{HWS}class/{EOID} {
BEGIN(in_struct);
return STRUCT;
}
^{HWS}extern/{EOID} {
BEGIN(in_struct);
return EXTERN;
......@@ -93,18 +100,27 @@ EOID [^[:alnum:]_]
}
}
/* Parsing inside a struct, union or class declaration. */
<in_struct>{
"/*" { BEGIN(in_struct_comment); }
"//".*\n { lexer_line.line++; }
{WS} { update_lineno (yytext, yyleng); }
\\\n { lexer_line.line++; }
"const"/{EOID} /* don't care */
{CXX_KEYWORD}/{EOID} |
"~" |
"&" {
*yylval = XDUPVAR (const char, yytext, yyleng, yyleng + 1);
return IGNORABLE_CXX_KEYWORD;
}
"GTY"/{EOID} { return GTY_TOKEN; }
"VEC"/{EOID} { return VEC_TOKEN; }
"union"/{EOID} { return UNION; }
"struct"/{EOID} { return STRUCT; }
"class"/{EOID} { return STRUCT; }
"typedef"/{EOID} { return TYPEDEF; }
"enum"/{EOID} { return ENUM; }
"ptr_alias"/{EOID} { return PTR_ALIAS; }
"nested_ptr"/{EOID} { return NESTED_PTR; }
......@@ -127,7 +143,6 @@ EOID [^[:alnum:]_]
return SCALAR;
}
{ID}/{EOID} {
*yylval = XDUPVAR (const char, yytext, yyleng, yyleng+1);
return ID;
......@@ -148,7 +163,7 @@ EOID [^[:alnum:]_]
}
"..." { return ELLIPSIS; }
[(){},*:<>;=%|-] { return yytext[0]; }
[(){},*:<>;=%|+-] { return yytext[0]; }
/* ignore pp-directives */
^{HWS}"#"{HWS}[a-z_]+[^\n]*\n {lexer_line.line++;}
......@@ -159,6 +174,7 @@ EOID [^[:alnum:]_]
}
"/*" { BEGIN(in_comment); }
"//".*\n { lexer_line.line++; }
\n { lexer_line.line++; }
{ID} |
"'"("\\".|[^\\])"'" |
......@@ -172,6 +188,7 @@ EOID [^[:alnum:]_]
[^*\n] /* do nothing */
"*"/[^/] /* do nothing */
}
<in_comment>"*/" { BEGIN(INITIAL); }
<in_struct_comment>"*/" { BEGIN(in_struct); }
......
......@@ -961,6 +961,8 @@ write_state_type (type_p current)
current->state_number = state_written_type_count;
switch (current->kind)
{
case TYPE_NONE:
gcc_unreachable ();
case TYPE_STRUCT:
write_state_struct_type (current);
break;
......@@ -988,9 +990,6 @@ write_state_type (type_p current)
case TYPE_STRING:
write_state_string_type (current);
break;
default:
fatal ("Unexpected type...");
}
}
......@@ -1318,7 +1317,6 @@ read_state_scalar_char_type (type_p *type)
read_state_common_type_content (*type);
}
/* Read the string_type. */
static void
read_state_string_type (type_p *type)
......
......@@ -497,10 +497,10 @@ struct type scalar_char = {
/* Lists of various things. */
pair_p typedefs;
type_p structures;
type_p param_structs;
pair_p variables;
pair_p typedefs = NULL;
type_p structures = NULL;
type_p param_structs = NULL;
pair_p variables = NULL;
static type_p find_param_structure (type_p t, type_p param[NUM_PARAM]);
static type_p adjust_field_tree_exp (type_p t, options_p opt);
......@@ -611,6 +611,7 @@ resolve_typedef (const char *s, struct fileloc *pos)
return create_user_defined_type (s, pos);
}
/* Create and return a new structure with tag NAME at POS with fields
FIELDS and options O. The KIND of structure must be one of
TYPE_STRUCT, TYPE_UNION or TYPE_USER_STRUCT. */
......@@ -676,8 +677,7 @@ new_structure (const char *name, enum typekind kind, struct fileloc *pos,
structures = s;
}
if (s->u.s.line.file != NULL
|| (s->u.s.lang_struct && (s->u.s.lang_struct->u.s.bitmap & bitmap)))
if (s->u.s.lang_struct && (s->u.s.lang_struct->u.s.bitmap & bitmap))
{
error_at_line (pos, "duplicate definition of '%s %s'",
isunion ? "union" : "struct", s->u.s.tag);
......@@ -763,6 +763,7 @@ create_scalar_type (const char *name)
return &scalar_nonchar;
}
/* Return a pointer to T. */
type_p
......@@ -2636,7 +2637,7 @@ walk_type (type_p t, struct walk_type_data *d)
/* If a pointer type is marked as "atomic", we process the
field itself, but we don't walk the data that they point to.
There are two main cases where we walk types: to mark
pointers that are reachable, and to relocate pointers when
writing a PCH file. In both cases, an atomic pointer is
......@@ -3514,7 +3515,7 @@ write_func_for_structure (type_p orig_s, type_p s, type_p *param,
{
oprintf (d.of, " %s (x);\n", mark_hook_name);
}
d.prev_val[2] = "*x";
d.indent = 6;
if (orig_s->kind != TYPE_USER_STRUCT)
......
......@@ -308,7 +308,6 @@ struct type {
type_p param[NUM_PARAM]; /* The actual parameter types. */
struct fileloc line; /* The source location. */
} param_struct;
} u;
};
......@@ -444,38 +443,38 @@ extern void parse_file (const char *name);
extern bool hit_error;
/* Token codes. */
enum
{
EOF_TOKEN = 0,
/* Per standard convention, codes in the range (0, UCHAR_MAX]
represent single characters with those character codes. */
CHAR_TOKEN_OFFSET = UCHAR_MAX + 1,
GTY_TOKEN = CHAR_TOKEN_OFFSET,
TYPEDEF,
EXTERN,
STATIC,
UNION,
STRUCT,
ENUM,
VEC_TOKEN,
ELLIPSIS,
PTR_ALIAS,
NESTED_PTR,
USER_GTY,
PARAM_IS,
NUM,
SCALAR,
ID,
STRING,
CHAR,
ARRAY,
/* print_token assumes that any token >= FIRST_TOKEN_WITH_VALUE may have
a meaningful value to be printed. */
FIRST_TOKEN_WITH_VALUE = PARAM_IS
};
enum gty_token
{
EOF_TOKEN = 0,
/* Per standard convention, codes in the range (0, UCHAR_MAX]
represent single characters with those character codes. */
CHAR_TOKEN_OFFSET = UCHAR_MAX + 1,
GTY_TOKEN = CHAR_TOKEN_OFFSET,
TYPEDEF,
EXTERN,
STATIC,
UNION,
STRUCT,
ENUM,
VEC_TOKEN,
ELLIPSIS,
PTR_ALIAS,
NESTED_PTR,
USER_GTY,
PARAM_IS,
NUM,
SCALAR,
ID,
STRING,
CHAR,
ARRAY,
IGNORABLE_CXX_KEYWORD,
/* print_token assumes that any token >= FIRST_TOKEN_WITH_VALUE may have
a meaningful value to be printed. */
FIRST_TOKEN_WITH_VALUE = PARAM_IS
};
/* Level for verbose messages, e.g. output file generation... */
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment