Commit 313465bb by Diego Novillo

Add more C++ support in gengtype.

This patch combines the changes from
http://gcc.gnu.org/ml/gcc-patches/2012-08/msg02016.html with other
additions to support C++ inside GTY'd structures.

The main changes wrt Aaron's original patch are:

- Support for function declarations inside classes.

- Support scoping in identifiers.  This does not mean that gengtype
  supports scopes, it just knows that 'Foo::id' is a single entity.

- Explicit non-support for typedef and enum inside class/struct.
  Since gengtype does not really know about scopes, it cannot
  understand these types, but it knows enough to recognize and reject
  them.  GTY'd struct/class that need to typedef their own types
  should use GTY((user)).

- Documentation on what is and is not supported.

There is one check I needed to remove that gave me some trouble.
When a ctor is detected, we have already parsed the name of the
ctor as a type, which is then registered in the list of structures.

We go on to recognize it as a ctor *after* the type has been
registered.  We reject the field in declarator() and it is never
added to the list of fields for the class.

However, when we reach the end of the class, we find that the
type we created while parsing the ctor has line number
information in it (the line where the ctor was) and gengtype
thinks that it is a duplicate structure definition.

I took out this check for two reasons: (a) It is actually
unnecessary because if there were really duplicate definitions of
this structure, the code would not compile, and (b) all the other
alternatives required making the parser much more convoluted and
I'm trying hard not to make gengtype parser too smart.

2012-10-12  Aaron Gray <aaronngray.lists@gmail.com>
	    Diego Novillo <dnovillo@google.com>

        * gengtype-lex.l: Support for C++ single line comments.
        Support for classes.
	(CXX_KEYWORD): New.  Support C++ keywords inline, public,
	protected, private, template, operator, friend, &, ~.
	(TYPEDEF): New.  Support typedef.
        * gengtype-parser.c: updated 'token_names[]'
        (direct_declarator): Add support for parsing functions
	and ctors.

2012-10-12  Diego Novillo  <dnovillo@google.com>

	* doc/gty.texi: Document C++ limitations in gengtype.
	* gengtype-lex.l (CID): Rename from ID.
	(ID): Include scoping '::' as part of the identifier name.
	* gengtype-parse.c (token_names): Update.
	(token_value_format): Update.
	(consume_until_eos): Rename from consume_until_semi.
	Remove unused argument IMMEDIATE.  Update all callers.
	Also consider '}' as a finalizer.
	(consume_until_comma_or_eos): Rename from
	consume_until_comma_or_semi.
	Remove unused argument IMMEDIATE.  Update all callers.
	Also consider '}' as a finalizer.
	(direct_declarator): Add documentation on ctor support.
	Add argument IN_STRUCT.
	If the token following ID is a '(', consider ID a
	function and return NULL.
	If the token following '(' is not a '*', and IN_STRUCT is
	true, conclude that this is a ctor and return NULL.
	If the token is IGNORABLE_CXX_KEYWORD, return NULL.
	(inner_declarator): Add argument IN_STRUCT.
	Update all callers.
	(declarator): Add argument IN_STRUCT with default value
	false.  Update all callers.
	(type): Document argument NESTED.
	Skip over C++ inheritance specifiers.
	If a token TYPEDEF is found, emit an error.
	If an enum is found inside a class/structure, emit an
	error.
	(typedefs, structures, param_structs, variables): Initialize.
	(new_structure): Do not complain about duplicate
	structures if S has a line location set.
	* gengtype-state.c (write_state_type): Remove default
	handler.  Add handler for TYPE_NONE.
	(read_state_scalar_char_type):
	* gengtype.c: Fix spacing.
	* gengtype.h (enum gty_token): Add name.  Add token
	IGNORABLE_CXX_KEYWORD.

From-SVN: r192405
parent b09e6a70
2012-10-12 Aaron Gray <aaronngray.lists@gmail.com>
Diego Novillo <dnovillo@google.com>
* gengtype-lex.l: Support for C++ single line comments.
Support for classes.
(CXX_KEYWORD): New. Support C++ keywords inline, public,
protected, private, template, operator, friend, &, ~.
(TYPEDEF): New. Support typedef.
* gengtype-parser.c: updated 'token_names[]'
(direct_declarator): Add support for parsing functions
and ctors.
2012-10-12 Diego Novillo <dnovillo@google.com>
* doc/gty.texi: Document C++ limitations in gengtype.
* gengtype-lex.l (CID): Rename from ID.
(ID): Include scoping '::' as part of the identifier name.
* gengtype-parse.c (token_names): Update.
(token_value_format): Update.
(consume_until_eos): Rename from consume_until_semi.
Remove unused argument IMMEDIATE. Update all callers.
Also consider '}' as a finalizer.
(consume_until_comma_or_eos): Rename from
consume_until_comma_or_semi.
Remove unused argument IMMEDIATE. Update all callers.
Also consider '}' as a finalizer.
(direct_declarator): Add documentation on ctor support.
Add argument IN_STRUCT.
If the token following ID is a '(', consider ID a
function and return NULL.
If the token following '(' is not a '*', and IN_STRUCT is
true, conclude that this is a ctor and return NULL.
If the token is IGNORABLE_CXX_KEYWORD, return NULL.
(inner_declarator): Add argument IN_STRUCT.
Update all callers.
(declarator): Add argument IN_STRUCT with default value
false. Update all callers.
(type): Document argument NESTED.
Skip over C++ inheritance specifiers.
If a token TYPEDEF is found, emit an error.
If an enum is found inside a class/structure, emit an
error.
(typedefs, structures, param_structs, variables): Initialize.
(new_structure): Do not complain about duplicate
structures if S has a line location set.
* gengtype-state.c (write_state_type): Remove default
handler. Add handler for TYPE_NONE.
(read_state_scalar_char_type):
* gengtype.c: Fix spacing.
* gengtype.h (enum gty_token): Add name. Add token
IGNORABLE_CXX_KEYWORD.
2012-10-12 Chung-Lin Tang <cltang@codesourcery.com> 2012-10-12 Chung-Lin Tang <cltang@codesourcery.com>
* config/arm/arm.md (get_thread_pointersi): Moved to place with * config/arm/arm.md (get_thread_pointersi): Moved to place with
...@@ -65,6 +65,27 @@ The parser understands simple typedefs such as ...@@ -65,6 +65,27 @@ The parser understands simple typedefs such as
@code{typedef int @var{name};}. @code{typedef int @var{name};}.
These don't need to be marked. These don't need to be marked.
Since @code{gengtype}'s understanding of C++ is limited, there are
several constructs and declarations that are not supported inside
classes/structures marked for automatic GC code generation. The
following C++ constructs produce a @code{gengtype} error on
structures/classes marked for automatic GC code generation:
@itemize @bullet
@item
Type definitions inside classes/structures are not supported.
@item
Enumerations inside classes/structures are not supported.
@end itemize
If you have a class or structure using any of the above constructs,
you need to mark that class as @code{GTY ((user))} and provide your
own marking routines (see section @ref{User GC} for details).
It is always valid to include function definitions inside classes.
Those are always ignored by @code{gengtype}, as it only cares about
data members.
@menu @menu
* GTY Options:: What goes inside a @code{GTY(())}. * GTY Options:: What goes inside a @code{GTY(())}.
* GGC Roots:: Making global variables GGC roots. * GGC Roots:: Making global variables GGC roots.
......
...@@ -50,12 +50,15 @@ update_lineno (const char *l, size_t len) ...@@ -50,12 +50,15 @@ update_lineno (const char *l, size_t len)
%} %}
ID [[:alpha:]_][[:alnum:]_]* CID [[:alpha:]_][[:alnum:]_]*
WS [[:space:]]+ WS [[:space:]]+
HWS [ \t\r\v\f]* HWS [ \t\r\v\f]*
IWORD short|long|(un)?signed|char|int|HOST_WIDE_INT|HOST_WIDEST_INT|bool|size_t|BOOL_BITFIELD|CPPCHAR_SIGNED_T|ino_t|dev_t|HARD_REG_SET IWORD short|long|(un)?signed|char|int|HOST_WIDE_INT|HOST_WIDEST_INT|bool|size_t|BOOL_BITFIELD|CPPCHAR_SIGNED_T|ino_t|dev_t|HARD_REG_SET
ITYPE {IWORD}({WS}{IWORD})* ITYPE {IWORD}({WS}{IWORD})*
/* Include '::' in identifiers to capture C++ scope qualifiers. */
ID {CID}({HWS}::{HWS}{CID})*
EOID [^[:alnum:]_] EOID [^[:alnum:]_]
CXX_KEYWORD inline|public:|private:|protected:|template|operator|friend
%x in_struct in_struct_comment in_comment %x in_struct in_struct_comment in_comment
%option warn noyywrap nounput nodefault perf-report %option warn noyywrap nounput nodefault perf-report
...@@ -83,6 +86,10 @@ EOID [^[:alnum:]_] ...@@ -83,6 +86,10 @@ EOID [^[:alnum:]_]
BEGIN(in_struct); BEGIN(in_struct);
return UNION; return UNION;
} }
^{HWS}class/{EOID} {
BEGIN(in_struct);
return STRUCT;
}
^{HWS}extern/{EOID} { ^{HWS}extern/{EOID} {
BEGIN(in_struct); BEGIN(in_struct);
return EXTERN; return EXTERN;
...@@ -93,18 +100,27 @@ EOID [^[:alnum:]_] ...@@ -93,18 +100,27 @@ EOID [^[:alnum:]_]
} }
} }
/* Parsing inside a struct, union or class declaration. */
<in_struct>{ <in_struct>{
"/*" { BEGIN(in_struct_comment); } "/*" { BEGIN(in_struct_comment); }
"//".*\n { lexer_line.line++; }
{WS} { update_lineno (yytext, yyleng); } {WS} { update_lineno (yytext, yyleng); }
\\\n { lexer_line.line++; } \\\n { lexer_line.line++; }
"const"/{EOID} /* don't care */ "const"/{EOID} /* don't care */
{CXX_KEYWORD}/{EOID} |
"~" |
"&" {
*yylval = XDUPVAR (const char, yytext, yyleng, yyleng + 1);
return IGNORABLE_CXX_KEYWORD;
}
"GTY"/{EOID} { return GTY_TOKEN; } "GTY"/{EOID} { return GTY_TOKEN; }
"VEC"/{EOID} { return VEC_TOKEN; } "VEC"/{EOID} { return VEC_TOKEN; }
"union"/{EOID} { return UNION; } "union"/{EOID} { return UNION; }
"struct"/{EOID} { return STRUCT; } "struct"/{EOID} { return STRUCT; }
"class"/{EOID} { return STRUCT; }
"typedef"/{EOID} { return TYPEDEF; }
"enum"/{EOID} { return ENUM; } "enum"/{EOID} { return ENUM; }
"ptr_alias"/{EOID} { return PTR_ALIAS; } "ptr_alias"/{EOID} { return PTR_ALIAS; }
"nested_ptr"/{EOID} { return NESTED_PTR; } "nested_ptr"/{EOID} { return NESTED_PTR; }
...@@ -127,7 +143,6 @@ EOID [^[:alnum:]_] ...@@ -127,7 +143,6 @@ EOID [^[:alnum:]_]
return SCALAR; return SCALAR;
} }
{ID}/{EOID} { {ID}/{EOID} {
*yylval = XDUPVAR (const char, yytext, yyleng, yyleng+1); *yylval = XDUPVAR (const char, yytext, yyleng, yyleng+1);
return ID; return ID;
...@@ -148,7 +163,7 @@ EOID [^[:alnum:]_] ...@@ -148,7 +163,7 @@ EOID [^[:alnum:]_]
} }
"..." { return ELLIPSIS; } "..." { return ELLIPSIS; }
[(){},*:<>;=%|-] { return yytext[0]; } [(){},*:<>;=%|+-] { return yytext[0]; }
/* ignore pp-directives */ /* ignore pp-directives */
^{HWS}"#"{HWS}[a-z_]+[^\n]*\n {lexer_line.line++;} ^{HWS}"#"{HWS}[a-z_]+[^\n]*\n {lexer_line.line++;}
...@@ -159,6 +174,7 @@ EOID [^[:alnum:]_] ...@@ -159,6 +174,7 @@ EOID [^[:alnum:]_]
} }
"/*" { BEGIN(in_comment); } "/*" { BEGIN(in_comment); }
"//".*\n { lexer_line.line++; }
\n { lexer_line.line++; } \n { lexer_line.line++; }
{ID} | {ID} |
"'"("\\".|[^\\])"'" | "'"("\\".|[^\\])"'" |
...@@ -172,6 +188,7 @@ EOID [^[:alnum:]_] ...@@ -172,6 +188,7 @@ EOID [^[:alnum:]_]
[^*\n] /* do nothing */ [^*\n] /* do nothing */
"*"/[^/] /* do nothing */ "*"/[^/] /* do nothing */
} }
<in_comment>"*/" { BEGIN(INITIAL); } <in_comment>"*/" { BEGIN(INITIAL); }
<in_struct_comment>"*/" { BEGIN(in_struct); } <in_struct_comment>"*/" { BEGIN(in_struct); }
......
...@@ -961,6 +961,8 @@ write_state_type (type_p current) ...@@ -961,6 +961,8 @@ write_state_type (type_p current)
current->state_number = state_written_type_count; current->state_number = state_written_type_count;
switch (current->kind) switch (current->kind)
{ {
case TYPE_NONE:
gcc_unreachable ();
case TYPE_STRUCT: case TYPE_STRUCT:
write_state_struct_type (current); write_state_struct_type (current);
break; break;
...@@ -988,9 +990,6 @@ write_state_type (type_p current) ...@@ -988,9 +990,6 @@ write_state_type (type_p current)
case TYPE_STRING: case TYPE_STRING:
write_state_string_type (current); write_state_string_type (current);
break; break;
default:
fatal ("Unexpected type...");
} }
} }
...@@ -1318,7 +1317,6 @@ read_state_scalar_char_type (type_p *type) ...@@ -1318,7 +1317,6 @@ read_state_scalar_char_type (type_p *type)
read_state_common_type_content (*type); read_state_common_type_content (*type);
} }
/* Read the string_type. */ /* Read the string_type. */
static void static void
read_state_string_type (type_p *type) read_state_string_type (type_p *type)
......
...@@ -497,10 +497,10 @@ struct type scalar_char = { ...@@ -497,10 +497,10 @@ struct type scalar_char = {
/* Lists of various things. */ /* Lists of various things. */
pair_p typedefs; pair_p typedefs = NULL;
type_p structures; type_p structures = NULL;
type_p param_structs; type_p param_structs = NULL;
pair_p variables; pair_p variables = NULL;
static type_p find_param_structure (type_p t, type_p param[NUM_PARAM]); static type_p find_param_structure (type_p t, type_p param[NUM_PARAM]);
static type_p adjust_field_tree_exp (type_p t, options_p opt); static type_p adjust_field_tree_exp (type_p t, options_p opt);
...@@ -611,6 +611,7 @@ resolve_typedef (const char *s, struct fileloc *pos) ...@@ -611,6 +611,7 @@ resolve_typedef (const char *s, struct fileloc *pos)
return create_user_defined_type (s, pos); return create_user_defined_type (s, pos);
} }
/* Create and return a new structure with tag NAME at POS with fields /* Create and return a new structure with tag NAME at POS with fields
FIELDS and options O. The KIND of structure must be one of FIELDS and options O. The KIND of structure must be one of
TYPE_STRUCT, TYPE_UNION or TYPE_USER_STRUCT. */ TYPE_STRUCT, TYPE_UNION or TYPE_USER_STRUCT. */
...@@ -676,8 +677,7 @@ new_structure (const char *name, enum typekind kind, struct fileloc *pos, ...@@ -676,8 +677,7 @@ new_structure (const char *name, enum typekind kind, struct fileloc *pos,
structures = s; structures = s;
} }
if (s->u.s.line.file != NULL if (s->u.s.lang_struct && (s->u.s.lang_struct->u.s.bitmap & bitmap))
|| (s->u.s.lang_struct && (s->u.s.lang_struct->u.s.bitmap & bitmap)))
{ {
error_at_line (pos, "duplicate definition of '%s %s'", error_at_line (pos, "duplicate definition of '%s %s'",
isunion ? "union" : "struct", s->u.s.tag); isunion ? "union" : "struct", s->u.s.tag);
...@@ -763,6 +763,7 @@ create_scalar_type (const char *name) ...@@ -763,6 +763,7 @@ create_scalar_type (const char *name)
return &scalar_nonchar; return &scalar_nonchar;
} }
/* Return a pointer to T. */ /* Return a pointer to T. */
type_p type_p
...@@ -2636,7 +2637,7 @@ walk_type (type_p t, struct walk_type_data *d) ...@@ -2636,7 +2637,7 @@ walk_type (type_p t, struct walk_type_data *d)
/* If a pointer type is marked as "atomic", we process the /* If a pointer type is marked as "atomic", we process the
field itself, but we don't walk the data that they point to. field itself, but we don't walk the data that they point to.
There are two main cases where we walk types: to mark There are two main cases where we walk types: to mark
pointers that are reachable, and to relocate pointers when pointers that are reachable, and to relocate pointers when
writing a PCH file. In both cases, an atomic pointer is writing a PCH file. In both cases, an atomic pointer is
...@@ -3514,7 +3515,7 @@ write_func_for_structure (type_p orig_s, type_p s, type_p *param, ...@@ -3514,7 +3515,7 @@ write_func_for_structure (type_p orig_s, type_p s, type_p *param,
{ {
oprintf (d.of, " %s (x);\n", mark_hook_name); oprintf (d.of, " %s (x);\n", mark_hook_name);
} }
d.prev_val[2] = "*x"; d.prev_val[2] = "*x";
d.indent = 6; d.indent = 6;
if (orig_s->kind != TYPE_USER_STRUCT) if (orig_s->kind != TYPE_USER_STRUCT)
......
...@@ -308,7 +308,6 @@ struct type { ...@@ -308,7 +308,6 @@ struct type {
type_p param[NUM_PARAM]; /* The actual parameter types. */ type_p param[NUM_PARAM]; /* The actual parameter types. */
struct fileloc line; /* The source location. */ struct fileloc line; /* The source location. */
} param_struct; } param_struct;
} u; } u;
}; };
...@@ -444,38 +443,38 @@ extern void parse_file (const char *name); ...@@ -444,38 +443,38 @@ extern void parse_file (const char *name);
extern bool hit_error; extern bool hit_error;
/* Token codes. */ /* Token codes. */
enum enum gty_token
{ {
EOF_TOKEN = 0, EOF_TOKEN = 0,
/* Per standard convention, codes in the range (0, UCHAR_MAX] /* Per standard convention, codes in the range (0, UCHAR_MAX]
represent single characters with those character codes. */ represent single characters with those character codes. */
CHAR_TOKEN_OFFSET = UCHAR_MAX + 1,
CHAR_TOKEN_OFFSET = UCHAR_MAX + 1, GTY_TOKEN = CHAR_TOKEN_OFFSET,
GTY_TOKEN = CHAR_TOKEN_OFFSET, TYPEDEF,
TYPEDEF, EXTERN,
EXTERN, STATIC,
STATIC, UNION,
UNION, STRUCT,
STRUCT, ENUM,
ENUM, VEC_TOKEN,
VEC_TOKEN, ELLIPSIS,
ELLIPSIS, PTR_ALIAS,
PTR_ALIAS, NESTED_PTR,
NESTED_PTR, USER_GTY,
USER_GTY, PARAM_IS,
PARAM_IS, NUM,
NUM, SCALAR,
SCALAR, ID,
ID, STRING,
STRING, CHAR,
CHAR, ARRAY,
ARRAY, IGNORABLE_CXX_KEYWORD,
/* print_token assumes that any token >= FIRST_TOKEN_WITH_VALUE may have /* print_token assumes that any token >= FIRST_TOKEN_WITH_VALUE may have
a meaningful value to be printed. */ a meaningful value to be printed. */
FIRST_TOKEN_WITH_VALUE = PARAM_IS FIRST_TOKEN_WITH_VALUE = PARAM_IS
}; };
/* Level for verbose messages, e.g. output file generation... */ /* Level for verbose messages, e.g. output file generation... */
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment