Commit 688a9b51 by Robert Dewar Committed by Arnaud Charlet

einfo.ads, einfo.adb (Default_Aspect_Component_Value): Is on base type only.

2013-10-14  Robert Dewar  <dewar@adacore.com>

	* einfo.ads, einfo.adb (Default_Aspect_Component_Value): Is on base type
	only.
	* exp_aggr.adb (Expand_Array_Aggregate): Handle proper
	initialization of <> component.
	* exp_ch3.adb, exp_tss.adb: Minor reformatting
	* sem_ch13.adb (Default_Aspect_Component_Value, Default_Aspect_Value):
	Is on base type only.
	* sinfo.ads: Minor comment revision.

2013-10-14  Robert Dewar  <dewar@adacore.com>

	* g-decstr.adb (Decode_Wide_Wide_Character): Fix failure
	to detect invalid sequences where longer than necessary
	sequences are used for encoding.
	(Validate_Wide_Character):
	Call Decode_Wide_Character to get the above validations.
	(Validate_Wide_Wide_Character): Same fix
	* g-decstr.ads: Add documentation making it clear that the UTF-8
	implementation here recognizes all valid UTF-8 sequences, rather
	than the well-formed subset corresponding to characters defined
	in Unicode.
	(Next_Wide_Character): Remove comment about this
	being more efficient than Decode_Wide_Character (because this
	no longer the case).
	(Prev_Wide_Character): Add note that valid encoding is assumed.

2013-10-14  Robert Dewar  <dewar@adacore.com>

	* a-wichha.adb (Character_Set_Version): New function.
	* a-wichha.ads: Remove comments for pragma Pure (final RM has
	this).
	(Character_Set_Version): New function.
	* gnat_rm.texi: Update doc.

From-SVN: r203527
parent 124092ee
2013-10-14 Robert Dewar <dewar@adacore.com>
* einfo.ads, einfo.adb (Default_Aspect_Component_Value): Is on base type
only.
* exp_aggr.adb (Expand_Array_Aggregate): Handle proper
initialization of <> component.
* exp_ch3.adb, exp_tss.adb: Minor reformatting
* sem_ch13.adb (Default_Aspect_Component_Value, Default_Aspect_Value):
Is on base type only.
* sinfo.ads: Minor comment revision.
2013-10-14 Robert Dewar <dewar@adacore.com>
* g-decstr.adb (Decode_Wide_Wide_Character): Fix failure
to detect invalid sequences where longer than necessary
sequences are used for encoding.
(Validate_Wide_Character):
Call Decode_Wide_Character to get the above validations.
(Validate_Wide_Wide_Character): Same fix
* g-decstr.ads: Add documentation making it clear that the UTF-8
implementation here recognizes all valid UTF-8 sequences, rather
than the well-formed subset corresponding to characters defined
in Unicode.
(Next_Wide_Character): Remove comment about this
being more efficient than Decode_Wide_Character (because this
no longer the case).
(Prev_Wide_Character): Add note that valid encoding is assumed.
2013-10-14 Robert Dewar <dewar@adacore.com>
* a-wichha.adb (Character_Set_Version): New function.
* a-wichha.ads: Remove comments for pragma Pure (final RM has
this).
(Character_Set_Version): New function.
* gnat_rm.texi: Update doc.
2013-10-14 Hristian Kirtchev <kirtchev@adacore.com>
* einfo.adb: Flag263 is now known as Has_Null_Refinement.
......
......@@ -33,6 +33,11 @@ with Ada.Wide_Characters.Unicode; use Ada.Wide_Characters.Unicode;
package body Ada.Wide_Characters.Handling is
function Character_Set_Version return String is
begin
return "Unicode 6.2";
end Character_Set_Version;
---------------------
-- Is_Alphanumeric --
---------------------
......
......@@ -15,10 +15,12 @@
package Ada.Wide_Characters.Handling is
pragma Pure;
-- This package is clearly intended to be Pure, by analogy with the
-- base Ada.Characters.Handling package. The version in the RM does
-- not yet have this pragma, but that is a clear omission. This will
-- be fixed in a future version of AI05-0266-1.
function Character_Set_Version return String;
pragma Inline (Character_Set_Version);
-- Returns an implementation-defined identifier that identifies the version
-- of the character set standard that is used for categorizing characters
-- by the implementation. For GNAT this is "Unicode v.v".
function Is_Control (Item : Wide_Character) return Boolean;
pragma Inline (Is_Control);
......
......@@ -853,13 +853,13 @@ package body Einfo is
function Default_Aspect_Component_Value (Id : E) return N is
begin
pragma Assert (Is_Array_Type (Id));
return Node19 (Id);
return Node19 (Base_Type (Id));
end Default_Aspect_Component_Value;
function Default_Aspect_Value (Id : E) return N is
begin
pragma Assert (Is_Scalar_Type (Id));
return Node19 (Id);
return Node19 (Base_Type (Id));
end Default_Aspect_Value;
function Default_Expr_Function (Id : E) return E is
......@@ -3456,13 +3456,13 @@ package body Einfo is
procedure Set_Default_Aspect_Component_Value (Id : E; V : E) is
begin
pragma Assert (Is_Array_Type (Id));
pragma Assert (Is_Array_Type (Id) and then Is_Base_Type (Id));
Set_Node19 (Id, V);
end Set_Default_Aspect_Component_Value;
procedure Set_Default_Aspect_Value (Id : E; V : E) is
begin
pragma Assert (Is_Scalar_Type (Id));
pragma Assert (Is_Scalar_Type (Id) and then Is_Base_Type (Id));
Set_Node19 (Id, V);
end Set_Default_Aspect_Value;
......
......@@ -738,13 +738,13 @@ package Einfo is
-- subprograms, this returns the {function,procedure}_specification, not
-- the subprogram_declaration.
-- Default_Aspect_Component_Value (Node19)
-- Default_Aspect_Component_Value (Node19) [base type only]
-- Defined in array types. Holds the static value specified in a
-- default_component_value aspect specification for the array type.
-- Default_Component_Value aspect specification for the array type.
-- Default_Aspect_Value (Node19)
-- Default_Aspect_Value (Node19) [base type only]
-- Defined in scalar types. Holds the static value specified in a
-- default_value aspect specification for the type.
-- Default_Value aspect specification for the type.
-- Default_Expr_Function (Node21)
-- Defined in parameters. It holds the entity of the parameterless
......@@ -5171,7 +5171,7 @@ package Einfo is
-- E_Array_Type
-- E_Array_Subtype
-- First_Index (Node17)
-- Default_Aspect_Component_Value (Node19)
-- Default_Aspect_Component_Value (Node19) (base type only)
-- Component_Type (Node20) (base type only)
-- Original_Array_Type (Node21)
-- Component_Size (Uint22) (base type only)
......@@ -5354,7 +5354,7 @@ package Einfo is
-- Lit_Indexes (Node15) (root type only)
-- Lit_Strings (Node16) (root type only)
-- First_Literal (Node17)
-- Default_Aspect_Value (Node19)
-- Default_Aspect_Value (Node19) (base type only)
-- Scalar_Range (Node20)
-- Enum_Pos_To_Rep (Node23) (type only)
-- Static_Predicate (List25)
......@@ -5386,7 +5386,7 @@ package Einfo is
-- E_Floating_Point_Subtype
-- Digits_Value (Uint17)
-- Float_Rep (Uint10) (Float_Rep_Kind)
-- Default_Aspect_Value (Node19)
-- Default_Aspect_Value (Node19) (base type only)
-- Scalar_Range (Node20)
-- Machine_Emax_Value (synth)
-- Machine_Emin_Value (synth)
......@@ -5564,7 +5564,7 @@ package Einfo is
-- E_Modular_Integer_Type
-- E_Modular_Integer_Subtype
-- Modulus (Uint17) (base type only)
-- Default_Aspect_Value (Node19)
-- Default_Aspect_Value (Node19) (base type only)
-- Original_Array_Type (Node21)
-- Scalar_Range (Node20)
-- Static_Predicate (List25)
......@@ -5599,7 +5599,7 @@ package Einfo is
-- E_Ordinary_Fixed_Point_Type
-- E_Ordinary_Fixed_Point_Subtype
-- Delta_Value (Ureal18)
-- Default_Aspect_Value (Node19)
-- Default_Aspect_Value (Node19) (base type only)
-- Scalar_Range (Node20)
-- Small_Value (Ureal21)
-- Has_Small_Clause (Flag67)
......@@ -5853,7 +5853,7 @@ package Einfo is
-- E_Signed_Integer_Type
-- E_Signed_Integer_Subtype
-- Default_Aspect_Value (Node19)
-- Default_Aspect_Value (Node19) (base type only)
-- Scalar_Range (Node20)
-- Static_Predicate (List25)
-- Has_Biased_Representation (Flag139)
......
......@@ -4878,6 +4878,43 @@ package body Exp_Aggr is
Check_Same_Aggr_Bounds (N, 1);
end if;
-- STEP 1d
-- If we have a default component value, or simple initialization is
-- required for the component type, then we replace <> in component
-- associations by the required default value.
declare
Default_Val : Node_Id;
Assoc : Node_Id;
begin
if (Present (Default_Aspect_Component_Value (Typ))
or else Needs_Simple_Initialization (Ctyp))
and then Present (Component_Associations (N))
then
Assoc := First (Component_Associations (N));
while Present (Assoc) loop
if Nkind (Assoc) = N_Component_Association
and then Box_Present (Assoc)
then
Set_Box_Present (Assoc, False);
if Present (Default_Aspect_Component_Value (Typ)) then
Default_Val := Default_Aspect_Component_Value (Typ);
else
Default_Val := Get_Simple_Init_Val (Ctyp, N);
end if;
Set_Expression (Assoc, New_Copy_Tree (Default_Val));
Analyze_And_Resolve (Expression (Assoc), Ctyp);
end if;
Next (Assoc);
end loop;
end if;
end;
-- STEP 2
-- Here we test for is packed array aggregate that we can handle at
......
......@@ -4940,7 +4940,7 @@ package body Exp_Ch3 is
Next_Elmt (Discr);
end loop;
-- Now collect values of initialized components.
-- Now collect values of initialized components
Comp := First_Component (Full_Type);
while Present (Comp) loop
......@@ -4957,11 +4957,11 @@ package body Exp_Ch3 is
Next_Component (Comp);
end loop;
-- Finally, box-initialize remaining components.
-- Finally, box-initialize remaining components
Append_To (Component_Associations (Aggr),
Make_Component_Association (Loc,
Choices => New_List (Make_Others_Choice (Loc)),
Choices => New_List (Make_Others_Choice (Loc)),
Expression => Empty));
Set_Box_Present (Last (Component_Associations (Aggr)));
Set_Expression (N, Aggr);
......
......@@ -6,7 +6,7 @@
-- --
-- B o d y --
-- --
-- Copyright (C) 1992-2010, Free Software Foundation, Inc. --
-- Copyright (C) 1992-2013, Free Software Foundation, Inc. --
-- --
-- GNAT is free software; you can redistribute it and/or modify it under --
-- terms of the GNU General Public License as published by the Free Soft- --
......@@ -211,7 +211,7 @@ package body Exp_Tss is
begin
return Present (BIP)
and then (Restriction_Active (No_Default_Initialization)
or else not Is_Null_Init_Proc (BIP));
or else not Is_Null_Init_Proc (BIP));
end Has_Non_Null_Base_Init_Proc;
---------------
......
......@@ -6,7 +6,7 @@
-- --
-- S p e c --
-- --
-- Copyright (C) 2007-2010, AdaCore --
-- Copyright (C) 2007-2013, AdaCore --
-- --
-- GNAT is free software; you can redistribute it and/or modify it under --
-- terms of the GNU General Public License as published by the Free Soft- --
......@@ -192,6 +192,11 @@ package body GNAT.Decode_String is
elsif (U and 2#11100000#) = 2#110_00000# then
W := U and 2#00011111#;
Get_UTF_Byte;
if W not in 16#00_0080# .. 16#00_07FF# then
Bad;
end if;
Result := Wide_Wide_Character'Val (W);
-- 16#00_0800#-16#00_ffff#: 1110xxxx 10xxxxxx 10xxxxxx
......@@ -200,6 +205,11 @@ package body GNAT.Decode_String is
W := U and 2#00001111#;
Get_UTF_Byte;
Get_UTF_Byte;
if W not in 16#00_0800# .. 16#00_FFFF# then
Bad;
end if;
Result := Wide_Wide_Character'Val (W);
-- 16#01_0000#-16#10_FFFF#: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
......@@ -211,6 +221,10 @@ package body GNAT.Decode_String is
Get_UTF_Byte;
end loop;
if W not in 16#01_0000# .. 16#10_FFFF# then
Bad;
end if;
Result := Wide_Wide_Character'Val (W);
-- 16#0020_0000#-16#03FF_FFFF#: 111110xx 10xxxxxx 10xxxxxx
......@@ -223,6 +237,10 @@ package body GNAT.Decode_String is
Get_UTF_Byte;
end loop;
if W not in 16#0020_0000# .. 16#03FF_FFFF# then
Bad;
end if;
Result := Wide_Wide_Character'Val (W);
-- All other cases are invalid, note that this includes:
......@@ -304,100 +322,10 @@ package body GNAT.Decode_String is
-------------------------
procedure Next_Wide_Character (Input : String; Ptr : in out Natural) is
Discard : Wide_Character;
pragma Unreferenced (Discard);
begin
if Ptr < Input'First then
Past_End;
end if;
-- Special efficient encoding for UTF-8 case
if Encoding_Method = WCEM_UTF8 then
UTF8 : declare
U : Unsigned_32;
procedure Getc;
pragma Inline (Getc);
-- Gets the character at Input (Ptr) and returns code in U as
-- Unsigned_32 value. On return Ptr is bumped past the character.
procedure Skip_UTF_Byte;
pragma Inline (Skip_UTF_Byte);
-- Skips past one encoded byte which must be 2#10xxxxxx#
----------
-- Getc --
----------
procedure Getc is
begin
if Ptr > Input'Last then
Past_End;
else
U := Unsigned_32 (Character'Pos (Input (Ptr)));
Ptr := Ptr + 1;
end if;
end Getc;
-------------------
-- Skip_UTF_Byte --
-------------------
procedure Skip_UTF_Byte is
begin
Getc;
if (U and 2#11000000#) /= 2#10_000000# then
Bad;
end if;
end Skip_UTF_Byte;
-- Start of processing for UTF-8 case
begin
-- 16#00_0000#-16#00_007F#: 0xxxxxxx
Getc;
if (U and 2#10000000#) = 2#00000000# then
return;
-- 16#00_0080#-16#00_07FF#: 110xxxxx 10xxxxxx
elsif (U and 2#11100000#) = 2#110_00000# then
Skip_UTF_Byte;
-- 16#00_0800#-16#00_ffff#: 1110xxxx 10xxxxxx 10xxxxxx
elsif (U and 2#11110000#) = 2#1110_0000# then
Skip_UTF_Byte;
Skip_UTF_Byte;
-- Any other code is invalid, note that this includes:
-- 16#01_0000#-16#10_FFFF#: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
-- 16#0020_0000#-16#03FF_FFFF#: 111110xx 10xxxxxx 10xxxxxx
-- 10xxxxxx 10xxxxxx
-- 16#0400_0000#-16#7FFF_FFFF#: 1111110x 10xxxxxx 10xxxxxx
-- 10xxxxxx 10xxxxxx 10xxxxxx
-- since Wide_Character does not allow codes > 16#FFFF#
else
Bad;
end if;
end UTF8;
-- Non-UTF-8 case
else
declare
Discard : Wide_Character;
begin
Decode_Wide_Character (Input, Ptr, Discard);
end;
end if;
Decode_Wide_Character (Input, Ptr, Discard);
end Next_Wide_Character;
------------------------------
......@@ -405,110 +333,10 @@ package body GNAT.Decode_String is
------------------------------
procedure Next_Wide_Wide_Character (Input : String; Ptr : in out Natural) is
Discard : Wide_Wide_Character;
pragma Unreferenced (Discard);
begin
-- Special efficient encoding for UTF-8 case
if Encoding_Method = WCEM_UTF8 then
UTF8 : declare
U : Unsigned_32;
procedure Getc;
pragma Inline (Getc);
-- Gets the character at Input (Ptr) and returns code in U as
-- Unsigned_32 value. On return Ptr is bumped past the character.
procedure Skip_UTF_Byte;
pragma Inline (Skip_UTF_Byte);
-- Skips past one encoded byte which must be 2#10xxxxxx#
----------
-- Getc --
----------
procedure Getc is
begin
if Ptr > Input'Last then
Past_End;
else
U := Unsigned_32 (Character'Pos (Input (Ptr)));
Ptr := Ptr + 1;
end if;
end Getc;
-------------------
-- Skip_UTF_Byte --
-------------------
procedure Skip_UTF_Byte is
begin
Getc;
if (U and 2#11000000#) /= 2#10_000000# then
Bad;
end if;
end Skip_UTF_Byte;
-- Start of processing for UTF-8 case
begin
if Ptr < Input'First then
Past_End;
end if;
-- 16#00_0000#-16#00_007F#: 0xxxxxxx
Getc;
if (U and 2#10000000#) = 2#00000000# then
null;
-- 16#00_0080#-16#00_07FF#: 110xxxxx 10xxxxxx
elsif (U and 2#11100000#) = 2#110_00000# then
Skip_UTF_Byte;
-- 16#00_0800#-16#00_ffff#: 1110xxxx 10xxxxxx 10xxxxxx
elsif (U and 2#11110000#) = 2#1110_0000# then
Skip_UTF_Byte;
Skip_UTF_Byte;
-- 16#01_0000#-16#10_FFFF#: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
elsif (U and 2#11111000#) = 2#11110_000# then
for K in 1 .. 3 loop
Skip_UTF_Byte;
end loop;
-- 16#0020_0000#-16#03FF_FFFF#: 111110xx 10xxxxxx 10xxxxxx
-- 10xxxxxx 10xxxxxx
elsif (U and 2#11111100#) = 2#111110_00# then
for K in 1 .. 4 loop
Skip_UTF_Byte;
end loop;
-- Any other code is invalid, note that this includes:
-- 16#0400_0000#-16#7FFF_FFFF#: 1111110x 10xxxxxx 10xxxxxx
-- 10xxxxxx 10xxxxxx 10xxxxxx
-- since Wide_Wide_Character does not allow codes > 16#03FF_FFFF#
else
Bad;
end if;
end UTF8;
-- Non-UTF-8 case
else
declare
Discard : Wide_Wide_Character;
begin
Decode_Wide_Wide_Character (Input, Ptr, Discard);
end;
end if;
Decode_Wide_Wide_Character (Input, Ptr, Discard);
end Next_Wide_Wide_Character;
--------------
......
......@@ -6,7 +6,7 @@
-- --
-- S p e c --
-- --
-- Copyright (C) 2007-2010, AdaCore --
-- Copyright (C) 2007-2013, AdaCore --
-- --
-- GNAT is free software; you can redistribute it and/or modify it under --
-- terms of the GNU General Public License as published by the Free Soft- --
......@@ -47,6 +47,17 @@
-- does not make any assumptions about the character coding. See also the
-- packages Ada.Wide_[Wide_]Characters.Unicode for unicode specific functions.
-- In particular, in the case of UTF-8, all valid UTF-8 encodings, as listed
-- in table 3.6 of the Unicode Standard, version 6.2.0, are recognized as
-- legitimate. This includes the full range 16#0000_0000# .. 16#03FF_FFFF#.
-- This includes codes in the range 16#D800# - 16#DFFF#. These codes all
-- have UTF-8 encoding sequences that are well-defined (e.g. the encoding for
-- 16#D800# is ED A0 80). But these codes do not correspond to defined Unicode
-- characters and are thus considered to be "not well-formed" (see table 3.7
-- of the Unicode Standard). If you need to exclude these codes, you must do
-- that manually, e.g. use Decode_Wide_Character/Decode_Wide_String and check
-- that the resulting code(s) are not in this range.
-- Note on the use of brackets encoding (WCEM_Brackets). The brackets encoding
-- method is ambiguous in the context of this package, since there is no way
-- to tell if ["1234"] is eight unencoded characters or one encoded character.
......@@ -86,7 +97,6 @@ package GNAT.Decode_String is
-- will be raised.
function Decode_Wide_Wide_String (S : String) return Wide_Wide_String;
pragma Inline (Decode_Wide_Wide_String);
-- Same as above function but for Wide_Wide_String output
procedure Decode_Wide_Wide_String
......@@ -124,16 +134,17 @@ package GNAT.Decode_String is
(Input : String;
Ptr : in out Natural;
Result : out Wide_Wide_Character);
pragma Inline (Decode_Wide_Wide_Character);
-- Same as above procedure but with Wide_Wide_Character input
procedure Next_Wide_Character (Input : String; Ptr : in out Natural);
pragma Inline (Next_Wide_Character);
-- This procedure examines the input string starting at Input (Ptr), and
-- advances Ptr past one character in the encoded string, so that on return
-- Ptr points to the next encoded character. Constraint_Error is raised if
-- an invalid encoding is encountered, or the end of the string is reached
-- or if Ptr is less than String'First on entry, or if the character
-- skipped is not a valid Wide_Character code. This call may be more
-- efficient than calling Decode_Wide_Character and discarding the result.
-- skipped is not a valid Wide_Character code.
procedure Prev_Wide_Character (Input : String; Ptr : in out Natural);
-- This procedure is similar to Next_Encoded_Character except that it moves
......@@ -149,8 +160,12 @@ package GNAT.Decode_String is
-- WCEM_Brackets). For all other encodings, we work by starting at the
-- beginning of the string and moving forward till Ptr is reached, which
-- is correct but slow.
--
-- Note: this routine assumes that the sequence prior to Ptr is correctly
-- encoded, it does not have a defined behavior if this is not the case.
procedure Next_Wide_Wide_Character (Input : String; Ptr : in out Natural);
pragma Inline (Next_Wide_Wide_Character);
-- Similar to Next_Wide_Character except that codes skipped must be valid
-- Wide_Wide_Character codes.
......
......@@ -770,17 +770,9 @@ package body Sem_Ch13 is
Set_Has_Default_Aspect (Base_Type (Ent));
if Is_Scalar_Type (Ent) then
Set_Default_Aspect_Value (Ent, Expr);
-- Place default value of base type as well, because that is
-- the semantics of the aspect. It is convenient to link the
-- aspect to both the (possibly anonymous) base type and to
-- the given first subtype.
Set_Default_Aspect_Value (Base_Type (Ent), Expr);
else
Set_Default_Aspect_Component_Value (Ent, Expr);
Set_Default_Aspect_Component_Value (Base_Type (Ent), Expr);
end if;
end Analyze_Aspect_Default_Value;
......@@ -9457,6 +9449,7 @@ package body Sem_Ch13 is
-- Default_Component_Value
if Is_Array_Type (Typ)
and then Is_Base_Type (Typ)
and then Has_Rep_Item (Typ, Name_Default_Component_Value, False)
and then Has_Rep_Item (Typ, Name_Default_Component_Value)
then
......@@ -9468,6 +9461,7 @@ package body Sem_Ch13 is
-- Default_Value
if Is_Scalar_Type (Typ)
and then Is_Base_Type (Typ)
and then Has_Rep_Item (Typ, Name_Default_Value, False)
and then Has_Rep_Item (Typ, Name_Default_Value)
then
......
......@@ -3596,7 +3596,7 @@ package Sinfo is
-- Sloc points to first selector name
-- Choices (List1)
-- Loop_Actions (List2-Sem)
-- Expression (Node3)
-- Expression (Node3) (empty if Box_Present)
-- Box_Present (Flag15)
-- Inherited_Discriminant (Flag13)
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment