Sending arrays of octets between C and Ada

The Ada language and the C language have a very different concept of strings and characters. I'll try to follow Ossasepia and use the term octet for an 8 bit integer and use char for C and Character for Ada. In Ada the Character is defined as an enumerated type ranging from the Ada.Characters.Latin_1.NUL (Character'Val (0)) character to the Ada.Characters.Latin_1.LC_Y_Diaeresis (Character'Val (255)) character. This range is exactly the same as the range of all valid octets and so characters can be stored as octets. As characters are supposed to represent another domain as natural numbers these need to be converted back and forth through the Character'Val and Character'Ord functions. In C all is a lot more muddier and a char can either be seen as an 8 bit integer or as a symbol for a language, it all depends on the context. So far for context, now to address the cause, how to convert arrays of characters between C and Ada1. I consider the Char_Ptr support from the ada standard library out of bounds2, this investigation is based on the 'char_array' type from the Ada package `Interfaces.C. First, the code;

The code defines 5 different methods to interact with strings between C and Ada;

  1. The basic method, call a function from Ada to C. The character_array in Ada, will turn into a char * pointer in C. A parameter needs to be added to pass the length of the array.
  2. The basic method, now two way. Ada will call C and C will immediately call Ada. The Ada function uses a constrained character array so no count parameter is needed for it.
  3. A GNAT specific method, import a C function but use the Ada calling convention. The character_array in Ada will be a structure in C. This layout of this structure is based on how GNAT does this internally.
  4. Call to C as in (1), but then call Ada as in (3). Note that Ada methods can be exported using the Ada calling convention.
  5. Like (2) but now the Ada procedure does not have a constrained character array as parameter but an unconstrained one, so a count parameter is needed for Ada too.

First, to define the procedures (please also read the calling conventions section of the GNAT documentation):

with Interfaces.C; use Interfaces.C;

package C_Array is

        -- The basic method, call C using a pointer and a count
        procedure C_Fill_1(CH : in out char_array; Count : Integer);
        pragma Import(C, C_Fill_1, "c_fill_1");

        -- Same method as 'C_Fill_1', but the C function will call Ada.
        procedure C_Fill_2(CH : in out char_array; Count : Integer);
        pragma Import(C, C_Fill_2, "c_fill_2");

        -- Same method as 'C_Fill_1', but the C function will call Ada using Ada calling conventions
        procedure C_Fill_3(CH : in out char_array; Count : Integer);
        pragma Import(C, C_Fill_3, "c_fill_3");

        -- Call to C using Ada calling conventions
        procedure C_Fill_4(CH : in out char_array);
        pragma Import(Ada, C_Fill_4, "c_fill_4");

        -- Same method as 'C_Fill_1', the C function will call Ada using an unconstrained array and a count.
        procedure C_Fill_5(CH : in out char_array; Count : Integer);
        pragma Import(C, C_Fill_5, "c_fill_5");

        -- For method 2, the C function will make a call to a function with a constrained array parameter
        subtype constrained_char_array is char_array(0 .. 100);
        procedure ADA_Fill_2(CH : in out constrained_char_array);
        pragma Export(C, ADA_Fill_2, "ada_fill_2");

        -- For method 3, the C function will make a call to Ada usinging Ada calling conventions.
        procedure ADA_Fill_3(CH : in out char_array);
        pragma Export(Ada, ADA_Fill_3, "ada_fill_3");

        -- For method 5, the C function will make a call to a function with a unconstrained array parameter
        subtype constrained_char_array is char_array(0 .. 100);
        procedure ADA_Fill_5(CH : in out char_array; Count: Integer);
        pragma Export(C, ADA_Fill_5, "ada_fill_5");

end C_Array;

Next, the more interesting bit, the C functions;

#include <stdint.h>
#include <stdio.h>

typedef struct B {
        size_t LB0;
        size_t UB0;
} B_t;

typedef struct U {
        char * P_ARRAY;
        B_t * P_BOUNDS;
} U_t;

void ada_fill_2(char * buffer);
void ada_fill_3(U_t array, int count);
void ada_fill_5(char * buffer, int count);

void c_fill_1(char * buffer, int count) {
        int i;

        printf("c_fill_1; buffer = %p, count = %d\n", buffer, count);

        for(i = 0; i < count; i++) {
                buffer[i] = '1';
        }
}

void c_fill_2(char * buffer, int count) {
        printf("c_fill_2; buffer = %p, count = %d\n", buffer, count);
        ada_fill_2(buffer);
}

void c_fill_3(char * buffer, int count) {
        B_t b;
        U_t a;
        b.LB0 = 0;
        b.UB0 = count;
        a.P_ARRAY = buffer;
        a.P_BOUNDS = &b;

        printf("c_fill_3; buffer = %p, count = %d\n", buffer, count);

        ada_fill_3(a, count);
}

void c_fill_4(U_t array) {
        int i = 0;
        char * buffer = array.P_ARRAY;

        printf("c_fill_4; buffer = %p, count = %d\n", array.P_ARRAY, array.P_BOUNDS->UB0 - array.P_BOUNDS->LB0);

        for(i = array.P_BOUNDS->LB0; i <= array.P_BOUNDS->UB0; i++) {
                buffer[i] = '4';
        }
}

void c_fill_5(char * buffer, int count) {
        printf("c_fill_5; buffer = %p, count = %d\n", buffer, count);
        ada_fill_5(buffer, count);
}

The first 2 functions are simple. Because the array is constrained in Ada there is no need for the count parameter, however the actual length of the array in C must always be the same as the one in Ada. Next the two methods that took the most time to figure out. I could not find any useful description of the so called Ada Calling Convention. No such convention seems to be specified, and every ada implementation is free to implement this as they see fit. The C code will be tight to GNAT when using this method and maybe even specific versions of GNAT. The Ada calling convention for arrays is implemented in the interface between the GNAT frontend and the GCC backend3. In the interface, the GNAT code tree is converted into a GCC code tree, and most of these functions are recursive and try to get information from different parts of both trees. In short, reading this code is not that easy, but from the code I could determine that unconstrained arrays are send as a structure with two fields, one field is a pointer to the start of the array, and the other is a pointer to a structure having again two fields. This last structure has a field for the lower bound and one for the upper bound of the array. The exact layout of the structure was a bit harder to determine so an extra flag for the compiler was needed -fdump-tree-original4. From that dump, I could determine the structure5. The C function is not more secure with this structure, but the Ada implementation will be. Finally, we finish with the more usual way of calling an Ada function with an unconstrained character_array and a count variable.

For reference, the Ada implementation. Note that for the fifth case we cannot use the upper bound of the array. This upper bound is undefined (and in practice will be the maximum value of the given range type).

with Ada.Text_IO; use Ada.Text_IO;
with Ada.Integer_Text_IO; use Ada.Integer_Text_IO;
with Ada.Long_Integer_Text_IO; use Ada.Long_Integer_Text_IO;

package body C_Array is

        -- We have a statically defined length so the range will be fine.
        -- The call in C code to this procedure must use a buffer with at least the constrained range.
        procedure ADA_Fill_2(CH : in out constrained_char_array) is
        begin
                Put("ada_fill_2;");
                Put(" lb=" & size_t'Image(CH'First));
                Put(" ub=" & size_t'Image(CH'Last));
                New_line;

                for I in CH'Range loop
                        CH(I) := To_C('2');
                end loop;
        end Ada_Fill_2;

        -- The call in the C code needs to send an Ada array.
        procedure ADA_Fill_3(CH : in out char_array) is
        begin
                Put("ada_fill_3;");
                Put(" lb=" & size_t'Image(CH'First));
                Put(" ub=" & size_t'Image(CH'Last));
                New_line;

                for I in CH'Range loop
                        CH(I) := To_C('3');
                end loop;
        end Ada_Fill_3;

        -- For calls from C without a constained type or ada array, an extra count parameter is needed.
        procedure ADA_Fill_5(CH : in out char_array; Count: Integer) is
        begin
                Put("ada_fill_5; count="); Put(Count);
                Put(" lb=" & size_t'Image(CH'First));
                Put(" ub=" & size_t'Image(CH'Last));
                New_line;

                -- the Range cannot be used, the 'Last index is wrong.
                for I in ch'First .. size_t(Count) loop
                        CH(I) := To_C('5');
                end loop;
        end Ada_Fill_5;
end C_Array;

The code includes a simple main program that calls all five functions;

with C_Array; use C_Array;
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Integer_Text_IO; use Ada.Integer_Text_IO;

with Interfaces.C; use Interfaces.C;

procedure Array_Main is
        work_array : char_array(0 .. 100);
        output_string : String(1 .. 101) := (others => ' ');
        C : Integer := 0;
begin
        Put("start");
        New_Line;
        C_fill_1(work_array,100);
        To_Ada(work_array, output_string, C, False);
        Put("c_fill_1 output ="); Put(output_string);
        New_Line;

        C_fill_2(work_array,100);
        To_Ada(work_array, output_string, C, False);
        Put("c_fill_2 output ="); Put(output_string);
        New_Line;

        C_fill_3(work_array,100);
        To_Ada(work_array, output_string, C, False);
        Put("c_fill_4 output ="); Put(output_string);
        New_Line;

        C_fill_4(work_array);
        To_Ada(work_array, output_string, C, False);
        Put("c_fill_4 output ="); Put(output_string);
        New_Line;

        C_fill_5(work_array,100);
        To_Ada(work_array, output_string, C, False);
        Put("c_fill_5 output ="); Put(output_string);
        New_Line;
end;

The output will be;

start
c_fill_1; buffer = 0x7ffdf87fd910, count = 100
c_fill_1 output =1111111111111111111111111111...
c_fill_2; buffer = 0x7ffdf87fd910, count = 100
ada_fill_2; lb= 0 ub= 100
c_fill_2 output =2222222222222222222222222222...
c_fill_3; buffer = 0x7ffdf87fd910, count = 100
ada_fill_3; lb= 0 ub= 100
c_fill_4 output =3333333333333333333333333333...
c_fill_4; buffer = 0x7ffdf87fd910, count = 100
c_fill_4 output =4444444444444444444444444444...
c_fill_5; buffer = 0x7ffdf87fd910, count = 100
ada_fill_5; count=        100 lb= 0 ub= 18446744073709551615
c_fill_5 output =5555555555555555555555555555...

A random remark; it is not a good idea to call To_Ada on the unconstrained array from method 56. First, To_Ada is not more efficient than a character by character conversion, in fact it is implemented as such. Second, To_Ada will use the Last parameter of the `character_array and that parameter is set to the maximum value of size_t (Ada will check on this bound but a segmentation fault will happen first). Either copy the character_array to a constrained character array first, or write a custom conversion function.

Another random remark; the Ada standard library can be studied with the Ada 2012 LRM and understood better with the GNAT source code. It helps to have a cross-referenced, browser readable version of the GNAT source code at hand (there is one in the GNAT Book, but that one is incomplete). To make such a version do:

0) Make a directory and go to it

    mkdir ada-html
    cd ada-html

1) Find the gnat runtime library
    (i.e. the directory containing adainclude and adalib)
    It should be in the gnat install directory,
    as the lib/gcc/<machine>/<gcc version>/ directory

    For AdaCore 2016, (linux 64bit) this directory can set with:

    RT_DIR = $(dirname `which gnatmake`)/../lib/gcc/x86_64-pc-linux-gnu/4.9.4

2) Copy the source files from adainclude:

    cp $RT_DIR/adainclude/*.ad* .

3) Copy the ali files from adalib (needed for cross references):

    cp $RT_DIR/adalib/*.ali .

4) Make the html files with the `gnathtml.pl` script:

    gnathtml.pl -f -D *.ad*

5) Go to the 'html' directory and look (open index.htm in a browser):

    cd html
    ls
  1. I've done this a couple of times and my knowledge thus gained was largely anecdotal. This won't do for republic business so hence this article. []
  2. please read the GNAT source code file i-cstrin.adb'. This should put you off the idea of using the Char_Ptr []
  3. all the code can be found in the gcc/ada/gcc-interfaces directory []
  4. This flag will dump the gcc version of the tree in a somewhat readable fashion. The dump does omit information so it also needs to be followed by a recompile and dump with -fdump-tree-original-raw (see also the GCC Command Line Switches). The second dump can be used to determine the types of the fields in the structures []
  5. For every kind of array this structure will have the same field but the fields will have different types. For integer array, the pointer will be an integer pointer. For Strings the boundary fields (LB0/UB0) will be 32 bit integers. So this kind of interfacing needs to be repeated on a case by case bases []
  6. This can be determined from reading the code in the i-c.adb file []

Leave a Reply