GNAT Zero Foot Print - Take 4 - Introduction of the platform

August 13th, 2018

An Ada runtime library is used to provide a standard interface to different operating systems and hardware. Already two different ways of compilation (1) based on the C library (2) based on assembly code, is supported in the ZFP library. Both versions can be had by pressing a different node of the v-tree. Although this works, it all becomes complicated when I want to add the same file to both systems and have to maintain multiple branches. Also, I want to a add a version of the library with no OS support and one with 64-bit arm support and probably MIPS and so on and so forth.

I needed to do a major overhaul of the code to support different platforms. An option was added to the gprbuild project file and with this option different source directories are selected to compile the library. All the sources have been distributed over different directories, one directory adainclude for generic (non-platform specific) code and multiple directories under the platform directory for all those files that are different per system. Now that all the source files are in different directories, the only way the runtime can be used is once it is installed1.

To use this new code, download the patch and signature;

After pressing, you'll need to do the following magic commands in the zfp directory2;

make clean MODE=x86_64-asm

make MODE=x86_64-asm

make install MODE=x86_64-asm PREFIX=prefix-asm

To check;

cd examples

make clean

make RTS=../prefix-asm

This will build the assembly based gnat library, for the C based do in the zfp directory;

make clean

make MODE=x86_64-c

make install MODE=x86_64-asm PREFIX=prefix-c

Again, to check

cd examples

make clean

make RTS=../prefix-c

Once built and installed into a prefix directory the default GNAT, the C and asm library can all be used to build the examples. The only thing to be set is the runtime directory with the RTS environment variable.

  1. At installation time the source files will nicely be put into the target adainclude directory with the gprinstall command []
  2. make is necessary, the gprbuild is fine for building Ada libraries and executables but when it comes to a simple rule to copy a file to a new name (so that gprinstall can pick it up and install that file) you can forget about it. []

GNAT Zero Foot Print - Take 3 - Regrind

August 7th, 2018

No new code in this installment. Instead, a regrind of all 3 patches, after a helpful suggestion to do so by Diana Coman . With this regrind, I updated the patches to follow the current thinking in vpatch management; the whole package under a common subdirectory, addition of a manifest and all files hashed with Keccak.

You can download and press the files with

v.pl init http://ave1.org/code/zfp

but you will have to comment out the hash checking code.

GNAT Zero Foot Print - Take 2 - No C

July 6th, 2018

"Libc gotta go."

—Stanislav Datskovskiy

And it will. In an, at this moment, unknown amount of steps the C library can be ripped out from the Ada Runtime library and be replaced with Ada and assembly code. In the first step, all C calls need to be replaced with Ada code and possibly some assembly to perform system calls to the Linux kernel. The second step is then to replace the C library specific start-up code with code for Ada.

I start with the previous version of a minimal ZFP library for Linux. This library uses only two calls to the C library, one to output characters and the other to exit the code. Both are replaced with a direct system call1. The second change is to include a file with startup code2. The resulting code is published in the following vpatch (with signature).

Combine this patch with those from the previous installment, press and build it. Building the code needs to be done with the Makefile3.

<<create a directory and put a .wot directory in it with at least my key>>

v.pl init http://ave1.org/code/zfp

v.pl p a zfp_2_noc.vpatch

cd a

make

cd examples

make

All system calls can be found in the adainclude/s-syscal.adb file. The Write function (used for outputting characters) is implemented as a single assembly statement syscall with the parameter list specified to fill the right processor registers. The function starts with a conversion from characters to bytes4 and ends with a check of the return values. After a completed system call the 'RAX' register will be filled with a return code. If an error occurred during execution of the system call, the register will contain the error code as a negative number (always between -1 and -4096). If the execution was successful the register will contain 0 or any other 64bit number outside of the range -1 to -4096.

function Write (fd : in Int; S : in String; E : out ErrorCode) return Int is
    type byte is mod 2**8;
    B : array (S'Range) of byte;
    R : Int := 0;
 begin
    for I in S'Range loop
       B (I) := Character'Pos (S (I));
    end loop;
    Asm
      ("syscall",
       Outputs => (Int'Asm_Output ("=a", R)),
       Inputs  =>
         (Int'Asm_Input ("a", SYSCALL_WRITE),
          Int'Asm_Input ("D", fd),
          System.Address'Asm_Input ("S", B'Address),
          Int'Asm_Input ("d", B'Length)),
       Volatile => True);
    if R < 0 and R >= -(2**12) then
       E := ErrorCode'Val (-R);
       R := -1;
    else
       E := OK;
    end if;
    return R;
 end Write;

The a-textio.adb and last_chance_handler.adb files have been updated to use the system calls instead of the C library. The s-maccod.ads was added from the GNAT runtime library to support the inline assembly code. The other addition is the startup.S file. In it simplest working form it just needs to contain one definition of a global (_start), a call to a main function and a syscall to exit the code;

.global _start

_start:
  call main

  /* exit code */
  mov $60, %rax
  mov $0, %rdi
  syscall

The version in the patch also stores the argument count and a pointer to the argument array in two globals. Both globals are unused for now but will be needed for future parsing of any command line arguments.

The final noteworthy change is the inclusion of a runtime.xml file. The gprbuild command will use this file to set flags for all projects that are build with the runtime library. For reasons , this file is written as an xml file containing gprbuild project statements;

<?xml version="1.0" ?>

<gprconfig>
  <configuration>
   <config>
   package Linker is
      for Required_Switches use Linker'Required_Switches &amp;
        ("${RUNTIME_DIR(ada)}/adalib/libgnat.a") &amp;
        ("-nostdlib", "-nodefaultlibs", "-lgcc");

      for Required_Switches use Linker'Required_Switches &amp;
          ("${RUNTIME_DIR(ada)}/adalib/start.o");
   end Linker;

   package Binder is
      for Required_Switches ("Ada") use Binder'Required_Switches ("Ada") &amp;
       ("-nostdlib") ;
   end Binder;
   </config>
  </configuration>
</gprconfig>

The linker flags are set so that no standard C library or startup code is included in the resulting binary. As we are then lacking the default startup code, an extra line is added to include the start.o code with every compile.

In the end, the fun part, a working binary. The hello world example from the previous installment can be built and it's size inspected. It is now at 2.6k (down from 54k) on my computer5.

In the final end, I will include another reference to AdaCore's configurable runtime documentation. The GNAT documentation has been very helpful for learning the GNAT system and developing this library.

  1. The main difficulty in doing so is to learn how the Linux system calls work and get a better understanding of the inline assembly statements. Stans' demo.asm posted in the logs proved very helpful for this process []
  2. This file is now written in assembly, although (upon reflection) it should be possible to rewrite it in Ada []
  3. I did not find a method to compile one separate file into an object file with gprbuild []
  4. Which in practice will be a copy operation []
  5. Ofcourse, this minimal library is too minimal. In some cases (for example when a string is concatenated) the compiler will generate memcpy or memset calls. We need to provide replacement Ada functions for each. This is not difficult as the ada 2017 code contains pure Ada versions for all of these. []

Building GNAT on MUSL, updated tar line

June 3rd, 2018

An update on the previous version.

The produced gcc compiler builds static executables and no dynamically linked executables.

For detailed instructions in how to run the script see the readme-2018-06-01.txt.

PGPy a review

May 29th, 2018

The code of PGPy1 sucks.

A good indication of the quality of a Python package is the 'requirements.txt' file, reproduced here;

cryptography>=1.1
enum34
pyasn1
six>=1.9.0
singledispatch

The cryptography package will need to be reviewed separately. A quick view at the PYPI package index for cryptography is already good for some lulz2. The enum34 brings the Python 3 enumeration type to Python 2. Only one object is used from the package pyasn1 and the functionality provided in this object could all have been implemented in an hour in PGPy. If you see six as a requirement, you know you are in trouble. The six package is for when an package author wants to program in Python 2 but also wants to make it's3 program work in Python 3 without any conversion. So six indicates that you will be reading code that is not Python 2 and will use the from future import print_function, the from future import division and more. Any author writing packages requiring six can be safely negrated. The singledistpatch package is again something from Python 34. Based on these requirements alone, I conclude that PGPy sucks.

Next, the types.py file in the pgpy directory. The code in this file failed to run on my systems and so triggered this review. The first class definition therein is an Armorable class. Clearly the authors did not know it is forbidden to define any -able class in Python. The Armorable class contains the full implementation of converting objects into armored5 text and vice versa. This is a mistake, as -able stands for Capable of being ..., the being in that fragment will need to be implemented by something else. If something is drinkable it usually does not drink itself, but is has properties that make it drinkable to someone. And which of it's many properties make it drinkable is determined by the drinker not by the drunk. Based on this class alone, I conclude that PGPy sucks.

Two classes in types.py are defined with a meta class (the Armorable and the MetaDispatchable). The whole metaclass mess is defined in PEP-3119. Go and read it if you want to waste your time. The definition of MetaDispatchable provides for an extra complex and custom object-orientation. Remember, we are reviewing a package to handle PGP code. Another strike against PGPy and I will not bore you with more.

  1. Yes not pgpy or PGPY but PGPy []
  2. The first only example is with a Fernet symmetric encryption recipe as if that is something. []
  3. Yes it's []
  4. I'm not following Python 3 but clearly the development of Python 3 has gone over the deep end []
  5. A PGPism []

Convert a TMSR key to PGP

May 29th, 2018

A script is floating around to convert TMSR key format (e,n,comment) to a PGP key for digesting in phuctor. This script did not work on the machines I tried it on. Of course, the script is fine, it's PGPY that is broken. I could not get it to install. As I'm programming in python for a living and have all kinds of stupid in me, I decided to try and fix the pgpy code that failed to install. An hour was so spent and some material gathered for a future blog post, but not any working code1.

After that I decided to spent another hour making an alternative that uses only standard python modules. I read RFC 4880 a month ago, this left me with headache back then. The thing is unreadable. So to make this script, I made extensive use of the search function in my browser and only read those lines that helped in writing the script.

The script;

import struct
import time
import sys
import base64
import math

# some format strings for the struct module
# these are used to encode integers and shorts to arrays of bytes
# '>' stands for big-endian as this is what is used in the PGP format
openpgp_publickey_format = ">BIB"
mpi_format = ">H"
packet_length_format = ">I"
crc_format = ">I"

# determine the index of the highest bit set to 1 in a number
def count_bits(B):
  R = 0
  i = 0
  while B > 0:
    i += 1
    if B & 0x1:
      R = i
    B >>= 1
  return R

# Convert a number to an array of bytes
# The bytes in the array are stored in big-endian order.
# The most significant byte is stored as the first item
# in the array
def number_to_bytes(B):
  R = []
  bits = 0
  while B > 0xff:
    bits += 8
    R.append(B & 0xff)
    B >>= 8
  R.append(B)
  bits += count_bits(B)
  return bits, ''.join(map(chr, reversed(R)))

# An MPI is a byte array that starts with a two byte
# length header. The length is given in bits.
def number_to_mpi(B):
  C, A = number_to_bytes(B)
  return struct.pack(mpi_format, C) + A

# A PGP public key header consists of a byte "4",
# an integer (4 bytes) to denote the timestamp
# and a byte "1" (RSA).
def public_key_header(T):
  return struct.pack(openpgp_publickey_format, 4, T, 1)

# A public key packet is the public key header
# plus 2 MPI numbers, the RSA modulus (N) and
# the RSA exponent (e).
def public_key_packet(t, n, e):
  return ''.join((public_key_header(t), number_to_mpi(n), number_to_mpi(e),))

# A comment or userid packet is a string encoded as utf-8
def userid_packet(s):
  return s.encode('utf8')

# The PGP format is a stream of "packets".
# Each packet has a header. This header consists of a tag
# and a length field. The tag has flags to determine if it is a
# "new" or "old" packet.
# The only supported encoding in this scriptis "new".
def encode_packet(packet_bytes, tag = 6):
  # 0x80, 8th bit always set, 7th bit set --> new packet
  h = 0x80 | 0x40
  # 0-5 bits -> the tag
  h |= tag

  # convert the integer to a byte
  header = chr(h)

  # dude, this is totally how you may save 2 or 3 bytes with minimal complexity
  l = len(packet_bytes)
  if l < 192:
    header += chr(l)
  elif l < 8384:
    l -= 192
    o1 = l >> 0xff
    o2 = l & 0xff
    header += chr(o1 + 192) + chr(o2)
  else:
    header += chr(0xff) + struct.pack(packet_length_format, l)

  return header + packet_bytes

# When you encode binary data as an ascii text with base64
# this data becomes fragile. So a CRC code is needed to
# fix this.
def crc24(s):
  R = 0xB704CE
  for char in s:
    B = ord(char)
    R ^=  B << 16
    for i in range(8):
      R <<= 1;
      if R & 0x1000000:
        R ^= 0x1864CFB
  return R & 0xFFFFFF

# Create a public key for consumption by Phuctor.
# The public key needs to contain 2 packets
# one for the key data (n, e)
# one for the comment
# It must be in the armor / ascii format.
def enarmored_public_key(n, e, comment, t):
  R = []
  # the header
  R.append("-----BEGIN PGP PUBLIC KEY BLOCK-----")
  R.append("")

  # the packets in bytes
  A = encode_packet(public_key_packet(t, n, e), 6)
  A += encode_packet(userid_packet(comment), 13)

  # the packets in base64 encoding with line length max 76
  s=base64.b64encode(A)
  i = 0
  while i < len(s):
    R.append(s[i:i+76])
    i += 76

  # the CRC
  R.append("="+base64.b64encode(struct.pack(crc_format, crc24(A))[1:]))

  # the footer
  R.append("")
  R.append("-----END PGP PUBLIC KEY BLOCK-----")

  return '\n'.join(R)

# read a file with comma separated lines
# each line is in the TMSR format: e,n,comment
if __name__ == "__main__":
  ser = 1
  for x in sys.stdin:
    x = x.strip()

    # ignore empty lines
    if len(x) == 0 or x.startswith('#'):
      continue

    # the comment may contain comma's so split on the first 2
    e,n,comment = x.split(',', 2)

    t0 = int(time.time())
    with open("{0}.txt".format(ser), "wb") as stream:
      stream.write(enarmored_public_key(int(n), int(e), comment, t0))

    ser += 1

And the patch itself with signature;

  1. I've been reading code (both open and closed source) for a large part of my life. I started this whole career by typing over basic programs into my fathers Commodore 128 and then stumbled along. The code I read in these popular security programs (pgpy, openssl, openssh, pgp) is markedly worse than any I encountered before. I can only image the kind of cockroaches that are attracted to this foul smelling mess []

Building GNAT on MUSL, now always static

May 28th, 2018

An update on the previous version.

The produced gcc compiler builds static executables and no dynamically linked executables1.

For detailed instructions in how to run the script see the readme-2018-05-28.txt.

Updated!, the 2018-05-28 file contained a broken patch

New version!

  1. Building dynamically linked executables is controlled by a specfile. GCC has a builtin specfile, code for this file can be found in the gcc/config directory of the gcc source. The C++ and Ada front-ends have a slightly different handling of these settings and for these also some code had to be changed. The changes can be found in the 'gcc-4.9.adacore2016-2-musl.diff' file. []

Building GNAT on MUSL, now with partial and parallel build support

May 15th, 2018

An update on the previous version.

The previous version needed git for getting the source of libelf-compat, this was ripped out1.
The previous version cleaned and reused the build directory for different stages, this was changed to use a separate build directory for each stage and no more cleanup. Last, the script now builds with parallel make options, further speeding up the build process2.

  1. A tarred version of libelf-compat does exist on the internets, however that version does not match the one in the git repository. []
  2. The problem with building in parallel seemed to be with some of the ada specific build rules. But changing those rules did nothing to fix the problem. In fact, the installation of a previous step had failed. Which was a direct result of using the environment variable MAKEFLAGS, this environment variable is used in the scripts but also read in the make program. So, make install was run with parallel jobs and promptly failed. The actions of one of the rules used a variable in a loop and that variable was changed by an action in another rule. The fix was 2-fold, use MAKEOPTS, change the install step to always use make -j1. []

Building GNAT on MUSL, now with ARM 64-bit support

April 30th, 2018

An update on the previous version. I thought that version already supported ARM 64-bit processors1, but unfortunately it did not.

So another debug cycle ensued. As it turned out the code that is used to generate aarch64 instructions had a wrong #ifdef line. Once this bug was fixed, the next bug cropped up and with a mean time between a possible fix and the correct fix taking days, the whole exercise took weeks2. After a week or so, the cross-compiler seemed to work. Next, I wanted to compile a native compiler for the target platform with the help of the cross-compiler. Again, time was spent, and fortunately a compiler could be build and after some more work3 it can build the FFA code on aarch64 systems.

Some small things are left to do; the scripts can do with some clean-up and the native compilers are not tarred to a file, finally, I want to try little-endian ppc.

Update:The native compilers do not contain the gprbuild tool, this is still on the todo list.

  1. Build scripts designate these processors with 'aarch64'. []
  2. Try the fix, start a rebuild which will take an hour if unsuccessful and 3 if successful, look at it after half a day or more, see that it failed, try something else, etc. etc. []
  3. It also needs an assembler and a C library []

Sending arrays of octets between C and Ada

March 2nd, 2018

The Ada language and the C language have a very different concept of strings and characters. I'll try to follow Ossasepia and use the term octet for an 8 bit integer and use char for C and Character for Ada. In Ada the Character is defined as an enumerated type ranging from the Ada.Characters.Latin_1.NUL (Character'Val (0)) character to the Ada.Characters.Latin_1.LC_Y_Diaeresis (Character'Val (255)) character. This range is exactly the same as the range of all valid octets and so characters can be stored as octets. As characters are supposed to represent another domain as natural numbers these need to be converted back and forth through the Character'Val and Character'Ord functions. In C all is a lot more muddier and a char can either be seen as an 8 bit integer or as a symbol for a language, it all depends on the context. So far for context, now to address the cause, how to convert arrays of characters between C and Ada1. I consider the Char_Ptr support from the ada standard library out of bounds2, this investigation is based on the 'char_array' type from the Ada package `Interfaces.C. First, the code;

The code defines 5 different methods to interact with strings between C and Ada;

  1. The basic method, call a function from Ada to C. The character_array in Ada, will turn into a char * pointer in C. A parameter needs to be added to pass the length of the array.
  2. The basic method, now two way. Ada will call C and C will immediately call Ada. The Ada function uses a constrained character array so no count parameter is needed for it.
  3. A GNAT specific method, import a C function but use the Ada calling convention. The character_array in Ada will be a structure in C. This layout of this structure is based on how GNAT does this internally.
  4. Call to C as in (1), but then call Ada as in (3). Note that Ada methods can be exported using the Ada calling convention.
  5. Like (2) but now the Ada procedure does not have a constrained character array as parameter but an unconstrained one, so a count parameter is needed for Ada too.

First, to define the procedures (please also read the calling conventions section of the GNAT documentation):

with Interfaces.C; use Interfaces.C;

package C_Array is

        -- The basic method, call C using a pointer and a count
        procedure C_Fill_1(CH : in out char_array; Count : Integer);
        pragma Import(C, C_Fill_1, "c_fill_1");

        -- Same method as 'C_Fill_1', but the C function will call Ada.
        procedure C_Fill_2(CH : in out char_array; Count : Integer);
        pragma Import(C, C_Fill_2, "c_fill_2");

        -- Same method as 'C_Fill_1', but the C function will call Ada using Ada calling conventions
        procedure C_Fill_3(CH : in out char_array; Count : Integer);
        pragma Import(C, C_Fill_3, "c_fill_3");

        -- Call to C using Ada calling conventions
        procedure C_Fill_4(CH : in out char_array);
        pragma Import(Ada, C_Fill_4, "c_fill_4");

        -- Same method as 'C_Fill_1', the C function will call Ada using an unconstrained array and a count.
        procedure C_Fill_5(CH : in out char_array; Count : Integer);
        pragma Import(C, C_Fill_5, "c_fill_5");

        -- For method 2, the C function will make a call to a function with a constrained array parameter
        subtype constrained_char_array is char_array(0 .. 100);
        procedure ADA_Fill_2(CH : in out constrained_char_array);
        pragma Export(C, ADA_Fill_2, "ada_fill_2");

        -- For method 3, the C function will make a call to Ada usinging Ada calling conventions.
        procedure ADA_Fill_3(CH : in out char_array);
        pragma Export(Ada, ADA_Fill_3, "ada_fill_3");

        -- For method 5, the C function will make a call to a function with a unconstrained array parameter
        subtype constrained_char_array is char_array(0 .. 100);
        procedure ADA_Fill_5(CH : in out char_array; Count: Integer);
        pragma Export(C, ADA_Fill_5, "ada_fill_5");

end C_Array;

Next, the more interesting bit, the C functions;

#include <stdint.h>
#include <stdio.h>

typedef struct B {
        size_t LB0;
        size_t UB0;
} B_t;

typedef struct U {
        char * P_ARRAY;
        B_t * P_BOUNDS;
} U_t;

void ada_fill_2(char * buffer);
void ada_fill_3(U_t array, int count);
void ada_fill_5(char * buffer, int count);

void c_fill_1(char * buffer, int count) {
        int i;

        printf("c_fill_1; buffer = %p, count = %d\n", buffer, count);

        for(i = 0; i < count; i++) {
                buffer[i] = '1';
        }
}

void c_fill_2(char * buffer, int count) {
        printf("c_fill_2; buffer = %p, count = %d\n", buffer, count);
        ada_fill_2(buffer);
}

void c_fill_3(char * buffer, int count) {
        B_t b;
        U_t a;
        b.LB0 = 0;
        b.UB0 = count;
        a.P_ARRAY = buffer;
        a.P_BOUNDS = &b;

        printf("c_fill_3; buffer = %p, count = %d\n", buffer, count);

        ada_fill_3(a, count);
}

void c_fill_4(U_t array) {
        int i = 0;
        char * buffer = array.P_ARRAY;

        printf("c_fill_4; buffer = %p, count = %d\n", array.P_ARRAY, array.P_BOUNDS->UB0 - array.P_BOUNDS->LB0);

        for(i = array.P_BOUNDS->LB0; i <= array.P_BOUNDS->UB0; i++) {
                buffer[i] = '4';
        }
}

void c_fill_5(char * buffer, int count) {
        printf("c_fill_5; buffer = %p, count = %d\n", buffer, count);
        ada_fill_5(buffer, count);
}

The first 2 functions are simple. Because the array is constrained in Ada there is no need for the count parameter, however the actual length of the array in C must always be the same as the one in Ada. Next the two methods that took the most time to figure out. I could not find any useful description of the so called Ada Calling Convention. No such convention seems to be specified, and every ada implementation is free to implement this as they see fit. The C code will be tight to GNAT when using this method and maybe even specific versions of GNAT. The Ada calling convention for arrays is implemented in the interface between the GNAT frontend and the GCC backend3. In the interface, the GNAT code tree is converted into a GCC code tree, and most of these functions are recursive and try to get information from different parts of both trees. In short, reading this code is not that easy, but from the code I could determine that unconstrained arrays are send as a structure with two fields, one field is a pointer to the start of the array, and the other is a pointer to a structure having again two fields. This last structure has a field for the lower bound and one for the upper bound of the array. The exact layout of the structure was a bit harder to determine so an extra flag for the compiler was needed -fdump-tree-original4. From that dump, I could determine the structure5. The C function is not more secure with this structure, but the Ada implementation will be. Finally, we finish with the more usual way of calling an Ada function with an unconstrained character_array and a count variable.

For reference, the Ada implementation. Note that for the fifth case we cannot use the upper bound of the array. This upper bound is undefined (and in practice will be the maximum value of the given range type).

with Ada.Text_IO; use Ada.Text_IO;
with Ada.Integer_Text_IO; use Ada.Integer_Text_IO;
with Ada.Long_Integer_Text_IO; use Ada.Long_Integer_Text_IO;

package body C_Array is

        -- We have a statically defined length so the range will be fine.
        -- The call in C code to this procedure must use a buffer with at least the constrained range.
        procedure ADA_Fill_2(CH : in out constrained_char_array) is
        begin
                Put("ada_fill_2;");
                Put(" lb=" & size_t'Image(CH'First));
                Put(" ub=" & size_t'Image(CH'Last));
                New_line;

                for I in CH'Range loop
                        CH(I) := To_C('2');
                end loop;
        end Ada_Fill_2;

        -- The call in the C code needs to send an Ada array.
        procedure ADA_Fill_3(CH : in out char_array) is
        begin
                Put("ada_fill_3;");
                Put(" lb=" & size_t'Image(CH'First));
                Put(" ub=" & size_t'Image(CH'Last));
                New_line;

                for I in CH'Range loop
                        CH(I) := To_C('3');
                end loop;
        end Ada_Fill_3;

        -- For calls from C without a constained type or ada array, an extra count parameter is needed.
        procedure ADA_Fill_5(CH : in out char_array; Count: Integer) is
        begin
                Put("ada_fill_5; count="); Put(Count);
                Put(" lb=" & size_t'Image(CH'First));
                Put(" ub=" & size_t'Image(CH'Last));
                New_line;

                -- the Range cannot be used, the 'Last index is wrong.
                for I in ch'First .. size_t(Count) loop
                        CH(I) := To_C('5');
                end loop;
        end Ada_Fill_5;
end C_Array;

The code includes a simple main program that calls all five functions;

with C_Array; use C_Array;
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Integer_Text_IO; use Ada.Integer_Text_IO;

with Interfaces.C; use Interfaces.C;

procedure Array_Main is
        work_array : char_array(0 .. 100);
        output_string : String(1 .. 101) := (others => ' ');
        C : Integer := 0;
begin
        Put("start");
        New_Line;
        C_fill_1(work_array,100);
        To_Ada(work_array, output_string, C, False);
        Put("c_fill_1 output ="); Put(output_string);
        New_Line;

        C_fill_2(work_array,100);
        To_Ada(work_array, output_string, C, False);
        Put("c_fill_2 output ="); Put(output_string);
        New_Line;

        C_fill_3(work_array,100);
        To_Ada(work_array, output_string, C, False);
        Put("c_fill_4 output ="); Put(output_string);
        New_Line;

        C_fill_4(work_array);
        To_Ada(work_array, output_string, C, False);
        Put("c_fill_4 output ="); Put(output_string);
        New_Line;

        C_fill_5(work_array,100);
        To_Ada(work_array, output_string, C, False);
        Put("c_fill_5 output ="); Put(output_string);
        New_Line;
end;

The output will be;

start
c_fill_1; buffer = 0x7ffdf87fd910, count = 100
c_fill_1 output =1111111111111111111111111111...
c_fill_2; buffer = 0x7ffdf87fd910, count = 100
ada_fill_2; lb= 0 ub= 100
c_fill_2 output =2222222222222222222222222222...
c_fill_3; buffer = 0x7ffdf87fd910, count = 100
ada_fill_3; lb= 0 ub= 100
c_fill_4 output =3333333333333333333333333333...
c_fill_4; buffer = 0x7ffdf87fd910, count = 100
c_fill_4 output =4444444444444444444444444444...
c_fill_5; buffer = 0x7ffdf87fd910, count = 100
ada_fill_5; count=        100 lb= 0 ub= 18446744073709551615
c_fill_5 output =5555555555555555555555555555...

A random remark; it is not a good idea to call To_Ada on the unconstrained array from method 56. First, To_Ada is not more efficient than a character by character conversion, in fact it is implemented as such. Second, To_Ada will use the Last parameter of the `character_array and that parameter is set to the maximum value of size_t (Ada will check on this bound but a segmentation fault will happen first). Either copy the character_array to a constrained character array first, or write a custom conversion function.

Another random remark; the Ada standard library can be studied with the Ada 2012 LRM and understood better with the GNAT source code. It helps to have a cross-referenced, browser readable version of the GNAT source code at hand (there is one in the GNAT Book, but that one is incomplete). To make such a version do:

0) Make a directory and go to it

    mkdir ada-html
    cd ada-html

1) Find the gnat runtime library
    (i.e. the directory containing adainclude and adalib)
    It should be in the gnat install directory,
    as the lib/gcc/<machine>/<gcc version>/ directory

    For AdaCore 2016, (linux 64bit) this directory can set with:

    RT_DIR = $(dirname `which gnatmake`)/../lib/gcc/x86_64-pc-linux-gnu/4.9.4

2) Copy the source files from adainclude:

    cp $RT_DIR/adainclude/*.ad* .

3) Copy the ali files from adalib (needed for cross references):

    cp $RT_DIR/adalib/*.ali .

4) Make the html files with the `gnathtml.pl` script:

    gnathtml.pl -f -D *.ad*

5) Go to the 'html' directory and look (open index.htm in a browser):

    cd html
    ls
  1. I've done this a couple of times and my knowledge thus gained was largely anecdotal. This won't do for republic business so hence this article. []
  2. please read the GNAT source code file i-cstrin.adb'. This should put you off the idea of using the Char_Ptr []
  3. all the code can be found in the gcc/ada/gcc-interfaces directory []
  4. This flag will dump the gcc version of the tree in a somewhat readable fashion. The dump does omit information so it also needs to be followed by a recompile and dump with -fdump-tree-original-raw (see also the GCC Command Line Switches). The second dump can be used to determine the types of the fields in the structures []
  5. For every kind of array this structure will have the same field but the fields will have different types. For integer array, the pointer will be an integer pointer. For Strings the boundary fields (LB0/UB0) will be 32 bit integers. So this kind of interfacing needs to be repeated on a case by case bases []
  6. This can be determined from reading the code in the i-c.adb file []