Writing A Cygwin Friendly Package

One approach to using the Cygwin support offered by GNU Autotools in your own package is to have an eye towards having it compile nicely on Unix and on Windows, or indeed of tweaking the configuration of existing packages which use GNU Autotools but which do not compile under Cygwin, or do not behave quite right after compilation. There are several things you need to be aware of in order to design a package to work seamlessly under Cygwin, and yet several more if portability to DOS and (non-Cygwin) Windows is important too. We discussed many of these issues in the section called Unix/Windows Issues in the chapter called Writing Portable C with GNU Autotools. In this section, we will expand on those issues with ways in which GNU Autotools can help deal with them.

If you only need to build executables and static libraries, then Cygwin provides an environment close enough to Unix that any packages which ship with a relatively recent configuration will compile pretty much out of the box, except for a few peculiarites of Windows which are discussed throughout the rest of this section. If you want to build a package which has not been maintained for a while, and which consequently uses an old Autoconf, then it is usually just a matter of removing the generated files, rebootstrapping the package with the installed (up to date!) Autoconf, and rerunning the configure script. On occasion some tweaks will be needed in the configure.in to satisfy the newer autoconf, but autoconf will almost always diagnose these for you while it is being run.

Text vs Binary Modes

As discussed in the section called Text and Binary Files in the chapter called Writing Portable C with GNU Autotools, text and binary files are different on Windows. Lines in a Windows text files end in a carriage return/line feed pair, but a C program reading the file in text mode will see a single line feed.

Cygwin has several ways to hide this dichotomy, and the solution(s) you choose will depend on how you plan to use your program. I will outline the relative tradeoffs you make with each choice:

mounting

Before installing an operating system to your hard drive, you must first organise the disk into partitions. Under Windows, you might only have a single partition on the disk, which would be called C: [1] . Provided that some media is present, Windows allows you to access the contents of any drive letter - that is you can access A: when there is a floppy disk in the drive, and F: provided you divided you available drives into sufficient partitions for that letter to be in use. With Unix, things are somewhat different: hard disks are still divided into partitions (typically several), but there is only a single filesystem mounted under the root directory. You can use the mount command to hook a partition (or floppy drive or CD-ROM, etc.) into a subdirectory of the root filesystem:
          $ mount /dev/fd0 /mnt/floppy
          $ cd /mnt/floppy
           

Until the directory is unmounted, the contents of the floppy disk will be available as part of the single Unix filesystem in the directory, /mnt/floppy. This is in contrast with Windows' multiple root directories which can be accessed by changing filesystem root - to access the contents of a floppy disk:
          C:\WINDOWS\> A:
          A:> DIR
          ...
           

Cygwin has a mounting facility to allow Cygwin applications to see a single unified file system starting at the root directory, by mounting drive letters to subdirectories. When mounting a directory you can set a flag to determine whether the files in that partition should be treated the same whether they are TEXT or BINARY mode files. Mounting a file system to treat TEXT files the same as BINARY files, means that Cygwin programs can behave in the same way as they might on Unix and treat all files as equal. Mounting a file system to treat TEXT files properly, will cause Cygwin programs to translate between Windows CR-LF line end sequences and Unix CR line endings, which plays havoc with file seeking, and many programs which make assumptions about the size of a char in a FILE stream. However binmode is the default method because it is the only way to interoperate between Windows binaries and Cygwin binaries. You can get a list of which drive letters are mounted to which directories, and the modes they are mounted with by running the mount command without arguments:
          BASH.EXE-2.04$ mount
          Device              Directory            Type        flags
          C:\cygwin           /                    user        binmode
          C:\cygwin\bin       /usr/bin             user        binmode
          C:\cygwin\lib       /usr/lib             user        binmode
          D:\home             /home                user        binmode
           

As you can see, the Cygwin mount command allows you to `mount' arbitrary Windows directories as well as simple drive letters into the single filesystem seen by Cygwin apllications.

binmode

The CYGWIN environment variable holds a space separated list of setup options which exert some minor control over the way the cygwin1.dll (or cygwinb19.dll etc.) behaves. One such option is the binmode setting; if CYGWIN contains the binmode option, files which are opened through cygwin1.dll without an explicit text or binary mode, will default to binary mode which is closest to how Unix behaves.

system calls

cygwin1.dll, GNU libc and other modern C API implementations accept extra flags for fopen and open calls to determine in which mode a file is opened. On Unix it makes no difference, and sadly most Unix programmers are not aware of this subtlety, so this tends to be the first thing that needs to be fixed when porting a Unix program to Cygwin. The best way to use these calls portably is to use the following macros with a package's configure.in to be sure that the extra arguments are available:

     # _AB_AC_FUNC_FOPEN(b | t, USE_FOPEN_BINARY | USE_FOPEN_TEXT)
     # -----------------------------------------------------------
     define([_AB_AC_FUNC_FOPEN],
     [AC_CACHE_CHECK([whether fopen accepts "$1" mode], [ab_cv_func_fopen_$1],
     [AC_TRY_RUN([#include <stdio.h>
     int
     main ()
     {
        FILE *fp = fopen ("conftest.bin", "w$1");
        fprintf (fp, "\n");
        fclose (fp);
        return 0;
     }],
                 [ab_cv_func_fopen_$1=yes],
                 [ab_cv_func_fopen_$1=no],
                 [ab_cv_func_fopen_$1=no])])
     if test x$ab_cv_func_fopen_$1 = xyes; then
       AC_DEFINE([$2], 1,
                 [Define this if we can use the "$1" mode for fopen safely.])
     fi[]dnl
     ])# _AB_AC_FUNC_FOPEN
     
     # AB_AC_FUNC_FOPEN_BINARY
     # -----------------------
     # Test whether fopen accepts a "" in the mode string for binary file
     # opening.  This makes no difference on most unices, but some OSes
     # convert every newline written to a file to two bytes (CR LF), and
     # every CR LF read from a file is silently converted to a newline.
     AC_DEFUN([AB_AC_FUNC_FOPEN_BINARY], [_AB_AC_FUNC_FOPEN(b, USE_FOPEN_BINARY)])
     
     # AB_AC_FUNC_FOPEN_TEXT
     # ---------------------
     # Test whether open accepts a "t" in the mode string for text file
     # opening.  This makes no difference on most unices, but other OSes
     # use it to assert that every newline written to a file writes two
     # bytes (CR LF), and every CR LF read from a file are silently
     # converted to a newline.
     AC_DEFUN([AB_AC_FUNC_FOPEN_TEXT],   [_AB_AC_FUNC_FOPEN(t, USE_FOPEN_TEXT)])
     
     
     # _AB_AC_FUNC_OPEN(O_BINARY|O_TEXT)
     # ---------------------------------
     AC_DEFUN([_AB_AC_FUNC_OPEN],
     [AC_CACHE_CHECK([whether fcntl.h defines $1], [ab_cv_header_fcntl_h_$1],
     [AC_EGREP_CPP([$1],
                   [#include <sys/types.h>
     #include <sys/stat.h>
     #include <fcntl.h>
     $1
     ],
                   [ab_cv_header_fcntl_h_$1=no],
                   [ab_cv_header_fcntl_h_$1=yes])
     if test "x$ab_cv_header_fcntl_h_$1" = xno; then
       AC_EGREP_CPP([_$1],
                    [#include <sys/types.h>
     #include <sys/stat.h>
     #include <fcntl.h>
     _$1
     ],
                     [ab_cv_header_fcntl_h_$1=0],
                     [ab_cv_header_fcntl_h_$1=_$1])
     fi])
     if test "x$ab_cv_header_fcntl_h_$1" != xyes; then
       AC_DEFINE_UNQUOTED([$1], [$ab_cv_header_fcntl_h_$1],
         [Define this to a usable value if the system provides none])
     fi[]dnl
     ])# _AB_AC_FUNC_OPEN
     
     
     # AB_AC_FUNC_OPEN_BINARY
     # ----------------------
     # Test whether open accepts O_BINARY in the mode string for binary
     # file opening.  This makes no difference on most unices, but some
     # OSes convert every newline written to a file to two bytes (CR LF),
     # and every CR LF read from a file is silently converted to a newline.
     #
     AC_DEFUN([AB_AC_FUNC_OPEN_BINARY], [_AB_AC_FUNC_OPEN([O_BINARY])])
     
     
     # AB_AC_FUNC_OPEN_TEXT
     # --------------------
     # Test whether open accepts O_TEXT in the mode string for text file
     # opening.  This makes no difference on most unices, but other OSes
     # use it to assert that every newline written to a file writes two
     # bytes (CR LF), and every CR LF read from a file are silently
     # converted to a newline.
     #
     AC_DEFUN([AB_AC_FUNC_OPEN_TEXT],   [_AB_AC_FUNC_OPEN([O_TEXT])])
     
     
      

Add the following preprocessor code to a common header file that will be included by any sources that use fopen calls:
     #define fopen	rpl_fopen
      

Save the following function to a file, and link that into your program so that in combination with the preprocessor magic above, you can always specify text or binary mode to open and fopen, and let this code take care of removing the flags on machines which do not support them:
     #if HAVE_CONFIG_H
     #  include <config.h>
     #endif
     
     #include <stdio.h>
     
     /* Use the system size_t if it has one, or fallback to config.h */
     #if STDC_HEADERS || HAVE_STDDEF_H
     #  include <stddef.h>
     #endif
     #if HAVE_SYS_TYPES_H
     #  include <sys/types.h>
     #endif
     
     /* One of the following headers will have prototypes for malloc
        and free on most systems.  If not, we don't add explicit
        prototypes which may generate a compiler warning in some
        cases -- explicit  prototypes would certainly cause
        compilation to fail with a type clash on some platforms. */
     #if STDC_HEADERS || HAVE_STDLIB_H
     #  include <stdlib.h>
     #endif
     #if HAVE_MEMORY_H
     #  include <memory.h>
     #endif
     
     #if HAVE_STRING_H
     #  include <string.h>
     #else
     #  if HAVE_STRINGS_H
     #    include <strings.h>
     #  endif /* !HAVE_STRINGS_H */
     #endif /* !HAVE_STRING_H */
     
     #if ! HAVE_STRCHR
     
     /* BSD based systems have index() instead of strchr() */
     #  if HAVE_INDEX
     #    define strchr index
     #  else /* ! HAVE_INDEX */
     
     /* Very old C libraries have neither index() or strchr() */
     #    define strchr rpl_strchr
     
     static inline const char *strchr (const char *str, int ch);
     
     static inline const char *
     strchr (const char *str, int ch)
     {
       const char *p = str;
       while (p && *p && *p != (char) ch)
         {
           ++p;
         }
     
       return (*p == (char) ch) ? p : 0;
     }
     #  endif /* HAVE_INDEX */
     
     #endif /* HAVE_STRCHR */
     
     /* BSD based systems have bcopy() instead of strcpy() */
     #if ! HAVE_STRCPY
     # define strcpy(dest, src)        bcopy(src, dest, strlen(src) + 1)
     #endif
     
     /* Very old C libraries have no strdup(). */
     #if ! HAVE_STRDUP
     # define strdup(str)                strcpy(malloc(strlen(str) + 1), str)
     #endif
     
     char*
     rpl_fopen (const char *pathname, char *mode)
     {
         char *result = NULL;
         char *p = mode;
     
         /* Scan to the end of mode until we find 'b' or 't'. */
         while (*p && *p != 'b' && *p != 't')
           {
             ++p;
           }
     
         if (!*p)
           {
             fprintf(stderr,
                 "*WARNING* rpl_fopen called without mode 'b' or 't'\n");
           }
     
     #if USE_FOPEN_BINARY && USE_FOPEN_TEXT
         result = fopen(pathname, mode);
     #else
         {
             char ignore[3]= "bt";
             char *newmode = strdup(mode);
             char *q       = newmode;
     
             p = newmode;
     
     #  if ! USE_FOPEN_TEXT
             strcpy(ignore, "b")
     #  endif
     #  if ! USE_FOPEN_BINARY
             strcpy(ignore, "t")
     #  endif
     
             /* Copy characters from mode to newmode missing out
                b and/or t. */
             while (*p)
               {
                 while (strchr(ignore, *p))
                   {
                     ++p;
                   }
                 *q++ = *p++;
               }
             *q = '\0';
     
             result = fopen(pathname, newmode);
     
             free(newmode);
         }
     #endif /* USE_FOPEN_BINARY && USE_FOPEN_TEXT */
     
         return result;
     }
     
      

The correct operation of the file above relies on several things having been checked by the configure script, so you will also need to ensure that the following macros are present in your configure.in before you use this code:
     # configure.in -- Process this file with autoconf to produce configure
     AC_INIT(rpl_fopen.c)
     
     AC_PROG_CC
     AC_HEADER_STDC
     AC_CHECK_HEADERS(string.h strings.h, break)
     AC_CHECK_HEADERS(stdlib.h stddef.h sys/types.h memory.h)
     
     AC_C_CONST
     AC_TYPE_SIZE_T
     
     AC_CHECK_FUNCS(strchr index strcpy strdup)
     AB_AC_FUNC_FOPEN_BINARY
     AB_AC_FUNC_FOPEN_TEXT
     
      

File System Limitations

We discussed some differences between Unix and Windows file systems in the section called File system Issues in the chapter called Writing Portable C with GNU Autotools. You learned about some of the differences between Unix and Windows file syatems. This section expands on that discussion, covering filename differences and separator and drive letter distinctions.

8.3 Filenames

As discussed earlier, DOS file systems have severe restrictions on possible file names: they must follow an 8.3 format. See the section called DOS Filename Restrictions in the chapter called Writing Portable C with GNU Autotools.

This is quite a severe limitation, and affects some of the inner workings of GNU Autotools in two ways. The first is handled automatically, in that if .libs isn't a legal directory name on the host system, Libtool and Automake will use the directory _libs instead. The other is that the traditional config.h.in file is not legal under this scheme, and it must be worked around with a little known feature of Autoconf:
     AC_CONFIG_HEADER(config.h:config.hin)
      

Separators and Drive Letters

As discussed earlier (see the section called Windows Separators and Drive Letters in the chapter called Writing Portable C with GNU Autotools), the Windows file systems use different delimiters for separating directories and path elements than their Unix cousins. There are three places where this has an effect:

the shell command line

Up until Cygwin b20.1, it was possible to refer to drive letter prefixed paths from the shell using the //c/path/to/file syntax to refer to the directory root at C:\path\to\file. Unfortunately, the Windows kernel confused this with the its own network share notation, causing the shell to pause for a short while to look for a machine named c in its network neighbourhood. Since release 1.0 of Cygwin, the //c/path/to/file notation now really does refer to a machine named c from Cygwin as well as from Windows. To refer to drive letter rooted paths on the local machine from Cygwin there is a new hybrid c:/path/to/file notation. This notation also works in Cygwin b20, and is probably the system you should use.

On the other hand, using the new hybrid notation in shell scripts means that they won't run on old Cygwin releases. Shell code embedded In configure.in scripts, should test whether the hybrid notation works, and use an alternate macro to translate hybrid notation to the old style if necessary.

I must confess that from the command line I now use the longer /cygdrive/c/path/to/file notation, since TAB completion doesn't yet work for the newer hybrid notation. It is important to use the new notation in shell scripts however, or they will fail on the latest releases of Cygwin.

shell scripts

For a shell script to work correctly on non-Cygwin development environments, it needs to be aware of and handle Windows path and directory separator and drive letters. The Libtool scripts use the following idiom:
          case "$path" in
          # Accept absolute paths.
          [\\/]* | [A-Za-\]:[\\/]*)
            # take care of absolute paths
            insert some code here
            ;;
          *)
            # what is left must be a relative path
            insert some code here
            ;;
          esac     

source code

When porting Unix software to Cygwin, this is much less of an issue because these differences are hidden beneath the emulation layer, and by the mount command respectively; although I have found that GCC, for example, returns a mixed mode / and \ delimitted include path which upsets Automake's dependency tracking on occasion.

Cygwin provides convenience functions to convert back and forth between the different notations, which we call POSIX paths or path lists, and WIN32 paths or path lists: - Function: int posix_path_list_p (const char *path)

Return 0, unless path is a / and : separated path list. The determination is rather simplistic, in that a string which contains a ; or begins with a single letter followed by a : causes the 0 return. - Function: void cygwin_win32_to_posix_path_list (const char *win32, char *posix)

Converts the \ and ; delimiters in win32, into the equivalent / and : delimiters while copying into the buffer at address posix. This buffer must be preallocated before calling the function. - Function: void cygwin_conv_to_posix_path (const char *path, char *posix_path)

If path is a \ delimitted path, the equivalent, / delimitted path is written to the buffer at address posix_path. This buffer must be preallocated before calling the function. - Function: void cygwin_conv_to_full_posix_path (const char *path, char *posix_path)

If path is a, possibly relative, \ delimitted path, the equivalent, absolute, / delimitted path is written to the buffer at address posix_path. This buffer must be preallocated before calling the function. - Function: void cygwin_posix_to_win32_path_list (const char *posix, char *win32)

Converts the / and : delimiters in posix, into the equivalent \ and ; delimiters while copying into the buffer at address win32. This buffer must be preallocated before calling the function. - Function: void cygwin_conv_to_win32_path (const char *path, char *win32_path)

If path is a / delimitted path, the equivalent, \ delimitted path is written to the buffer at address win32_path. This buffer must be preallocated before calling the function. - Function: void cygwin_conv_to_full_win32_path (const char *path, char *win32_path)

If path is a, possibly relative, / delimitted path, the equivalent, absolute, \ delimitted path is written to the buffer at address win32_path. This buffer must be preallocated before calling the function.

You can use these functions something like this:
     void
     display_canonical_path(const char *maybe_relative_or_win32)
     {
         char buffer[MAX_PATH];
         cygwin_conv_to_full_posix_path(maybe_relative_or_win32,
                                        buffer);
         printf("canonical path for %s:  %s\n",
                maybe_relative_or_win32, buffer);
     }

For your code to be fully portable however, you cannot rely on these Cygwin functions as they are not implemented on Unix, or even mingw or DJGPP. Instead you should add the following to a shared header, and be careful to use it when processing and building paths and path lists:
     #if defined __CYGWIN32__ && !defined __CYGWIN__
        /* For backwards compatibility with Cygwin b19 and
           earlier, we define __CYGWIN__ here, so that
           we can rely on checking just for that macro. */
     #  define __CYGWIN__  __CYGWIN32__
     #endif     
     #if defined _WIN32 && !defined __CYGWIN__
        /* Use Windows separators on all _WIN32 defining
           environments, except Cygwin. */
     #  define DIR_SEPARATOR_CHAR		'\\'
     #  define DIR_SEPARATOR_STR		"\\"
     #  define PATH_SEPARATOR_CHAR		';'
     #  define PATH_SEPARATOR_STR		";"
     #endif
     #ifndef DIR_SEPARATOR_CHAR
        /* Assume that not having this is an indicator that all
           are missing. */
     #  define DIR_SEPARATOR_CHAR		'/'
     #  define DIR_SEPARATOR_STR		"/"
     #  define PATH_SEPARATOR_CHAR		':'
     #  define PATH_SEPARATOR_STR		":"
     #endif /* !DIR_SEPARATOR_CHAR */

With this in place we can use the macros defined above to write code which will compile and work just about anywhere:
     char path[MAXBUFLEN];
     snprintf(path, MAXBUFLEN, "%ctmp%c%s\n",
              DIR_SEPARATOR_CHAR, DIR_SEPARATOR_CHAR, foo);
     file = fopen(path, "tw+");

Executable Filename Extensions

As I already noted in the section called Package Installation, the fact that Windows requires that all program files be named with the extension .exe, is the cause of several inconsistencies in package behaviour between Windows and Unix.

For example, where Libtool is involved, if a package builds an executable which is linked against an as yet uninstalled library, libtool puts the real executable in the .libs (or _libs) subdirectory, and writes a shell script to the original destination of the executable [2] , which ensures the runtime library search paths are adjusted to find the correct (uninstalled) libraries that it depends upon. On Windows, only a PE-COFF executable is allowed to bear the .exe extension, so the wrapper script has to be named differently to the executable it is substituted for (i.e the script is only executed correctly by the operating system if it does not have an .exe extension). The result of this confusion is that the Makefile can't see some of the executables it builds with Libtool because the generated rules assume an .exe extension will be in evidence. This problem will be addressed in some future revision of Automake and Libtool. In the mean time, it is sometimes necessary to move the executables from the .libs directory to their install destination by hand. The continual rebuilding of wrapped executables at each invocation of make is another symptom of using wrapper scripts with a different name to the executable which they represent.

It is very important to correctly add the .exe extension to program file names in your Makefile.am, otherwise many of the generated rules will not work correctly while they await a file without the .exe extension. Fortunately, Automake will do this for you where ever it is able to tell that a file is a program - everything listed in bin_PROGRAMS for example. Occasionaly you will find cases where there is no way for Automake to be sure of this, in which case you must be sure to add the $(EXEEXT) suffix. By structuring your Makefile.am carefully, this can be avoided in the majority of cases:
     TESTS = $(check_SCRIPTS) script-test bin1-test$(EXEEXT)
      

could be rewritten as:
     check_PROGRAMS = bin1-test
     TESTS = $(check_SCRIPTS) script-test $(check_PROGRAMS)
      

The value of EXEEXT is always set correctly with respect to the host machine if you use Libtool in your project. If you don't use Libtool, you must manually call tyhe Libtool macro, AC_EXEEXT in your configure.in to make sure that it is initialiesed correctly. If you don't call this macro (either directly or implicity with AC_PROG_LIBTOOL), your project will almost certainly not build correctly on Cygwin.

Notes

[1]

Typically you would also have a floppy drive named @samp{A:}, and a @sc{CD-ROM} named @samp{D:}.

[2]

@xref{Executing Uninstalled Binaries}.