From 1fd5e000ace55b323124c7e556a7a864b972a5c4 Mon Sep 17 00:00:00 2001 From: Christopher Faylor Date: Thu, 17 Feb 2000 19:38:33 +0000 Subject: import winsup-2000-02-17 snapshot --- winsup/doc/textbinary.sgml | 181 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 181 insertions(+) create mode 100644 winsup/doc/textbinary.sgml (limited to 'winsup/doc/textbinary.sgml') diff --git a/winsup/doc/textbinary.sgml b/winsup/doc/textbinary.sgml new file mode 100644 index 000000000..cf6fc1b36 --- /dev/null +++ b/winsup/doc/textbinary.sgml @@ -0,0 +1,181 @@ +Text and Binary modes + + The Issue + +On a UNIX system, when an application reads from a file it gets +exactly what's in the file on disk and the converse is true for writing. +The situation is different in the DOS/Windows world where a file can +be opened in one of two modes, binary or text. In the binary mode the +system behaves exactly as in UNIX. However in text mode there are +major differences: + + + +On writing in text mode, a NL (\n, ^J) is transformed into the +sequence CR (\r, ^M) NL. + + + +On reading in text mode, a CR followed by an NL is deleted and a ^Z +character signals the end of file. + + + +This can wreak havoc with the seek/fseek calls since the number +of bytes actually in the file may differ from that seen by the +application. + +The mode can be specified explicitly as explained in the Programming +section below. In an ideal DOS/Windows world, all programs using lines as +records (such as bash, make, +sed ...) would open files (and change the mode of their +standard input and output) as text. All other programs (such as +cat, cmp, tr ...) +would use binary mode. In practice with Cygwin, programs that deal +explicitly with object files specify binary mode (this is the case of +od, which is helpful to diagnose CR problems). Most +other programs (such as cat, cmp, +tr) use the default mode. + + + +The default Cygwin behavior + +The Cygwin system gives us some flexibility in deciding how files +are to be opened when the mode is not specified explicitly. +The rules are evolving, this section gives the design goals. + + +If the file appears to reside on a file system that is mounted +(i.e. if its pathname starts with a directory displayed by +mount), then the default is specified by the mount +flag. If the file is a symbolic link, the mode of the target file system +applies. + + +If the file appears to reside on a file system that is not mounted +(as can happen when the path contains a drive letter), the default is text. + + + +Pipes and non-file devices are opened in binary mode, +except if the CYGWIN environment variable contains +nobinmode. +Warning!In b20.1 of 12/98, a file will be opened +in binary mode if any of the following conditions hold: + +binary mode is specified in the open call + +CYGWIN contains binmode + +the file resides in a binary mounted partition + +the file is not a disk file + + + + + + +When a Cygwin program is launched by a shell, its standard input, +output and error are in binary mode if the CYGWIN variable +contains tty, else in text mode, except if they are piped +or redirected. + When redirecting, the Cygwin shells uses rules (a-c). For +these shells the relevant value of CYGWIN is that at the time +the shell was launched and not that at the time the program is executed. +Non-Cygwin shells always pipe and redirect with binary mode. With +non-Cygwin shells the commands cat filename | program +and program < filename are not equivalent when +filename is on a text-mounted partition. + + + + +Example +To illustrate the various rules, we provide scripts to delete CRs +from files by using the tr program, which can only write +to standard output. +The script + +#!/bin/sh +# Remove \r from the file given as argument +tr -d '\r' < "$1" > "$1".nocr + + will not work on a text mounted systems because the \r will be +reintroduced on writing. However scripts such as + +#!/bin/sh +# Remove \r from the file given as argument +tr -d '\r' | gzip | gunzip > "$1".nocr + +and the .bat file + +REM Remove \r from the file given as argument +@echo off +tr -d \r < %1 > %1.nocr + + work fine. In the first case (assuming the pipes are binary) +we rely on gunzip to set its output to binary mode, +possibly overriding the mode used by the shell. +In the second case we rely on the DOS shell to redirect in binary mode. + + + +Binary or text? + +UNIX programs that have been written for maximum portability +will know the difference between text and binary files and act +appropriately under Cygwin. For those programs, the text mode default +is a good choice. Programs included in official Cygnus distributions +should work well in the default mode. + +Text mode makes it much easier to mix files between Cygwin and +Windows programs, since Windows programs will usually use the CRLF +format. Unfortunately you may still have some problems with text +mode. First, some of the utilities included with Cygwin do not yet +specify binary mode when they should, e.g. cat will +not work with binary files (input will stop at ^Z, CRs will be +introduced in the output). Second, you will introduce CRs in text +files you write, which can cause problems when moving them back to a +UNIX system. + +If you are mounting a remote file system from a UNIX machine, +or moving files back and forth to a UNIX machine, you may want to +access the files in binary mode. The text files found there will normally +be in UNIX NL format, and you would want any files put there by Cygwin +programs to be stored in a format understood by UNIX. +Be sure to remove CRs from all Makefiles and +shell scripts and make sure that you only edit the files with +DOS/Windows editors that can cope with and preserve NL terminated lines. + + +Note that you can decide this on a disk by disk basis (for +example, mounting local disks in text mode and network disks in binary +mode). You can also partition a disk, for example by mounting +c: in text mode, and c:\home +in binary mode. + + + +Programming + +In the open() function call, binary mode can be +specified with the flag O_BINARY and text mode with +O_TEXT. These symbols are defined in +fcntl.h. + +In the fopen() function call, binary mode can be +specified by adding a b to the mode string. There is no +direct way to specify text mode. + +The mode of a file can be changed by the call +setmode(fd,mode) where fd is a file +descriptor (an integer) and mode is +O_BINARY or O_TEXT. The function +returns O_BINARY or O_TEXT depending +on the mode before the call, and EOF on error. + + + + -- cgit v1.2.3