Fog Creek Software
Discussion Board




Cross UNIX Portability - Facts and Myths

The purpose of this post is to specify the various issues that pertain
to writing cross-platform UNIX applications. It is not meant to tell
you exactly how to avoid these problems. That where books like
the excellent "Porting UNIX Software" by Greg Lehey
( http://www.oreilly.com/catalog/port/ ) or "Advanced Programming in the
UNIX Environment" by W. Richard Stevens (which I haven't read yet but have
heard only positive things about). However it is meant to be exhaustive.

1. Cross-Architecture Programmers need to be aware of several things

Some architectures are little-endian (like Intel x86's), others are
big endian (like Power PCs or SPARCs). Some are 32-bit, others are 64-bit.
Padding of struct fields vary between the architectures as well. Running
into these things can easily be avoided by a clueful developer, but he
needs to be aware of them.

One incident that I remember was a Windows code where the content of the
struct was written straight to the disk, thus designating the file format
we used and distributed. (i.e: "write(my_file_handle, &mystruct,
sizeof(mystruct));"). This is one thing that an experienced UNIX programmer
won't do. And I was told the code was written by someone who was better at
Object Oriented Programming than anyone else in Tel Aviv.

2. Writing software that compiles everywhere is hard ; Writing software
that runs perfectly everywhere is trickier.

Getting software to compile on a different UNIX platform is _relatively_ easy.
Tools like GNU Autoconf/Automake/Libtool can help you with it.

Getting a software to run equally well everywhere is tricky. I recall a time
that a simple sockets' program my partner and I wrote and executed on
perfectly on  Linux, failed from some reason on Solaris. We did not know
what was the problem for a long time, until we Googled it and found that we
had a problem in our program that occurs only on Solaris and not on Linux,
and we were easily able to write a workaround that worked equally well on both platforms.

Of course, had it been a larger program with much more system calls, then
making sure it is portable would have been much harder.

However, if your program is written well, actively tested by your QA
engineers or volunteers, it should run equally well on every UNIX platform.

3. The GNU Autotools - a Two-Edged Sword

Autoconf/Automake/Libtool contain some very good functionality which is a
must for writing portable code. However, they are very, very hard to use as
they assume the greatest common denominator of a working UNIX system. One can
work with them, but it also take a lot of time and frustration to do so.
Recently, we have seen an inflation of open source "start up" projects to
replace them with something better. Most of them did not see a significant
use yet. For a partial list, check:

http://vipe.technion.ac.il/~shlomif/software-tools/

4. UNIXes come in all shapes and sizes

UNIXes vary in their quality, compatibility and feature set. It is generally
assumed that Linux and Solaris (and as far as I know the BSDs too) are the most
complete and "just working" of them, and from there it is going steadily
downhill. I once heard an experienced hacker say that "HP-UX is not UNIX
and AIX is even less than that".

Again, the books I recommended may give some perspective on what to expect
in this regard.

5. Abstraction Libraries - a Partial Solution that is better than nothing.

Many abstraction libraries have been developed for UNIX (see
http://vipe.technion.ac.il/~shlomif/abstraction/) They abstract the
system functionality in a set of layers above them, that are usually
saner, and more well-documented and centrally documented. Several software
packages use them instead of implementing their own ad-hoc proprietary
abstraction logic, but it's hard to tell whether this trend is growing or
not. It is up to you to evaluate the suitability of them for your projects.

6. Portability with Win32 - Impossible but Doable

So far, a great deal of prominent UNIX software was ported to Win32 (either
open source or commercial). Usually it involved a great deal of difficulty
as Win32 does not natively support a large number of UNIX mechanisms, that
purely UNIX hackers take for granted. Nevertheless, it is doable. However,
not all projects found it a high priority to do so, for various reasons.

7. C++ Portability - g++ or bust

C++ compilers vary a great deal in implementation of the C++ standards
and in which features are supported. Mozilla's C++ Portability Guide define
what is the gcd of this:

http://www.mozilla.org/hacking/portable-cpp.html

As can be easily derived this is C++ that is not quite C++. An anecdote is
that a friend of mine (that I highly admire his skill as a programmer) got
burned once when he wrote code with "friend" methods that compiled fine on
Linux, and did not work properly in a certain proprietary compiler that
was needed to compile it for the Solaris target platform. He had to eliminate
the use of friends classes.

The only real solution is to restrict your use to the only cross-platform,
open source, compiler that implements a broad enough subset of the language
to have breathing air with - g++ of the GNU Compiler Collection. This compiler
is written in ANSI C and offers true compatibility across platforms, with
good code performance. So, my suggestion, is to restrict your work to g++
and possibly to relatively new versions of the other compilers as well.

8. Perl/Python/PHP/Ruby/Tcl - The Pot at the end of the Rainbow

Perl and friends are high-level P-code interpreted virtual machines which
transparently abstract all the system's functionality in neat convenient
packages. Each language like that, has its own advantages and disadvantages,
and discussing the pros and cons of them are out of scope of this document.

The important thing to remember is that it is much easier to write programs
that will work well on one modern platform as well as on any other. As such
, and because of their good support of other useful paradigms, they have seen
wide deployment by the "hackers" crowd, who used them for writing more and
more software. This trend seems to grow.

Note that sometimes, speed of execution prevents various distributable software
from being written in these languages, as they tend to perform quite poorly
compared to the most optimized C code. (or in many cases even a very sloppily
written one). Nevertheless, they do fill a growing niche quite competently.

Still, one has to stress that even with these languages, portability can

9. Java - nice try, but...

Java was over-hyped as a "write once ; run everywhere" solution. As someone
once noted it was more of a "write once ; debug everywhere" one, at least at
its beginning. Now it is much more stable, (and less hyped than it had been),
and so seems to be used appropriately.

Java tried to combine the best elements of C++ and those of high-level
languages like Smalltalk. It was a good attempt, but most Perl and Python
hackers felt it hardly went far enough for it to be usable. See for instance
Paul Graham analysis of it in his "Java's Cover" essay:

http://www.paulgraham.com/javacover.html

Java probably has its niche, but it seems that most of the uses Sun would
have wanted it to believe it would fit into, can better be fulfilled in
Perl, Python and friends, or alternatively in a carefully written C or C++
code. Add to that the licensing and availability issues than its vendor imposes
on them, and you'll understand why it isn't very popular.

10. Conclusion

Writing a software that will run on most modern UNIX systems out there is
possible, but requires clueful, knowledgeable developers. (And users or
QA engineers who will assist him on testing it everywhere it needs to run on.
) Restricting yourself primarily to a certain number of UNIX flavours
(probably just Linux) will make your life much easier.

On the other hand, writing code that will run on Windows, requires testing
on a great deal of Windows variants after compilation. And when developing
solely for Windows, your code can probably never be able to run anywhere
except there. (not including the various recent .NET/Mono focus).

Writing software for UNIX or a particular UNIX flavour offers a more
open-source environment than Windows, reduced cost of tools (albeit perhaps
not reduced TCO - no one seems to know for sure), components that as a
general rule, "just work", and if you're using the right tools and
methodologies, even portability to Win32. UNIX Portability is a fact and
not an ideal, as many open source packages are known to compile and
work on a large number of modern and legacy systems.

Shlomi Fish
Saturday, January 03, 2004

Just to note that I wrote this comment by inspiration from what Joel said here:

http://www.joelonsoftware.com/articles/LordPalmerston.html

Namely:

<<<
I think this person was trying to say that in the Linux world they don't write setup programs. Well, I hate to disappoint you, but you have something just as complicated: imake, make, config files, and all that stuff, and when you're done, you still distribute applications with a 20KB INSTALL file full of witty instructions
>>>

First of all, Joel clearly shows that he's a bit out of date on where Linux is (most people now use Autoconf instead of imake). But otherwise, I just wanted to bring some facts into this.

Shlomi Fish
Saturday, January 03, 2004

Yeah, unix portability is a very interesting problem indeed ;-).

THE Best Practice to reduce this to a minimum is to, from the start, write on more then one unix and have daily build and tests run so that you can be aware of problems early on. Personally, I think that one Linux flavor, one BSD flavor (especially Net/OpenBSD, they're more "traditional") and a commercial unix or two (Solaris + one of your choice) is the best mix. Just get a cheap used sun ultra-30 or so and an SGI Indigo2 and you're set, 32-bit, 64-bit, indian-ness and ass-backward-ness are all covered...

Saruman
Saturday, January 03, 2004

Excellent article, Shlomi!

John
Saturday, January 03, 2004

> And I was told the code was written by someone who was better at Object Oriented Programming than anyone else in Tel Aviv

I live in Tel Aviv and I don't recall participating in a poll to determine my relative rank as an OOP developer. Moreover, being good or bad in OOP does not necessarily improve your ability to write portable code. All I can say is that:

write(my_file_handle, &mystruct, sizeof(mystruct));

is not good OOP. I would expect the "best Object Oriented Programmer" to write something like:

my_file << myclass;

Anyway, at the company I work for we managed to take a non-trivial win32 app, written in C++, and port it to Linux (tested on RedHat and Suse), Solaris, SCO UnixWare and OpenServer, HP-UX, AIX and OpenVMS. Easy no, doable yes.

Dan Shappir
Saturday, January 03, 2004

Just a note: For cross-platform scripting, the Rebol language is very impressive. This very tiny, quick interpreter runs identically on dozens of OSes:

http://www.rebol.com/platforms.shtml
http://www.rebol.com/view-platforms.shtml

Rebol prizes platform independence above all else, which is nice for some tasks. The flip side of this is that it comes at the expense of OS specific system calls.

Likes long walks, short piers
Saturday, January 03, 2004

Shlomi Fish, have you ever worked on a real java project?  Its obvious to me that you havent, but correct me if i'm wrong.  There are a lot of things that java sucks at, but a couple things its great at.  The last two java projects i've worked on wouldn't have been possible in perl without some major, custom modules (which would probably have to be done in c or c++?), and yes, it could have been done in c++, but it would have taken 3 times as long.  I'd also guess that the equivelant of 300k lines of java in C++ is something like 600k+.    As for portability?  all the developers have their own development enviorments on their WINDOWS laptops, and our staging and production envoirments are beefed up linux servers.  And guess what?  With the exception of one problem the first time we moved over, everything works! 

vince
Saturday, January 03, 2004

Another option is a combination of interpreted script with compiled code where speed is really needed. I suspect that for most applications, the need for optimized C is more in a C programmer's imagination.

Tom Hathaway
Saturday, January 03, 2004

Dan Shappir: I realize mastery of Object Oriented Programming and knowing how to avoid the write struct stuff are orthogonal. But I specified it to show that the guy who wrote it was not very clueless, and still made this (possibly common) mistake.

Actually it was much worse, as he decided to use a padding of 1 byte in the internal DLL (to save space on the disk ?) and use 4 bytes padding by the application. This made me have to write an ugly conversion problem.

In any case, the safest UNIXish way to write a binary data file is to serialize each relevant member of the participating array individually. You can do it in C++ by declaring a stream operator (as you've demonstrated), and in C by declaring a couple of nifty functions. Not too hard, but still requires a clueful developer.

Shlomi Fish
Saturday, January 03, 2004

vince: good for you and your Java project. In any case, I still think Java is used (mainly in the industry, but not only) for many tasks that are otherwise very easily done in Perl and friends. I worked on a Java project a while back:

MikMod for Java - http://t2.technion.ac.il/~shlomif/jmikmod/

It was written for JDK 1.0.x, but still can be easily ported to JDK 1.4.x. My impressions from the language were not entirely positive (but, OTOH, not entirely negative, either). Back then, I found it vastly incompatible with C or C++, it was very slow after I eliminated all pointer arithemtics, and it's quite verbose (even though I did not felt it too much). The fact all structs were references causes it to have a huge memory overhead sometimes. Its proprietary nature also is a huge drawback.

I'm not saying it does not have its uses, just that it doesn't go as far as Perl and friends do. For that matter, I also found C++ lacking in some respects and used ANSI C for one of my projects:

http://vipe.technion.ac.il/~shlomif/lecture/Freecell-Solver/slides/why_not_cpp.html

Finally, when I see a Java project in Freshmeat, I almost usually avoid trying it out. Those things are usually a dependency hell, difficult to debug, and cause too many problems.

Shlomi Fish
Saturday, January 03, 2004

Tom Hathaway: most applications need not be very fast, and can easily be written in Perl. Some applications, however, seriously do.

When working with Perl, you need to understand that some operations are fast (like regexps search and replace or built-in functions), while stupid loops, conditionals, etc. have a lot of overhead. It's kind of like working in Matlab, where you have to think in matrix operations, and cannot simply use loops all over the place. (or it will be dog slow).

Shlomi Fish
Saturday, January 03, 2004

As a windows programmer I have always been frustrated by windows mechanisms not available in unix. I hadnt really considered that it goes the other way also. What are these unix mechanisms your mention?

Curious Windows Kernel Programmer
Saturday, January 03, 2004

Curious Windows Kernel Programmer: well, here are a few things off the top of the head:

1. A fork()-equivalent system call is not available in Win32. What fork() does is clone the current process into a parent
and a child that differ only in the function return value.

fork is very useful for creating Internet servers and for security, and the entire UNIX multi-processing is based on fork() and execve().

2. All major GUI libraries for UNIX have better geometry managers than X,Y. Like tables, or packing. (and naturally still support X,Y). The Win32 API suppports only X,Y, and creating
Windows that can be easily resized usually requires implementing this logic yourself. And different fonts can make your dialog unusable.

3. I personally find the Win32 API to be over-engineered and much less usable than the UNIX API. Why do you need to pass 7 arguments to the core function that opens a file? (In UNIX it's 3 or 2 for fopen).

4. There are much fewer signals available on Windows than on UNIX. (including the useful SIGUSR1 and SIGUSR2 signals).

5. Some user-id/process-id functions have no Windows equivalents.

6. Windows deviated from the "everything is a file" philosophy of UNIX. As a result, working with drivers and coding such is much harder than on UNIX. On UNIX, a device driver is very straightforward to write, and you can later use open(), read(), write(), ioctl(), etc. to manipulate it (possibly even from Perl). In Windows, it is much more difficult.

There may be more.

Shlomi Fish
Sunday, January 04, 2004

I'm not anti-perl, I think it has some great uses, but it also has its limitations.  I've seen them occur in complex, multi-server enviorments.  Its true that a lot of people use Java when it would be way more appropriate to use perl, or something else.  Not always though.    I'm not sure what you meant about "eliminating all pointer arithmetics".

How does all "structs" (I think you mean objects, btw) being refrences cause memory problems?  The fact that they are refrences mean its easy to cache objects in a single JVM.  Not only that, since when does memory matter.  If the choices are have a team of programmers spend two weeks tuning for performance, or buy a couple gigs more of RAM, I'm sure the  latter comes out cheaper.
Now, maybe we're talking about different types of projects.  I think perl is going to be superior on any single server application.  If your going to have multiple servers, possibly interacting with more then one database, I would definatly go with java.   

I also disagree with your assesment of  java being "proprietary".  While it *was* developed by SUN, its currently supported by a number of companies, and in addition, almost EVERY major software company has an application server (macromedia, novell, oracle, ibm, apple, bea, etc.)  Not only that, theres a huge open source following.  And when SUN's shrinking empire finally comes crashing down, i'm sure IBM will pick up java. ;-)

MikMod for Java - http://t2.technion.ac.il/~shlomif/jmikmod/

vince
Sunday, January 04, 2004

"1. A fork()-equivalent system call is not available in Win32. What fork() does is clone the current process into a parent
and a child that differ only in the function return value.

fork is very useful for creating Internet servers and for security, and the entire UNIX multi-processing is based on fork() and execve()."

As a general question it is my understanding that forking is more expensive on Windows, but threads are better on Windows than Unix.  Is that true for the most part?

Mike
Sunday, January 04, 2004

Mike: a fork() system call is non-existent in the Win32 sub-system (as compilable by Visual C++, Mingw32 and friends). What cygwin does to emulate it, is by creating a new process, and passing through the memory contents by a pipe. As such it is extremely costy and inefficient. Just try to run a configure script (which uses fork() excessively) on cygwin and see how dog slow it is in comparison to Linux or FreeBSD.

As for threads: I don't know what do you mean by "better" on Windows. The API on Windows is a bit better from what I was told. The semantics are rumoured to be vastly different.

Note that fork() and threads do not replace each other but rather each have its own place. As such the unwillingness of the Win32 designers to supply programmers with a good ForkProcess() system call, made it much less usable as a programming system.

I should also add that Win32 does not have symbolic links. One notable place where it is apparent is in keeping various versions of the same shared library. In UNIX it's a no brainer (using a symbolic links tree), but in Win32 it is very hard.

My friend conjectured that Microsoft did that on purpose so vendors of UNIX software will have a harder time porting their software to Win32, which will give Microsoft a fore.

Shlomi Fish
Sunday, January 04, 2004

> I should also add that Win32 does not have symbolic links.

There's an API (for NTFS) called CreateHardLink

Christopher Wells
Sunday, January 04, 2004

vince: re "multi-server environments" - have you looked at Perl's POE or Python's Twisted. Maybe they can help.

"Eliminating all pointer arithmetics". The MikMod code had a lot of pointer arithmetics used for optimizations. Something like that:

*(int_ptr++) = (int)*(short_ptr++);

And quite a lot of things similar. I converted MikMod from ANSI C to Java through several stages of C++. When I eliminated the pointer arithmetics and used plain arrays (or arrays+indices), I got a slowness factor of about times 6. My program used to consume 5% of the CPU on my Pentium 166 MHz computer, and afterwards it took 30%. (even in C++) It is possible that modern Java implementation can optimize this away, and it's probable that the faster computers that are common today would be able to handle this load without a squintch.

As for structs, I suppose it makes sense to designate them all as references (that is how it's done in Perl and friends as well). However, from my experience, reducing the memory consumption can actually increase speed considerably, due to less cache misses, less indirect access, etc. These are micro-optimizations which are usually not relevant for uses Java will be considered to.

My point here is that Java may appear to be C++-like, but is  a radically different language.

As for its proprietary nature: Sun did not release the JRE and the JDK under an open source license, and the license (the SCCL) does not give allow too much colaboration similar to what open source technologies offer. I meant "proprietary" in the Stallmanic/"Free Software Foundation" sense.

So, while there are many open source projects around Java, the core JDK and several core technologies are not open source and as such not distributed with many Linux distributions, and have to be installed explictly (in a much less convenient way than Perl's CPAN modules).

Regarding Sun's Future: I would not start mourning it right now. Sun recently won a very big contract to install a great deal of its Suse Linux+GNOME+Java desktops in China, and it's highly possible that its temporary losses in the past year or two will be reversed soon. One of the options I see happening is that when and if Linux becomes dominant, then computer vendors, other than Intel, will market their computers for the home and office market offering true source compatibility with x86 computers. With proper marketing and good prices, Sun, SGI, etc. can gain a large percentage of the market now owned by Intel and other vendors. (in a way Apple already does).

Shlomi Fish
Sunday, January 04, 2004

Shlomi: Actually it was much worse, as he decided to use a padding of 1 byte in the internal DLL (to save space on the disk ?) and use 4 bytes padding by the application.

This, the endian issue you also mentioned, and potentially different methods of representing data types, are what usually lead me to prefer textual formats over binary formats. Plus, it is always helpful to be able to view the data in a text editor. To a great extent, this is XML's reason for being. The same is true for communication protocols BTW.

Sure, text-based formats are slower but:
1. If you work correctly, they are often fast enough (e.g. use SAX instead of DOM)
2. In many cases you don't care (e.g. server start-time goes up by 10 seconds).
3. When you do care, you can compensate by caching data in memory.
4. Handling versions, errors, updates is a lot easier.

For example, we needed a routine to read data from very large files in the .ini format (format used for historical reasons). I rewrote the code in portable C++ (using STL). Time to read, parse, and construct a hash table from a 12MB file, with > 10000 sections:

1.2 seconds

Write file after any modifications, while preserving comments:

1.5 seconds

Ok, ran these tests on my fast laptop (2.4GHz Pentium 4 with 500MB RAM using Windows XP), but even if it ran 10 times slower, still good enough for me.

With regard to scripting: I love scripting, but I would not like to be the one required to maintain a significant server application written in Perl.

Dan Shappir
Sunday, January 04, 2004

"With regard to scripting: I love scripting, but I would not like to be the one required to maintain a significant server application written in Perl"

I agree. But I expect we've all seen some pretty unmaintainable C programs as well. My point was that it doesn't have to be all one or the other; pick the right tool for the job. 

Tom Hathaway
Sunday, January 04, 2004

Shlomi, good article, thanks.

Regarding fork() on Windows. Could you think about good implementation of fork() for the process with multiple threads ? There is also no practical difference between fork+execv and CreateProcess.

Regarding fopen and CreateFile. First of all, fopen is available on Win as well. But the point is you should compare CreateFile to the system call open() plus one function I do not remember right now (Sunday morning :)) that allows to change behavour of a file descriptor.

Regarding signals. That really hurts. Console signals on Win are not even close to Unix signals. In my current project someone ported signals using Win32 events and additional thread with WaitForMultipleObjects. Ugly but it works.

Michael Popov
Sunday, January 04, 2004

the thing about writing raw structures to file got me a bit confused.

Is there a difference between 'portability' codewise and 'interoperability' e.g. datawise between different compiled versions?

If the said files were not intended to be portable, e.g. temp files or local config files or whatnot, then why are raw structures in files a bad thing?

i like i
Sunday, January 04, 2004

Two points:

1.  A good way to deal with OS incompatibilities is to
abstract OS features (typically disk I/O and/or various
threads implementations) in an abstraction
library, which presents a typical interface paradigm, but
which has implementations for each platform.  Good
abstraction libraries have runtime inquiry as to what's
supported by the particular library as well, ie "do you
support ftruncate?"  I have had to do this several times,
and have a FileOpen interface, an open() interface, an
fopen() interface (for a BIOS which is written using the
STDIO I/O calls and doesn't support the Posix calls
open(), etc), and a custom interface written in
assembly language to Flash.

2.  As for binary files, these are tricky to do portably but
not impossible, although you have to impose some sort
of grammar on them and have some way to store
metadata.  Also, it helps greatly if you have a header
(which never changes, at least up to the part where the
header lives) on the file which indicates the version of
your software that created it.  This way, fancier software
can convert the files to the newer format, and less
fancy software can at least gracefully reject them
without a ton of expensive -  and hard to QA - error
checking.

foobarista
Sunday, January 04, 2004

During a debate on ArsTechnica over the relative merits of XML, PeterB constantly championed ASN.1 as an alternative to XML.  I didn't have time to research it, but the gist I got from his arguments are that you declare a document structure, like a DTD or Schema for XML, and then the ASN.1 libraries handle streaming it out to any particular format you need, be it binary or even XML. 

This would seem like a good way to go if you had a need to store binary data in a cross platform way.  Of course, it's not going to be as fast as dumping in-memory structs to disk.  It's the whole "for every task, there is an algorithm that is simple, fast, and WRONG" phenomenon.

Richard P
Monday, January 05, 2004

Michael Popov: a thread-aware fork() is present in Linux and in other thread-enabled UNIXes. I don't know what it's policy is. I think the main problem in forking is to mark all the memory as copy-on-write, and handling the threads is the easy part whether you fork just one thread or all of them.
(don't know, I'm not a professional kernel programmer).

As for fork+execve == CreateProcess - you are right. However, fork() alone is useful for other things, like handling queries in a separate process, and others. On Win32 you have CreateProcess and execve, and you need some pretty dirty tricks to implement fork(). (e.g: what cygwin does). On UNIX, you have fork() and execve() and implementing CreateProcess is a no brainer.

As for fopen() and open() on Win32: they both exist in the C Standard Run-time Library. However, as far as I know their use is a bit frowned upon, and in serious programs (and ones where you don't want to depend on Microsoft bugs and idiosyncracies) you need to use the native Win32 functions, or at least an emulation layer above them. (like apr, NSPR, ACE, etc.)

Shlomi Fish
Tuesday, January 06, 2004

i like i: the problem with writing raw data strucutres to the disk is this:

Imagine you have this structure:

struct hello
{
    char my_char;
    long my_integer;
};

Fair enough? Now suppose you write it to the disk. The first problem you encounter is that the compiler wouldn't put my_integer immediately after my_char, because char takes only 1 byte. So there would be padding in between (usually 3 bytes). Then my_integer can be 4 bytes or 8 bytes long (depending on the platform). Furthermore, my_integer can have its byte components arranged as "UNIX" or as "XINU" (or in PDP-11 as "NUXI", but that's not common anymore). So if you write it directly to disk, and try to read it on a substantially different architecture, it won't be read correctly.

As for temp files, etc - this could be a valid use. Just don't use this scheme as your file format, or the UNIX police will hunt you and imprison you for life.

Shlomi Fish
Tuesday, January 06, 2004

"As for temp files, etc - this could be a valid use. Just don't use this scheme as your file format, or the UNIX police will hunt you and imprison you for life."

There are more ways to write a binary file than just writing
a struct in this fashion - you can use htonl() to give you
a canonical long, and once you have this, you can do most
things portably - as long as your loader uses ntohl() to
get the int back when it reads it in.  Also, anyone familiar
with flat structures should know about struct padding and
how to avoid it, ie declaring ints at the top and padding
out characters to word-aligned boundaries.

Float/double is another, more annoying problem, although
far less than it once was since most things use IEEE
floating format nowadays.  If there's a chance that you'll
have floats on a non-IEEE platform, you'll need to either
roll your own interface, ie with a cartesian fixpoint scheme
using two htonl()'ed ints, or just punt and use a text
representation and all the joy contained therein.

Disclaimer: #include <std/genuflect.h>

Binary files are only necessary if high speed and an
efficient persistent representation are required.  I'm
usually using them when writing database engine
storage managers, so I actually need small, efficient
representations with fast mid-file update capability and
a minimum of parsing and formatting when reading or
writing the file.  My requirements are probably
representative of about 0.01% of software development
nowadays - which is exactly where I want to be :)

foobarista
Wednesday, January 07, 2004

fork : as a windows programmer working with(porting) unix's fork has always been a pain.  Though to be fair I hazily recall my recent problem was solaris's two different fork implementations. ( ie does the new process just have the current thread or does it have all the threads? and who closes my ports?)  The NT kernel can implement complete fork() semantics, it just doesnt for win32 ( it does for posix ). The kernel actually creates processes much like "fork", but those details just arent exposed to win32.  I know that there exist some completely functional workarounds that use the NTApis, but I dont know who does em.  As long as you are just fork+exec its simple enough to use CreateProcesses. CreateProcess also supports overrides for std handles, so you dont need to do that inbetween your fork and exec, and it also does detach from console ( you can do directly in CreateProcess several things you might normally do between fork and exec)  The thread Impersonation stuff is complicated enough that most folks avoid it. As I understand it fork predates threading. But now Windows threading is what I'm used to, and everything else is "dumb".  You can get the same behaviour, but its not a simple port.

gui: Yeah I'm a kernel/console programmer, I avoid GUIs. I wrote some X stuff in college - didnt like it. Have written a bunch WindowsAPI, MFC, ATL code too, ugh.  I write my GUI's in Java or (d)HTML these days.

arguments: Too many arguments is kinda a pain, but you get used to passing NULL for the defaults.  CreateWindowEx is a doozy. Course these extra arguments are a goodsend when you want sharing or inherited security or whatever.  Itanium doesnt/didnt like CreateWindowsEx cause it has too many args. I dont remember what the workaround was...

signals: Not a unix developer so I've dont even think in signals, and dont miss em.  But I love the WaitFor/CompletionPorts/ThreadPools apis.

user/process: user stuff is much more complicated in windows. I dont know how many times I have had to use LsaApis ( aka NT apis ) for seemingly straight forward stuff.  The who attached/unattached parent child thingie is wierd for me.  My first thought when I saw this was should my child die just cause my parent died?

everything a file/drivers: never written unix drivers. Have written quite a few for windows, you can create and work with a lot of object as if they were files, but not all of em.  Most of this access like a stuff is even mentioned in the DDK.  Windows drivers a complicated, especially ordering stuff win2000.  I dont know how many times I have take out machine with buggy filter driver attaching in the wrong place.  There are some nice tricks you can do by opening the NT equivalent of kmem... ( kmem is the raw memmory space in unix right? ). Besides everythings a file is a "leaky abstraction" eh?

ini file format: the windows .ini file APIs are suprisingly very thread ( and process ) safe, and pretty quick. 

localisation: a pain on unix and windows.  Console mode localisation a double pain in windows.

Curious Windows Kernel Programmer
Wednesday, January 07, 2004

*  Recent Topics

*  Fog Creek Home