Fog Creek Software
Discussion Board




Writing device drivers

Every now and again, I notice people talking about writing device drivers, and specifically to how it's different to other software.

I've never looked at what's involved, so I was wondering if somebody could easily explain the difference between writing applications and servers, and writing device drivers.

As far as I understand it, a device driver provides implementations of standard functions like open(), read(), write() and close(). In doing so, it may read or write to various I/O ports linked to the processor, possibly waiting to poll some data in.

Where I think some programmers will fall down here is that it's vitally important to do the right thing at the right time, because the various control signals to and from the processor don't hang around waiting for you. Using functions to encapsulate information becomes more of an issue because you might be wasting vital CPU cycles getting the arguments set up on the stack.

Does that sound about right or have I missed the plot entirely?

Better Than Being Unemployed...
Tuesday, December 09, 2003

My device drivers years are long behind me, (and I only wrote one) but the biggest issue in my head was:

Don't screw up. You screw this up and it brings the whole machine down since it's running in Ring 0. (or whatever it was called..)

Mark Hoffman
Tuesday, December 09, 2003

The thing that's most difficult about developing device drivers is that by definition they're integral to the operating system and therefore create an unstable debugging environment. Most application source level debugging solutions don't work in a device driver situation. Basically, it's usually extremely difficult to understand what's happening when code doesn't work correctly.

The most difficult conceptual aspect of device driver design is that they generally need to be implemented as state machines that are reentered continually. Application developers aren't used to having to operate at this level.

Drivers that interface to hardware devices need to buffer the application (which simply wants a series of bytes delivered as a result of a call) from hardware which may have very complex low level interfacing requirements that have nothing to do with the desired application level view of the data. 

So, a device driver author needs to have that "generalist" background (hardware and software) that's been discussed recently.

A key example of a device family that is difficult to develop for is the USB bus. USB is a software link control protocol that greatly exceeds Ethernet/TCPIP in complexity.

About issues like rationing usage of the stack to save execution time: I have seen and worked around absolutely horrible low level code that was kluged around concepts like avoiding use of automatic variables, etc in the interest of "speed". What tends to happen in industry is that low level code developers usually place so much emphasis on tuning for nonexistent performance issues up front that the design lacks solidity.

Bored Bystander
Tuesday, December 09, 2003

Depending on the device, performance can be a real issue. 

Simple reads and writes to some registers (talking to a serial port for instance) aren't really timing critical.  The fun stuff happens when you get to high speed devices with DMA.

The thing to remember with a driver is that you're exposing functionality.  You have little control over how that functionality will be used.  God only knows what a user-mode application will be doing.  And if your driver chokes, the system goes down.

Depending on the OS, wrting a driver isn't that hard.  Windows is one of the tougher nuts to crack, but with a good book and some patience, you can muddle your way through it.  Third parties make toolkits to help you write Windows drivers.  Some of them are really good.

Honestly, the real reason I see a lot of crappy drivers is that they were written by hardware engineers who hacked something together as an afterthought.  Another common one is that the driver development was outsourced, and the hardware company doesn't have the expertise to know if the outsourcer did a good job or not.

That's why Microsoft is pushing WHQL.

Myron A. Semack
Tuesday, December 09, 2003

Performance and reliability are essential, good documentation can be scarce, and they are a bitch to debug.

Mitch & Murray (from downtown)
Tuesday, December 09, 2003

Hardware interfaces are often poorly defined,
buggy, and can fail in ways people often don't
anticipate. This can make it very challenging
when you need infinite performance and infinite
reliability.

son of parnas
Tuesday, December 09, 2003

Each hardware device represents its own challenge.  Sometimes big, sometimes small.  The problem is not in understanding the OS, because relative to new hardware, the OS concpets change slowly. 

However, each device is completely different than the last and the learning curve is a pain in the neck - never mind that the hardware mfg's these days rarely supply good docs, even if you pay them for it.

hoser
Tuesday, December 09, 2003

To further the "poor docs" comment, I'd say half the device drivers I've written have been designed and mostly implemented either before there was hardware, or with alpha silicon.  I actually had one of my bug reports show up on an Intel ethernet chip many many (many many many...) moons ago.

"Wherever your snot freezes, there we are"
  --- Columbia Sportsware ad

Snotnose
Tuesday, December 09, 2003

Oops.  I had one of my bug reports show up as an Intel chip errata once, they didn't implement it in silicon.

Snotnose
Tuesday, December 09, 2003

Writing device drivers is hard.

a) There are limits about what and how much you can do when and where. For example, in some contexts (Interrupt handlers), you can only access "non paged" memory, and cannot allocate/deallocate anything. This "non paged" memory is a very limited resource, and you have to plan how to use it. You can't use files in most contexts; You can't block waiting for something in many contexts. You can't properly use stuff like C++ exceptions. Almost no common library or framework is available.
b) The system is intolerant. Almost any bug in the device driver will BSOD your system, or worse - it can, e.g., overwrite disk buffers and harm the integrity of your file system.
c) Documentation and examples are harder to find.
d) Development tools, libraries, debuggers,etc aren't comparable to the user-mode equivalent.

When writing device drivers, having an easy and reliable way to printf (which will be visible even if the system BSODs) is a blessing, and rarely ever available. If you don't have VMWare or UserModeLinux (or something comparable), you'll have to wait for your system to reload each time it crashes (which might be often).

Ori Berger
Tuesday, December 09, 2003

If you're writing a device driver, you should be using a debugger.  In recent months I've fallen in love with SoftICE.  I wouldn't want to develop a driver without it (Linus be damned).

Myron A. Semack
Tuesday, December 09, 2003

Device Drivers need to have no bugs. None. Nada. Zip.
That's the first challenge.

Then they have to be small, fast, and very optimized. Ok, that's more doable.

Assembly language will be required. Not an issue for embedded guys but could be one more problem for some people.

Hardware documentation, some is good and some is bad, but every device driver I have written has been for hardware I have designed or worked on the design and typically I am also the guy who writes the documentation as well since no one else will do it. Maybe mine is a special case. Though where your hardware device is interacting with the PCI or ISA bus or doing DMA with other hardware is where you start running into the bugs in other hardware. Getting the dosk top phone number of a senior chip designer at Intel is hard to do because 99.999% of bug reports to them are not really problems with their chip. But 0.001% are problems and once you prove that, you are allowed to speak with the designers directly, after you've signed the nondisclosure agreement promising not to tell anyone about the bugs that they quietly fix or provide a workaround for.

Debuggers? I don't know. I've not seen debuggers helpful. Maybe newer debuggers are better. What you need is a Logic Analyzer. Decent ones start at $20,000. Some people use CPU emulators as well, but those aren't available for the high end chips.

Dennis Atkins
Tuesday, December 09, 2003

Dennis,

All software that gets deployed above 1000 installs needs to be bug free - at least with respect to a SEV 1 bug.  I guess that's the point you're making: all device driver bugs are SEV 1.

I tell ya, device drivers build disciplines that every software engineer should be ingrained with.

1. Everything should have a context (scope).  You're operating on this varialbe within the scope of a device or an file open() context - each of which has an instance.  A bare minimum of globals from which all contexts are built.
2. All locks and their executable context must be documented, understood and their state known - 'provable' is a word Ori Berger has used regarding threads and locks.

That whole missive of ESR's which Joel quoted as to why processes are better than threads - of course they are if the thread has no limitations imposed by the author.  All those stinking globals running amok wihtout any thought to limiting their access.  Disgusting.

I've seen so much crap passed off as code.  Often this comes from hardware manufacturers:
1. Deciding that they'd rather write code rather than document a device.
2. VxWorks.  VxWorks and its so called flat memory model (another way of saying there is "no kernel" or security) which allows for globals galore, and imposes no restrictions on anything; Leads to really awful code.

I  could rant for days on this.

hoser
Tuesday, December 09, 2003

I am just in the middle of a device driver development on Windows without any prior knowledge of developing.  And my background is Mac product development.
First thing I did is read the DDK docs and then networked with people with device driver experience.
Understood what has to be developed for the device. As the driver architecture is layer based I found out where my code should fit  according to device specification.
Used samples from DDK and tried of stuff. It Worked!!!.
This was 2 months ago and Now I"ve had a beta release with zero crashes.
But it was a struggle when I began since I had to understand all the terminologies of the hardware.
I say device driver development is not that hard as it is supposed they say. If you are a good programmer it like just another job.

Cooler
Wednesday, December 10, 2003

Cooler,

You must be right. Driver developmetn is pretty simple and there are no special problems associated with it, as you say.

Dennis Atkins
Wednesday, December 10, 2003

I used to do some display driver work, both in Windows and OS/2.  It was, back then, a major bitch:

- Buggy hardware, undocumented hardware, and incorrectly-documented hardware.  This was one of the worst aspects.  You'd do something that should work, and get a lockup.  Or it simply wouldn't work.  And you'd go over your code with a fine tooth comb, look at the manual for the hardware and verify that you were correctly setting each register and I/O port up, and finally talk to the hardware maker and have one of their guys tell you "oh yeah, that's gonna be fixed in the next rev, but for now do this...".

- The hardware is a black box state machine.  Every driver exit must leave the hardware in a "known good" state so the next call to the driver doesn't result in a screwed-up display, BSOD, or hard lockup.

- Debugging was hairy, even with decent tools like SoftICE.  (When I started working in Java, I was amazed at the information I could get.  Stack traces printed to the console... what a treat!)  Especially timing-related bugs and hard lockups.  One common trick was to set a breakpoint and step by step "walk" it forward through the code until your machine hung before hitting the breakpoint.  Then walk it backwards in smaller increments until you hit it.  Then walk it forward again in still smaller increments.

- Performance optimizations were, more often than not, rigged towards artificial benchmarks whose impact (in the consumer world) was far out of proportion to their real value.  Yeah okay, we got line draws that are 18% faster than the next fastest driver... who cares when the screen is repainted in less than a tenth of a second anyway?

It was a good experience for me, but I'll not willingly go back to driver work.

-Thomas

Thomas
Wednesday, December 10, 2003

Myron: I don't know what kind of driver development you are doing, but if you can employ UserModeLinux you'll be much happier with it than with VMWare. You just can't beat running kernel mode in user mode for convenience. [You won't get interrupts and timers, but if you only need to read/write memory addresses and IO ports, UserModeLinux is for you].

Cooler: The DDK documentation is incomplete. Either you were extremely lucky, your driver is extremely simple, or your testing methodology is inadequate. The problematic situations are hard to test - e.g., you have to put the system under considerable load to have a good chance of two interrupt handlers active at the same time. An SMP machine makes these conditions more probable, but it's still hard to rigorously test all the preemption and concurrency interactions.

Ori Berger
Wednesday, December 10, 2003

Ori, I don't understand what you're sayoing.  I'm not using VMWare for development.  I've found actually that it gets in the way of certain things.  UserMode Linux is cute, but not really useful for serious driver development.

The best tool (for Windows) I've found is SoftICE, coupled with CompuWare's DriverSuite.

My comment about "Linus be damned" was referring to the fact that Linus doesn't think people should use debuggers.

I've never needed a logic analyizer for driver development.  BIOS development yes, but not driver stuff.  A simple Ocilliscope should be all you need.  But even then, I've only had to break one out when the hardware wasn't debugged yet.  Maybe I'm just lucky to have good hardware people here.

My biggest gripes about the DDK are:
- BUILD is a pretty crude way to work.
- To do a proper driver, you need way too much boilerplate code
- There's so many examples that it's a pain in the ass to know which one to start with.

Myron A. Semack
Wednesday, December 10, 2003

Myron,

A logic analyzer is needed when the driver is for interfacing with new silicon and you are looking at timing issues. Real timing issues, not issues with the code running on the cpu, but with the hardware the driver is for.

Analyzer also comes in handy for debugging the behavior of chipsets that are not acting as their specs say they are.

And it's also necessary if your hardware device has a mode to emulate a legacy device and you need to find out how that legacy device is really operating, which is never as its specs say.

Dennis Atkins
Thursday, December 11, 2003

*  Recent Topics

*  Fog Creek Home