Assembled from various sources by Marcus Overhagen.
R3 API The old BeOS R3 soundcard driver API. To be used with BeOS R5, it needs to be slightly modified. See below.
R3 API updating Modifying an BeOS R3 driver to be used with BeOS R4 and R5.
R5 (multiaudio) API The new multiaudio API. Not fully supported in BeOS R5, and never finished.
General Driver writing reference
More Information:
The updated R3 API publishes in /dev/audio/old/yourdriver
The multiaudio R5 API publishes in /dev/audio/multi/yourdriver/1
The undocumented R4 API publishes in /dev/audio/mix/yourdriver /dev/audio/mux/yourdriver and /dev/audio/raw/yourdriver
The Multiaudio description lists a number of ioctls. But not all are useable. Keep this in mind when reading the documentation
B_MULTI_GET_DESCRIPTION working.
B_MULTI_GET_BUFFERS working.
B_MULTI_SET_BUFFERS not implemented.
B_MULTI_BUFFER_EXCHANGE the playback part works, recording seems to be broken.
B_MULTI_GET_MIX not implemented.
B_MULTI_SET_MIX not implemented.
B_MULTI_LIST_MIX_CHANNELS not implemented.
B_MULTI_LIST_MIX_CONTROLS not implemented.
B_MULTI_LIST_MIX_CONNECTIONS not implemented.
The R5 multiaudio API description & header file
BE ENGINEERING INSIGHTS: Writing A Sound Card Driver
A number of people have expressed interest in writing sound card drivers for BeOS. This article describes the interface used by the current audio server to communicate with sound card drivers. Note that this interface is for the current audio server, and that the current audio server will be replaced with something better in a future release. But if you want to write a sound card driver and test it with the current audio server, this is what you will have to support.
These are the ioctl codes used by the audio server:
#include <Drivers.h> enum { SOUND_GET_PARAMS = B_DEVICE_OP_CODES_END, SOUND_SET_PARAMS, SOUND_SET_PLAYBACK_COMPLETION_SEM, SOUND_SET_CAPTURE_COMPLETION_SEM, SOUND_RESERVED_1, /* unused */ SOUND_RESERVED_2, /* unused */ SOUND_DEBUG_ON, /* unused */ SOUND_DEBUG_OFF, /* unused */ SOUND_WRITE_BUFFER, SOUND_READ_BUFFER, SOUND_LOCK_FOR_DMA };
The SOUND_SET_PLAYBACK_COMPLETION_SEM
ioctl takes a
(sem_id*) argument which points to a semaphore that must be
released once for each buffer written. The semaphore should
be released when the data in the buffer is no longer needed.
The SOUND_SET_CAPTURE_COMPLETION_SEM
ioctl takes a (sem_id*)
argument which points to a semaphore that must be released
once for each buffer read. The semaphore should be released
when the data in the buffer is valid.
The SOUND_WRITE_BUFFER
ioctl takes an (audio_buffer_header*)
argument which is defined in MediaDefs.h:
typedef struct audio_buffer_header { int32 buffer_number; int32 subscriber_count; bigtime_t time; int32 reserved_1; int32 reserved_2; int32 reserved_3; int32 reserved_4; } audio_buffer_header;
The audio data immediately follows the audio_buffer_header in memory and is in stereo signed 16-bit linear native-endian format. The size in bytes of the audio data plus the audio_buffer_header is stored in the "reserved_1" slot of the audio_buffer_header (the audio server was written before the size argument to the ioctl call was implemented). The size and address of the audio data can be derived this way:
audio_buffer_header* header = (audio_buffer_header*) ioctl_arg; int32 bytes_of_data = header->reserved_1 - sizeof(*header); int16* addr_of_data = (int16*) (header + 1);
The driver can ignore the "buffer_number" and "subscriber_count" slots of the buffer header and should store an estimate of the system_time() corresponding to the beginning of the buffer in the "time" slot of the buffer header.
The SOUND_WRITE_BUFFER
call is allowed to return before the
data in the buffer has been consumed but the playback
completion semaphore must be released when the buffer can be
recycled.
The SOUND_READ_BUFFER
ioctl takes an (audio_buffer_header*)
argument. The size and address of the data can be computed
as above. The "time" slot should be written with the
estimate of the system_time()
corresponding to the beginning
of the buffer. The call is allowed to return before the
buffer is full but the capture completion semaphore must be
released as soon as the buffer is full.
The SOUND_GET_PARAMS
and SOUND_SET_PARAMS
ioctls read and
write a set of parameters which correspond to the settings
in the sound preferences panel. The ioctl argument is a
(sound_setup*) in the following format:
enum adc_source { line = 0, cd, mic, loopback }; enum sample_rate { kHz_8_0 = 0, kHz_5_51, kHz_16_0, kHz_11_025, kHz_27_42, kHz_18_9, kHz_32_0, kHz_22_05, kHz_37_8 = 9, kHz_44_1 = 11, kHz_48_0, kHz_33_075, kHz_9_6, kHz_6_62 }; enum sample_format {}; /* obsolete */ struct channel { enum adc_source adc_source; /* adc input source */ char adc_gain; /* 0..15 adc gain, in 1.5 dB steps */ char mic_gain_enable; /* non-zero enables 20 dB MIC input gain */ char cd_mix_gain; /* 0..31 cd mix to output gain in -1.5dB steps */ char cd_mix_mute; /* non-zero mutes cd mix */ char aux2_mix_gain; /* unused */ char aux2_mix_mute; /* unused */ char line_mix_gain; /* 0..31 line mix to output gain in -1.5dB steps */ char line_mix_mute; /* non-zero mutes line mix */ char dac_attn; /* 0..61 dac attenuation, in -1.5 dB steps */ char dac_mute; /* non-zero mutes dac output */ }; typedef struct sound_setup { struct channel left; /* left channel setup */ struct channel right; /* right channel setup */ enum sample_rate sample_rate; /* sample rate */ enum sample_format playback_format; /* ignore (always 16bit-linear) */ enum sample_format capture_format; /* ignore (always 16bit-linear) */ char dither_enable; /* non-zero enables dither on 16 => 8 bit */ char mic_attn; /* 0..64 mic input level */ char mic_enable; /* non-zero enables mic input */ char output_boost; /* ignore (always on) */ char highpass_enable; /* ignore (always on) */ char mono_gain; /* 0..64 mono speaker gain */ char mono_mute; /* non-zero mutes speaker */ } sound_setup;
On PPC systems the "mic_attn" and "mic_enable" parameters are used to control the amount of adc to dac "loopback" instead of the microphone input level.
The SOUND_LOCK_FOR_DMA
ioctl takes an (audio_buffer_header*)
argument but can be ignored except on Macintosh. A Macintosh
driver should call lock_memory()
on the audio buffer with
the B_DMA_IO
flag.
The audio server opens the driver named "/dev/old/sound". So when you have implemented these ioctl calls and you want to test your driver with the audio server, you can either name it "/dev/old/sound" or name it something like "/dev/old/mydriver" and create a symbolic link in your UserBootscript file:
ln -s /dev/old/mydriver /dev/old/sound
The "old" in this path name is to remind you that this is an interface which will be deprecated in the future.
DEVELOPERS' WORKSHOP: Updating An Old Sound Card Driver
By Marc Ferguson - marc@be.com
"Developers' Workshop" is a weekly feature that provides answers to our developers' questions, or topic requests. To submit a question, visit
[http://www.be.com/developers/suggestion_box.html].
The new Media Kit contains a compatibility interface which lets you use a pre-R4 sound card driver in R4 with only minor modifications to the driver. Here is a description of those modifications, assuming that you are starting with a driver written to the old API described in the Newsletter article:
http://www.be.com/aboutbe/benewsletter/volume_II/Issue22.htmlThe most important change is that the driver must publish a "sample clock" which allows the Media Kit to synchronize to the sound card. The sample clock corresponds to performance time measured by the DAC (or ADC) clock in microseconds. If the nominal sample rate is 44100 samples per second then the sample clock moves at (1000000 / 44100) microseconds per sample processed by the DAC (or ADC). If the DAC is actually processing 44109 samples per second than the DAC's sample clock will run slightly faster than real-time.
The sample clock is returned by the driver in the audio_buffer_header structure which now looks like this:
typedef struct audio_buffer_header { int32 buffer_number; int32 subscriber_count; bigtime_t time; int32 reserved_1 int32 reserved_2; bigtime_t sample_clock; } audio_buffer_header;
As before, the SOUND_WRITE_BUFFER
and SOUND_READ_BUFFER
ioctls should store an estimate of the system_time()
corresponding to the beginning of the buffer in the "time"
slot of the audio_buffer_header
. They should also fill the
"sample_clock
" slot with the sample clock for the beginning
of the buffer.
To calculate the sample clock, keep track of the number of samples processed and multiply by the sample clock rate:
header->sample_clock = (samples_processed * 1000000LL) / sample_rate;
As long as buffers are flowing continuously, the sample
clock will move at the same rate as the performance time of
the buffers. But if there is an interruption between buffers
then the sample clock must account for the length of the
interruption. One way to do this is to measure the duration
of the interruption and increment "samples_processed
" by the
number of samples that would have been played during that
time:
samples_processed += (time_skipped * sample_rate) / 1000000;
There are four new ioctl
codes which a driver may optionally
support to negotiate an optimal buffer size with the Media
Kit. They appear at the end of this list:
#includeenum { SOUND_GET_PARAMS = B_DEVICE_OP_CODES_END, SOUND_SET_PARAMS, SOUND_SET_PLAYBACK_COMPLETION_SEM, SOUND_SET_CAPTURE_COMPLETION_SEM, SOUND_RESERVED_1, /* unused */ SOUND_RESERVED_2, /* unused */ SOUND_DEBUG_ON, /* unused */ SOUND_DEBUG_OFF, /* unused */ SOUND_WRITE_BUFFER, SOUND_READ_BUFFER, SOUND_LOCK_FOR_DMA, SOUND_SET_CAPTURE_PREFERRED_BUF_SIZE, SOUND_SET_PLAYBACK_PREFERRED_BUF_SIZE, SOUND_GET_CAPTURE_PREFERRED_BUF_SIZE, SOUND_GET_PLAYBACK_PREFERRED_BUF_SIZE };
The SOUND_SET_CAPTURE_PREFERRED_BUF_SIZE
and
SOUND_SET_PLAYBACK_PREFERRED_BUF_SIZE
ioctls
take an (int32
)
argument containing the buffer size (without the header)
that the Media Kit plans to send. If the driver is using a
circular DMA buffer it may want to set the size of the DMA
buffer to twice the preferred buffer size to minimize
latency.
The SOUND_GET_CAPTURE_PREFERRED_BUF_SIZE
and
SOUND_GET_PLAYBACK_PREFERRED_BUF_SIZE
ioctls take an
(int32*
) argument in which the driver can return the current
setting of the preferred buffer size.
Devices supporting the above API should be published under
the "/dev/audio/old/
" directory where they will be found by
the "legacy" media add-on.
Adding a sample clock to your old R3 sound card driver will get it up and running again under R4.
BE ENGINEERING INSIGHTS: Multiaudio API
By Steven Olson solson@be.com
I'm in the family bunker deep in an Idaho hillside. Y2K is officially over but I'm still not sure if it's safe to come out yet. The only communication line I have is a fiber optic link to Be's world headquarters in Menlo Park. Because of some poor planning on my part, though, there's no gas for the electric generator. Instead, I have to use a candle (signal) and deck of playing cards (modulator) to send this article. Fiber optic cable was the right choice for the bunker because of its high speed and bandwidth capabilities. Coincidentally, BeOS is also known for high speed and bandwith capabilities. In particular, the new multiaudio driver API takes full advantage of BeOS's superior speed and bandwidth.
Multi
Not all of you may be familiar with the multiaudio API, as it's intended only for driver writers. User level add-ons and applications will talk to the multiaudio node, or "multinode" as it's called. The advantage of the multiaudio API over other APIs (such as legacy and game) is that it's ideally suited for professional and semiprofessional audio cards. These cards generally have more inputs and outputs, higher sampling rates (typically 48Khz or 96Khz), and greater bit depths (up to 32 bits per sample) than typical game sound cards. They may also have support for other audio formats such as S/PDIF and ADAT. The "complete" documentation for the driver API is included in three files: multi audio.h, multi audio.gb, and multi audio.txt. To get these files, write to trinity@be.com. Without rehashing all the information in the files, I'd like to hit the highlights of the new API.
Highlights
The API is different from the other audio APIs in that read/write calls are combined into a single ioctl, B MULTI BUFFER EXCHANGE. This call is synchronous -- it returns after playback data has been transferred (or queued) and capture data (if any) is present in the capture buffer. A ping-pong type of buffer management normally used -- one buffer is being played (or filled with capture data) while a second buffer is being prepared. In some cases, a ring buffer may be required.
Another area of interest is the mixer ioctls. Extensible mixer implementation is tricky, and this API is no exception. However, the burden is now off the driver writer and on the node and add-on writers instead. That's good news for us driver writers. (There should be example code in the future to make nodes and add-ons easier too.) The ioctls B MULTI LIST MIX CHANNELS, B MULTI LIST MIX CONTROLS, and B MULTI LIST MIX CONNECTIONS return mixer components associated with the device. This allows the mixer GUI and implementation to be handled by the user mode components.
Masters
One item that's slightly different from the documentation is the ganging together of controls. The multi mix control structure has a member named "master." It is intended that this value be 0 if the control is not slaved. If it is ganged, then the ID of the master control goes here. In order to facilitate the implementation of parameter webs, I suggest that the master control use its own control ID here and not 0. This reduces to using 0 if the control is not ganged or slaved, and the master control ID if it is. The documentation will be updated to reflect these changes.
Complications
Some things are tricky. For example, say your card supports a 32Khz sampling rate, but only if you're sample-locked to an incoming 32Khz S/PDIF signal. You should report the actual sample rate (32 Khz) when requested (MULTI GET GLOBAL FORMAT) but don't show 32 Khz in the MULTI GET DESCRIPTION ioctl. This will prevent users from manually trying to set the rate to 32 Khz.
Confusion
Some people have reported confusion over the terms "bus" and "channel" as used in the multiaudio API. Busses are the connectors to the world outside the computer, while channels carry audio data inside the computer. Busses may be digital (e.g., ADAT) or analog. Channels are what's "processed" by the computer. In a typical capture scenario, an analog signal is converted to digital PCM data by an ADC. The digital PCM data is associated with the input channel, while the analog signal is part of the input bus.
Latency
The purpose of the multiaudio API is create an environment in which high-performance audio cards can excel. So the question "How fast is it?" invariably arises. The answer is "It depends." How fast is your CPU? How fast is your hard disk? How fast is your memory bus? Do you want multitrack hard disk recording or are you more interested in "real time" (<10 ms) effects processing? In audio, the primary concern is latency. How long does it take to get an analog signal into the computer, process it, and then send it back out? What if you also wish to write processed data to disk? There are many items which contribute to the latency of the entire system, including the OS, the API, A/D D/A converter latency, buffer size, etc... (If the OS is not designed properly, you may be able to get two channels of audio in and out fairly quickly, but you may not be able to access the disk simultaneously.) So how fast is the new API? The only way to know is to test your machine and card.
Test
Everyone has a preferred method for testing audio latency and I'm no exception. I recommend the following: input a sine wave from a signal generator to the sound card, run a loopback program that doesn't bypass the converters, then measure the phase difference with an oscilloscope. Make sure that the period of the input signal is greater than the buffer playback time. A very simple loopback program is available at <ftp://ftp.be.com/pub/samples /drivers/multiaudio_test.zip> This program does not currently use the multinode but instead talks directly to the driver. Consequently, it should give driver writers a basic understanding of how the new driver API is used and the extremely low latencies that are possible with a good sound card and the new multiaudio API.
DEVELOPERS' WORKSHOP: Using and Writing Device Drivers on the BeOS
By Jon Watte - hplus@be.com
"Developers' Workshop" is a weekly feature that provides answers to our developers' questions, or topic requests. To submit a question, visit
[http://www.be.com/developers/suggestion_box.html].
This is second in a series of Developer Workshop
articles to help people program with the new Media Kit.
It was written by Be's Director of Media Technology.
The article first appeared on <http://www.b500.com/bepage/>
,
and possible future updates will be posted there.
Rather than use a BeOS device driver directly, user-level applications instead use some higher-level API which calls a user-level add-on, which calls the driver. There's nothing, however, that prevents an application from talking directly to a driver just as an appropriate add-on would. Indeed, in cases where there is no appropriate add-on API you have to talk to the driver directly from an application. It's also useful to talk directly to the driver while developing and testing it. In this article, we'll call the entity (application or add-on) that is using the driver a "client" of the driver.
The first thing you need to do is to find the device. The
driver will export one or more devices in subdirectories of
the /dev
directory. For instance, the sonic_vibes
audio card
driver exports in /dev/audio/raw/sonic_vibes/
as well as in
other locations.
Because most BeOS drivers support handling more than one
installed card of the same kind, the convention is to number
the installed cards, starting at 1. Thus, the first
installed sonic_vibes
card is found as
/dev/audio/raw/sonic_vibes/1
. You can use the BDirectory
class or the opendir()
C function to look through a
directory for available devices.
Once you know what device you want to use, you should open
it using the open()
C call:
int fd = open("/dev/audio/raw/sonic_vibes/1", O_RDWR);
You'll use this file descriptor to refer to the open device from now on. The file descriptor should be closed with
close()
when you're done with it. If the process (team)
that
opened the device crashes or otherwise goes away without
closing the file descriptor, it will be garbage collected
and closed by the kernel.
Many devices implement the read()
and write()
protocols.
Thus, to record some audio from the default input device,
you just do this:
short * data = (short *)malloc(200000); ssize_t rd = read(fd, data, 200000);
rd
will contain the number of bytes actually read, or -1
if an error occurred (in which case the thread-local
variable errno
will contain the error code).
The format of the data returned by the device varies with
the device; the default format of the sonic_vibes driver is
stereo 16-bit signed native-endian 44.1 kHz PCM sample data.
To play back this data using the write()
call, do this:
ssize_t wr = write(fd, data, rd);
wr
will contain the actual number of bytes written, or -1
for error, in which case errno
contains the error code.
Many devices do not work well with the simple read()
and
write()
protocol; for instance, video capture cards often
require a contiguous locked area of memory, which typically
is not found in a buffer passed in by the user to read()
or
write()
. Then you can implement your protocol as ioctl()
selectors. There are a number of well-defined ioctl()
values
that your device can implement if they make sense for the
class of device you're dealing with; specific subdirectories
of /dev
may require certain ioctl()
protocols to be
implemented (such as /dev/joystick
, /dev/midi
, or
/dev/audio
).
Suppose we're using a video capture driver which implements the following protocol:
enum { drvOpSetBuffers = B_DEVICE_OP_CODES_END+10001, drvOpStart, drvOpStop, drvOpWaitForFrame, }; struct drv_buffer_info { color_space in_space; int in_width; int in_height; int in_rowbytes; void * in_buffers[2]; /* even, odd */ }; struct drv_frame_info { int out_frame_number; };
The client could then configure the driver like so:
drv_buffer_info buf_info; buf_info.in_space = B_YUV422; buf_info.in_width = 640; buf_info.in_height = 240; buf_info.in_rowbytes = 640; area_id buf_area = create_area("capture buffers", &buf_info.in_buffers[0], B_ANY_KERNEL_ADDRESS, buf_info.in_rowbytes*buf_info.in_height*2, B_CONTIGUOUS, B_READ_AREA|B_WRITE_AREA); buf_info.in_buffers[1] = ((char *)buf_info.in_buffers[0])+ buf_info.in_rowbytes*buf_info.in_height; int err = ioctl(fd, drvOpSetBuffers, &buf_info); if (err == -1) err = errno;
It would start video capture like so:
int err = ioctl(fd, drvOpStart); int err = ioctl(fd, drvOpStart);
It would wait for each frame to arrive like so:
while (running) { drv_frame_info frm_info; int err = ioctl(fd, drvOpWaitForFrame, &frm_info); if (err == -1) err = errno; process_frame(frm_info.out_frame_number, buf_info.in_buffers[frm_info.out_frame_number & 1]); }
Last, it would stop the capture like so:
int err = ioctl(fd, drvOpStop); if (err == -1) err = errno;
In real life, a typical protocol is more capable, and thus more complicated, than shown here, but it should be enough to give you an idea of how the protocol between a user-level client and a driver can be structured.
OK, now that you know how to use a device driver, and have some idea how to structure the protocol between the client and the driver, it's time to get down and dirty with the actual process of creating a driver. Creating a driver on BeOS is done using ANSI C; the C++ language requires certain support which is not available in the BeOS kernel environment.
If you already have a large C++ library that talks to your hardware device and want to port it to BeOS, we suggest that you make your driver very shallow and use it just to read/write card registers and service interrupts, and put all your C++ code in a user-level add-on. Some readers may know "interrupts" by the name "IRQ"; we'll call them "interrupts" because that's the terminology used by the BeOS kernel kit.
A driver is a loadable shared library (add-on)
which exports
certain well-known function names such as init_driver()
and
publish_devices()
. The driver gets loaded by the "devfs"
file system (which runs in the kernel) in response to some
client calling file system functions opendir()
, open()
, and
others. A driver may get loaded and unloaded several times,
not necessarily being opened just because it's loaded. It
will, however, never be unloaded while it is open. The moral
of this story is that you cannot expect global or static
variables to retain their values after uninit_driver()
has
been called, or before init_driver()
is called.
First, you have to decide what to call your driver and your
devices. Typically, one driver will service any number of
installed cards of the same type, and each of those cards
may cause the driver to publish multiple device names under
/dev
. These device names will be referred to as "devices";
the actual binary add-on will be called the "driver"; and
the pieces of hardware serviced by the driver will be called
the "hardware."
Typically, you'll name your driver something similar to the
name of the main chip serviced by the driver. The
sonic_vibes
driver drives the S3 Sonic Vibes chip; the bt848
driver drives the Brooktree Bt848/878 chips; the awe64
driver drives the Creative Labs SoundBlaster AWE32/64 cards;
etc.
Your device names will then be derived from the protocols
they implement, as well as the driver name. Thus,
sonic_vibes publishes devices in /dev/audio/raw/sonic_vibes,
/dev/audio/old/sonic_vibes, /dev/audio/mix/sonic_vibes,
/dev/audio/mux/sonic_vibes, /dev/midi/sonic_vibes, and
/dev/joystick/sonic_vibes,
each device implementing the
protocol that's defined for that part of the /dev directory
tree. If there is no protocol defined for your device, you
can implement whatever protocol you wish. Try to be
consistent in your naming, though. For instance, a video
capture driver for a chip named Pixtor might publish devices
in /dev/video/pixtor/
. By convention, each card will be
numbered from 1 and up, so the first "pixtor" device would
be called /dev/video/pixtor/1
.
If your device is of some irregular kind, you can always
publish in /dev/misc/your-name
. Please avoid publishing
directly under /dev
and avoid inventing new classes of
devices under /dev/
. If you feel you have to, contact BeOS
developer support or your favorite Be engineer first, to
check that your scheme will work well with the rest of the
system.
The first function called in your driver, if you implement
and export it, is the init_hardware()
hook. Please refer to
the skeleton driver for the C function prototype of each
driver function.
init_hardware()
will only be called the first time your
driver is loaded, to find and reset your hardware and get it
into some known state, if necessary. Many drivers can do
without implementing this hook at all. If you implement this
hook, but don't find any of your cards installed, you should
return a negative error code, such as ENODEV
.
Note: on BeOS, all the POSIX error codes (EXXX) are negative numbers, so you should return them as-is to signify error.
The next driver hook being called is init_driver()
, which
definitely should be implemented by all drivers. If your
device is an ISA card, you'll want to call get_module() on
the ISA bus manager module to initialize a global variable
to refer to that module for easy access (typically, this
variable will be named "isa"). For PCI cards, use the PCI
bus manager module, found in <PCI.h>
.
Then, use the bus manager module to iterate over available
hardware, looking for instances of the hardware you support.
For each piece of hardware, make sure you enable its PCI bus
interface in the configuration registers if it isn't
already. Then allocate whatever memory you need to keep
track of the hardware and the devices that hardware will
cause to be published, and make sure the hardware is in some
safe, well-behaved state and not generating spurious
interrupts or other bad behavior. To allocate memory, use
malloc()
. To later deallocate this memory, use free()
.
/* a global variable for the PCI module */ pci_module_info * pci; /* in init_driver() */ pci_info info; int ix = 0; int cards_found = 0; if (get_module(B_PCI_MODULE_NAME, (module_info **)&pci) < 0) return ENOSYS; while ((*pci->get_nth_pci_info)(ix, &info)) { if (info.vendor_id == MY_VENDOR && info.device_id == MY_DEVICE) { cards_found++; my_card_array[ix].info = info; } ix++; if (cards_found == MAX_CARDS) break; } if (cards_found < 1) return ENODEV; names[cards_found] = NULL; /* in uninit_driver() */ put_module(B_PCI_MODULE_NAME);
If you find no hardware, return ENODEV
. If you find
hardware, but something is wrong and you're not prepared to
publish any devices, return ENOSYS
or ENOENT
. If all is OK,
return B_OK
.
Next, the hook publish_devices()
will be called. It should
return a pointer to an array of C string pointers, one per
device you want to publish, and terminated by a NULL
pointer. For a hypothetical "Pixtor" driver which publishes
one device per installed hardware card, up to a maximum of
four installed cards, you'll typically have a global
variable "names", like so:
static char * names[5] = { "video/pixtor/1", "video/pixtor/2", "video/pixtor/3", "video/pixtor/4", NULL /* init_driver() sets unavailable slots to NULL */ };
In init_driver()
you will allocate a name string per device
you find (unless a static array will work, as shown), and
make the corresponding slot in "names" point to that string.
Then you can just return the "names" array in
publish_devices()
:
const char ** publish_devices() { if (names[0] == NULL) return NULL; return names; }
Note that the names assume they live under "/dev/
" and thus
should NOT contain that part; a typical name may be
"video/pixtor/1
".
How does the devfs file system know which driver to open
when a program asks for the device named "/dev/foo/bar/1
"?
Under R3, devfs opened all drivers when the system booted
and called their publish_devices()
function, so it could
know what devices were available. However, this mechanism
doesn't scale well with an increasing number of drivers
available for BeOS, and a new mechanism was introduced in
R4.
Inside /system/add-ons/kernel/drivers
(and
~/config/add-ons/kernel/drivers
) there are now two folders,
"dev
" and "bin
". All driver binaries go into "bin
", and
symlinks to the drivers go into the appropriate subdirectory
of "dev
". Thus the hypothetical Pixtor driver would put the
driver in ...kernel/drivers/bin
, and put a symlink to that
driver in ...kernel/drivers/dev/video
. The symlink has to be
put there by the installation program or script for the
driver, or, for development purposes, by the driver build
process.
Thus, when a client calls open("/dev/video/pixtor/1
") or
opendir("/dev/video/
"), devfs will scan all symlinks found
in ...kernel/drivers/dev/video
(and subdirectories thereof)
and open the referenced drivers to call their init_driver()
and publish_devices()
functions, in order to figure out
which driver(s) publish devices that would interest the
client. Devfs
is reasonably smart about only doing this
once, and it uses the modification date of the driver in
.../bin
to do that, so when you replace your driver with a
newer copy, subsequent open()
calls for your driver will
cause devfs
to load the new version (once all the old
clients have closed the old driver).
Not having to reboot for the new driver to be found is one of my favorite features of BeOS for driver development.
When a client decides to open one of your devices, the
kernel calls your find_device()
hook with the name in
question. It's up to you to map this name (which you
previously published in publish_devices()
) to the right
device type within your driver. If you support only one
device type, this is easy; even if you support more than
one, a simple strncmp()
is typically sufficient.
A "device type" consists of a set of function pointers that
define the interface for a device. In <Drivers.h>
you'll
find the struct
device_hooks
, which is what should be
returned from find_device()
. The hooks for open, close,
free, read, write, and control
must be implemented; the
hooks for select, deselect, readv, and writev
are optional.
If you don't implement readv/writev
, the kernel will emulate
these functions by repeatedly calling your read/write hooks,
which may be less efficient than if you supported the
readv/writev
functions directly. Don't confuse the hook name
"select()
" with the Net Kit function "select()
"; currently
they have nothing to do with each other.
Once you've returned a device_hooks
structure, the kernel
calls the open hook therein, letting you turn the device
name into a unique "cookie" which your other hooks will use
to find the open device in other hook calls. The open mode
is O_RDONLY, O_WRONLY, or O_RDWR
. Depending on your driver's
capabilities, you might want to ensure exclusive access to
reading and writing respectively, and return a EPERM
error
if someone tries to open the same device with the same mode
twice in a row. It may, however, make sense to allow one
open()
for O_RDONLY
and another open()
for O_WRONLY
.
The kernel never dereferences the "cookie" value, so it can
be a pointer to some private data you malloc()
, or a pointer
to an element in a global array, or just an index of some
sort. Suffice to say that you must be able to get all
necessary state information for the open device, and the
hardware associated with it, when given this cookie in later
hook function calls.
Your open()
hook will typically need to call
install_io_interrupt_handler()
to install an interrupt
service routine for the hardware in question the first time
it is opened, if you didn't already do that in
init_driver()
. For PCI devices, you find the values to pass
to this function in the pci_info
struct for your hardware.
The "data" value will be passed to your interrupt handler,
and thus is typically your "cookie" value.
Note that the device_hooks structure may acquire more
functions in later versions of BeOS. To tell the kernel what
version of the interface you were compiled with, you should
export an int32
variable named api_version
, it should be
initialized to B_CUR_DRIVER_API_VERSION
. Assuming you put
your device_hooks structs in static or global memory, the
compiler will clear out any slots you don't define at the
end to NULL
for the version of the device_hooks struct you
compile with; thus the value of B_CUR_DRIVER_API_VERSION
changes when the size of the device_hooks struct changes.
Just adding this line to your driver is enough, as long as
you include <Drivers.h>
before it:
int32 api_version = B_CUR_DRIVER_API_VERSION;
When the user is done with your device, he calls close()
on
the file descriptor that references it. When the file
descriptor is closed (or when the last file descriptor is
closed, if the user uses dup()
), the kernel calls your
close()
hook. You should start shutting down the device; set
a status bit so that future read()
, write()
, and control()
hook calls will return an error, and preferably un-wedge any
outstanding blocking I/O requests and have them return
EINTR
. One technique for doing this is to simply delete the
semaphores you use for synchronizing I/O. The acquire_sem()
calls in your driver hooks should then detect the
B_BAD_SEM_ID
error and take that to mean that the device is
being shut down, and return EINTR
to the calling client.
Once all outstanding I/O requests have returned from your
driver, the free()
hook is called. Here is where you can
deallocate all memory you allocated in open()
or during the
course of dealing with the specific open device (as
indicated by the cookie), and re-set your driver to accept a
future open()
for that device name. Note that there will be
exactly one call to the free()
hook for each call to the
open()
hook, and that a call to free()
for the cookie
returned by open()
will always come after a call to close()
for that cookie. There is no relation between different
cookies returned by different calls to open()
; as far as the
kernel knows they are independent.
The free()
hook is a good place to call
remove_io_interrupt_handler()
to remove the interrupt
handler for your device if you installed it in open()
. If
you allow multiple open()
s, it's easier to install the
handler (once) in init_driver()
and remove it in
uninit_driver()
; don't install a handler more than once for
the same hardware! Pass the same "data" value as you passed
to install_io_interrupt_handler()
in open()
(i.e., for most
devices, your "cookie" value).
Your interrupt handler is called whenever an interrupt on
your interrupt number occurs. Because of interrupt sharing,
your hardware may not be the hardware that generated the
interrupt. Your interrupt handler will be called on to
figure out whether the interrupt was caused by your
hardware, and if so, to handle it. The first thing you do in
the interrupt handler should be to read the appropriate
status register on your hardware, and if the interrupt was
not generated by your hardware, immediately return
B_UNHANDLED_INTERRUPT
. This lets the kernel move on to other
interrupt handlers installed for the same interrupt number
and see if they can handle the interrupt.
If the interrupt was indeed generated by your hardware, you
can go ahead and handle the interrupt, and then return
B_HANDLED_INTERRUPT
.
While your interrupt handler runs, interrupts are turned
off. Thus, threads cannot be rescheduled, and other
interrupts cannot be handled. This means that your interrupt
handler should run as fast as possible. A typical interrupt
handler just acquires a spinlock (for mutual exclusion with
user-level threads), adjusts some internal data structure,
and quite possibly releases a semaphore which the user
thread (read(), write(), or control()
) is waiting for.
Because rescheduling with interrupts disabled can cause a
total system hang, you should release semaphores using
release_sem_etc()
and pass the B_DO_NOT_RESCHEDULE
flag,
like so:
release_sem_etc(my_cookie->some_semaphore, 1, B_DO_NOT_RESCHEDULE);
The scheduling quantum on BeOS is 3000 microseconds. Thus,
if you release a semaphore without rescheduling, the longest
you may have to wait before a reschedule happens, and the
scheduler gets a chance to notice that your semaphore has
become available, and thus be able to schedule the thread
waiting on the semaphore, is 3 milliseconds. If this is too
long (for low-latency media devices like audio and MIDI, for
example) your interrupt handler routine can return the
special value B_INVOKE_SCHEDULER
, which means that you
handled the interrupt, and want a thread reschedule to
happen at the earliest possible time. The kernel then calls
resched() as soon as it leaves interrupt level, which gives
the scheduler a chance to notice that your semaphore has
been released and your waiting thread is now ready to run.
Note that, because of multithreading and thread priorities, your thread may not be the thread chosen to run just because a reschedule happens. If you have really low latency requirements, and can't afford to have lower-priority threads come between your interrupt handler and your waiting thread getting scheduled, you have to use real-time priority for the thread waiting for the interrupt. Using real-time priority for threads is dangerous, however, because they may completely lock out other threads from the system, including the graphics threads that draw to the screen, making the system appear "hung" if your real-time thread does too much work without synchronizing with a blocking primitive (like a semaphore).
Now that you know how your device is loaded and unloaded, and how to handle interrupts generated by your hardware, you can design the rest of your device API to be used by user-level clients.
The read()
hook is called in response to a call to the
user-level function read()
on a file descriptor that
references your device. The cookie for your device will be
passed to the read()
hook, as well as the current position,
as maintained by the kernel file descriptor layer. If your
device does not support positioning (seeking) you can ignore
the position parameter.
Your job inside the read()
hook is to transfer data into the
buffer passed into the read()
hook. The buffer has a size of
*numBytes
. You should transfer at most that many bytes, and
then set *numBytes
to the number of bytes transferred. If
any bytes were transferred, return B_OK
. If an error
occurred and/or no data was transferred, set *numBytes
to 0
and return a negative error code.
Please note that the buffer pointed at by "data" will typically be in the user space of the team calling read(). It will typically be in discontiguous memory, and it will not be locked in physical RAM. Thus, it is not accessible from an interrupt service routine, nor can you DMA directly into it without first locking the buffer and getting the physical memory mapping for it:
status_t read_hook(void *cookie, off_t position, void *buffer, size_t *numBytes) { long entries = 2+*numBytes/B_PAGE_SIZE; physical_entry * pe = (physical_entry *) malloc(sizeof(physical_entry)*entries); status_t err; lock_memory(buffer, *numBytes, B_DMA_IO); entries = get_memory_map(buffer, *numBytes, pe, entries); /* set up and start your DMA here */ ... /* assume your interrupt handler will release this semaphore when DMA done */ err = acquire_sem(my_dma_semaphore); unlock_memory(buffer, *numBytes, B_DMA_IO); if (err < B_OK) { *numBytes = 0; } free(pe); return err; }
The same rules apply for the write()
hook, except that the
data transfer direction is from the buffer passed by the
client to your hardware.
An alternative is to use a contiguous buffer in kernel space
that you allocate and copy to/from in read()
and write()
. If
you have a sound card that uses a cyclic auto-repeat DMA
buffer, this is often a good solution, for example. However,
if the data rate is high, such as for live video or fast
mass storage devices, you want to avoid copies. You might
choose to just have read()
and write()
return an error, and
use ioctl()
exclusively for communicating with your device.
Another option is to make ioctl()
the preferred protocol,
but have read()
and write()
call the appropriate ioctl()
functions for convenience.
These kinds of decisions are easier if you're implementing a device for which Be has defined a protocol, because then you just follow the protocol. However, if you're implementing a driver for a device for which there is no predefined protocol, or if your device will have significantly better performance using some other protocol, you'll have to design the driver protocol on your own.
The control()
hook is called in response to the user-level
client calling the ioctl()
function:
struct the_args { int a; int * b; }; int foo; struct the_args args; args.a = 1; args.b = &foo; err = ioctl(fd, SOME_CONSTANT, &args); if (err == -1) err = errno;
The control()
hook receives the integer constant passed to
ioctl()
, as well as the pointer argument. Currently, the
"size" argument will always be 0 when passed to the hook, so
you can ignore it. Assume that the pointer argument is
correct for the integer constant in question.
You can start numbering your own operation constants from
B_DEVICE_OP_CODES_END+1
(in <Drivers.h>
). If you want to
avoid the risk of clashing with someone trying to use a
protocol you do not know about on your device, you can
choose an arbitrary larger number to start numbering from,
such as your birthday or something. As long as the numbers
(when read as signed 32-bit integers) are larger than
B_DEVICE_OP_CODES_END
.
In the example above, your device control hook can look like this:
status_t control_hook(void * cookie, uint32 operation, void * data, size_t length) { my_device * md = (my_device *)cookie; status_t err = B_OK; switch (operation) { case SOME_CONSTANT: { struct the_args * ta = (struct the_args *)data; int i; if (ta->a > MAX_INDEX_FOR_MY_DEVICE) { ta->a = MAX_INDEX_FOR_MY_DEVICE; } if (ta->b == NULL) { err = B_BAD_VALUE; } else { err = acquire_sem(md->lock_sem); if (err < B_OK) { return err; } for (i=0; ia; i++) { ta->b[i] = md->some_value[i]; release_sem(md->lock_sem); } } break; default: err = B_DEV_INVALID_IOCTL; break; } return err; }
Semaphores may cause a reschedule to another thread when
released. Thus, you should not release a semaphore from an
interrupt handler, or with interrupts disabled, without
passing the B_DO_NOT_RESCHEDULE
flag (using
release_sem_etc()
).
It is generally a good idea to put as much code as possible
at the user level, and make your driver as shallow as
possible even if you aren't forced to by porting C++ code.
The less code there is in the driver, the less locked memory
will be used, and the less code there is that may crash the
kernel. All of your driver's code and global/static data, as
well as all memory returned by malloc()
called from a
driver, is locked (and thus safe to access from an interrupt
handler). Be gentle on the system.
Disabling interrupts is NOT sufficient to guarantee atomicity, because on an SMP system, the other CPU may be calling into your driver at the same time. For synchronization with data accessed by interrupt handlers, you have to use a spinlock. Spinlocks are the most primitive synchronization mechanism available; basically they use some atomic memory operation to test-and-set a variable. When the test-and-set fails, the calling thread just keeps trying (busy-waiting) until it succeeds. Thus, contention for spinlocks can be quite CPU intensive. Therefore, they should be used sparingly, and only to synchronize data that really has to be touched by an interrupt handler (since semaphores cannot be used by interrupt handlers).
A spinlock is simply an int32
value in some permanent
storage (a global, or some memory you malloc()
as part of
opening your device) that is initialized to 0 before being
used the first time. To acquire a spinlock
, you turn off
interrupts and then call acquire_spinlock()
:
/* these are global variables */ int32 my_spinlock = 0; char protected_data[128]; int protected_ctr = 0; /* Acquire spinlock. */ cpu_status cp = disable_interrupts(); acquire_spinlock(&my_spinlock); /* Do protected operations -- this should be fast and not cause */ /* any reschedule, so don't call malloc() or any semaphore operations */ /* or any function that may call these functions. */ protected_data[protected_ctr++] = 0; protected_ctr = protected_ctr & 127; /* Release spinlock. */ release_spinlock(&my_spinlock); restore_interrupts(cp); /* in your interrupt handler */ /* serialize with user code, possibly on other CPUs */ acquire_spinlock(&my_spinlock); /* Do protected operations like hardware register access */ release_spinlock(&my_spinlock);
If you fail to disable interrupts before acquiring the spinlock, you'll deadlock on single-CPU machines, because your interrupt handler may then be called (and try to acquire_spinlock() your spinlock) while the regular thread is holding the spinlock. That would be bad.
Many people find it convenient to wrap spin-locking into two
general-purpose lock()
and unlock()
routines to not forget
to turn off interrupts. You can use the same routines inside
your interrupt handler, because calling disable_interrupts()
and later restore_interrupts()
is OK even inside an
interrupt handler (even though interrupt handlers run with
interrupts already turned off). Spinlocks, like semaphores,
don't nest like that, however, so think about what you're
doing and don't call functions that may lock a spinlock from
some function that already holds the same spinlock.
/* Assuming you keep state information about your hardware */ /* in a struct named my_card, with a 0-initialized */ /* spinlock named hardware_lock */ cpu_status lock_hardware(my_card * card) { cpu_status ret = disable_interrupts(); acquire_spinlock(&card->hardware_lock); return ret; } void unlock_hardware(my_card * card, cpu_status previous) { release_spinlock(&card->hardware_lock); restore_interrupts(previous); }
It's important to not disable interrupts for a long time. As a general rule, no more than 50 microseconds is allowable. If you disable interrupts for longer, you'll jeopardize the overall performance of the BeOS and the machine it's running on. Similarly, your interrupt service routine should not run for more than 50 microseconds (and less is, of course, better).
You may find the lack of deferred procedure calls disturbing if you come from some other driver architecture. However, the time it takes for BeOS to service an interrupt, release a semaphore, and cause a reschedule into a user-level real-time thread is often less than the time it takes for other operating systems just to handle the interrupt and get to the deferred procedure call level. Thus, we prefer to do what needs to be done in the user-level client threads that call into the device hooks.
If at all possible, let the user of your device spawn
whatever threads your device needs. Kernel threads are very
tricky and live by different, undocumented rules. If you
find a need for a periodic task, look into using timer
interrupts (available as of BeOS R4.1). Look for add_timer()
and cancel_timer()
in <KernelExport.h>
.
If you wish to use a kernel thread in your driver, there are
several pitfalls that make doing this a bad idea. The kernel
team is a team just like any other team, and your kernel
thread will have a stack in the upper half of the kernel
team address space. This stack (and stacks of other kernel
threads) is not accessible from user programs, and thus is
not accessible from device hooks called by user programs.
Only the lower half of the kernel team address space
(0x0-0x7fffffff
) is accessible to all teams when they enter
the kernel.
You also cannot use wait_for_thread()
in your close()
or
free()
hooks, because doing so causes a deadlock with the
psycho_killer
thread, which is responsible both for reaping
dead threads and for freeing file descriptors and their
associated devices. Thus, it is impossible for you to be
perfectly sure that your kernel thread has terminated before
your free()
hook returns. This is a big enough problem that
you should reconsider using kernel threads at all in your
driver, if there is some other possibility. This specific
problem will be fixed in a future version of the BeOS, but
all the other problems with kernel threads will still
remain; also, R4 will be the baseline BeOS for some time to
come, so a design without kernel threads is thus more widely
compatible.
Again: Put the threads you need in the user-level client
(add-on, application, whatever). If the driver needs to
perform periodic tasks not in response to hardware
interrupts, use timer interrupts. They are even more
lightweight than threads, and have fewer of the problems
mentioned above. As with any interrupt routine, timer
interrupts still cannot access memory that is not in the
kernel space; thus if you need to write into user-supplied
buffers, do it in your driver read()
, write()
and control()
hooks.
Good luck!