Linux has taken the embedded marketplace by
storm. According to industry analysts, one third to one half of
new embedded 32 and 64-bit designs leverage Linux. Embedded Linux
already dominates multiple application spaces, including SOHO
networking and imaging/multi-function peripherals, and is making
vast strides in storage (NAS/SAN), digital home entertainment
(HDTV/PVR/DVR/STB), and handheld/wireless, especially in digital
mobile phones.
New embedded Linux applications do not spring, Minerva-like,
from the heads of developers - a majority of projects must accommodate
thousands, even millions of lines of legacy source code. While
hundreds of embedded projects have successfully ported existing
code from platforms like Wind River VxWorks and pSOS, VRTX,
Nucleus and other RTOSes across to Linux, the exercise is still
non-trivial.
To date, the majority of literature on migration from legacy
RTOS Applications to embedded Linux has focused on RTOS APIs,
tasking, and scheduling models and how they map onto Linux user-space
equivalents. Equally important in the I/O-intensive sphere of
embedded programming is porting of RTOS application hardware
interface code to the more formal Linux device driver model.
This article will survey several common approaches to memory-mapped
I/O frequently found in legacy embedded applications. These
range from ad hoc use of ISRs and user-thread hardware access
to the semi-formal driver models found in some RTOS repertoires.
It will also present heuristics and methodologies for transforming
RTOS code into well-formed Linux device drivers. In particular,
the article will focus on memory-mapping in RTOS code vs. Linux,
porting queue- based I/O schemes, and redefining RTOS I/O into
native Linux drivers and daemons.
RTOS I/O Concepts
The word that best describes most I/O in RTOS-based
systems is "informal". Most RTOSes were designed for
older MMU-less CPUs, ignore memory management even when an MMU
is present, and so make no distinction between logical and physical
addressing. Most RTOSes also execute entirely in privileged
state (system mode), ostensibly to enhance performance. As such,
all RTOS application and system code has access to the entire
machine address space, memory-mapped devices, and I/O instructions.
Indeed, it is very difficult to distinguish RTOS application
code from driver code even when such distinctions exist.
This informal architecture leads to ad hoc implementation of
I/O, and in many cases the complete absence of a recognizable
device driver model. In light of this egalitarian non-partitioning
of work, it is instructive to review a few key concepts and
practices as they apply to RTOS-based software:
In-line Memory-Mapped Access
When commercial RTOS products became available in the mid 1980s,
most embedded software consisted of big mainline loops with
polled I/O and ISRs for time-critical operations. Developers
designed RTOSes and executives into their projects mostly to
enhance concurrency and aid in synchronization of multi-tasking,
but eschewed any other constructs that "got in the way".
As such, even when an RTOS offered I/O formalisms, embedded
programmers continued to perform I/O in-line:
#define DATA_REGISTER 0xF00000F5
char getchar(void) {
return (*((char *) DATA_REGISTER)); /* read from port */
}
void putchar(char c) {
*((char *) DATA_REGISTER) = c; /* write to port */
}
More disciplined developers usually segregate all such in-line
I/O code from h/w independent code, but I have encountered plenty
of I/O spaghetti as well.
When faced with pervasive in-line memory-mapped I/O usage,
embedded developers that are new to Linux always face the temptation
to port all such code as-is to user space, converting the #define
of register addresses to calls to mmap(). This approach works
fine for some types of prototyping, but cannot support interrupt
processing, has limited real-time responsiveness, is not particularly
secure, and so is not suitable for commercial deployment.
RTOS ISRs
In Linux, interrupt service is exclusively the domain of the
kernel; with an RTOS, ISR code is free-form and often indistinguishable
from application code (other than the return sequence). Many
RTOSes offer a system call or macro that lets code detect its
own context (e.g., Wind River VxWorks intContext(). Common also
is the use of standard libraries by ISRs, with accompanying
reentrancy and portability challenges.
Most RTOSes support registration of ISR code and handle interrupt
arbitration and ISR dispatch. Some very primitive embedded executives,
however, only support direct insertion of ISR start addresses
into hardware vector tables.
Even if you attempt to perform read and write operations in-line
in user space, you will have to put your Linux ISR into kernel
space.
RTOS I/O Subsystems
Most RTOSes ship with a customized standard C run-time library
(e.g., pREPC for pSOS), or selectively patch C libraries (libc)
from compiler ISVs or do the same for glibc. Thus, at a minimum,
most RTOSes support a subset of standard C-style I/O (open/close/read/write/ioctl).
In most cases, these calls and their derivatives resolve to
a very thin wrapper around I/O primitives. Interestingly, since
most RTOSes did not support file systems, those platforms that
do offer file abstractions for flash or rotating media often
use completely different code and/or different APIs (e.g., pHILE
for pSOS).
Wind River VxWorks goes farther than most RTOS platforms in
offering a feature-rich I/O subsystem, principally to overcome
hurdles in integration and generalization of networking interfaces/media.
Deferred Processing
Many RTOSes also support a "bottom half" mechanism,
that is, some means of deferring I/O processing to an interruptible
and/or preemptible context. Others do not, but may instead support
mechanisms like interrupt nesting to achieve comparable ends.
Typical RTOS Application
I/O Architecture
A typical I/O scheme (input only) and the data
delivery path to the main application is diagramed below. Processing
proceeds as follows:
* A h/w interrupt triggers execution of an ISR.
* The ISR does basic processing and either completes the input
operation locally or lets the RTOS schedule deferred handling.
In some cases, deferred processing is handled by what Linux
would call a "user thread", herein an ordinary RTOS
task.
* Whenever and wherever the data is ultimately acquired (ISR
or deferred context), ready data is put into a queue (yes, RTOS
ISRs can access application queue APIs and other IPCs - see
API table below).
* One or more application tasks then read messages from the
queue to consume the delivered data.

Comparison between typical I/O and data delivery in a legacy
RTOS and Linux
Output is often accomplished with comparable mechanisms - instead
of using write() or comparable system calls, one or more RTOS
application tasks put ready data into a queue. The queue is
then drained by an I/O routine or ISR that responds to a "ready-to-send"
interrupt, a system timer, or another application task that
waits pending on queue contents and then performs I/O directly
(either polled or via DMA).
Mapping RTOS I/O onto
Linux
The queue-based producer/consumer I/O model described
above is just one of many ad hoc approaches employed in legacy
designs. Let us continue to use this straightforward example
to discuss several possible (re)implementations under embedded
Linux:
Wholesale Port to User Space
Developers who are reticent to learn the particulars of Linux
driver design, or who are in a great hurry, will likely try
to port most of such a queue-based design, intact, into a user-space
paradigm. In this driver mapping scheme, memory-mapped physical
I/O occurs in user context via a pointer supplied by mmap().
#include <sys/mman.h>
#define REG_SIZE 0x4 /* device register size */
#define REG_OFFSET 0xFA400000
/* physical address of device */
void *mem_ptr; /* de-reference for memory-mapped access */
int fd;
fd=open("/dev/mem",O_RDWR); /* open physical memory
(must be root) */
mem_ptr = mmap((void *)0x0, REG_AREA_SIZE, PROT_READ+PROT_WRITE,
MAP_SHARED, fd, REG_OFFSET);
/* actual call to mmap() */
A process-based user thread performs the same processing as
the RTOS-based ISR or deferred task would, and then uses the
SVR4 IPC msgsnd() to queue a message for receipt by another
local thread or by another process via msgrcv().
While such a quick and dirty approach is good for prototyping,
it presents significant challenges for building deployable code.
Foremost is the need to field interrupts in user space. Projects
like DOSEMU offer signal-based interrupt I/O with SIG (the Silly
Interrupt Generator), but user-space interrupt processing is
very slow (millisecond latencies instead of tens of microseconds
for a kernel-based ISR). Furthermore, user-context scheduling,
even with the preemptible Linux kernel and real-time policies
in place, cannot guarantee 100% timely execution of user-space
I/O threads.
Re-architecting to Use Linux Drivers
It is highly preferable to bite the bullet and write at least
a simple Linux driver to handle interrupt processing at kernel
level. A basic character or block driver can field application
interrupt data directly in the "top half" or defer
processing to a tasklet, kernel thread or to the newer work-queue
bottom half mechanism in the 2.6 kernel. One or more application
threads/processes can open the device and then perform synchronous
reads, just as the RTOS application made synchronous queue receive
calls. Note that this approach will require at least recoding
consumer thread I/O to use device reads instead of queue receive
operations.
Preserving an RTOS Queue-based I/O Architecture
To reduce the impact of porting to embedded Linux, you could
also leave a queue based scheme in place and add an additional
thread or daemon process that waits for I/O on the newly-minted
device. When data is ready, that thread/daemon wakes up and
en-queues the received data for use by the consuming application
threads or processes.
Porting Approaches
Porting RTOS code to embedded Linux does not
differ conceptually from enterprise application migration. After
the logistics of porting have been addressed (make/build scripts
and methods, compiler compatibility, location of include files,
etc.), code-level porting challenges turn on the issues of application
architecture and API usage.
For the purposes of the discussion at hand, let us assume that
the "application" part (everything except I/O-specific
code) will migrate from the RTOS-based system into a single
Linux process; RTOS tasks will map onto Linux threads and inter-task
IPCs will map onto Linux inter-process and inter-thread equivalents.
Mapping RTOS Tasks on to Linux Process-based
Threads
While the basic shape of the port is easy to understand,
the devil is in the details. And the most salient details are
the RTOS APIs in use and how to accommodate them with Linux
constructs.
Holistic Approach - Re-architecting
If your project is not highly time-constrained, and if your
goal is to produce portable code for future project iterations,
then you will want to spend some time analyzing the current
structure of your RTOS application and how/if it "fits"
into the Linux paradigm. For RTOS application code, you will
want to consider the viability of one-to-one mapping of RTOS
tasks onto Linux process-based threads, and whether to repartition
the RTOS application into multiple Linux processes. Depending
on that decision, you will want to review the RTOS IPCs in use
to determine proper intra-process vs. inter-process scope.
On the driver level, you will definitely want to convert any
informal in-line RTOS code into proper drivers. If your legacy
application is already well-partitioned, either using RTOS I/O
APIs or at least segregated into a distinct layer, your task
will be much easier. If ad hoc I/O code is sprinkled liberally
throughout your legacy code base, you've got your work cut out
for you.
API-based Approach
Developers in a hurry to move off a legacy RTOS, or those just
trying to glue together a prototype, are more likely to attempt
to map or convert as many RTOS APIs to Linux equivalents in
situ. Entities in common port nearly transparently (comparable
APIs, IPCs, system data types, etc.). Others can be addressed
with #define redefinition and macros. Those remaining will need
to be recoded, ideally as part of an abstraction layer.
You can get a head start on API-based porting by using emulation
libraries that accompany many embedded Linux distributions (like
my company's libraries for Wind River VxWorks and pSOS) or by
using third-party API mapping packages from companies like MapuSoft.

Multi-pronged approach to porting RTOS code
and APIs to Linux
Most project take a hybrid approach, mapping all comparable
or easily translatable APIs, re-architecting where it doesn't
slow things down, and playing "whack a mole" with
the remaining code until it builds and runs.
Available APIs in Kernel and User Space
For both intensive re-architecting and for quicker-and-dirtier
API approaches, you will still have to (re)partition your RTOS
application and I/O code to fit the Linux kernel and user-space
paradigm.
The following table illustrates how Linux is much stricter
about privileged operations than a legacy RTOS and will guide
you in the (re)partitioning process:
| |
IPCs |
Synchronization |
Tasking |
Name Space |
| RTOS Application |
Queues, Signals, Mailboxes Informal Shared Memory
|
Semaphores, Mutexes |
Full RTOS Tasking Repertoire |
Full Application, Libraries and System (Link-time) |
| RTOS Driver |
Queues, Signals, Mailboxes Informal Shared Memory |
Semaphores, Mutexes |
Full RTOS Tasking |
Repertoire Full Application, Libraries and System (Link-time) |
| Linux Application |
Queues, Signals, Pipes Intra-Process Shared Memory
Shared System Memory |
Semaphores, Mutexes |
Process and Threads APIs |
Local Process, Static and
Shared Libraries |
| Linux Driver (Static) |
Shared System Memory Read/Write Process Memory |
Kernel Semaphores Spin Locks |
Kernel Threads, Tasklets |
Full Kernel |
| Linux Module (Dynamic) |
Shared System Memory Read/Write Process Memory |
Kernel Semaphores
Spin Locks |
Kernel Threads, Tasklets |
Module -local and Exported Kernel Symbols |
There are two very important distinctions called out in table:
* RTOSes are very egalitarian, letting application and I/O
code "touch" any address and perform almost any activity,
whereas Linux is much more hierarchical and restrictive.
* Legacy RTOS code can "see" every symbol or entry-point
in the system (at least at link-time), whereas Linux user code
is isolated from and built separately from kernel code and its
accompanying name-space.
The consequences of the Linux hierarchy of privileged access
is that normally only kernel code
(drivers) actually accesses physical memory, and that user
code that also does so must run as root.
In general, user-space code is isolated from the Linux kernel
and can only "see" explicitly exported symbols as
they appear in /proc/ksyms. Moreover, visible system calls to
the kernel are not invoked directly, but via calls to user library
code. This segregation is intentional, enhancing stability and
security in Linux.
When you write a driver, the opposite is true. Statically-linked
drivers are privy to the ENTIRE kernel name-space (not just
exports), but have zero visibility into user-space process-based
symbols and entry points. And, when you encapsulate driver code
in run-time loadable modules, your program can only leverage
interfaces explicitly exported in the kernel via the EXPORT_SYMBOL
macro.
Migrating Network Drivers
As indicated above, porting character and block
device drivers to Linux is a straightforward if consuming activity.
Porting network drivers, though, can seem much more daunting.
Remember that while Linux grew up with TCP/IP, most RTOSes
had networking grafted onto them in the late nineties. As such,
legacy networking often only presents bare bones capabilities,
like being able only to handle a single session or instance
on a single port, or only to support a physical interface to
a single network medium. In some cases networking architecture
was generalized after the fact (as with Wind River VxWorks MUX
code) to allow for multiple interfaces and types of physical
connection.
The bad news is that you will likely have to rewrite most or
all of your existing network interfaces. The good news is that
re-partitioning for Linux is not hard and that you have dozens
of Open Source examples of network device drivers to choose
from.
Your porting task will be to populate the areas at the bottom
of the following diagram with suitable packet formatting and
interface code:

Block Diagram of Linux Network Drivers
Writing network drivers is not for beginners. Since, however,
many RTOS network drivers actually were derived from existing
GPL Linux interfaces, you might find the process facilitated
by the code itself. Moreover, there is a large and still-growing
community of integrators and consultants focused on making a
business of helping embedded developers move their applications
to Linux, for reasonable fees.
Conclusion
The goal of this article has been to give embedded
developers some insight into both the challenges they will face
and benefits they will realize from moving their entire software
stack from a legacy RTOS to Linux. The span of 2800 words or
so is too brief to delve into many of the details of driver
porting (driver APIs for bus interfaces, address translation,
etc.) but the wealth of existing Open Source GPL driver code
serves as both documentation and template for your migration
efforts. The guidelines presented here will definitely help
your team scope the effort involved in a port of RTOS to Linux,
and provide heuristics for re-partitioning code for best native
fit into embedded Linux.