Porting RTOS Device Drivers to Embedded Linux

By Bill Weinberg, MontaVista Software, Inc.


Linux has taken the embedded marketplace by storm. According to industry analysts, one third to one half of new embedded 32 and 64-bit designs leverage Linux. Embedded Linux already dominates multiple application spaces, including SOHO networking and imaging/multi-function peripherals, and is making vast strides in storage (NAS/SAN), digital home entertainment (HDTV/PVR/DVR/STB), and handheld/wireless, especially in digital mobile phones.

New embedded Linux applications do not spring, Minerva-like, from the heads of developers - a majority of projects must accommodate thousands, even millions of lines of legacy source code. While hundreds of embedded projects have successfully ported existing code from platforms like Wind River VxWorks and pSOS, VRTX, Nucleus and other RTOSes across to Linux, the exercise is still non-trivial.

To date, the majority of literature on migration from legacy RTOS Applications to embedded Linux has focused on RTOS APIs, tasking, and scheduling models and how they map onto Linux user-space equivalents. Equally important in the I/O-intensive sphere of embedded programming is porting of RTOS application hardware interface code to the more formal Linux device driver model.

This article will survey several common approaches to memory-mapped I/O frequently found in legacy embedded applications. These range from ad hoc use of ISRs and user-thread hardware access to the semi-formal driver models found in some RTOS repertoires. It will also present heuristics and methodologies for transforming RTOS code into well-formed Linux device drivers. In particular, the article will focus on memory-mapping in RTOS code vs. Linux, porting queue- based I/O schemes, and redefining RTOS I/O into native Linux drivers and daemons.

RTOS I/O Concepts

The word that best describes most I/O in RTOS-based systems is "informal". Most RTOSes were designed for older MMU-less CPUs, ignore memory management even when an MMU is present, and so make no distinction between logical and physical addressing. Most RTOSes also execute entirely in privileged state (system mode), ostensibly to enhance performance. As such, all RTOS application and system code has access to the entire machine address space, memory-mapped devices, and I/O instructions. Indeed, it is very difficult to distinguish RTOS application code from driver code even when such distinctions exist.

This informal architecture leads to ad hoc implementation of I/O, and in many cases the complete absence of a recognizable device driver model. In light of this egalitarian non-partitioning of work, it is instructive to review a few key concepts and practices as they apply to RTOS-based software:

In-line Memory-Mapped Access
When commercial RTOS products became available in the mid 1980s, most embedded software consisted of big mainline loops with polled I/O and ISRs for time-critical operations. Developers designed RTOSes and executives into their projects mostly to enhance concurrency and aid in synchronization of multi-tasking, but eschewed any other constructs that "got in the way". As such, even when an RTOS offered I/O formalisms, embedded programmers continued to perform I/O in-line:
#define DATA_REGISTER 0xF00000F5

char getchar(void) {

return (*((char *) DATA_REGISTER)); /* read from port */
}


void putchar(char c) {

*((char *) DATA_REGISTER) = c; /* write to port */
}

More disciplined developers usually segregate all such in-line I/O code from h/w independent code, but I have encountered plenty of I/O spaghetti as well.

When faced with pervasive in-line memory-mapped I/O usage, embedded developers that are new to Linux always face the temptation to port all such code as-is to user space, converting the #define of register addresses to calls to mmap(). This approach works fine for some types of prototyping, but cannot support interrupt processing, has limited real-time responsiveness, is not particularly secure, and so is not suitable for commercial deployment.

RTOS ISRs
In Linux, interrupt service is exclusively the domain of the kernel; with an RTOS, ISR code is free-form and often indistinguishable from application code (other than the return sequence). Many RTOSes offer a system call or macro that lets code detect its own context (e.g., Wind River VxWorks intContext(). Common also is the use of standard libraries by ISRs, with accompanying reentrancy and portability challenges.

Most RTOSes support registration of ISR code and handle interrupt arbitration and ISR dispatch. Some very primitive embedded executives, however, only support direct insertion of ISR start addresses into hardware vector tables.

Even if you attempt to perform read and write operations in-line in user space, you will have to put your Linux ISR into kernel space.

RTOS I/O Subsystems
Most RTOSes ship with a customized standard C run-time library (e.g., pREPC for pSOS), or selectively patch C libraries (libc) from compiler ISVs or do the same for glibc. Thus, at a minimum, most RTOSes support a subset of standard C-style I/O (open/close/read/write/ioctl). In most cases, these calls and their derivatives resolve to a very thin wrapper around I/O primitives. Interestingly, since most RTOSes did not support file systems, those platforms that do offer file abstractions for flash or rotating media often use completely different code and/or different APIs (e.g., pHILE for pSOS).

Wind River VxWorks goes farther than most RTOS platforms in offering a feature-rich I/O subsystem, principally to overcome hurdles in integration and generalization of networking interfaces/media.

Deferred Processing
Many RTOSes also support a "bottom half" mechanism, that is, some means of deferring I/O processing to an interruptible and/or preemptible context. Others do not, but may instead support mechanisms like interrupt nesting to achieve comparable ends.

Typical RTOS Application I/O Architecture

A typical I/O scheme (input only) and the data delivery path to the main application is diagramed below. Processing proceeds as follows:

* A h/w interrupt triggers execution of an ISR.

* The ISR does basic processing and either completes the input operation locally or lets the RTOS schedule deferred handling. In some cases, deferred processing is handled by what Linux would call a "user thread", herein an ordinary RTOS task.

* Whenever and wherever the data is ultimately acquired (ISR or deferred context), ready data is put into a queue (yes, RTOS ISRs can access application queue APIs and other IPCs - see API table below).

* One or more application tasks then read messages from the queue to consume the delivered data.


Comparison between typical I/O and data delivery in a legacy RTOS and Linux


Output is often accomplished with comparable mechanisms - instead of using write() or comparable system calls, one or more RTOS application tasks put ready data into a queue. The queue is then drained by an I/O routine or ISR that responds to a "ready-to-send" interrupt, a system timer, or another application task that waits pending on queue contents and then performs I/O directly (either polled or via DMA).

Mapping RTOS I/O onto Linux

The queue-based producer/consumer I/O model described above is just one of many ad hoc approaches employed in legacy designs. Let us continue to use this straightforward example to discuss several possible (re)implementations under embedded Linux:

Wholesale Port to User Space
Developers who are reticent to learn the particulars of Linux driver design, or who are in a great hurry, will likely try to port most of such a queue-based design, intact, into a user-space paradigm. In this driver mapping scheme, memory-mapped physical I/O occurs in user context via a pointer supplied by mmap().


#include <sys/mman.h>

#define REG_SIZE 0x4 /* device register size */
#define REG_OFFSET 0xFA400000
/* physical address of device */

void *mem_ptr; /* de-reference for memory-mapped access */
int fd;

fd=open("/dev/mem",O_RDWR); /* open physical memory (must be root) */

mem_ptr = mmap((void *)0x0, REG_AREA_SIZE, PROT_READ+PROT_WRITE,
MAP_SHARED, fd, REG_OFFSET);
/* actual call to mmap() */


A process-based user thread performs the same processing as the RTOS-based ISR or deferred task would, and then uses the SVR4 IPC msgsnd() to queue a message for receipt by another local thread or by another process via msgrcv().

While such a quick and dirty approach is good for prototyping, it presents significant challenges for building deployable code. Foremost is the need to field interrupts in user space. Projects like DOSEMU offer signal-based interrupt I/O with SIG (the Silly Interrupt Generator), but user-space interrupt processing is very slow (millisecond latencies instead of tens of microseconds for a kernel-based ISR). Furthermore, user-context scheduling, even with the preemptible Linux kernel and real-time policies in place, cannot guarantee 100% timely execution of user-space I/O threads.

Re-architecting to Use Linux Drivers
It is highly preferable to bite the bullet and write at least a simple Linux driver to handle interrupt processing at kernel level. A basic character or block driver can field application interrupt data directly in the "top half" or defer processing to a tasklet, kernel thread or to the newer work-queue bottom half mechanism in the 2.6 kernel. One or more application threads/processes can open the device and then perform synchronous reads, just as the RTOS application made synchronous queue receive calls. Note that this approach will require at least recoding consumer thread I/O to use device reads instead of queue receive operations.

Preserving an RTOS Queue-based I/O Architecture
To reduce the impact of porting to embedded Linux, you could also leave a queue based scheme in place and add an additional thread or daemon process that waits for I/O on the newly-minted device. When data is ready, that thread/daemon wakes up and en-queues the received data for use by the consuming application threads or processes.

Porting Approaches

Porting RTOS code to embedded Linux does not differ conceptually from enterprise application migration. After the logistics of porting have been addressed (make/build scripts and methods, compiler compatibility, location of include files, etc.), code-level porting challenges turn on the issues of application architecture and API usage.

For the purposes of the discussion at hand, let us assume that the "application" part (everything except I/O-specific code) will migrate from the RTOS-based system into a single Linux process; RTOS tasks will map onto Linux threads and inter-task IPCs will map onto Linux inter-process and inter-thread equivalents.

Mapping RTOS Tasks on to Linux Process-based Threads

While the basic shape of the port is easy to understand, the devil is in the details. And the most salient details are the RTOS APIs in use and how to accommodate them with Linux constructs.

Holistic Approach - Re-architecting
If your project is not highly time-constrained, and if your goal is to produce portable code for future project iterations, then you will want to spend some time analyzing the current structure of your RTOS application and how/if it "fits" into the Linux paradigm. For RTOS application code, you will want to consider the viability of one-to-one mapping of RTOS tasks onto Linux process-based threads, and whether to repartition the RTOS application into multiple Linux processes. Depending on that decision, you will want to review the RTOS IPCs in use to determine proper intra-process vs. inter-process scope.

On the driver level, you will definitely want to convert any informal in-line RTOS code into proper drivers. If your legacy application is already well-partitioned, either using RTOS I/O APIs or at least segregated into a distinct layer, your task will be much easier. If ad hoc I/O code is sprinkled liberally throughout your legacy code base, you've got your work cut out for you.

API-based Approach
Developers in a hurry to move off a legacy RTOS, or those just trying to glue together a prototype, are more likely to attempt to map or convert as many RTOS APIs to Linux equivalents in situ. Entities in common port nearly transparently (comparable APIs, IPCs, system data types, etc.). Others can be addressed with #define redefinition and macros. Those remaining will need to be recoded, ideally as part of an abstraction layer.

You can get a head start on API-based porting by using emulation libraries that accompany many embedded Linux distributions (like my company's libraries for Wind River VxWorks and pSOS) or by using third-party API mapping packages from companies like MapuSoft.

Multi-pronged approach to porting RTOS code and APIs to Linux


Most project take a hybrid approach, mapping all comparable or easily translatable APIs, re-architecting where it doesn't slow things down, and playing "whack a mole" with the remaining code until it builds and runs.

Available APIs in Kernel and User Space
For both intensive re-architecting and for quicker-and-dirtier API approaches, you will still have to (re)partition your RTOS application and I/O code to fit the Linux kernel and user-space paradigm.

The following table illustrates how Linux is much stricter about privileged operations than a legacy RTOS and will guide you in the (re)partitioning process:

  IPCs Synchronization Tasking Name Space
RTOS Application

Queues, Signals, Mailboxes Informal Shared Memory

Semaphores, Mutexes Full RTOS Tasking Repertoire Full Application, Libraries and System (Link-time)
RTOS Driver Queues, Signals, Mailboxes Informal Shared Memory Semaphores, Mutexes Full RTOS Tasking Repertoire Full Application, Libraries and System (Link-time)
Linux Application
Queues, Signals, Pipes Intra-Process Shared Memory
Shared System Memory
Semaphores, Mutexes Process and Threads APIs Local Process, Static and
Shared Libraries
Linux Driver (Static) Shared System Memory Read/Write Process Memory Kernel Semaphores Spin Locks
Kernel Threads, Tasklets
Full Kernel
Linux Module (Dynamic) Shared System Memory Read/Write Process Memory Kernel Semaphores
Spin Locks
Kernel Threads, Tasklets
Module -local and Exported Kernel Symbols

There are two very important distinctions called out in table:

* RTOSes are very egalitarian, letting application and I/O code "touch" any address and perform almost any activity, whereas Linux is much more hierarchical and restrictive.

* Legacy RTOS code can "see" every symbol or entry-point in the system (at least at link-time), whereas Linux user code is isolated from and built separately from kernel code and its accompanying name-space.

The consequences of the Linux hierarchy of privileged access is that normally only kernel code

(drivers) actually accesses physical memory, and that user code that also does so must run as root.

In general, user-space code is isolated from the Linux kernel and can only "see" explicitly exported symbols as they appear in /proc/ksyms. Moreover, visible system calls to the kernel are not invoked directly, but via calls to user library code. This segregation is intentional, enhancing stability and security in Linux.

When you write a driver, the opposite is true. Statically-linked drivers are privy to the ENTIRE kernel name-space (not just exports), but have zero visibility into user-space process-based symbols and entry points. And, when you encapsulate driver code in run-time loadable modules, your program can only leverage interfaces explicitly exported in the kernel via the EXPORT_SYMBOL macro.

Migrating Network Drivers

As indicated above, porting character and block device drivers to Linux is a straightforward if consuming activity. Porting network drivers, though, can seem much more daunting.

Remember that while Linux grew up with TCP/IP, most RTOSes had networking grafted onto them in the late nineties. As such, legacy networking often only presents bare bones capabilities, like being able only to handle a single session or instance on a single port, or only to support a physical interface to a single network medium. In some cases networking architecture was generalized after the fact (as with Wind River VxWorks MUX code) to allow for multiple interfaces and types of physical connection.

The bad news is that you will likely have to rewrite most or all of your existing network interfaces. The good news is that re-partitioning for Linux is not hard and that you have dozens of Open Source examples of network device drivers to choose from.

Your porting task will be to populate the areas at the bottom of the following diagram with suitable packet formatting and interface code:


Block Diagram of Linux Network Drivers

Writing network drivers is not for beginners. Since, however, many RTOS network drivers actually were derived from existing GPL Linux interfaces, you might find the process facilitated by the code itself. Moreover, there is a large and still-growing community of integrators and consultants focused on making a business of helping embedded developers move their applications to Linux, for reasonable fees.

Conclusion

The goal of this article has been to give embedded developers some insight into both the challenges they will face and benefits they will realize from moving their entire software stack from a legacy RTOS to Linux. The span of 2800 words or so is too brief to delve into many of the details of driver porting (driver APIs for bus interfaces, address translation, etc.) but the wealth of existing Open Source GPL driver code serves as both documentation and template for your migration efforts. The guidelines presented here will definitely help your team scope the effort involved in a port of RTOS to Linux, and provide heuristics for re-partitioning code for best native fit into embedded Linux.


Headquarter:1308,Building 1, Zhongguancun Software Park, Haidian District, Beijing, China, 100094
Beijing:86-10-82826868  Shanghai:86-21-62122267 Shenzhen:86-0755-82971846
Copyright by Beijing Microtec Research Tech Corp