Tech Pioneers

Jim Starkey: Creator of InterBase and Pioneer of MVCC Databases

Jim Starkey: Creator of InterBase and Pioneer of MVCC Databases

In the mid-1980s, while most database engineers were still grappling with pessimistic locking schemes that forced transactions to wait in line like customers at a single checkout counter, Jim Starkey was busy reinventing how relational databases handle concurrency. His invention of multi-version concurrency control — MVCC — eliminated the need for readers to block writers and writers to block readers, a breakthrough so fundamental that it took the rest of the industry twenty-five years to catch up. Today, every major database system from PostgreSQL to Oracle to MySQL’s InnoDB engine relies on some form of MVCC architecture. But long before these systems adopted his ideas, Starkey had already built InterBase, invented the BLOB data type, and laid the groundwork for distributed, elastic database engines that would come to define cloud-era data management.

Early Life and Education

James “Jim” Starkey was born on January 6, 1949, in Illinois. He grew up during the dawn of the computer age, a time when mainframes filled entire rooms and programming meant punching cards. Starkey showed an early aptitude for mathematics and logical thinking, pursuits that would eventually lead him to the University of Wisconsin at Madison, where he earned a Bachelor of Arts in Mathematics.

Even before completing his degree, Starkey demonstrated his talent for systems-level programming. In 1965, at just sixteen years old, he wrote STOP — an assembler emulator used by the Illinois Institute of Technology for undergraduate computer science instruction. This was not a toy project; it was production software used in an academic setting, hinting at the engineering discipline that would define his later career.

After graduating, Starkey joined the Computer Corporation of America (CCA), where he worked on a research project to build a database machine for the fledgling ARPAnet. This early exposure to networked database systems — at a time when the internet itself was still an experimental military network — gave Starkey a perspective on distributed data that was decades ahead of the mainstream industry.

Career and Technical Contributions

The DEC Years: Building the Foundation (1975–1984)

In 1975, Starkey joined Digital Equipment Corporation (DEC), then one of the most influential computer companies in the world. At DEC, he became the architect behind an impressive series of data products that would shape enterprise computing for years to come.

His first major achievement was DATATRIEVE, a query and report-writing language for DEC’s PDP-11 and VAX systems. DATATRIEVE Version 1 shipped for the PDP-11 in 1977, followed by VAX DATATRIEVE in 1981 as part of the VAX Information Architecture. These were not merely database tools — they represented an entire philosophy of making data accessible to non-programmers through a high-level, English-like query interface.

Starkey also designed the DEC Standard Relational Interface and created Rdb/ELN, a relational database for embedded systems. This work presaged the later rise of embedded databases like SQLite, created by Richard Hipp, though Starkey’s focus was on real-time embedded applications running on DEC hardware.

Perhaps most remarkably, it was during his DEC years that Starkey invented the BLOB — the Binary Large Object data type. The origin story is delightfully accidental: his boss at DEC, Barry Rubinson, kept wandering around the office muttering about needing “blobs” in the database. When Starkey asked what a blob actually was, Rubinson pointed out that Starkey was the architect and figuring that out was his job. Stranded in Colorado Springs during a snowstorm and unable to make progress on his work on transaction consistency theory, Starkey invented the BLOB instead. That single data type — the ability to store images, documents, audio, and other unstructured binary data directly in a relational database — became one of the most widely used features in database technology.

Technical Innovation: InterBase and MVCC

In 1984, Starkey left DEC and co-founded Groton Database Systems alongside Ann Harrison and Don DePalma, operating initially out of a house at 297 Reedy Meadow Road in Groton, Massachusetts. The company later moved to an office above a dry cleaner and was renamed InterBase Software Corporation in 1986.

It was here that Starkey made his most significant contribution to computer science: multi-version concurrency control (MVCC). Traditional databases of the era used lock-based concurrency, where a transaction writing to a row would lock it, preventing any other transaction from reading or writing that row until the lock was released. This approach created serious bottlenecks in multi-user environments.

Starkey’s insight was elegant: instead of locking rows, the database could maintain multiple versions of each record. When a transaction modifies a row, it creates a new version rather than overwriting the existing one. Other transactions that started before the modification continue to see the original version, providing a consistent snapshot of the data without any locking overhead. Each transaction receives a monotonically incrementing transaction ID, and the database uses these IDs to determine which version of a record should be visible to each transaction.

Here is a simplified conceptual model of how MVCC record versioning works in InterBase’s architecture:

-- Conceptual view of MVCC record versioning in InterBase
-- Each row maintains a chain of versions linked by transaction IDs

-- Transaction T1 (TxID=100) reads account balance
SELECT balance FROM accounts WHERE account_id = 1001;
-- Sees: balance = 5000 (version created by TxID=95)

-- Transaction T2 (TxID=101) updates the same row
UPDATE accounts SET balance = 4500 WHERE account_id = 1001;
-- Creates NEW version: {TxID=101, balance=4500}
-- Old version: {TxID=95, balance=5000} becomes a "back version"

-- Transaction T1 (TxID=100) reads AGAIN — still sees original!
SELECT balance FROM accounts WHERE account_id = 1001;
-- Still sees: balance = 5000 (snapshot isolation)
-- T1 started before T2, so T2's changes are invisible to T1

-- Transaction T3 (TxID=102) starts AFTER T2 commits
SELECT balance FROM accounts WHERE account_id = 1001;
-- Sees: balance = 4500 (the committed version from T2)

-- Garbage collection removes old versions when no active
-- transaction can possibly need them anymore

InterBase became the first commercial relational database to fully implement multi-versioning. But MVCC was not the only innovation baked into the system. InterBase also pioneered event alerting mechanisms (allowing the database to notify applications of changes), native array column types, and stored triggers — features that would later become standard in virtually all relational database systems.

Why It Mattered

The impact of MVCC cannot be overstated. Before Starkey’s work, database concurrency was a zero-sum game: you could have consistency or you could have performance, but getting both simultaneously was considered impossible by many in the field. MVCC fundamentally changed this calculus.

Consider the difference in approach. In a traditional lock-based system, a long-running analytical report could block dozens of transactional operations. In an MVCC system, the report runs against a consistent snapshot while transactions continue unimpeded. This is not merely an optimization — it is a qualitatively different model of how databases can operate.

Michael Stonebraker’s PostgreSQL adopted MVCC as its concurrency model, as did Oracle (starting with version 3). Michael “Monty” Widenius’s MySQL gained MVCC capability through the InnoDB storage engine. Today, MVCC is the dominant concurrency control mechanism in relational databases worldwide — a testament to how right Starkey was in 1984.

Starkey himself identified MVCC as his most significant innovation. He further recognized that MVCC is the core enabling technology for true distributed database systems, since it eliminates the need for distributed locking protocols that would otherwise create unacceptable latency across network boundaries.

Other Notable Contributions

The Ashton-Tate and Borland Era

InterBase Software Corporation was acquired by Ashton-Tate in 1991, and when Ashton-Tate was itself acquired by Borland shortly thereafter, InterBase came along for the ride. Under Borland’s stewardship, InterBase continued to develop but never achieved the market dominance that its technical merits deserved — a familiar story in technology, where marketing muscle often matters more than engineering excellence.

In a pivotal move in 2000, Borland released the InterBase 6.0 source code under an open-source license. This decision gave birth to the Firebird open-source database project, which continues to this day and carries forward many of Starkey’s original architectural decisions. Starkey is known affectionately as “The Wolf” among Firebird developers — a nod to his role as the original architect of the codebase.

Netfrastructure: The Unified Web Platform

In 2000, Starkey founded Netfrastructure, Inc., an ambitious attempt to build a unified platform for web application development. The Netfrastructure system was architecturally remarkable — it combined a relational database engine, an integrated full-text search engine, an embedded Java virtual machine, and a high-performance context-sensitive page generator into a single, cohesive platform.

This vision of a unified application platform — where the database, the application server, and the web server were deeply integrated rather than loosely coupled — was arguably ahead of its time. Modern platforms that integrate databases with application logic, such as those supported by tools like Taskee for managing complex development workflows, echo some of Netfrastructure’s original vision of reducing architectural complexity.

Falcon: The MySQL Storage Engine

In 2006, MySQL AB acquired Netfrastructure, and Starkey joined MySQL as a senior software architect. His mission was to build Falcon, a transactional storage engine intended to compete with InnoDB as MySQL’s default storage engine. Falcon was based on the Netfrastructure codebase and incorporated Starkey’s decades of experience with MVCC and transaction processing.

Here is an example of how Falcon’s storage engine configuration was envisioned for MySQL:

# MySQL configuration for Falcon storage engine (circa 2007)
# my.cnf settings for the Falcon transactional engine

[mysqld]
# Enable Falcon as the default storage engine
default-storage-engine = falcon

# Falcon-specific memory allocation
# Record cache for MVCC version management
falcon_record_memory_max    = 512M
falcon_record_scavenge_threshold = 67

# Page cache for disk I/O optimization
falcon_page_cache_size      = 256M

# Serial log for crash recovery
# Similar in concept to InnoDB's redo log
falcon_serial_log_dir       = /var/lib/mysql/falcon_logs

# Transaction management
falcon_max_transaction_backlog = 1000
falcon_scavenge_schedule    = 30

# Tablespace configuration
falcon_tablespace_page_size = 16384

However, Falcon never progressed beyond beta release. Sun Microsystems acquired MySQL AB in 2008, and Starkey departed shortly afterward in June 2008. The Falcon project was eventually shelved in favor of continuing with InnoDB. While Falcon itself did not survive, the engineering insights Starkey brought to MySQL’s internal architecture contributed to the broader understanding of how transactional storage engines should be designed.

NuoDB: Elastic SQL for the Cloud

Never one to rest, Starkey incorporated a new database company called NimbusDB in 2008, which was formally renamed NuoDB in 2011. NuoDB represented Starkey’s most ambitious vision yet: an elastic, distributed SQL database designed from the ground up for cloud deployment.

Starkey invented what he called the “Emergent Architecture” for NuoDB — a peer-to-peer design where database processes could be added or removed dynamically, with the system automatically rebalancing load and data distribution. This was MVCC taken to its logical conclusion: not just multiple versions of records within a single server, but a fully distributed, multi-versioned architecture spanning multiple nodes.

Starkey retired from NuoDB at the end of 2012, shortly before the product’s commercial launch. The company, co-founded with Barry S. Morris, continued development and NuoDB remains in active use as a cloud-native distributed SQL database.

Philosophy and Key Principles

Throughout his career spanning more than four decades, Starkey has maintained a consistent set of engineering principles that have guided his work.

Question conventional wisdom. Starkey’s entire career has been defined by challenging what the database establishment considered settled science. When the prevailing wisdom said that locking was the only way to achieve transactional consistency, he invented MVCC. When the industry said relational databases could not scale horizontally, he built NuoDB. His advice to young engineers echoes this philosophy — he believes in teaching people to look beyond what everybody currently accepts as true.

Elegance through simplicity. MVCC is, at its core, a remarkably elegant idea: instead of coordinating access through locks, simply let every transaction see a consistent snapshot of the data. The implementation is complex, but the conceptual model is clean and intuitive. This preference for conceptually simple solutions to hard problems runs through all of Starkey’s work, from DATATRIEVE’s English-like query syntax to NuoDB’s emergent architecture.

Integration over composition. From Netfrastructure’s unified platform to NuoDB’s peer-to-peer architecture, Starkey has consistently favored tightly integrated systems over loosely coupled component architectures. He sees unnecessary boundaries between system components as sources of both complexity and performance overhead — a perspective that resonates with modern approaches to integrated development platforms like Toimi, which similarly emphasizes reducing tool sprawl by unifying project management workflows.

Think in decades, not quarters. MVCC took twenty-five years to become ubiquitous. Starkey has never optimized for short-term market success, instead focusing on building architectures that are fundamentally sound. This long-term orientation has meant that some of his ventures, like Falcon, did not achieve commercial success, but his core ideas have invariably proven correct given enough time.

Legacy and Impact

Jim Starkey’s influence on modern database technology is both profound and pervasive. Every time a web application serves a page while simultaneously processing a transaction — without either operation blocking the other — it is leveraging the concurrency model Starkey invented in 1984.

The lineage of his work traces through some of the most important database systems in computing history. Edgar Codd provided the relational model, and Jim Gray established the theory of transaction processing, but it was Starkey who solved the practical problem of making relational transactions fast enough for real-world concurrent workloads. Without MVCC, the relational database might well have been replaced by less rigorous alternatives long before NoSQL databases emerged as a genuine competitor.

The Firebird open-source project carries his original InterBase architecture forward, serving as a living testament to the durability of his design choices. Modern distributed databases like CockroachDB, co-created by Peter Mattis, and Avinash Lakshman’s Apache Cassandra all grapple with the same fundamental concurrency challenges that Starkey addressed, and many incorporate MVCC principles directly.

His invention of the BLOB data type — born from a snowstorm and a boss’s vague demands — enabled relational databases to store multimedia content, documents, and other unstructured data, bridging the gap between structured relational data and the messy reality of real-world information. Without BLOBs, the entire architecture of web applications that store images, PDFs, and media files in databases would have developed very differently.

At the age of seventy-seven, Starkey remains active in database research, working on a new database model called AmorphousDB — proof that the restless engineering mind that invented MVCC four decades ago continues to push against the boundaries of what databases can do.

Key Facts

Detail Information
Full Name James “Jim” Starkey
Born January 6, 1949, Illinois, USA
Education B.A. in Mathematics, University of Wisconsin–Madison
Known For Creating InterBase, inventing MVCC, inventing the BLOB data type
Key Employers Computer Corporation of America, Digital Equipment Corporation (DEC), MySQL AB
Companies Founded Groton Database Systems / InterBase (1984), Netfrastructure (2000), NuoDB / NimbusDB (2008)
Major Products DATATRIEVE, Rdb/ELN, InterBase, Netfrastructure, Falcon, NuoDB
Key Innovations Multi-Version Concurrency Control (MVCC), BLOB data type, database event alerts, native array columns
Spouse Ann Harrison (InterBase contributor)
Nickname “The Wolf” (among Firebird developers)
Current Project AmorphousDB

Frequently Asked Questions

What is MVCC and why is Jim Starkey credited with inventing it?

Multi-Version Concurrency Control (MVCC) is a database technique where the system maintains multiple versions of each data record so that reading transactions and writing transactions do not block each other. Instead of using locks to serialize access, each transaction sees a consistent snapshot of the database as it existed when the transaction began. Jim Starkey implemented MVCC in InterBase in the mid-1980s, making it the first commercial relational database with full multi-versioning support. While the theoretical concept of multi-versioning had been discussed in academic literature, Starkey was the first to build it into a production-grade relational database system, and his implementation became the model that PostgreSQL, Oracle, and other major databases later followed.

What happened to the Falcon storage engine for MySQL?

Falcon was a transactional storage engine developed by Jim Starkey after MySQL AB acquired his company Netfrastructure in 2006. It was intended to become MySQL’s default storage engine, replacing the third-party InnoDB engine. Falcon was based on Starkey’s decades of experience with MVCC and incorporated the Netfrastructure codebase. However, after Sun Microsystems acquired MySQL AB in early 2008, priorities shifted. Starkey left the project in June 2008, and Falcon never progressed beyond beta. Oracle’s subsequent acquisition of Sun — and with it, both MySQL and InnoDB — effectively ended any possibility of Falcon replacing InnoDB as MySQL’s primary transactional engine.

How does InterBase relate to the Firebird database?

Firebird is a direct descendant of InterBase. In 2000, Borland (which had acquired InterBase through its purchase of Ashton-Tate) released the source code for InterBase 6.0 under an open-source license. A community of developers forked this codebase and created the Firebird project, which has continued independent development ever since. Firebird retains InterBase’s core architecture, including its MVCC implementation, and has been enhanced with additional features over the past two decades. The Firebird community holds Jim Starkey in particularly high regard, referring to him as “The Wolf” in recognition of his role as the original architect of the system they continue to develop.

What was Jim Starkey’s role in the invention of the BLOB data type?

Jim Starkey invented the BLOB (Binary Large Object) data type while working at Digital Equipment Corporation in the late 1970s or early 1980s. The origin story is memorable: his boss Barry Rubinson kept insisting the database needed “blobs” without specifying what that meant. When pressed, Rubinson told Starkey — as the architect — that defining BLOBs was his responsibility. Stranded in Colorado Springs during a snowstorm and stuck on his work on transaction consistency theory, Starkey invented the BLOB as a way to store arbitrary binary data within a relational database. This seemingly accidental invention became one of the most important data types in database history, enabling applications to store images, documents, video, and other unstructured content alongside structured relational data.