Package Manager with SQLite

Prove v1.4 ships a package manager. Packages are distributed as .prvpkg files — SQLite databases containing pre-checked AST, type signatures, and comptime-resolved assets. The entire implementation uses only Python’s standard library: sqlite3, urllib, hashlib, struct. No git, pip or pesky node_modules.

This post covers why we made the choices we did and how the pieces fit together.

The Problem

Prove is a compiled language with strong contracts, a verb system, and comptime evaluation. Sharing code between projects meant copying .prv files around. That works for a while — until you want versioning, dependency resolution, or the ability to share a library that uses comptime-loaded CSV data.

The constraint was clear: whatever we built needed to feel native to Prove’s existing toolchain. No shelling out to git. No requiring users to install extra tools. Just proof package add math-helpers and build.

Why SQLite as the Package Format

Most package managers use tarballs. We chose SQLite databases for three reasons:

1. Queryable without full deserialization. The checker needs type signatures to validate imports from packages. With SQLite, we read the exports table — a few rows of function names, parameter types, and return types — without ever touching the AST blobs. This is the “fast path”: the checker sees a package import the same way it sees a stdlib import, backed by a SQL query instead of parsing source.

2. Migratable across compiler versions. Prove’s AST changes between releases. With a tarball, you’d need to re-serialize the entire package. With SQLite, we ship SQL migration files that ALTER TABLE or UPDATE the schema in place. When you upgrade your compiler, your cached packages upgrade themselves.

3. Inspectable. Anyone can open a .prvpkg with sqlite3 and poke around. What modules does this package export? What version of Prove compiled it? It’s all right there in the meta table.

The schema:

CREATE TABLE meta (key TEXT PRIMARY KEY, value TEXT);
CREATE TABLE exports (
    module TEXT, kind TEXT, name TEXT, verb TEXT,
    params TEXT, return_type TEXT, can_fail INTEGER, doc TEXT
);
CREATE TABLE strings (id INTEGER PRIMARY KEY, value TEXT);
CREATE TABLE module_ast (module TEXT PRIMARY KEY, data BLOB);
CREATE TABLE dependencies (name TEXT, version_constraint TEXT);
CREATE TABLE assets (key TEXT PRIMARY KEY, data BLOB);

Six tables. That’s the entire package format.

String-Interned Binary AST

Prove’s AST has ~50 concrete node types (functions, types, match arms, lookup entries, contracts, etc.). Serializing them as JSON would work but would be massive — every field name repeated on every node.

Instead, we built a tagged-tuple binary format with string interning. Each AST node is encoded as:

A uint8 tag identifying the node type
Fields written inline via struct.pack
Strings stored once in a StringIntern table, referenced by uint32 ID

Lists are a uint32 count followed by elements. None is 0xFF. Bools are uint8. The result is roughly 50% smaller than keyed JSON before any compression, and it serializes/deserializes in a single recursive walk — the same pattern we already use for AST dumping in the CLI.

The string table is stored in SQLite’s strings table, and the binary blob goes into module_ast. The key insight: strings dominate Prove AST size (identifiers, type names, doc comments, contract texts), so interning them once gives us most of the compression benefit with zero algorithmic complexity.

Two-Speed Loading

The package loader has two modes, and this split is the architectural decision that makes the whole system feel fast:

Fast path (checker): Read the exports table. Build FunctionSignature and Type objects from the tabular data. No binary deserialization at all. This is what happens on every proof check and every keystroke in the LSP server. It’s a few SQL queries per package.

Slow path (emitter): Deserialize the full AST from module_ast blobs, reconstruct the string table, run the checker on the reconstituted AST to build a symbol table, then hand it to the C emitter. This only happens during proof build, and only for packages the project actually uses.

# Fast path — checker just needs signatures
info = read_package(pkg_path)  # reads meta + exports tables
for exp in info.exports:
    # Already has: name, verb, params, return_type, can_fail

# Slow path — emitter needs full AST
module = load_package_module(pkg_path, "MathHelpers")  # deserializes binary AST

This means prove check on a project with 10 package dependencies does essentially zero extra work — it’s reading pre-computed signatures from SQLite.

Flat Dependency Resolution

For v1, we chose flat resolution: each dependency name resolves to exactly one version across the entire tree. No diamond dependencies. If package A needs json-utils 0.3.0 and package B needs json-utils 0.4.0, that’s a conflict and the resolver tells you.

This is intentionally simple. Prove packages can’t contain native code (no foreign blocks allowed), so there’s no ABI compatibility concern that would push us toward allowing multiple versions. And for a language where the verb system makes function roles explicit (creates, transforms, inputs), having two versions of the same function with the same verb in scope would be confusing, not helpful.

The resolver supports version constraints: exact (0.3.0), minimum (>=0.2.0), range (>=0.1.0,<1.0.0), and caret (^0.2.0). It walks dependencies breadth-first, checks the lockfile for reusable pins, and falls back to the registry for new packages.

Static HTTP Registry

The registry is a directory of files:

{registry}/packages/{name}/index.json
{registry}/packages/{name}/{version}.prvpkg

No API server, authentication (yet) or database. Host it on any CDN, S3 bucket, or even python -m http.server. The registry client is ~130 lines of urllib code.

This means anyone can mirror the registry, host private packages on their own infrastructure, or develop entirely offline using local path dependencies:

[dependencies]
math-helpers = { version = "*", path = "../math-helpers" }

Local path dependencies are first-class: the resolver builds a .prvpkg from the local project on the fly, running the full check pipeline, and caches it alongside remote packages.

Purity Validation

When you prove package publish, the system validates that your package is pure:

All modules must pass prove check with no errors
No foreign blocks — packages can’t contain arbitrary C code
All imports must resolve to stdlib or declared [dependencies]
IO verbs (inputs/outputs) are allowed — the safety boundary is FFI, not IO

This is a deliberate design choice. Prove’s verb system already makes IO visible and trackable. A function declared as inputs read_file(path String) String! is honest about what it does. The danger is foreign blocks — arbitrary C code that could do anything. So we drew the purity line there.

Comptime Resolution at Publish Time

Prove has compile-time evaluation: comptime read("data.csv") loads and processes data during compilation. For packages, all comptime is fully resolved before serialization. The AST in a .prvpkg is post-evaluation — CSV data is already materialized into LookupEntry nodes, computed constants are already values.

Consumers never re-evaluate comptime. They get the final result. Any large data blobs that store-backed types need at runtime go into the assets table.

The CLI

Seven subcommands, all discoverable:

prove package init      — add [dependencies] to prove.toml
prove package add       — add dep, resolve, download, update lock
prove package remove    — remove dep, re-resolve
prove package install   — fetch all deps from lockfile
prove package publish   — validate purity, create .prvpkg
prove package list      — show dependency tree
prove package clean     — clear ~/.prove/cache/packages/

A typical workflow:

prove package add math-helpers 0.1.0 --path ../math-helpers
prove build

That’s it. The add command writes the dependency to prove.toml, resolves versions, downloads (or builds from local path), writes prove.lock, and you’re ready to build.

What We Shipped

8 new source files, 6 new test files, ~4,500 lines total. The implementation breaks down into clean layers:

Layer	File	Purpose
Serialization	`ast_serial.py`	Binary AST ↔ bytes with string interning
Package format	`package.py`	Create/read `.prvpkg` SQLite databases
Config	`config.py`	`[dependencies]` parsing in `prove.toml`
Lockfile	`lockfile.py`	`prove.lock` read/write
Registry	`registry.py`	Static HTTP client, caching, checksums
Resolution	`resolver.py`	Flat version resolution with constraint solving
Integration	`package_loader.py`	Bridge between packages and checker/emitter
Migrations	`migrations/`	SQL-based AST schema upgrades

Each layer depends only on the ones above it, and every layer has its own test file.

What’s Not in V1

C-extension packages (need a [native] section)
Private registries with authentication
Workspaces / monorepo support
Full semver constraint solver (just flat resolution)
Package yanking / deprecation
Signed packages

These are all solvable, and the SQLite foundation makes most of them straightforward to add. But v1 is about getting the core right: share pure Prove libraries, with type safety, fast checking, and zero ceremony.

The package manager will be available in Prove v1.4. Try proof package init in your project to get started.

For more on Prove’s design and development, see the full roadmap or browse all Prove posts.

The Problem

Why SQLite as the Package Format

String-Interned Binary AST

Two-Speed Loading

Flat Dependency Resolution

Static HTTP Registry

Purity Validation

Comptime Resolution at Publish Time

The CLI

What We Shipped

What’s Not in V1

Could we add you to our mailing list!

Comments

Leave a Reply Cancel reply

Your basket (items: 0)