Skip to content

Schema Validation

Schema validation is optional. TalaDB is schemaless by default — you can store any document in any collection without declaring a schema first. When you do want runtime type safety (catching bad data at the boundary before it reaches storage), pass a schema option to db.collection().

Basic usage

Pass a schema as the second argument to db.collection(). Any object with a parse(data: unknown): T method works — Zod, Valibot, or a hand-rolled validator.

ts
import { z } from 'zod'

const userSchema = z.object({
  name: z.string().min(1),
  age:  z.number().int().nonnegative(),
  email: z.string().email().optional(),
})

type User = z.infer<typeof userSchema>

const users = db.collection<User>('users', { schema: userSchema })

Without the second argument, the collection behaves exactly as before — no validation, no overhead.

Validation on write

When a schema is set, every call to insert and insertMany runs the document through schema.parse() before sending it to storage. If validation fails, a TalaDbValidationError is thrown and nothing is written.

ts
// ✓ valid — stored normally
await users.insert({ name: 'Alice', age: 30 })

// ✗ invalid — throws TalaDbValidationError, nothing stored
await users.insert({ name: '', age: 30 })

// ✗ insertMany — throws on first invalid doc, nothing is stored
await users.insertMany([
  { name: 'Bob', age: 25 },
  { name: '',    age: 0  },  // fails at index 1 → error, nothing committed
])

TalaDbValidationError

Import TalaDbValidationError to catch schema errors specifically:

ts
import { TalaDbValidationError } from 'taladb'

try {
  await users.insert({ name: '', age: -1 })
} catch (err) {
  if (err instanceof TalaDbValidationError) {
    console.error('Validation failed:', err.message)
    console.error('Caused by:', err.cause)
  }
}

err.message includes the context (insert, insertMany[2], etc.) and the underlying schema library's error message. err.cause is the raw error thrown by schema.parse().

Validate on read (optional)

By default, documents returned by find and findOne are not parsed — they come back as stored. Set validateOnRead: true if you want to catch schema drift (old documents written before a schema change):

ts
const users = db.collection<User>('users', {
  schema: userSchema,
  validateOnRead: true,
})

// find() and findOne() now run every returned document through schema.parse()
const all = await users.find()   // throws if any doc fails

This is useful during migrations or when evolving a schema on existing data. For most production use cases, leave it off — validateOnRead adds a parse call per document on every read.

Works with Valibot

The schema option is duck-typed — it only requires a parse(data: unknown): T method. Valibot schemas expose this via v.parse(), but you can also pass a wrapped version:

ts
import * as v from 'valibot'

const schema = v.object({
  name: v.string(),
  age:  v.number(),
})

const users = db.collection<v.InferOutput<typeof schema>>('users', {
  schema: {
    parse: (data) => v.parse(schema, data),
  },
})

Or any custom validator:

ts
const users = db.collection<User>('users', {
  schema: {
    parse(data: unknown): User {
      const doc = data as Record<string, unknown>
      if (typeof doc.name !== 'string') throw new Error('name must be a string')
      return doc as User
    },
  },
})

CollectionOptions reference

OptionTypeDefaultDescription
schema{ parse(data: unknown): T }undefinedSchema validator. When set, insert and insertMany run every document through schema.parse(). Compatible with Zod, Valibot, or any object with a parse method.
validateOnReadbooleanfalseWhen true, documents returned by find and findOne are also passed through schema.parse().

Cloudflare Workers

@taladb/cloudflare supports the same schema option on db.collection():

ts
import { TalaDBDurableObject, TalaDbValidationError } from '@taladb/cloudflare'
import { z } from 'zod'

const noteSchema = z.object({ title: z.string(), body: z.string() })
type Note = z.infer<typeof noteSchema>

export class NoteDB extends TalaDBDurableObject {
  async fetch(request: Request): Promise<Response> {
    const db = await this.getDB()
    const notes = db.collection<Note>('notes', { schema: noteSchema })

    if (request.method === 'POST') {
      try {
        const body = await request.json<unknown>()
        const id = await notes.insert(body as Omit<Note, '_id'>)
        await db.flush()
        return Response.json({ id }, { status: 201 })
      } catch (err) {
        if (err instanceof TalaDbValidationError) {
          return Response.json({ error: err.message }, { status: 400 })
        }
        throw err
      }
    }

    return Response.json(await notes.find())
  }
}

FAQ

Does the schema affect storage? No. The schema only runs at the TypeScript layer — TalaDB stores documents exactly as returned by schema.parse(). The underlying storage is still schemaless.

Can I change the schema later? Yes. Existing documents are not re-validated when you update the schema definition — only new writes are checked. Use validateOnRead: true temporarily to audit existing data, then run a migration to fix any non-conforming documents.

Is there a performance cost? Only on operations where validation runs (insert, insertMany, and find/findOne when validateOnRead is true). Collections without a schema option have zero overhead — the code path is identical to before.