Fresh test data in every CI run
Your pipeline should not depend on a fixtures file that rots or a copied production dump that is a GDPR liability. SeedBase generates deterministic, foreign-key-consistent data and loads it into an ephemeral database, in one command, before your tests run.
Why CI test data is a problem
- Fixtures rot. One non-null column and every fixture breaks across the suite.
- Production dumps in CI are a security and GDPR liability, and they are huge.
- Stale shared data makes tests flaky: a row another test changed, an ID that drifted.
What a pipeline wants is a fresh, correct database per run, generated, not copied, and the same every time so failures are real.
One command in your pipeline
Spin up an ephemeral database, generate a deterministic dataset, load it, run your tests. A GitHub Actions job:
# .github/workflows/test.yml
jobs:
test:
runs-on: ubuntu-latest
services:
db:
image: postgres:16
env: { POSTGRES_PASSWORD: postgres }
ports: ["5432:5432"]
env:
DATABASE_URL: postgres://postgres:postgres@localhost:5432/postgres
SEEDBASE_TOKEN: ${{ secrets.SEEDBASE_TOKEN }}
steps:
- uses: actions/checkout@v4
- run: pip install seedbase
- name: Seed the test database
run: |
seedbase generate --project ${{ vars.SEEDBASE_PROJECT }} \
--seed 20240601 --format postgresql --wait
seedbase pull data --target "$DATABASE_URL" --replace
- run: pytest -q
The schema is dropped after the run (if: always()), so nothing leaks between builds. The same shape works on GitLab CI, CircleCI or any runner, the CLI is just pip install seedbase.
Deterministic by seed, reproducible across PRs
Generation is deterministic: pin SEEDBASE_SEED and every pull request gets the exact same dataset, so a failing test is a real failure, not data drift. Commit the generation config as code next to your migrations and the whole team regenerates the same data from the same schema.
Or pull it straight into your tests, with the pytest plugin
Install SeedBase next to pytest and request a fixture. It is opt-in and side-effect free at import, so it cannot break an existing suite, and it skips (not fails) if credentials are missing.
pip install seedbase pytest
def test_orders_have_customers(seeded_data):
dump = seeded_data.decode("utf-8")
assert "INSERT INTO orders" in dump
assert "INSERT INTO customers" in dump
def test_generation_is_deterministic(seedbase_generation):
assert seedbase_generation["status"] == "completed"
Fixtures: seedbase_client, seedbase_project, seedbase_generation, seeded_data. Configure via env vars, pytest.ini / pyproject.toml, or CLI flags.
Works with your stack
- Any CI: GitHub Actions, GitLab CI, CircleCI, any runner. The CLI is one pip install.
- Postgres and MySQL, export as SQL/CSV/JSON, or push straight into the database.
- Python SDK and a Node SDK, plus an MCP server if your AI agent drives the pipeline.
Seed your CI database on every run.
Create an API key, drop two commands into your pipeline, and get fresh FK-consistent data per build. Free tier, no card.
Get an API key, freeMore: CLI & CI docs · For AI agents · why Faker breaks at scale · Django test data