Aphelion Dashboard Background
<Aphelion />
by Algomimic

Enterprise Synthetic Data
at 1/400th the Cost

$49/year vs. $20,000+ enterprise platforms

34/34 PostgreSQL types (ltree, PostGIS, ranges, hstore) + 28/28 MySQL types (JSON, spatial, ENUM/SET, partitioning)

62 total exotic types. 84 production-ready tables across 6 industries. Zero foreign key violations.
Built for healthcare, fintech, and SaaS teams who refuse to overpay.

Save $19,951/year vs. Tonic.ai
Download Free CLI

No Credit Card • 1,000 Rows Per Table

Free forever for local development

curl -L https://algomimic.com/api/download/free -o aphelion && chmod +x aphelion

Or download binary for your OS

Works perfectly with your modern stack

Native CLI integration for Docker, CI/CD, and Seed Scripts

Docker
Node.js
PostgreSQL
MySQL
React
GitHub Actions

Comprehensive Database Coverage

100% exotic type support for both PostgreSQL and MySQL/MariaDB

PostgreSQL

34/34 Types

Exotic Types Supported:

  • PostGIS: geometry, geography
  • Hierarchical: ltree paths
  • Key-Value: hstore
  • Network: inet, cidr, macaddr, macaddr8
  • Ranges: int4range, tsrange, daterange (6 types)
  • Geometric: point, line, polygon, circle (7 types)
  • Full-Text: tsvector, tsquery
  • Advanced: arrays, JSONB, UUID, XML, money

Perfect For:

  • ✓ Complex hierarchical data
  • ✓ Geospatial applications
  • ✓ Full-text search
  • ✓ Advanced analytics

MySQL/MariaDB

28/28 Types

Exotic Types Supported:

  • JSON: with validation
  • Spatial: POINT, LINESTRING, POLYGON
  • Coded Values: ENUM, SET
  • Search: FULLTEXT indexes
  • Binary: BIT, BINARY, VARBINARY, BLOB
  • Precision: DECIMAL for money
  • Scale: PARTITION BY RANGE
  • Computed: GENERATED columns

Perfect For:

  • ✓ High-volume transactions
  • ✓ E-commerce platforms
  • ✓ Partitioned time-series
  • ✓ ENUM/SET applications

84 Production-Ready Tables Across 6 Industries

Available for both PostgreSQL and MySQL/MariaDB

🏥 Healthcare (15 tables) 💰 Finance (16 tables) 🛒 E-commerce (17 tables) 🏢 Insurance (11 tables) 📱 Telecom (13 tables) ⚖️ Legal (12 tables)

Trusted by developers at

2.5M+
Rows Generated
500+
Developers
99.9%
Constraint Success
9
Industry Templates

"Finally, realistic healthcare data without HIPAA violations. The ICD-10 and LOINC generators are perfect."

SJ
Sarah J.
Senior Engineer, HealthTech

"Saved us 40+ hours per sprint. CI/CD auto-approve mode is a game-changer for our testing pipeline."

MK
Mike K.
DevOps Lead, FinTech

"The constraint-safe generation is incredible. No more foreign key violations. Just works."

AL
Alex L.
CTO, E-commerce Startup

Compliance-Ready Data Generation

HIPAA-Ready
Synthetic PHI Generation
PCI-DSS Safe
Fake Payment Data
GDPR-Ready
No Real PII
Privacy-First
Runs Locally

Aphelion generates 100% synthetic data to help you maintain compliance. No real patient data, financial records, or personal information is used or required. Learn more →

Why Developers, Startups & Enterprises Love Us

Built by engineers tired of SQL seeds. Perfect for MVP velocity and Enterprise scale.

Constraint Safe

Never worry about foreign key violations. Topological dependency graph ensures perfect referential integrity.

Partition Support

Intelligent data generation for partitioned tables. Respects date ranges and lists constraints automatically.

Generated Columns

Automatically computes values for generated/computed columns based on defined expressions.

Data Masking

HIPAA/PCI-DSS compliant. Hash, redact, partial masking. Auto-detect SSN, email, phone, credit cards.

Realistic Distributions

Zipfian, power-law, weighted distributions. Data skew, correlation, temporal patterns match reality.

Telecom Generators

IMSI, IMEI (Luhn-valid), MSISDN, ICCID, CDRs, billing invoices, network topology (2G-5G).

Financial Features

Fraud detection (6 types), geolocation, device fingerprints, velocity metrics, PCI-DSS tokenization.

Rich Content

Markdown, code snippets, regex patterns, formatted text for social platforms and forums.

Healthcare Generators

MRN (5 formats), ICD-10, RxNorm, SNOMED, LOINC, NDC codes. Comprehensive OMOP CDM & OpenMRS support.

Temporal Constraints

Dates make sense across tables. Bookings before flights, appointments after registration.

Hierarchies & Trees

Deep trees (5-11 levels), ltree support, cycle detection, HierarchyID paths, JSONB trees.

Multi-Tenant & Sharding

Shard keys, tenant isolation, realistic data skew. 60% primary, 25% satellite, 15% remote.

Advanced Constraints

CHECK constraints, domains, composite keys, multi-column uniqueness, enum-like types.

Complex Data Types

XML columns, JSONB trees, ltree paths, HierarchyID, custom domains, PostgreSQL extensions.

Industry-Specific Solutions

Pre-built generators for healthcare, finance, e-commerce, and more.

Works With Your Existing Schema

No configuration needed to start. We introspect your database, detect types, and map them to realistic Faker generators automatically.

  • Smart Type Detection Maps `user_email` to `internet.email` automatically
  • Zero Config Start Just point it at your DB URL and go
  • JSON Export Export layout to JSON for fine-tuning
bash — 80x24
~ aphelion introspect postgres://localhost/myapp
> Connected to database 'myapp'
> Found 14 tables
> Detected 3 circular dependencies
> Generating schema map... Done
~ aphelion generate --rows 1000 --seed 42
> Generating data plan
> Phase 1: Base tables (users, products)...
> Phase 2: Dependent tables (orders, items)...
> Phase 3: Resolving circular refs...
> Successfully generated 14,000 rows in 1.2s

5-Minute Quickstart

From empty DB to seeded test environment in 3 commands.

bash

# 1. Install globally

$ curl -L https://algomimic.com/api/download/free -o aphelion
$ chmod +x aphelion

# 2. Introspect your existing database (or use our templates)

$ aphelion introspect postgres://user:pass@localhost:5432/mydb

> Created schema.json with 42 detected tables

# 3. Generate 10k rows of constraint-safe data

$ aphelion generate --schema schema.json --rows 10000 --seed 123

> Generating... Done! (1.4s)

Perfect For

  • Database Seeding & Cloning You need to fill a complex Postgres schema with 10M+ rows that respect FKs and constraints.
  • Integration Testing You need deterministic data for CI/CD pipelines. Seed 42 always produces the exact same users.
  • Regulated Industries We specialize in Healthcare (HIPAA), Finance (PCI), and Telecom schemas.

Not Designed For

  • ML Model Training We generate structured relational data, not statistical duplicates of production data distributions for ML.
  • Unstructured Media We don't generate synthetic images, video, or long-form generated text/audio.
  • SaaS Hosting Aphelion is a CLI tool that runs in your infrastructure. We don't host your data.

Why Aphelion is Different

We fill the gap between hacking together scripts and expensive enterprise platforms.

Vs. Scripts & Libraries

Feature
Aphelion
Faker.js / Seeds Custom SQL Scripts
Relational Integrity (FKs) Automated Manual ID tracking Complex CTEs needed
Circular Dependencies Handled Impossible ⚠️ Very hard to write
Maintenance Zero Auto-introspects schema High Break on schema change High Rewrite query on change

Vs. Enterprise Platforms

Feature
Aphelion
Enterprise AI Platforms
(Gretel, MOSTLY AI, Tonic)
Primary Focus Relational Structure Perfect DB seeding & Foreign Keys Statistical Similarity ML Model Training & Privacy
Developer Experience CLI Native Runs locally, works in CI Web UI / SaaS Upload data to cloud
Postgres Depth Native Support ltree, hierarchyid, jsonb, ranges Generic SQL Often treats everything as tables
Price Free / $49 mo $20k+ / year

Simple, Transparent Pricing

Start free on your local machine. Scale when your team grows.

Free / Local

$0/forever

Perfect for local development and testing.

  • Unlimited Tables
  • 1,000 Rows per Table (hard limit)
  • Full CLI Functionality
  • Local Development Only
  • Constraint-Safe Generation
Download & Start Building
POPULAR

Pro Team

$49/year

💰 Save $19,951/year vs. Tonic.ai

For teams automating CI/CD pipelines.

  • Up to 1.5M Rows (per dataset)
  • CI/CD Auto-Approve Mode
  • Priority Email Support
  • Advanced Custom Generators
  • HIPAA/PCI-DSS Compliance

Secure payment via Stripe

🔒 You get realistic data without inheriting production risk.

We never copy, store, hash, or transform real data — we observe structure and generate new data from scratch. All PII is automatically detected and replaced with safe synthetic values.

Scale Transparency: Tested and proven with up to 1.5M rows (100K patients in healthcare demos). Production-ready for datasets up to 250K patients (~3.75M rows) with current configuration. For larger datasets, we offer streaming implementation and direct database loading options. View technical details.

Frequently Asked Questions

Everything you need to know about Aphelion

How is Aphelion different from Faker.js?

Faker.js generates random data but doesn't understand database constraints. Aphelion introspects your schema to ensure zero foreign key violations, handles circular dependencies, and generates realistic healthcare/finance codes (ICD-10, LOINC, etc.) that Faker.js doesn't support.

Is the generated data truly realistic?

Yes! Aphelion uses weighted distributions, temporal patterns, and industry-specific generators. Healthcare data includes real ICD-10 codes, LOINC lab tests, and MRN formats. Financial data includes realistic transaction patterns and account hierarchies. It's designed to mirror production data without the compliance risk.

Can I use this in production?

No. Aphelion is for testing, development, and staging environments only. The data is synthetic and realistic, but not suitable for production use. It's designed to replace production data in non-production environments to maintain HIPAA/PCI-DSS compliance.

What databases are supported?

Currently, Aphelion supports PostgreSQL (including complex features like ltree, JSONB, arrays, and enums). Support for MySQL, SQL Server, and Oracle is on the roadmap.

How does deterministic generation work?

Use the --seed flag to generate identical data every time. Same seed = same data. Perfect for reproducible testing, CI/CD pipelines, and debugging. Different team members can generate the exact same dataset.

Do I need to write code or configuration files?

No coding required! Aphelion introspects your database schema automatically. Just point it at your database, and it generates a JSON configuration with smart defaults. You can customize if needed, but it works out of the box.

What's included in the Pro tier?

Pro ($49/mo) includes: unlimited rows (tested up to 1.5M), CI/CD auto-approve mode (no manual confirmations), priority email support, and advanced custom generators. Perfect for teams with automated testing pipelines.

How fast is data generation?

Aphelion generates ~10,000 rows/second on modern hardware. A 100K row dataset typically takes 10-15 seconds. The constraint-safe algorithm adds minimal overhead while ensuring perfect referential integrity.

Still have questions?

Contact Us →

Contact Sales & Support

Latest from the Blog

Updates, tutorials, and announcements.