feat(DestinationAppwrite): tolerate existing schema on Skip/Upsert re-migration#171
feat(DestinationAppwrite): tolerate existing schema on Skip/Upsert re-migration#171premtsd-code wants to merge 10 commits intomainfrom
Conversation
Re-migration needs the destination to tolerate existing schema resources.
The database library stays minimal — migration owns the reconciliation
decision since it has both source and destination metadata in hand.
Per-resource behavior:
- Database: pre-check `databases` metadata. If exists, hydrate sequence
from the existing document and return. Both Skip and Upsert tolerate
unconditionally — databases contain the entire tree of collections +
rows, dropping would destroy everything.
- Table: pre-check `database_{sequence}` metadata. Same pattern —
tolerate unconditionally, hydrate sequence, never drop. Attribute and
index reconciliation happens per-resource at the lower layer.
- Attribute: pre-check `attributes` metadata by composite id
({databaseSeq}_{tableSeq}_{key}). Skip tolerates. Upsert compares
source's updatedAt vs destination's — if source is strictly newer the
column is dropped (via `dbForDatabases->deleteAttribute`) and recreated
so the spec matches source. Column data is wiped on drop; rows will be
repopulated by the row-level Upsert path that follows.
- Index: same pattern against `indexes` metadata. Drop is cheap because
indexes carry no user data — only rebuild time.
Approach is pre-check rather than try/catch for control flow: migration
is sequential single-writer, so the race condition justification for
try/catch doesn't apply, and pre-check reads top-to-bottom with no
exception-as-control-flow.
Row-level dispatch (unchanged, already committed):
OnDuplicate::Upsert → $dbForDatabases->upsertDocuments(...)
OnDuplicate::Skip → $dbForDatabases->skipDuplicates(fn () => createDocuments(...))
OnDuplicate::Fail → plain createDocuments
Both row-level primitives are existing library APIs — no database-library
changes are required for this feature.
Helpers:
shouldTolerateSchemaDuplicate() // onDuplicate !== Fail
sourceSpecIsNewer($src, $dst) // strtotime compare, false on empty
Greptile SummaryThis PR introduces Skip/Upsert re-migration tolerance for
Confidence Score: 2/5Not safe to merge — three P1 ordering bugs will cause Upsert re-migrations to fail at schema limits. The src/Migration/Destinations/Appwrite.php — specifically the ordering of the index count/IndexValidator calls relative to the drop block, and the stale Important Files Changed
|
Guard against malformed datetime strings (e.g. "0000-00-00 00:00:00") that PHP's strtotime() returns false for. Without the explicit check, false gets loose-compared as 0 and the helper silently returns false, meaning a corrupted source updatedAt would always tolerate the existing destination entry and never trigger drop+recreate. Appwrite itself always emits parseable RFC 3339 timestamps, so this is mainly defensive for non-Appwrite sources (Supabase, NHost, CSV). Flagged by greptile P2.
On a first-run migration createField writes TWO metadata documents for a
two-way relationship attribute: parent-side at
{dbSeq}_{tableSeq}_{key} (line 763) and child-side at
{dbSeq}_{relatedTableSeq}_{twoWayKey} (line 819). Migration operates
directly on the database via $dbForProject — it doesn't rely on
Appwrite's attribute-event workers to derive the child side — so both
documents are physically written by migration itself.
The Upsert drop block was only removing the parent-side document +
column. On the recreate pass that followed, line 819 re-wrote the
child-side document and immediately hit DuplicateException because the
stale child doc from the previous migration run was still sitting there.
The inner catch rolled back the parent and the whole attribute aborted
with "Attribute already exists" — breaking Upsert re-migration for every
two-way relationship whose source spec had changed.
Fix: in the Upsert drop branch, when the resource is a two-way
relationship, also delete the child-side metadata document and the
child-side physical column on the related table, then purge caches.
Both deletes are wrapped in try/catch to tolerate the case where a prior
interrupted run (or utopia-php/database's relationship handling) already
cleaned one side — the goal is to guarantee a clean slate for the
downstream recreate, not to require both sides to still exist.
Flagged by greptile P1. Reproducing scenario:
- OnDuplicate::Upsert
- Destination already contains the attribute
- Source's two-way relationship spec was modified between runs
(source.updatedAt > dest.updatedAt)
DestinationAppwrite's shouldTolerateSchemaDuplicate() and sourceSpecIsNewer() were private helpers that encoded mode-specific behavior. Moving them onto OnDuplicate itself puts the behavior on the type that owns it and makes the call sites self-documenting: $this->onDuplicate->toleratesSchemaDuplicate() $this->onDuplicate->shouldReconcileSchema($src, $dst) shouldReconcileSchema() also subsumes the explicit Skip check at the call sites — it returns false for Skip and Fail, so the compound "`Skip || !sourceSpecIsNewer(...)`" condition collapses to a single "`!shouldReconcileSchema(...)`". No behavior change. Lint + PHPStan L3 clean.
OnDuplicate previously exposed two methods (toleratesSchemaDuplicate +
shouldReconcileSchema) that call sites had to combine. That split meant
the decision was spread across two methods and the call site — adding a
new mode later would require updating all three places in sync.
Replace with a single resolveSchemaAction() that returns one of three
SchemaAction enum cases (Create / Tolerate / DropAndRecreate). The
SchemaAction enum lives in the same file as OnDuplicate so the complete
decision surface is in one place.
For data-preserving containers (databases, tables) the method accepts
allowDrop=false so it can never return DropAndRecreate — callers that
use that path don't need to worry about accidentally dropping the
container. In practice the container call sites just use the inline
`$this->onDuplicate !== OnDuplicate::Fail` gate and then a simple
"is empty?" branch; resolveSchemaAction() is reserved for the leaves
(attributes, indexes) that actually benefit from the three-way choice.
Fail-mode short-circuit is preserved at every call site: no
getDocument pre-check when onDuplicate is Fail, so first-time
migrations (the common default-mode path) pay zero overhead. The
short-circuit stays at the call site rather than inside the enum method
so the "don't query on Fail" invariant is visible where the query is
issued.
Call sites:
- createDatabase / createEntity: inline `!== Fail` gate, tolerate on
existing. No drop ever.
- createField: Fail short-circuit, else resolveSchemaAction + match.
DropAndRecreate triggers the two-way relationship cleanup.
- createIndex: same pattern, no two-way concern.
…havioral comments
Three review improvements to the schema-tolerance work:
1. OnDuplicate::resolveSchemaAction — remove the $allowDrop parameter.
No caller passes it; all four call sites rely on the default. Containers
(databases, tables) bypass the method entirely with an inline
"!== Fail" check. The parameter advertised tuning that wasn't real,
confusing readers about what was actually gated. Docstring tightened
to mention that this method is only safe for leaf resources.
2. Extract deleteAttributeCompletely() as a reusable primitive that
fully removes an attribute from destination — both metadata documents
AND physical columns, mirroring createField's two-document write for
two-way relationships. The Upsert drop block in createField shrinks
from ~40 nested lines to a single method call; the two-way cleanup
is now a named concern that future non-Upsert paths (Merge mode,
explicit rollback, etc.) can reuse without duplication.
3. Replace line-number comments ("the first-run createField wrote TWO
metadata docs (parent at line 763, child at line 819)") with
behavioral descriptions ("createField writes a second `attributes`
document on the related table"). Line references rot on every edit;
behavioral descriptions don't.
Pure refactor — no functional change. Pint + PHPStan L3 clean.
… handling resolveSchemaAction is the load-bearing decision point for re-migration tolerance. Unit test locks the full mode × existence × timestamp matrix so a future refactor can't silently shift the mapping. 14 tests, sub-millisecond to run, no destination spin-up required. Coverage: - Not-exists returns Create for all modes. - Fail + exists → Create (routed through create flow so library DuplicateException surfaces as designed, not silently tolerated). - Skip ignores timestamps entirely — "don't touch" contract enforced by verifying both timestamp orderings produce Tolerate. - Upsert with source strictly newer → DropAndRecreate. - Upsert with dest newer or equal timestamps → Tolerate (strict > gate avoids unnecessary destructive rebuild when nothing has changed). - Upsert with unparseable timestamps → Tolerate (conservative, dataset covers empty strings, '0000-00-00 00:00:00', and non-date garbage). - OnDuplicate::values() case ordering guarded against accidental rename/reorder that would break SDK param validators. Writing the test surfaced a real gap in sourceIsNewer: PHP's strtotime() accepts '0000-00-00 00:00:00' leniently and returns a large negative epoch (-62169984000) rather than false. The previous `=== false` check missed the MySQL zero-date sentinel and would have returned true when source was a valid date and dest was '0000-00-00', triggering a destructive drop on garbage input. Harden sourceIsNewer to additionally reject non-positive epochs (<= 0) — any legitimate Appwrite updatedAt is well past 1970-01-01.
Extends the Upsert reconciliation to cover container metadata drift.
Previously databases and tables always tolerated an existing destination
under Upsert — source renames, enable toggles, and (for tables)
permission changes silently never reached destination on re-migration.
Now Upsert + source strictly newer triggers an updateDocument on the
container's own metadata doc, leaving children (rows, sub-resources)
untouched.
Mechanism:
- New SchemaAction::UpdateInPlace case alongside Create / Tolerate /
DropAndRecreate.
- OnDuplicate::resolveSchemaAction gains a canDrop parameter (default
false — safer default, destructive reconciliation must be opt-in):
canDrop: true → DropAndRecreate on Upsert-newer (attributes, indexes)
canDrop: false → UpdateInPlace on Upsert-newer (databases, tables)
- createField / createIndex pass canDrop: true.
- createDatabase / createEntity rely on the default and handle
SchemaAction::UpdateInPlace by calling updateDocument with a scoped
field set:
Database → name, search, enabled, type, originalId, database, $updatedAt
Table → name, search, enabled, $permissions, documentSecurity, $updatedAt
Immutable fields ($id, $createdAt, internal sequences) are never touched.
Source-wins semantics for table permissions follow the full-Upsert
contract: if the caller wants destination drift preserved across re-runs,
they should pick Skip mode, not Upsert. Row Upsert already works this
way; containers now match.
Unit tests cover both canDrop branches and confirm Skip never returns
UpdateInPlace regardless of timestamps (reinforcing the "don't touch"
contract). 16 tests, pint + phpstan L3 clean.
Summary
Re-migration against a destination Appwrite project needs the destination to tolerate existing schema resources — otherwise the second run throws
DuplicateExceptionon every collection/attribute/index that already exists. This PR makesDestinationAppwritehandle that reconciliation locally, without requiring any changes toutopia-php/database.Per-resource behavior
attributes→ toleratesource.updatedAt > dest.updatedAt→deleteAttribute+ recreate; else tolerateindexes→ tolerateupdatedAtgate →deleteIndex+ recreate; else tolerateskipDuplicates(createDocuments)upsertDocumentscreateDocumentsWhy pre-check over try/catch
Migration is a sequential single-writer. The race-condition justification for exception-driven control flow doesn't apply, so pre-check reads top-to-bottom without using exceptions for control flow. On
OnDuplicate::Fail(the default) the pre-check is gated off — fresh-migration paths pay zero overhead.Why only attributes/indexes get the
updatedAtdrop+recreateZero library changes
All of this uses existing
utopia-php/databaseAPIs:getDocument(collection, id)for pre-checkdeleteAttribute/deleteIndexfor the Upsert drop pathskipDuplicates()(5.3.22) for row-level SkipupsertDocuments()(existing) for row-level UpsertNo database library bump required.
Test plan
testAppwriteMigrationRowsOnDuplicatein appwrite/appwrite PR #11910 passes end-to-end (three scenarios: fresh, Skip re-run, Upsert re-run).vendor/bin/pint --testpasses.vendor/bin/phpstan analyse --level 3 srcpasses.