Require project_id on list endpoints for the four hot tables#1250
Require project_id on list endpoints for the four hot tables#1250
Conversation
Adds `require_project_for_list` flag to ProjectMixin (default False) and opts in ClassificationViewSet, OccurrenceViewSet, DetectionViewSet, and SourceImageViewSet. Also sets require_project=True on SummaryView and removes the unfiltered else branch. Context: on 2026-02-16 the Facebook crawler hit /api/v2/classifications/ 13k+ times without a project filter, causing 3.6 min COUNT(*) queries and P99 > 2 min. The crawler is now blocked at nginx, but unfiltered list queries from any client remain a risk. This is the opt-in alternative to #1133: flipping the mixin default to True has too wide a blast radius (pagination calls get_active_project on every ProjectMixin viewset, so `/projects/`, `/algorithms/`, etc. would all 400 without opt-outs). Opt-in keeps the change scoped to the hot tables we actually want to protect. Co-Authored-By: Claude <noreply@anthropic.com>
✅ Deploy Preview for antenna-ssec canceled.
|
✅ Deploy Preview for antenna-preview canceled.
|
📝 WalkthroughWalkthroughThe pull request refactors project context enforcement across API endpoints by introducing a Changes
Sequence DiagramsequenceDiagram
participant Client
participant ViewSet as ViewSet<br/>(ProjectMixin)
participant ProjectMixin as ProjectMixin<br/>.get_active_project()
participant Serializer as Serializer<br/>(Validation)
Client->>ViewSet: GET /api/v2/classifications/ (list)
ViewSet->>ProjectMixin: get_active_project()<br/>(action="list")
ProjectMixin->>ProjectMixin: Compute required flag<br/>(require_project_for_list=True)
alt project_id provided & valid
ProjectMixin->>Serializer: Validate project_id
Serializer->>Serializer: Valid
Serializer-->>ViewSet: Validated project
ViewSet-->>Client: 200 OK<br/>(project-scoped data)
else project_id missing or invalid
ProjectMixin->>ViewSet: Return None or raise
ViewSet->>Serializer: Validate required project_id
Serializer->>Serializer: Raise ValidationError
Serializer-->>Client: 400 BAD_REQUEST
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Adds an opt-in mechanism to require project_id on selected high-cost list endpoints to prevent unfiltered full-table scans, while keeping existing list endpoints unchanged unless explicitly opted in.
Changes:
- Introduces
require_project_for_list(defaultFalse) inProjectMixinand enforces it viaget_active_project()foraction == "list". - Opts in the four “hot table” viewsets (classifications/occurrences/detections/captures) to require
project_idfor list requests; makes/status/summary/require a project and removes the unscoped branch. - Adds API tests to assert 400/200 behavior for the protected endpoints and regression-guard unaffected list endpoints.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
ami/base/views.py |
Adds require_project_for_list and updates get_active_project() enforcement logic. |
ami/main/api/views.py |
Opts specific viewsets into list-time project enforcement; requires project for SummaryView. |
ami/main/tests.py |
Adds tests covering required/accepted behavior for protected list endpoints and guards unaffected endpoints. |
Comments suppressed due to low confidence (1)
ami/main/api/views.py:926
DetectionViewSetopts intorequire_project_for_list, but the view never callsget_active_project()inlist()/get_queryset(). With DRF LimitOffsetPagination, the expensivequeryset.count()happens inpaginate_queryset()beforeget_paginated_response()callsview.get_active_project(), so an unfiltered request can still trigger a full-table COUNT even though it ultimately returns 400. To actually prevent the scan, forceself.get_active_project()to run before pagination (e.g., at the start oflist()or insideget_queryset()).
require_project_for_list = True # Unfiltered list scans are too expensive on this table
queryset = Detection.objects.exclude(NULL_DETECTIONS_FILTER).select_related("source_image", "detection_algorithm")
serializer_class = DetectionSerializer
filterset_fields = ["source_image", "detection_algorithm", "source_image__project"]
ordering_fields = ["created_at", "updated_at", "detection_score", "timestamp"]
def get_serializer_class(self):
"""
Return different serializers for list and detail views.
"""
if self.action == "list":
return DetectionListSerializer
else:
return DetectionSerializer
@extend_schema(parameters=[project_id_doc_param])
def list(self, request, *args, **kwargs):
return super().list(request, *args, **kwargs)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@coderabbitai ready for review |
|
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
ami/main/tests.py (2)
2479-2489:⚠️ Potential issue | 🔴 CriticalUpdate this test for the guarded occurrence list.
/api/v2/occurrences/is one of the hot list endpoints now requiringproject_id; this test still expects the old unfiltered 200 response and is failing CI.🐛 Proposed fix
- def test_no_project_id_returns_all(self): - """Without project_id, threshold falls back to 0.0 and returns all occurrences""" + def test_no_project_id_requires_project(self): + """Occurrence list requires project_id to avoid unfiltered hot-table scans.""" url = "/api/v2/occurrences/?limit=1000" res = self.client.get(url) - self.assertEqual(res.status_code, status.HTTP_200_OK) - - # Check that our test occurrences are present (don't assume all in DB are ours) - expected_ids = {occ.pk for occ in list(self.high_occurrences) + list(self.low_occurrences)} - returned_ids = {o["id"] for o in res.data["results"]} - - self.assertTrue(expected_ids.issubset(returned_ids), f"Missing occurrence IDs: {expected_ids - returned_ids}") + self.assertEqual(res.status_code, status.HTTP_400_BAD_REQUEST) + self.assertIn("project_id", res.data)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ami/main/tests.py` around lines 2479 - 2489, The test test_no_project_id_returns_all must be updated because /api/v2/occurrences/ now requires a project_id; modify the test to supply a valid project_id query param (e.g., append ?project_id=<test_project.pk>&limit=1000 or include project_id in the existing URL construction) and adjust assertions to only expect occurrences for that project (build expected_ids from self.high_occurrences/self.low_occurrences filtered by that project or use the specific test_project occurrences), then assert the returned_ids contain that filtered expected_ids and that the response status remains HTTP_200_OK.
2315-2354:⚠️ Potential issue | 🔴 CriticalUpdate stale global summary assertions.
SummaryView.require_project=Truemakesglobal_urlreturn 400 now, so this test’s global-summary block contradicts the new contract and is failing CI. Keep the project-scoped visibility assertions, and remove or rewrite the global comparison as an explicit “requires project_id” assertion.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ami/main/tests.py` around lines 2315 - 2354, The global-summary assertions are now stale because SummaryView.require_project=True; replace the existing global block that calls global_url and asserts 200 with assertions that the endpoint requires a project_id: call _auth_get(self.outsider, global_url) and _auth_get(self.member, global_url) and assert their status_code is 400 (or the view's error code) and optionally assert the response message indicates "requires project_id" (check outsider_global_response.json() / member_global_response.json() for the exact error field); leave the project-scoped checks (project_url, draft visibility) untouched and remove any comparisons between outsider_global_response and member_global_response that assume successful global summaries.ami/main/api/views.py (2)
909-926:⚠️ Potential issue | 🟠 MajorAdd missing
require_project_for_listflag and wire project scoping toDetectionViewSet.The code doesn't currently set
require_project_for_list = Trueand lacks aget_queryset()method to enforce project scoping. Without both, the/api/v2/detections/list endpoint bypasses project isolation despite havingproject_idas a filterable field.Add the flag and implement project filtering following the established pattern in
SourceImageViewSet:Proposed fix
class DetectionViewSet(DefaultViewSet, ProjectMixin): """ API endpoint that allows detections to be viewed or edited. """ + require_project_for_list = True # Unfiltered list scans are too expensive on this table queryset = Detection.objects.exclude(NULL_DETECTIONS_FILTER).select_related("source_image", "detection_algorithm") serializer_class = DetectionSerializer filterset_fields = ["source_image", "detection_algorithm", "source_image__project"] ordering_fields = ["created_at", "updated_at", "detection_score", "timestamp"] + def get_queryset(self): + qs = super().get_queryset() + project = self.get_active_project() + if project: + qs = qs.filter(source_image__project=project) + return qs + def get_serializer_class(self):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ami/main/api/views.py` around lines 909 - 926, DetectionViewSet is missing the require_project_for_list guard and a project-aware queryset; set require_project_for_list = True and implement get_queryset(self) (or override get_queryset in the class that currently defines queryset/select_related/serializer_class and list logic) to read the current project from self.request (or use the same project-extraction helper used by SourceImageViewSet), filter Detection.objects.exclude(NULL_DETECTIONS_FILTER) by source_image__project equal to the current project when a project is present, preserve the existing select_related("source_image", "detection_algorithm") and ordering/filterset behavior, and ensure list uses this filtered queryset so the /api/v2/detections/ endpoint enforces project scoping.
1814-1822:⚠️ Potential issue | 🟠 MajorEnsure
useStatushook contract matches API requirement: makeprojectIdmandatory or add error handling.The backend's
SummaryViewenforcesrequire_project = True, returning a 400 ValidationError whenproject_idis missing. However, theuseStatushook still declaresprojectId?: stringas optional.Current call sites (
summary.tsxanduseNavItems.ts) safely passprojectId, but the hook signature invites undefined values. WhenprojectIdis undefined,getFetchUrlomits theproject_idparameter, triggering a 400 response. Neither caller currently handles theerrorfield returned by the hook.Either make
projectIdrequired in the hook signature or ensure all callers explicitly handle the 400 error scenario.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ami/main/api/views.py` around lines 1814 - 1822, SummaryView sets require_project = True (returns 400 if project_id missing) but the useStatus hook declares projectId?: string optional and its getFetchUrl can omit project_id causing 400s; update the hook signature to require projectId (change useStatus to accept projectId: string) and update all callers (summary.tsx and useNavItems.ts) to pass a non-null projectId, or alternatively keep the optional signature but add explicit error handling around getFetchUrl/returned error to handle 400 responses; ensure references to SummaryView/require_project and getFetchUrl are accounted for when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@ami/main/api/views.py`:
- Around line 909-926: DetectionViewSet is missing the require_project_for_list
guard and a project-aware queryset; set require_project_for_list = True and
implement get_queryset(self) (or override get_queryset in the class that
currently defines queryset/select_related/serializer_class and list logic) to
read the current project from self.request (or use the same project-extraction
helper used by SourceImageViewSet), filter
Detection.objects.exclude(NULL_DETECTIONS_FILTER) by source_image__project equal
to the current project when a project is present, preserve the existing
select_related("source_image", "detection_algorithm") and ordering/filterset
behavior, and ensure list uses this filtered queryset so the /api/v2/detections/
endpoint enforces project scoping.
- Around line 1814-1822: SummaryView sets require_project = True (returns 400 if
project_id missing) but the useStatus hook declares projectId?: string optional
and its getFetchUrl can omit project_id causing 400s; update the hook signature
to require projectId (change useStatus to accept projectId: string) and update
all callers (summary.tsx and useNavItems.ts) to pass a non-null projectId, or
alternatively keep the optional signature but add explicit error handling around
getFetchUrl/returned error to handle 400 responses; ensure references to
SummaryView/require_project and getFetchUrl are accounted for when making the
change.
In `@ami/main/tests.py`:
- Around line 2479-2489: The test test_no_project_id_returns_all must be updated
because /api/v2/occurrences/ now requires a project_id; modify the test to
supply a valid project_id query param (e.g., append
?project_id=<test_project.pk>&limit=1000 or include project_id in the existing
URL construction) and adjust assertions to only expect occurrences for that
project (build expected_ids from self.high_occurrences/self.low_occurrences
filtered by that project or use the specific test_project occurrences), then
assert the returned_ids contain that filtered expected_ids and that the response
status remains HTTP_200_OK.
- Around line 2315-2354: The global-summary assertions are now stale because
SummaryView.require_project=True; replace the existing global block that calls
global_url and asserts 200 with assertions that the endpoint requires a
project_id: call _auth_get(self.outsider, global_url) and _auth_get(self.member,
global_url) and assert their status_code is 400 (or the view's error code) and
optionally assert the response message indicates "requires project_id" (check
outsider_global_response.json() / member_global_response.json() for the exact
error field); leave the project-scoped checks (project_url, draft visibility)
untouched and remove any comparisons between outsider_global_response and
member_global_response that assume successful global summaries.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3110f623-f547-4a5b-b900-b3e0df9f15a9
📒 Files selected for processing (3)
ami/base/views.pyami/main/api/views.pyami/main/tests.py
…re pagination COUNT, update stale tests - ami/base/views.py: rename require_project/require_project_for_list comments to "Opt-in" so they read correctly given the False default (per Copilot). - ami/main/api/views.py: call self.get_active_project() at the top of DetectionViewSet.list() so the missing-project_id 400 fires before DRF pagination's queryset.count() runs a full-table scan. The other three hot viewsets already call get_active_project() inside get_queryset(). - ami/main/tests.py: update two pre-existing tests that relied on the old unguarded contract — global /status/summary/ is now 400, and /api/v2/occurrences/ without project_id is now 400. Co-Authored-By: Claude <noreply@anthropic.com>
|
Tested on staging. All UI routes to these "hot" endpoints appears to already pass |
Summary
Opt-in alternative to #1133. Adds a
require_project_for_listflag toProjectMixin(default False) and opts in only the four list endpoints that cause real damage when scanned unfiltered:ClassificationViewSetOccurrenceViewSetDetectionViewSetSourceImageViewSetAlso sets
require_project = TrueonSummaryViewand removes the unfilteredelsebranch.Context
On 2026-02-16 the Facebook crawler (
meta-externalagent) hit/api/v2/classifications/13k+ times in a day without a project filter. The resultingCOUNT(*)on the classifications table took ~3.6 min per request and dragged P99 latency past 2 min. The crawler is now blocked at nginx, but unfiltered list queries from any client remain a foot-gun on the four largest tables.Why opt-in, not opt-out
The draft in #1133 flips the
ProjectMixindefault torequire_project_for_list = True. That sounds narrower than it is:LimitOffsetPaginationWithPermissions(the global default pagination class,ami/base/pagination.py:29-30) callsview.get_active_project()on every paginated list response for anyProjectMixinviewset — not just viewsets that call it themselves. Flipping the default therefore affects ~20 viewsets, and silently breaks any of them that don't get an explicit opt-out, including:GET /api/v2/projects/(listing projects themselves — used on UI load)GET /api/v2/algorithms/,/api/v2/pipelines/(admin ML views)GET /api/v2/sites/,/api/v2/devices/,/api/v2/storage/Opt-in keeps the change scoped to the tables we actually want to protect and needs no audit of the other 15+ viewsets. The default-flip can come as a follow-up once those opt-outs are verified.
Changes
ami/base/views.pyrequire_project_for_list = FalsetoProjectMixin; enforce inget_active_project()whenaction == "list"ami/main/api/views.pyrequire_project_for_list = Trueon Classification/Occurrence/Detection/SourceImage viewsetsami/main/api/views.pyDetectionViewSet.list()callsself.get_active_project()up front so the 400 fires before DRF pagination'squeryset.count()runs a full-table scan. The other three hot viewsets already callget_active_project()insideget_queryset(), so the check lands before pagination's COUNT naturally.ami/main/api/views.pySummaryView:require_project = True+ delete unfilteredelsebranchami/main/tests.pyTestProjectRequiredOnListEndpoints: 4 hot endpoints 400/200, summary 400/200, regression guard for/projects/and/taxa/ami/main/tests.pyTestDraftProjectPermissions.test_summary_countsasserts 400 on global/status/summary/, andTestProjectDefaultThresholdFilter.test_no_project_id_requires_projectasserts 400 on unfiltered/api/v2/occurrences/Behavior
GET /api/v2/classifications/→ 400{"project_id": ["This field is required."]}GET /api/v2/classifications/?project_id=1→ 200GET /api/v2/classifications/123/→ 200 (detail view unaffected)GET /api/v2/projects/→ 200 (unchanged)GET /api/v2/taxa/→ 200 (unchanged)GET /api/v2/status/summary/→ 400GET /api/v2/status/summary/?project_id=1→ 200Test plan
TestProjectRequiredOnListEndpointstests pass (4 cases) — 400 on missing project_id for the 4 hot endpoints, 200 with project_id, regression guard for/projects/and/taxa/.docker compose -f docker-compose.ci.yml run --rm django python manage.py test ami.main.tests.TestDraftProjectPermissions.test_summary_counts ami.main.tests.TestProjectDefaultThresholdFilter ami.main.tests.TestProjectRequiredOnListEndpoints.project_id.Follow-ups (not in this PR)
main_detection(occurrence_id, timestamp DESC)for the /captures/ join hot path (separate migration PR).useStatushook typing —projectId?: string→projectId: string— so the 400 contract onSummaryViewis reflected at the call-site level (current callers already pass it).ProjectMixinviewsets if/when we flip the mixin default.