Re-ticketed from Slack msg
Currently, this gets scraped:
"PageImages": [
"https://www.services.rcmp-grc.gc.ca/missing-disparus/showImage?id=2064&thumb=medium",
"https://www.services.rcmp-grc.gc.ca/missing-disparus/showImage?id=2064&thumb=small",
"https://www.services.rcmp-grc.gc.ca/missing-disparus/showImage?id=2065&thumb=medium",
"https://www.services.rcmp-grc.gc.ca/missing-disparus/showImage?id=2065&thumb=small"
],
but this would actually return the full image:
https://www.services.rcmp-grc.gc.ca/missing-disparus/showImage?id=2065
We should dedup and strip the thumb query param
Re-ticketed from Slack msg
Currently, this gets scraped:
but this would actually return the full image:
https://www.services.rcmp-grc.gc.ca/missing-disparus/showImage?id=2065
We should dedup and strip the
thumbquery param