Skip to content

docs: fix Linux deps and improve contribution setup instructions#291

Open
its-me-maady wants to merge 1 commit intoopenzim:mainfrom
its-me-maady:fix/readme-setup-and-dependencies
Open

docs: fix Linux deps and improve contribution setup instructions#291
its-me-maady wants to merge 1 commit intoopenzim:mainfrom
its-me-maady:fix/readme-setup-and-dependencies

Conversation

@its-me-maady
Copy link
Copy Markdown

  • Replace obsolete libjpeg8-dev with libjpeg-dev (virtual metapackage, works on both Debian and Ubuntu); clarify that -dev headers are only needed when building Pillow from source, not for normal development
  • Rewrite contribution setup to use hatch shell (consistent with other openZIM repos like ted, warc2zim, youtube); clarify that all commands must be run from the local clone root; add note on what pre-commit install does

Closes #152, Closes #153

Copy link
Copy Markdown
Collaborator

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@its-me-maady I want this PR to fix once (and hopefully for all) the list of dependencies.

With your changes, we still have inconsistent things between the list and instructions for macOS, Linux and Alpine.

We need to ensure this is well understood, documented, or fixed.

I also feel like explaining how to install Pillow is beyond the scope of what we can reasonably maintain, so I would prefer we simply consider most users will do the basic install, and then redirect anyone needing to build from source to go to Pillow documentation.

I also like the way Pillow describe which optional dependencies are required for some features: Optionally, install [defusedxml](https://pypi.org/project/defusedxml/) for Pillow to read XMP data, and [olefile](https://pypi.org/project/olefile/) for Pillow to read FPX and MIC images. I think we should strive to have the same kind of wording since dependencies are needed only for few scraperlib features (e.g. wget is used only in few functions).

Comment thread README.md Outdated

## Linux

## Linux
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why a duplicate section?

@its-me-maady its-me-maady force-pushed the fix/readme-setup-and-dependencies branch from 7a18d59 to 7a71230 Compare April 13, 2026 18:57
@its-me-maady
Copy link
Copy Markdown
Author

Updated — removed all Pillow build-from-source packages across macOS, Linux, and Alpine, described each dependency by feature (wget, FFmpeg, gifsicle, libcairo), linked to Pillow docs for anyone needing to build from source, fixed the duplicate Linux header.

@its-me-maady its-me-maady requested a review from benoit74 April 13, 2026 19:04
Copy link
Copy Markdown
Collaborator

@benoit74 benoit74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few questions/remarks:

  • I don't think it is correct to say the dependencies are install by PyPI, they are rather install by pip (and from PyPi by default, but this depends on local pip configuration which might use a local mirror, or whatever else)
  • do we really need to install libmagic, isn't this installed by python-magic?
  • we should probably make it clearer that all dependencies are obviously needed to have all tests pass
  • how did you tested you can drop all dependencies you've removed? For instance I'm a bit worried about the fact you remove libjpeg, pretty sure the "Alpine" case has been added by someone who tried to build/test the lib on Alpine and faced a problem. But problem/dependency might be gone, hence my question about how did you tested, because it is highly possible we just don't know it is not needed anymore

@its-me-maady its-me-maady force-pushed the fix/readme-setup-and-dependencies branch from 7a71230 to 28d3fe8 Compare April 16, 2026 13:36
@its-me-maady
Copy link
Copy Markdown
Author

Honestly, I didn't test it — the reasoning was that Pillow's pre-built wheels on macOS and Linux bundle their own image libraries, so those build deps shouldn't be needed at runtime. But I have no actual proof of this on a clean machine.
Alpine is a different story though — musl libc means manylinux wheels don't apply and Pillow likely builds from source there, so dropping libjpeg was wrong. I'm restoring it as libjpeg-turbo (which is what libjpeg wraps anyway, so either works — just being more explicit).
For macOS and Linux I'm not sure what the right call is — revert to be safe, or actually test on a clean environment? Happy to do either, just let me know.

@its-me-maady its-me-maady requested a review from benoit74 April 16, 2026 13:44
@benoit74
Copy link
Copy Markdown
Collaborator

We cannot just throw changes just hopping they are right. This is the exact reason we've not made any change, because it is hard to check changes are correct so that we

We hence need (you or someone else) to test all these changes (in a realistic fashion, e.g. we won't test all Linux distro, a "standard" one like Debian or Ubuntu is ok) to confirm what needs to be installed manually and what is automatically installed by pip.

- Replace obsolete libjpeg8-dev with libjpeg-dev (virtual metapackage,
  works on both Debian and Ubuntu); clarify that -dev headers are only
  needed when building Pillow from source, not for normal development
- Rewrite contribution setup to use `hatch shell` (consistent with other
  openZIM repos like ted, warc2zim, youtube); clarify that all commands
  must be run from the local clone root; add note on what pre-commit install does

Closes openzim#152, Closes openzim#153
@its-me-maady its-me-maady force-pushed the fix/readme-setup-and-dependencies branch from 28d3fe8 to b031141 Compare April 16, 2026 23:55
@its-me-maady
Copy link
Copy Markdown
Author

its-me-maady commented Apr 16, 2026

Tested everything properly this time with Docker on both python:3.14-slim and python:3.14-alpine — ran the full test suite on both, 1057 passed, 69 skipped. The only failure is test_validate_folder_writable_not_writable which is a pre-existing issue with running as root in Docker (root can write anywhere so the "not writable" condition never triggers) — unrelated to these changes.

Commands used:

Debian:

docker run --rm -v $(pwd):/app -w /app python:3.14-slim sh -c "
apt-get update -q && apt-get install -y -q libmagic1 wget ffmpeg gifsicle libcairo2 bash
pip install '.[dev]' -q --root-user-action=ignore
invoke coverage"

Alpine:

docker run --rm -v $(pwd):/app -w /app python:3.14-alpine sh -c "
apk add -q ffmpeg gifsicle libmagic wget cairo bash
pip install '.[dev]' -q --root-user-action=ignore
invoke coverage"

Also noticed libcairo was missing from all three install blocks even though it's in the dependencies list — added it. On Alpine, Pillow's wheel already bundles libjpeg_turbo, libtiff, webp etc. so those removals are fine.

Only thing I couldn't test is macOS since I don't have a Mac, but brew install cairo is what everyone uses for the same cairocffi error so I'm fairly confident that's right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Contribution instructions in README.md need work obsolete libjpeg8-dev package referenced in README.md

2 participants