Repository Provider, Memory Cap and Source Checksum Features#93
Repository Provider, Memory Cap and Source Checksum Features#93
Conversation
|
If the provider is appended, should it not only be fetched if a recipe for a target is missing? Generally, this seems like a very useful feature! |
The entire repository defined by the package needs to be fetched once it is encountered in the dependency tree and should be processed in the order in which packages appear in the list and before other packages are processed. The idea is to be able to version entire repositories (and defaults found therein) just like any other package. This would also remove the need to specify an explicit search_path in bits.rc (and we might even remove the need for bits.rc in the future). |
…e to the current architecture in target installation directory
Ok, I see. Thanks for the explanation! |
Updated acknowledgement section and configuration paths.
New features addressing all currently open issues
This is a major upgrade to bits that adds many new features while still aiming to remain backwards compatible. It introduces six major features — CVMFS publishing pipeline integration, persistent workDir cache management, repository provider discovery, source checksum verification, memory-aware job capping, and build manifests — together with a full Makeflow parallel-build overhaul, comprehensive documentation, and an extensive new test suite (19 new test files).
New commands
bits cleanupEvicts stale packages from the persistent workDir used by build runners. Two eviction modes may be combined:
--max-age N): removes packages whose sentinel mtime is older than N days. Intended as a nightly cron job on the build runner host.--min-free G): removes least-recently-used packages until free space exceeds G GiB. Called as a pre-build hook in the CI pipeline.Concurrency safety is implemented via
flockadvisory locks on sentinel files. A package actively in use by a concurrent build is never evicted; the OS releases stale locks automatically so no cleanup on job start is needed.bits publishCopies a built package's immutable INSTALLROOT to a scratch directory, relocates all embedded paths to the final CVMFS target, streams the result to the ingestion spool (
incoming/<pkg-id>/+.donesentinel), and removes the scratch copy. The original INSTALLROOT is never modified. When--cvmfs-prefixwas passed at build time the relocation step is skipped entirely (--no-relocate).Docker/CVMFS no-relocation builds
A new
--cvmfs-prefix PATHflag forbits build --dockerbind-mounts the persistent workDir at the final CVMFS releases path inside the container (e.g./cvmfs/sft.cern.ch/lcg/releases). Packages therefore compile with their deployment RPATHs already embedded, eliminating the relocation step at publish time.bits publish --no-relocateis used in conjunction to stream pre-positioned artifacts directly to the spool without path rewriting.Repository provider feature
A recipe can declare
provides_repository: trueto make itssource:URL a recipe repository rather than source code. When bits encounters such a package during dependency resolution it clones the repo into$BITS_WORK_DIR/REPOS/<pkg>/<hash>/, prepends or appends it to the search path (controlled byrepository_position:), and restarts dependency scanning. The process repeats until the graph is stable, supporting nested providers. The provider's commit hash is folded into the build hash of every package whose recipe came from that provider, so upgrading a provider triggers a rebuild of all affected packages even when recipe text is unchanged. Always-on providers (loaded unconditionally before dependency resolution) are also supported.The default provider registry is bits-providers, a repository of
provides_repository: truerecipes covering the major HEP experiment stacks (LCG/EP-SFT, ALICE, LHCb, Key4Hep, and the common HEP library set). Each recipe points to the corresponding*.bitsrecipe repository. The repository also carries aregistry.jsonindex consumed by bits-console to populate the Package Browser with human-readable names, descriptions, and tags for each known stack. Groups can maintain a private fork of bits-providers to add internal recipe repositories without touching the upstream registry.Source and patch checksum verification
Checksums can be declared inline in recipe
sources:andpatches:entries using a comma-suffix (url,sha256:...) or in a separatechecksums/<pkg>.checksumfile per recipe repository. The external file wins over inline entries when both exist. Three enforcement levels are available and can be set per-recipe or globally via CLI flag:--check-checksums— verify declared checksums; warn on mismatch.--enforce-checksums— verify and abort on mismatch; also abort when any source or patch carries no checksum declaration.--print-checksums— compute and print checksums in ready-to-paste YAML format.The checksum store also supports pinning the expected git commit SHA (
tag:field), verified after checkout.Memory-aware job capping
A new
mem_per_jobrecipe field specifies the expected peak RSS per parallel compiler process (plain integer = MiB, or string with unit:"2 GiB"). An optionalmem_utilisationfield (default 0.9) caps the fraction of available memory bits may commit. When set,$JOBSis automatically lowered so the total committed memory stays within the available budget, preventing kernel swap on memory-hungry packages such as LLVM or ROOT.Build manifests
Every build run writes an incremental JSON manifest to
$WORK_DIR/MANIFESTS/bits-manifest-<timestamp>.json(with alatest.jsonsymlink updated atomically after each package). The manifest records the bits version, architecture, defaults, provider list with commit hashes, and per-package entries including checksums and CVMFS target paths. A completed manifest can be replayed exactly withbits build --from-manifest.Makeflow parallel-build improvements
--makeflow-jobs Nflag caps the number of parallel Makeflow build jobs on the local machine.--parallel-sources Nflag enables concurrent source URL downloads within a single package.checkout_runner.pymodule allows source checkouts to run as independent, fully parallel Makeflow tasks rather than sequentially during the Python preparation phase.sync.py) gainupload_shell_command()methods that emit inline shell commands for Makeflow.uploadrules, allowing artefact upload to run asynchronously and in parallel with the next build stage.fetch_source()andupload_source()for source archive caching. The CVMFS backend is read-only for uploads; S3 and rsync backends support both directions.Architecture and utility improvements
pkg_to_shell_id()— derives a valid shell identifier from any package name.effective_arch()/compute_combined_arch()— correct handling of architecture-independent packages andqualify_archsuffixes.ver_rev()— centralises version-revision directory segment computation includingforce_revisionfrom defaults profiles.resolve_pkg_family()— resolves package family from defaults metadata; correctly excludesdefaults-*pseudo-packages.getPackageList()— extended to acceptprovider_dirsso that recipes sourced from provider checkouts are tagged withrecipe_providerandrecipe_provider_hash.Documentation
README.md(new, Markdown) andREADME.rst(revised) — quick-start guide covering installation, basic commands, configuration, and recipe writing.REFERENCE.md(new, ~3 700 lines) — comprehensive reference covering all commands, recipe fields, remote store backends, Docker support, CVMFS publishing pipeline, build manifest, defaults profiles, and design principles.WORKFLOWS.md(new) — phase-by-phase development-to-deployment walkthrough from local build through group CI and CVMFS publication via bits-console, with ASCII end-to-end summary.docs/bits-workflow.svg— visual overview of the full workflow with shared recipe repos, developer workstation, GitLab CI pipeline (group-admin vs individual-user namespaces), and two parallel CVMFS publication paths.Packaging and infrastructure
alidisttobuild-bitsinpyproject.tomlandsetup.py.botocoreadded torequirements.txtfor S3 backend support.tox.iniupdated: README renderer updated for Markdown.Tests
19 new test files and 4 expanded existing files covering all new features:
test_always_on_providers.pytest_async_build.pytest_checksum.pytest_checksum_store.pytest_cleanup.pytest_container_workdir.py--cvmfs-prefixworkDir bind-mount in Dockertest_defaults_requires_provider.pytest_download_sentinels.pytest_manifest.pytest_memory.pymem_per_jobparsing and job capping logictest_new_args.pytest_package_family.pytest_pkg_to_shell_id.pytest_provider_staleness.pytest_qualify_arch.pytest_repo_provider.pytest_shared_arch.pytest_source_cache.pytest_store_integrity.pyRelated repositories
provides_repository: truerecipes for LCG, ALICE, LHCb, Key4Hep, and common HEP stacks, plusregistry.jsonfor the bits-console Package Browser.