Currently https://download.opensuse.org/distribution/leap/16.0/repo/oss/ , similar to how Leap 15.x worked, provides repository metadata carrying all four primary architectures (aarch64, x86_64, ppc64le, s390x). This has a disadvantage on the size and the complexity of the repository metadata. just parsing the primary.xml today takes excessive time:
xmllint --timing --stream 0ad0e264fcbb98937b6b8d6e6bb8230d3d370fb444c6bd45bc43af5389d8c5ae0c22fec9862aaac8af5e978de8f51a53ad981ced57b212184988d29efd78e8a3-primary.xml Parsing took 1396 ms
this is the timing on a relatively modern machine. it can easily be 5s and more on less powerful hardware. if it would be only singlearch, this timing would be more in the range of 350ms, which is a significant win. Of course there is some possibility to optimize libxml2, it is rather unlikely to make a significant difference other than a few percentages here and there because it's a project that has a very wide and standards compliant focus, so parsing speed is secondary.
So basically move to a Fedora/RHEL style repository structure where you have independent repositories for each architecture?
Honestly, that would be great. It would also make the Mock configs for SUSE distributions so much less complicated.
Metadata Update from @Pharaoh_Atem: - Issue set to the milestone: 16.0
Hello Dirk, this is not so easy at the moment. We'd have to build Leap in a similar way as Tumblweed (in separate projects).
We'll discuss this on next rel-eng call https://etherpad.opensuse.org/p/ReleaseEngineering-20250219
The conflict here is to decide between storage needs (esp. on mirrors) versus primary.xml size.
For example we have currently for Backports SLE-15-SP7 x86_64
23 GB noarch 15 GB x86_64
when we split the archs and push to different places at different times it is causing to duplicate noarch rpms for every architecture there by default. src.rpms most likely need also to be duplicated if we want to match exact rpm binaries at all times.
An alternative solution would be to enhance rpmmd.xml and at least createrepo_c and zypp tooling. We could have primary_$arch files so that only the matching architecture would be downloaded (we could also switch to json there to speed up parsing time). The full primary xml could stay to support clients not being able to handle the architecture ones.
(since such an implementation is backward compatible it could even be rolled out for existing SLE 15 repos)
The conflict here is to decide between storage needs (esp. on mirrors) versus primary.xml size. For example we have currently for Backports SLE-15-SP7 x86_64 23 GB noarch 15 GB x86_64 when we split the archs and push to different places at different times it is causing to duplicate noarch rpms for every architecture there by default. src.rpms most likely need also to be duplicated if we want to match exact rpm binaries at all times.
The way this is typically solved is by hard-linking everything. And rsync can ensure that hard-link is carried over when mirroring.
Please don't do this.
Also SRPMs can be its own "arch" rather than duplicated everywhere.
Doable, but clear disadvantage is that tools using local metadata after a zypper ref will break. Suma's repo-sync e.g. They expect to see all the packages the repo offers.
zypper ref
well, not adapted tools would still see the original primary.xml containing everything. Only adapted tools would get the speedup though.
However, I implemented now a mechanic to generate multiple repodata directories during a build. Arch specific repodata are in the architecture subdirectories (eg. x86_64/repodata/... contains only x86_64 and noarch references). Disadvantage is that no unique URL is working anymore and likely more tools which modify the repodata need to get adapted.
@Pharaoh_Atem: mirrors reported that they loose the hardlinks due to independend rsync runs for each directory. And therefore drop all non-x86_64 archs.
Do we not provide guidance on rsync and how to sync rsync modules? There are definitely ways to avoid this problem with the right flags. We're the only RPM distribution that doesn't split things up like this, so if other distributions aren't seeing this problem en masse, then I don't think it's that significant of a problem.
@lkocman Will you update the openSUSE services to use the ../${basearch} URLs then? This would be IMO smarter than letting zypper use the splited primaries. Plugins downloading the repos changes and filelists would benfit being directed to the smaller arch-specific versions.
This is the plan Michael, however we have to have a working Leap 16.0 build first. The recent builds are still red because of libzypp / https://bugzilla.suse.com/show_bug.cgi?id=1237172
https://github.com/openSUSE/openSUSE-repos/pull/77
The new setup is following
The ftp-trees are build as part of https://build.opensuse.org/package/show/openSUSE:Leap:16.0/000productcompose
while offline media is built separate in the 000productcompose.all as it utilizes the all feature. https://build.opensuse.org/package/show/openSUSE:Leap:16.0/000productcompose.all
isn't it the opposite way around?
000productcompose.all is a single build building the ftp tree (aka online rpmmd repository).
000productcompose is building the iso files for offline installation via multibuild flavors.
btw, it may make sense to rename 000productcompose.all , it was just my first shot. But the name is maybe not helpful and has in the end no influence in the build.
https://lists.opensuse.org/archives/list/factory@lists.opensuse.org/thread/PBIS3DRT5GGXJXKOIOZXBZUDNEDDYADA/
Alpha users that were affected can "easily" grab binaries directly from https://build.opensuse.org/projects/Base:System/packages/openSUSE-repos:openSUSE-repos-Leap/repositories/16.0/binaries
Closing.
https://download.opensuse.org/distribution/leap/16.0/repo/oss/repodata/ The legacy repodata will continue existing aside from
https://download.opensuse.org/distribution/leap/16.0/repo/oss/x86_64/repodata/ or generally repo/$basearch/repodata
This will help to ease migration. Users are adviced to use $basearch/repodata.
Metadata Update from @lkocman: - Issue close_status updated to: Completed - Issue status updated to: Closed (was: Open)
Log in to comment on this ticket.