1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
|
Q: What is sbosrcarch?
A: sbosrcarch is "The SlackBuilds.Org Source Archive". It contains copies
of the source files listed in the .info files for all (or almost all)
the builds on SlackBuilds.org.
sbosrcarch is also the name of the software that created and maintains
the archive (more about this later, near the end of this FAQ).
Q: What is sbosrcarch for?
A: It's intended to be a backup location for source files that can't be
downloaded. This happens mainly for these reasons:
- The upstream web site goes down, is moved, or has connectivity
issues (intermittent or long-term).
- Upstream moves or removes the source, when they release a new version.
Also, the archive is hosted on a fast, well-connected host. Sometimes
you might choose to use the archive just for faster downloads.
A side benefit of the archiving process is that the archive maintenance
software produces a log of failed downloads, which can then be sent
to the slackbuilds-users mailing list and/or build maintainer so it
can be fixed quickly.
Q: Who is responsible for sbosrcarch?
A: The archive server is operated by Darren Austin, aka "Tadgy"
on Freenode IRC. The archive script was written by B. Watson, aka
"Urchlay" on Freenode. Both of us keep an eye on the logs and keep the
archive healthy.
The best way to contact us is using an IRC client to connect to
Freenode and join the ##slackware or #slackbuilds channel.
We can also be reached by email:
B. Watson <yalhcru@gmail.com>
Darren Austin <mirrors (at) slackware.uk>
Please read this entire FAQ before asking us questions. Chances are,
you'll find the answer here. If not, or if the answer isn't clear
enough, we'll be happy to help.
Note that the SlackBuilds.org team is NOT responsible for the
archive. PLEASE don't bother them with questions about sbosrcarch,
they're already busy enough maintaining the actual SlackBuilds site!
Same goes for individual build maintainers.
Q: Why create a giant archive like this? Isn't it better to fix the
SlackBuilds whose sources can't be downloaded?
A: Sort-of. Yes, if a SlackBuild references a no-longer-existing
source download URL, it should be updated. Usually the SlackBuild
maintainer is responsible for this. Sometimes the SBo admins take
care of it instead. Sometimes, it takes longer than expected to
update a SlackBuild: the new version uses a different build system,
or requires some dependency to be updated first, or the maintainer
is too busy with Real Life and can't spare the time just at the moment.
Once the build is updated, it still doesn't appear instantly on the
site. It has to sit in the "pending" queue until it's been reviewed by
the admins, and then in the "ready" queue until the next public update.
The SBo update process is complex, and requires coordination between
the various admins. Generally this means that site updates ("Public
www update" in the git log) only happen once a week.
During the time it takes for the SlackBuild to get updated for the
new download URL (and possibly new version), users won't be able to
download the source as listed on the SBo site.
That's what the archive is mainly intended for. It's a fallback,
a stop-gap solution, that allows builds to keep working during the
period between the source disappearing and the build being updated.
Usually this is only a week or less, but sometimes things slip through
the cracks...
Q: How do I use the archive?
A: Several answers here:
- Using a tool that supports the archive, such as sbopkg or sbotools.
This is by far the easiest way: they automatically use the archive
if they need to, without you having to do any extra work.
- Manually with a web browser. The easy way is to start at:
http://slackware.uk/sbosrcarch/by-name/
...which shows a list of category directories (academic, accessibility,
audio, etc). Choose a category, then within the category
you'll see a list of build name directories. Each of these will
contain the source file(s) for the build.
Example: you can't download the source to system/atari800
from its original URL, so you go to the by-name page, click on
"system", then "atari800". There you'll see the file you wanted,
atari800-3.1.0.tar.gz (unless it's been updated since I wrote this).
- With a download tool like wget or curl. You could do this using the
same by-name tree as you would for manual lookups, but it's better to
do this by md5sum. The base URL for this is:
http://slackware.uk/sbosrcarch/by-md5/
In the build's .info file, take the 'filename' part of each download
URL. Example: "atari800-3.1.0.tar.gz", where the link is
http://downloads.sourceforge.net/project/atari800/atari800/3.1.0/atari800-3.1.0.tar.gz
Now take the MD5SUM (or MD5SUM_x86_64 if you're using DOWNLOAD_x86_64),
and use the first two characters as subdirectory names, followed by the
full md5sum. Example: we have
MD5SUM="354f8756a7f33cf5b7a56377d1759e41"
in the .info file. The directory for this would be:
3/5/354f8756a7f33cf5b7a56377d1759e41
Add this to the base URL and get:
https://slackware.uk/sbosrcarch/by-md5/3/5/354f8756a7f33cf5b7a56377d1759e41/
Now add the filename part from DOWNLOAD or DOWNLOAD_x86_64, and you get:
https://slackware.uk/sbosrcarch/by-md5/3/5/354f8756a7f33cf5b7a56377d1759e41/atari800-3.1.0.tar.gz
This is the exact URL for the file, if it's actually present in the
archive. Most likely, it will be, and your download will succeed. If
the download fails, the file's not in the archive.
Of course, all these steps should be automated. You'll end up writing
a script in your favorite language to do the job. Or:
- Using the sbosrc script
Same as above, except someone's already written it for you. Download
it here:
https://slackware.uk/~urchlay/repos/sbostuff/plain/sbosrc
...or, it'd be better to use git:
git clone https://slackware.uk/~urchlay/repos/sbostuff
Make it executable (chmod +x) and place it somewhere on your $PATH,
such as /usr/local/bin.
Whenever you need to download something from the archive, change
to the directory containing the .info file (same place as the
.SlackBuild) and just run:
sbosrc
...which will check the current architecture (32-bit or 64-bit),
parse the info file, calculate the URL as above, and download the
file to the current directory.
Q: I need a specific older version of a source file, not the latest
version that's packaged on SBo. Will the archive have it?
A: Probably not. Old versions don't disappear immediately when new
ones are archived, but they do get purged monthly... or, almost:
old files are deleted on the 30th of every month, and February is
only 28 or 29 days long!
Use the by-md5 tree if you're looking for an old version, since some
builds use unversioned filenames (new one will overwrite the old,
in the by-name tree).
If you know the exact filename and/or md5sum, you can always try a
google search for them. Use "quotes" around the filename.
Q: How do I know it's safe to use files downloaded from the archive?
A: The same way you know it's safe to use any file you downloaded for
use with a SlackBuild: check the downloaded file's md5sum against
the MD5SUM line in the build's .info file.
Q: How do I use the archive with automated tools such as sbopkg and sbotools?
A: For sbopkg and sbotools, you just run them normally. They'll automatically
search the archive, if a source download fails.
Q: How complete is the archive?
A: Currently (2018-06-26), the by-md5 tree is 100% complete. This does
NOT count blacklisted sources (see next question).
For a more up-to-date answer, see the archive status page:
http://slackware.uk/sbosrcarch/STATUS
This gets updated nightly.
Q: Why are some sources missing from the archive?
A: Multiple answers:
- The archiver couldn't download the file. Maybe the site was down
when it tried, or the upstream developers removed the file. Generally
this will require the build's maintainer to fix the .info file or
update the SlackBuild to a newer version (that actually exists).
In some cases, the archive operator will find the file and manually
add it to the archive.
- The archiver downloaded the file, but the download's md5sum doesn't
match. The build maintainer will have to fix the .info file. We
won't archive any files we can't verify by md5sum.
- There is some software that can't be automatically downloaded
(requires account creation on the upstream site) or whose license
doesn't allow us to redistribute it.
The classic example of both is development/jdk: Oracle's license
requires that users download the file directly from their site and
doesn't allow us (or anyone else) to offer it for download. Also,
downloading from Oracle requires creating an Oracle account, so
the archiver couldn't auto-download it even if it were allowed.
Sources we can't download are blacklisted by the archiver, and
don't count towards the completion percentage on the status page.
The current blacklist is:
academic/novocraft
academic/wehi-weasel
development/amd-app-sdk
development/decklink-sdk
development/jdk
development/J-Link
development/sqlcl
development/sqldeveloper
office/treesheets
system/displaylink
system/oracle-instantclient-devel
system/oracle-xe
system/oracle-instantclient-basic
If you find a file in the archive that shouldn't be there due to
its license not allowing redistribution, PLEASE let us know so we
can remove and blacklist it. It is not our intention to violate
anyone's license.
Q: Why do some of the by-name directories have filenames ending in ".x86_64"?
A: This is due to a design flaw in the archive structure. We assumed that
download filenames would either be unique within an .info file, or else
that 2 files with the same filename were in fact the same file.
For 4 of the SlackBuilds, this turns out to be a bad assumption. Example:
development/p4's .info file has this:
DOWNLOAD="https://www.perforce.com/downloads/perforce/r18.1/bin.linux26x86/p4"
DOWNLOAD_x86_64="https://www.perforce.com/downloads/perforce/r18.1/bin.linux26x86_64/p4"
Notice that both URLs end in "/p4". The directory parts of the URL are
different, but the filenames are the same. In the archive, the 32-bit
download will be called "p4" and the 64-bit one will be "p4.x86_64".
The archive script successfully downloads these files and stores them
in the by-md5 tree in the correct directories. But when it tries to
store them in the by-name tree, it's trying to save two files in the
same directory with the same name. If it didn't use a different name,
the second one would overwrite the first.
The current list of builds affected by this is:
academic/ucsc-blat
development/p4
development/p4d
libraries/p4api
Q: I'm a SlackBuild maintainer, and the download URL for one of my builds
has disappeared. Can I use the archive URL as the DOWNLOAD in my .info
file?
A: Yes, but only as a temporary measure or a last resort.
It's better to do one of these:
- Find another copy of the source. Try a google search for the exact
filename (in "quotes"), or the md5sum.
- Host the source yourself, if you have access to a web or ftp server.
- Ask on the slackbuilds-users mailing list. Someone will probably
volunteer to host the source for you, provided you have a copy of
it to send them (and if you don't, hey, there's this handy source
archive you can probably get it from...)
Using the archive as the DOWNLOAD results in less redundancy. Nobody
is currently mirroring the archive that we know of. Ideally, we want
every source file to have two working URLs: the original plus the
sbosrcarch one.
Q: I'm a SlackBuild maintainer, and one of my builds keeps showing up
on the sbosrcarch STATUS as missing. How can I prevent this?
This usually happens for one of these reasons:
1. You made a mistake in your submission. Double-check the DOWNLOAD URL(s)
and MD5SUM(s) in the .info file. If they're wrong, resubmit your build.
2. The filename in the download URL is "unversioned", meaning the version
number isn't part of the filename (e.g. "thingy-latest.tar.gz"). At
some point after you last updated your .info file, but before the
SBo public update, the file changed on the server. Actually, this
occasionally happens even for files that have the version number
in the filename: upstream makes a mistake (leave a file out of the
tarball for instance) and a day or so later, they fix it without
changing the version number. When the archiver downloads the file,
it checks the md5sum against your .info file and sees a mismatch,
so it won't archive the file.
3. Upstream made a new release after you updated your build, but before
the SBo public update, and they removed the old version from their
server (or, possibly, moved it to a different location like /archives/
or /old-versions/). When the archiver tries to download the file, it
gets a '404 Not Found' error.
For (2) and (3), the problem is really the same: the web is a moving
target. Your download URLs and their md5sums were valid, but they got
changed on the server sometime after you submitted your build.
The solution is the same for both: find somewhere else to host your
source downloads. Either use your own web or ftp server if you have
one, or ask on the mailing list and someone will probably volunteer
to host it for you. Once you have the file(s) hosted somewhere,
update your .info file to point to the new location.
Before you do this, make sure the license allows you to: if it
doesn't allow redistribution, you can't host the download somewhere
else... and neither can we, so the build should be added to the
sbosrcarch blacklist (let us know if this is the case).
4. The file on the server is 'protected', because the server checks
the HTTP Referer and/or User-agent fields in the request. Typically
this means the download will work when using a browser, but will
fail when using wget or curl. Usually when this happens, one of
the sbosrcarch operators will manually download the file and add
it to the archive within a day or two. If not, let us know and
we'll get to it ASAP. Again, check the license of the download
file: if redistribution is not allowed, it should be added to the
blacklist and not kept in the archive.
Q: How do I create my own archive?
A: Two choices:
- Mirror the directory the usual way, with rsync. Using wget
would be possible, but it would use about twice the bandwidth and
storage. This is because rsync supports hard links, which sbosrcarch
makes extensive use of.
- Get a copy of the sbosrcarch script and run it on your web server.
This will be more work on your part, but your archive will be
independent: it'll keep updating itself even if the original archive
at slackware.uk goes away someday.
The script lives here:
git clone https://slackware.uk/~urchlay/repos/sbostuff
It's written in perl, and has extensive documentation. Run it as
"sbosrcarch --help" to see the docs.
If you're thinking about running a sbosrcarch instance, please
contact me (yalhcru@gmail.com). I've got a list (with only one
entry in it) and I'd like it to include all the archives eventually.
Also I'm pretty good at troubleshooting, if you're having problems
with the script.
Q: How much disk space will I need for my archive mirror/instance?
A: Currently (2018-06-26), the archive is 93GB. The by-name and by-md5 trees
also seem to be 93GB apiece, but that's because hardlinks are used between
the two trees.
If you're using the sbosrcarch script to create your archive, you can
run a smaller (incomplete) archive. The config file (sbosrcarch.conf)
has a "maxfilemegs" setting. Any file larger that this, won't be
downloaded and archived. You can also blacklist builds (or whole
categories) to save space.
|