This code is primarily but works.As noted in the title it:
1. Generates a Test Repo and then
2. Adds the test repo to pkg
Part of this process involves running a webserver. You can specify both the webserver to run as well asfallback webservers to run (or install) if for some reason the specified web server won't run.
The actual fallback logic might need more work but currently the webserver specified is "busybox httpd", which I covered in previous post. Since the puppy verson of busybox has this webserver, no other webservers should be run (or installed) unless the user changes the specified web server in the script.
I created this code to test the repo update scripts in Sc0tmann's package manager (i.e. pkg) and so I'll also want to try it with other web servers, just as a means to test package installation, while at the same time testing the repo update scripts. The code which is the subject of this thread is a good demonstration on what can be done with "pkg".
2.0 Cherry Picking Items for the Test Repo
The code to select the repo items has three parts:
2.1. Identify the items of interest
2.2. Randomly pick a few of the items of interest
2.3. Filter the Repo DB Doc File to select only those randomly selected items of interest.
After the Repo Doc File has been filtered, then"
3.1 download only the items in the fitered repo db doc file
3.2 start the web server
3.3 add the new repo to pkg. This adds the item to ~/.pkg/sources, ~/.pkg/sources-all and /ect/apt/sources.list and then converts the repo doc file into puppy format.
There are two scripts which are part of package to convert the repo into puppy format. They are ppa2pup and ppa2pup_gawk. The latter gawk version is many times faster for a large repo but not necessarily faster if there only a few items. The gawk version is part of the main branch but not yet part of an official release of pkg.
2.1 Cherry Picking Items of Interest
As noted above the first step is to identify the items of interest for testing. In our case we are interested in packages which include the epoch number in the version (see manpage debversion). Historically, the puppy package manager has stripped the epoch number from the repo database but this information could be useful for version comparison. The following awk program extracts the first three fields from a puppy "repo db doc file" (e.g. /var/packages/Packages-ubuntu-bionic-main) but only for the packages of interest, which are the ones that have a colon in their version number. THe colon means that the version number includes the epoch.
Code: Select all
AWK_PRG_1=\
'BEGIN {FS="|"; OFS="|"}
{ if ($1 ~ /^[^|]+:[^|]+$/ ){
print $1 "|" $2 "|" $3 #We might want to use some of these other fields for a different application
}}'
As noted above step two is to randomly pick some of these packages of interest and pragmatically generate AWK code to select only these randomly picked items of interest.
Code: Select all
function echo_filter_line(){
read a_pkg_name
echo "pkg_filter[\""$a_pkg_name"\"]=\"true\""
}
while read pkg_record; do
echo "$pkg_record" | cut -f2 -d'|' | echo_filter_line
done < <( cat $REPO_DB_DOC_FILE_in | awk "$AWK_PRG_1" ) \
| sort -R | head -n 3 >> "$filter_lines_path"
Code: Select all
sort -R | head -n 3
2.3 Filter the Repo DB doc file for only the items of interest.
Here is an example of the code generated by my script:
Code: Select all
#!/usr/bin/gawk -f
function init_filter(){
pkg_filter["libreoffice-l10n-nso"]="true"
pkg_filter["libmythes-dev"]="true"
pkg_filter["libgcc1-ppc64el-cross"]="true"
}
function filter_accept(s){ #Return true if we are to print the result
if ( pkg_filter[s] == "true" ){
return "true"
} else {
return "false"
}
}
BEGIN {init_filter()}
/^Package:/ { PKG=$0; sub(/^Package: /,"",PKG); FILTER_ACTION=filter_accept(PKG)}
{if (FILTER_ACTION == "true"){
print $0
}
}
Code: Select all
pkg_filter["libreoffice-l10n-nso"]="true"
Code: Select all
$(cat $filter_lines_path)
3.1 download only the items in the fitered repo db doc file
The code to download only the filtered items is quite simple.
Code: Select all
AWK_PRG_3=\
'/^Filename:/ {
system("wget --quiet \"$repo_url_in\" -O \"" RROOT "/" FPATH "\" 1>/dev/null")
}'
cat "${doc_path}/Packages" | awk -v "RROOT=\"$repo_root_path\"" \
"$AWK_PRG_3"
3.2 start the web server
Given that there are fallback webservers both to run and/or install the full code to start the web server is quite complicated. But in my example the basic code to start the seb server is as follows:
Code: Select all
httpd -h /var/www/html
3.3 add the new repo to pkg
The code to add a new repo to package is straight forward. For instance on Debian systems the node.js repo can be added as follows:
Code: Select all
pkg add-repo https://deb.nodesource.com/node_9.x stretch main
Code: Select all
TEST_CMD=ppa2pup_gawk
...
( exec <<< "$repo_name_out"
pkg add-repo "$repo_url_out" "$distro_ver_out" "$stream_out" )
...
case "$TEST_CMD" in
ppa2pup) PKG_PPA2PUP_FN=ppa2pup pkg --repo-update; ;;
ppa2pup_gawk) pkg --repo-update; ;;
esac
This coding exercise has created for me some examples on how I can filter a Debian repo and create a mirror of the filtered packages automatically with sc0ttman's package manager (i.e. pkg). It will be useful for testing sc0ttman's package manager and I will also be able to adapt the code to other applications. The biggest weakness is perhaps the complexity on using fallback webserver packages but I think this fallback approach will be useful for testing and I think that there are other things that I can learn from these fallback techniques.