You are not logged in.

#1 2006-07-12 13:08:27

djscholl
Member
From: Michigan, USA
Registered: 2006-05-24
Posts: 56

pacman + other packaging systems; this time it's CRAN

Developer Damir Perisa wrote on Flyspray:
http://bugs.archlinux.org/task/4910
"... by the way: if you know an intelligent way to pack R packages (CRAN pkgs) to arch packages, i'm interested to maintain the most popular ones. i have set up now some in my local repository (instead of directly installing them from R... because then pacman does not know who these files belong to - you cannot do pacman -Qo /whatever/file and get a pkg - one of the most useful tools in arch)..."

This made me wonder what solution could be found, while trying to keep it simple and be consistent .

I have found other examples of NPEP (non-pacman extension packages) discussed in Arch:  CPAN, CTAN, ruby gems, eclipse plugins.

Asking about a unified solution (2004) http://bbs.archlinux.org/viewtopic.php?t=8214

Making pkgbuilds for Perl modules (2005) http://bbs.archlinux.org/viewtopic.php?t=15128

A Perl script for CPAN-pacman integration (2006) http://bbs.archlinux.org/viewtopic.php?t=21048

Confusion caused by NPEP (2006) http://bbs.archlinux.org/viewtopic.php?t=21145

General discussion of NPEP (2006) http://bbs.archlinux.org/viewtopic.php?t=17571

CTAN status update (2006) http://www.archlinux.org/pipermail/arch … 11015.html

My summary of the preceding links:

It is often found that updating the main package via pacman erases the NPEP. The user is responsible for updating NPEP, i.e., they  are not handled by  pacman -Su. When a user needs a new extension package, she may have to search two places: in the Arch repository, and then possibly also in the NPEP repository. It is often found that NPEP must be upgraded or at least rebuilt after the main package is upgraded via pacman. I believe that most, if not all, of these are relevant to R extension packages.

The native NPEP systems in Perl and Ruby are particularly good at handling complicated dependencies. This has been leveraged in the  CPAN-pacman integration tool, which automates the preparation of pkgbuild  files. People have gone to a lot of trouble to extract information from Perl's native packaging system and feed it to pacman. I'm less clear on the current status of Ruby. The R extension packages have interdependencies, but I don't know whether they are as difficult to handle  manually as those in Perl and Ruby, so I can't judge the need for an R-based tool to create pkgbuild files. (R is not the duct tape of the internet, so it seems like it would be more difficult to implement such a tool in R than it was in Perl.)

In short, the implementation details differ, but there is a strong consensus in  favor of using pacman whenever possible, and this seems to apply with the same force to R extension packages. 

The CRAN packages are supplied as source code for all Unix versions so they will be cross-platform, and also because the CRAN admins have no way to build or check the safety of Unix binaries. Therefore, it should be OK for Arch to distribute binary packages. Arch has a trusted binary infrastructure and is platform specific.

The question now becomes, is it possible to use pacman for the CRAN extension packages, and if so, to what extent can we follow the other examples in Arch?

CRAN packages should be built by R. In other words, R performs the equivalent  of tar -zxf, configure, make, and make install. This is analogous to the way extension  package builds are handled in Perl (perl Makefile.PL) and Ruby (ruby setup.rb). It can be handled by calling R from the pkgbuild build script, in the same way that people currently are calling Perl or Ruby. To accomplish this, we can write the pkgbuild to download the tarball into $startdir, and as the active ingredient of our build script, use:

R CMD INSTALL --library=$startdir/pkg/usr/lib/R/library foo_1.0.tar.gz 

R does its own extraction and apparently uses /tmp for the build, so there is no need  for makepkg to extract into src; however, it does no harm that I know of. Running makepkg will then build the package and install it into $startdir/pkg/usr/lib/R/library/foo. As you will have guessed, the --library= option is the equivalent of the DESTDIR= option for make. The subdirectory foo can then be installed in /usr/lib/R/library/foo by pacman. Uninstalling will mean removing the subdirectory /usr/lib/R/library/foo, which will be handled by pacman, with no foo.install script needed.

CRAN also contains bundles, which build into several directories installed at the same time; bundles could also be handled in the same way.

It seems to me that this potential solution meets our goals in terms of simplicity, consistency, and information available to pacman -Qo. Accordingly, I have written and tested a pkgbuild file for a CRAN extension package chosen at random. I pasted it inline below. It's not complete, but it builds without errors and makes what looks like a valid pkg on my system. I am not proposing this for the AUR; rather, it is meant as a discussion point in response to Damir Perisa's comments. A picture is worth a thousand words, or if I'm posting it, a picture is accompanied by a thousand words...

The CRAN has worldwide download mirrors, but no equivalent to dl.sourceforge.net. Would it make sense for us to put the URL for the package's list of mirrors in a comment under the source line? This would be a reminder and convenience for people making their own builds. It's usually a waste of time to go to the url to look for source mirrors when you are hacking a pkgbuild, because most project don't have them, so it seems considerate for us to alert users when mirrors exist.

It would be good to have some method for identifying R extension packages in the Arch repositories for easy searching. Would a phrase like R-Extension in the pkgdesc work, or is there already a better way?

In addition to the chron directory, the build creates a file R.css, which is a duplicate of a file installed by the R package. I have the build script delete this so that it will not be removed at uninstall.

I had trouble with the underscore and dash in the tarball filename, so I hardcoded them. If there is a better way, no doubt someone will know what it is and (I hope) will clue me in.

pkgname=chron
pkgver=2.3.4
pkgrel=1
pkgdesc="R-Extension: Chronological objects which can handle dates and times"
url="http://cran.r-project.org/src/contrib/Descriptions/chron.html"
license="GPL"
depends=('r>=1.6.0')
source=("http://cran.r-project.org/src/contrib/chron_2.3-4.tar.gz")
# please use the mirror list at http://cran.r-project.org/mirrors.html
md5sums=('e53556d1d2ef049efc7ca0380ceaaa0e')

build() {
  R CMD INSTALL --library=$startdir/pkg/usr/lib/R/library chron_2.3-4.tar.gz
  rm $startdir/pkg/usr/lib/R/library/R.css
}

Corrections, comments, etc. are welcome.

Offline

#2 2006-07-12 23:16:02

Snowman
Developer/Forum Fellow
From: Montreal, Canada
Registered: 2004-08-20
Posts: 5,212

Re: pacman + other packaging systems; this time it's CRAN

To fix the underscore problem use braces:
source=("http://cran.r-project.org/src/contrib/${pkgname}_2.3-4.tar.gz")

Offline

#3 2006-07-12 23:23:10

dp
Member
From: Zürich, Switzerland
Registered: 2003-05-27
Posts: 3,378
Website

Re: pacman + other packaging systems; this time it's CRAN

thank you djscholl for this nice review. i'm happy to see that you come to the same solution i came about 2 years ago... in my private repos, i did build cran pkgs in exactly the way you described... only wihtout removing the R.css (i forgot about it, because i never had to remove a cran-lib till now).

so here i have updated my general PKGBUILD.cran

# $Id: PKGBUILD,v 1.5 2006/02/23 23:29:07 damir Exp $
# Maintainer :  damir <damir>
# Contributor: Damir Perisa <damir>

r_lib_name=Biodem # the name you need to type in R to load the lib: library(this_name)
r_lib_ver=0.1

pkgname=r-cran-`echo ${r_lib_name} | tr [A-Z] [a-z]`
pkgver=`echo ${r_lib_ver} | tr _- .`
pkgrel=1
pkgdesc="R-Extension: a number of functions for Biodemographycal analysis"
url="http://cran.r-project.org/src/contrib/Descriptions/${r_lib_name}.html"
license="GPL"
depends=('r>=1.8.0')
source=("http://cran.r-project.org/src/contrib/${r_lib_name}_${r_lib_ver}.tar.gz")

build() {
  R CMD INSTALL 
        --library=$startdir/pkg/usr/lib/R/library 
        ${r_lib_name}_${r_lib_ver}.tar.gz || return 1
  # make sure the R library is not used for anything
  rm $startdir/pkg/usr/lib/R/library/R.css
}

what do you think?
the idea with "R-Extension" in the description is a good thing for searching for these... but we can also make a "cran" group and assign all these r-libs to it. here i0m not perfectly sure what is best. you may have realised, that my pkgname consists of a prefix. this is how i primary marked this kind of pkgs for me and also so debian is handling them.

any suggestions for further changes? if not, we should maybe make a list of libs we want to have packaged. it would be even possible to automate it and package all of them, but i don't see a reason for that, if some of them are not used at all. i myself was/am using r-cran-biodem and r-cran-ape (ape has dependencies r-cran-gee, r-cran-nlme and r-cran-lattice) ... this are already prepared pkgs i just can move to [extra]. your r-cran-chron looks also useful smile


The impossible missions are the only ones which succeed.

Offline

#4 2006-07-13 14:33:53

djscholl
Member
From: Michigan, USA
Registered: 2006-05-24
Posts: 56

Re: pacman + other packaging systems; this time it's CRAN

I'm glad to see that we are converging on a solution. I do have a few comments on some of the details.

On second thought, R-Extension: is too long, since the field is limited to 80 characters. R-lib: or R-CRAN: would be better because shorter. However, the guidlines suggest not repeating the package name in the description. Since you are prefacing the name with r-cran, it would probably be best to leave it out of pkgdesc altogether.

There have been several recommendations made in the Forums to convert from backticks to the more modern $() syntax in pkgbuilds, and I haven't seen any arguments against it.
http://bbs.archlinux.org/viewtopic.php?t=22494
http://bbs.archlinux.org/viewtopic.php?t=21058

I am going to propose that you change the name of your local variables, even though I know that variable names are often strong personal preferences. However, perhaps you will agree with my reasons.

I have already seen the use of local variables for package name and version a number of times, even in my short time using Arch. In my opinion, this is an area where pkgbuild syntax, specifically, the rule of no local variables unless absolutely required,  is too restrictive. makepkg is brilliantly simple, but I think in this case it would be easier to use if it were slightly less simple. It is very common for version numbers out on the internet to be different from X.X.X, and fairly common for package names to include caps. In the AUR, we have to hardcode the exceptions, because the AUR website displays the source URL without running the pkgbuild through bash. This means that, unless we generate the pkgbuild from a script, we have to type a non-compliant name and/or version multiple times, at least once in the source and once in the build script. Typing multiple copies of the same information is something computers are good at, but humans are not. When the package is updated, we have to find and update every instance of the version string manually. This is an obvious application for a variable placed at the top of the pkgbuild. You don't have the same restrictions, since you are preparing binary packages, but your need for a local name and a local version different from the Arch name and Arch version are common needs. If you used a name that wasn't specific to R, but was general, then it could be adopted as a general solution.

The Arch Packaging Guidelines suggest that local variables begin with an underscore so as not to interfere with the makepkg namespace.
http://wiki.archlinux.org/index.php/Arc … _Standards
If you adopt this practice, then you don't need to rely on the chance that the makepkg developers will never adopt variables called r_lib_name or r_lib_ver, and you can use names that are more generic.

Given this name policy, the simplest variable name meaning "local package name" would be _pkgname, and the simplest variable name meaning "local package version" would be _pkgver. I first saw this usage in the swiftfox-amd package in the AUR, but it may be used elsewhere as well. These names could be used in a consistent way by anyone facing non-Arch-like names or version strings. The ultimate benefit would be if the AUR website developers saw these names come into common use, and decided to parse them in the same way they currently parse $pkgname and $pkgver. If they did this, AUR contributers and users could share those benefits of having only one place to type in names and versions. It would never be practical for the AUR website to support hundreds of different local variables, but they might be willing to add support for two, namely, _pkgname
and _pkgver. Or, if they don't want to break the underscore-equals-private rule, they might be willing to add two new names, such as locpkgname and locpkgver.

In the short-term, then, we could achieve greater consistency in the ABS, and in the long-term, have a practical possibility to make the AUR consistent with the same pattern.

With these changes, your pkgbuild would become:

_pkgname=Biodem # the name you need to type in R to load the lib: library(this_name)
_pkgver=0.1

pkgname=r-cran-$(echo ${_pkgname} | tr [A-Z] [a-z])
pkgver=$(echo ${_pkgver} | tr _- .)
pkgrel=1
pkgdesc="A number of functions for Biodemographycal analysis"
url="http://cran.r-project.org/src/contrib/Descriptions/${_pkgname}.html"
license="GPL"
depends=('r>=1.8.0')
source=("http://cran.r-project.org/src/contrib/${_pkgname}_${_pkgver}.tar.gz")

build() {
  R CMD INSTALL 
        --library=$startdir/pkg/usr/lib/R/library 
        ${_pkgname}_${_pkgver}.tar.gz || return 1
  # make sure the R library is not used for anything
  rm $startdir/pkg/usr/lib/R/library/R.css
}

Offline

#5 2006-07-13 16:53:26

dp
Member
From: Zürich, Switzerland
Registered: 2003-05-27
Posts: 3,378
Website

Re: pacman + other packaging systems; this time it's CRAN

ok, most of the things you mentioned are details. i agree with you that $(echo | processing) is better... i have still some old pkgbuilds i have to addapt to it. the local variables are also minor (and i'm against confusing _pkgname with pkgname by calling them almost the same - i'm not against prefixing local variables that may interfere with makepkg with "_" but i like common sense *smile*) and the R-Extension is not needed in pkgdesc, i agree.

what about suggesting possible useful cran pkgs archers need?


The impossible missions are the only ones which succeed.

Offline

#6 2006-07-13 23:11:17

djscholl
Member
From: Michigan, USA
Registered: 2006-05-24
Posts: 56

Re: pacman + other packaging systems; this time it's CRAN

Fair enough. You are writing the pkgbuilds, so the final call is obviously yours. Thanks for taking the time to discuss the issues with me. Now that we have discussed your plans on the forum, anyone who is thinking he might need to build CRAN packages should be able to find our discussion and get oriented.

I can't recommend any specific packages, as so far I have only used the functions that come packaged with R, such as Pearson's correlation. I use R to provide statistical functionality to my Python scripts. It is possible that I will need some packages in the coming months, but I don't know yet what they will be. I could have used the pls package on my former project, if it hadn't been canceled. I just want to make sure that I have a road map for how to access them when I get in the middle of an interesting problem and need one. I don't see any download statistics, or list of popular modules, on the r-project web site, so I don't see any guidance from that direction either. Perhaps a more experienced R user than I will post with some suggestions.

Offline

#7 2006-07-14 06:54:44

dp
Member
From: Zürich, Switzerland
Registered: 2003-05-27
Posts: 3,378
Website

Re: pacman + other packaging systems; this time it's CRAN

ok no problem - next time i do arch things, i will move the libs i have in my local repo to [extra]... then we can see further candidates. this thread can be the request/discussion catcher on suggestions.... i watch this topic now


The impossible missions are the only ones which succeed.

Offline

Board footer

Powered by FluxBB