You are not logged in.
Pages: 1
I know you guys prefer seeing results over discussion, but I'm currently a bit frazzled from working too hard on it and would like some feedback from others.
I've been in an anti-Python sort of mindset recently, and I enjoy fiddling around with programming as a side distraction, so I decided I would try rewriting namcap in Ruby as a personal project. The problem I encountered was that Ruby doesn't have the equivalent of Python's tarfile library. The Archive::Minitar module found in the third-party Facets library can extract files, but it is incapable of properly extracting symlinks. I was reluctant to call the external tar command to do this for me, but there was no other choice.
I read more into the tar command and I realised something interesting. The command tar -tvf, when performed on a package, produces a detailed list of what the gzipped filed contains--read/write/exec bits, size, date of extraction, name, etc. The method which namcap uses is to wholly extract the package and perform its rules on it. It made me think--how many of these rules could use the information from tar -tvf instead of working directly on the extracted files?
After reading through the source code for the depends rule I surmised that the only files in a package which will have shared dependencies are either executables or shared libraries (.so extension). Basically, anything that generates some sensible output from 'readelf -d'. I wondered if it would be possible to generate a list of files which meet this criteria, extract only those files, and apply all other rules on the information from tar -tvf. The command to extract specific files to the sandbox directory would be something like:
'tar -C ' + sandbox + ' -xf ' + listoffiles.join
I believe this approach would be faster because it would not extract all files and the other rules would not have to examine any extracted files but instead they would process the tar -tvf information. So, for example, the permissions rule would read the rwx-rwx-rwx label in tar -tvf instead of checking the mode for all files. The emptydirectory rule will be fun to write, I know.
So far my progress has been very modest. I know it would be far better to make the changes to namcap, but I really wanted to try it in Ruby just for fun. (I originally intended this to be for fun anyway.) I have written the permissions rule, and have spent the past... four days on the depends rule. It is the largest and most elaborate rule and so I have focused my early efforts into it... with increasing levels of frustration with debugging it. It is starting to not be fun anymore, sadly. I have rewritten most of it--the last couple of lines are proving quite difficult to understand and debug.
I'm in no way suggesting that namcap gets a significant rewrite for this. It works well enough and speed should be a secondary concern with it. I'm posting this to see what people think of my idea and would like to know if it has any fatal flaws I have failed to account for. I actually don't know for sure yet if it will be faster in the end.
edit: oops. I reread the source code for some of the rules. It appears that namcap does pretty much the same thing using the Python tarfile library with tarfile.getnames. Oh well, forget this then.
Last edited by mentallaxative (2008-12-20 18:33:45)
Offline
Pages: 1